Re: Features not present in Solr

2010-03-23 Thread Andrzej Bialecki

On 2010-03-23 06:25, David Smiley @MITRE.org wrote:


I use Endeca and Solr.

A few notable things in Endeca but not in Solr:
1. Real-time search.




2. related record navigation (RRN) is what they call it.  This is the
ability to join in other records, something Lucene/Solr definitely can't do.


Could you perhaps elaborate a bit on this functionality? Your 
description sounds intriguing - it reminds me of ParallelReader, but I'm 
probably completely wrong ...



--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Question about query

2010-03-23 Thread Armando Ota

Hey ...

10x for you reply ... unfortunately this is not a case for me ..
I have canceled the feature which needs this ...

KInd regards

Armando


Erick Erickson wrote:

One thing I've seen suggested is to add the number of values to
a separate field, say topic_count. Then, in your situation above
you could append AND topic_count=1. This can extend
to work if you wanted any number of matches (and only
that number). For instance,
topic=5 AND topic=10 AND topic=20 AND topic_count=3 would
give you article 4.

Don't know if this works in your particular situation

Erick

On Mon, Mar 22, 2010 at 10:32 AM, Armando Ota armando...@siol.net wrote:

  

Hi

I need a little help with query for my problem (if it can be solved)

I have a field in a document called topic

this field contains some values, 0 (for no topic) or  1 (topic 1), 2, 3,
etc ...

It can contain many values like 1, 10, 50, etc (for 1 doc)

So now to the problem:
I would like to get documents that have 0 for topic value and documents
that only have for example 1 for topic value inserted

articles for example:
article 1topics: 1, 5, 10, 20, 24
article 2 topics: 0
article 3 topics: 1
article 4 topic: 5, 10, 20
article 5 topic: 1, 13, 19

So I need search query to return me only article 2 and 3 not other articles
with 1 for topic value

Can that be done ? Any help appreciated

Kind regards

Armando





  


Configuring multiple SOLR apps to play nice with MBeans / JMX

2010-03-23 Thread Constantijn Visinescu
Hi,

I'm having a problem trying to get multiple solr applications to run in the
same servlet container because they all try to claim solr as a
name/category to put their mbeans under and that causes exceptions/crashes
for all the applications after the first.

I've read http://wiki.apache.org/solr/SolrJmx and it shows configuration
options to define a JMX server agentID or to provide your own JMX url but i
don't want either. (i think)

I just want my webapps to show as solr1, solr2 and solr3 when
monitoring them rather then all of them trying to race for solr and having
all of them after the first crash.

Right now I've disabled JMX and that works to get my apps started at least,
but it's not what i want either.

Anyone know how to configure solr to do this?
If a configuration option like jmx name=solr1 / exists that'd fix my
problem but i can't seem to find it in the documentation.

Thanks in advance,
Constantijn Visinescu


Re: Configuring multiple SOLR apps to play nice with MBeans / JMX

2010-03-23 Thread Charl Mert
Hi Constantijn,

I'm not too sure about the JMX monitoring side of things but having looked
at the Solr's MultiCore http://wiki.apache.org/solr/CoreAdmin
feature it seems really simple to create multiple solr cores that could all
be configured to point
to one MBean server.

When creating a core you can specify name like solr1, solr2:
http://localhost:8983/solr/admin/cores?action=CREATEname=solr_01instanceDir=/etc/solr/multicore/core2config=solrconfig.xmlschema=schema.xmldataDir=data

This is made possible due to the fact that each core can have it's own
solrconfig.xml
See example/multicore/ in your solr distribution.

Hope this helps.

Regards
Charl Mert



On Tue, Mar 23, 2010 at 12:10 PM, Constantijn Visinescu
baeli...@gmail.comwrote:

 Hi,

 I'm having a problem trying to get multiple solr applications to run in the
 same servlet container because they all try to claim solr as a
 name/category to put their mbeans under and that causes exceptions/crashes
 for all the applications after the first.

 I've read http://wiki.apache.org/solr/SolrJmx and it shows configuration
 options to define a JMX server agentID or to provide your own JMX url but i
 don't want either. (i think)

 I just want my webapps to show as solr1, solr2 and solr3 when
 monitoring them rather then all of them trying to race for solr and
 having
 all of them after the first crash.

 Right now I've disabled JMX and that works to get my apps started at least,
 but it's not what i want either.

 Anyone know how to configure solr to do this?
 If a configuration option like jmx name=solr1 / exists that'd fix my
 problem but i can't seem to find it in the documentation.

 Thanks in advance,
 Constantijn Visinescu



feature request for ivalid data formats

2010-03-23 Thread Király Péter

Hi,

I don't know whether this is the good place to ask it, or there is a special 
tool for issue

requests.
If I set a field to int, but the input contains a string, the Solr reports 
an error like this:


2010.03.23. 13:27:23 org.apache.solr.common.SolrException log
SEVERE: java.lang.NumberFormatException: For input string: 1595-1600
   at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)

   at java.lang.Integer.parseInt(Integer.java:456)

It would be great help in some cases, if I could know which field contained 
this data in wrong format.


I would like to see something like this:
SEVERE: java.lang.NumberFormatException: For input string: 1595-1600 in 
field named date.


At the client side I have another problem. If I use the post.jar, or a PHP 
client, my error messages

always like this:

SimplePostTool: FATAL: Solr returned an error: For_input_string_15951600
__javalangNumberFormatException_For_input_string_15951600
___at_javalangNumberFormatExceptionforInputStringNumberFormat

(I added some line breaks for the shake of readability.)

Could not be returned a string with the same format as in Solr log?

Péter



Perfect Match

2010-03-23 Thread Nair, Manas
Hello Experts,
 
I need help on one of my issues with perfect matching of terms.
 
I have a collection of artists which are stored in the index against the field 
name artist_t which is a text type field. This field consists of values like 
[dora, Dora The Explorer, Princess Dora The explorer] across various docs 
as in 
 
doc
field name=artist_tDora/field
/doc
doc
field name=artist_tDora The Explorer/field
/doc
doc
field name=artist_tPrincess Dora The Explorer/field
/doc
 
I am searching specifically on artist_t like q=artist_t:Dora.
What I need is the one document which matches exactly with Dora, ie. the first 
doc. Dora the Explorer and Princess Dora The Explorer should not come along 
with it.
 
But I am getting all the above.
 
To tackle this problem, I tried to copyfield this artist_t to a new field 
called artist_s which is of type string and indexed the content again. But this 
approach also doesnt help.
I tried to create a new field type with Keyword Tokenizer. and tried to create 
a field of that type and copied artist_t to this field. That also doesnt work.
 
Is there any way of doing this??
 
I need exact match ie. if I search for artist_t:Dora The Explorer, I should get 
only the second doc and not the third one(Princess Dora The Explorer).
 
 
Please Help!!
 
Manas


Re: Perfect Match

2010-03-23 Thread Ahmet Arslan

 I need help on one of my issues with perfect matching of
 terms.
  
 I have a collection of artists which are stored in the
 index against the field name artist_t which is a text type
 field. This field consists of values like [dora, Dora The
 Explorer, Princess Dora The explorer] across various docs
 as in 
  
 doc
 field name=artist_tDora/field
 /doc
 doc
 field name=artist_tDora The
 Explorer/field
 /doc
 doc
 field name=artist_tPrincess Dora The
 Explorer/field
 /doc
  
 I am searching specifically on artist_t like
 q=artist_t:Dora.
 What I need is the one document which matches exactly with
 Dora, ie. the first doc. Dora the Explorer and Princess
 Dora The Explorer should not come along with it.
  
 But I am getting all the above.
  
 To tackle this problem, I tried to copyfield this artist_t
 to a new field called artist_s which is of type string and
 indexed the content again. But this approach also doesnt
 help.

with type=string q=artist_s:Dora should return only 
doc
field name=artist_sDora/field
/doc

 I tried to create a new field type with Keyword Tokenizer.
 and tried to create a field of that type and copied artist_t
 to this field. That also doesnt work.

May be you have trailing white-spaces in your artists? Can you try with adding 
TrimFilterFactory after KeywordTokenizerFactory?

 Is there any way of doing this??
  
 I need exact match ie. if I search for artist_t:Dora The
 Explorer, I should get only the second doc and not the third
 one(Princess Dora The Explorer).

Note that q=artist_t:Dora The Explorer is parsed into artist_t:Dora 
defaultField:The defaultField:Explorer

Can you do your tests with q=artist_s:Dora?


  


RE: Perfect Match

2010-03-23 Thread Nair, Manas
Thankyou Ahmet. You were right. artist_s:Dora is bringing results.
But I need artist_s:Dora the explorer to bring only those results which contain 
Dora the explorer.
 
I tried to give artist_s:Dora the explorer (phrase search).. that is working. 
But artist_s:Dora the explorer is not working. Any way to make this 
artist_s:Dora the explorer to return results that contain this in them.
 
Thanks.



From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Tue 3/23/2010 9:32 AM
To: solr-user@lucene.apache.org
Subject: Re: Perfect Match




 I need help on one of my issues with perfect matching of
 terms.
 
 I have a collection of artists which are stored in the
 index against the field name artist_t which is a text type
 field. This field consists of values like [dora, Dora The
 Explorer, Princess Dora The explorer] across various docs
 as in
 
 doc
 field name=artist_tDora/field
 /doc
 doc
 field name=artist_tDora The
 Explorer/field
 /doc
 doc
 field name=artist_tPrincess Dora The
 Explorer/field
 /doc
 
 I am searching specifically on artist_t like
 q=artist_t:Dora.
 What I need is the one document which matches exactly with
 Dora, ie. the first doc. Dora the Explorer and Princess
 Dora The Explorer should not come along with it.
 
 But I am getting all the above.
 
 To tackle this problem, I tried to copyfield this artist_t
 to a new field called artist_s which is of type string and
 indexed the content again. But this approach also doesnt
 help.

with type=string q=artist_s:Dora should return only
doc
field name=artist_sDora/field
/doc

 I tried to create a new field type with Keyword Tokenizer.
 and tried to create a field of that type and copied artist_t
 to this field. That also doesnt work.

May be you have trailing white-spaces in your artists? Can you try with adding 
TrimFilterFactory after KeywordTokenizerFactory?

 Is there any way of doing this??
 
 I need exact match ie. if I search for artist_t:Dora The
 Explorer, I should get only the second doc and not the third
 one(Princess Dora The Explorer).

Note that q=artist_t:Dora The Explorer is parsed into artist_t:Dora 
defaultField:The defaultField:Explorer

Can you do your tests with q=artist_s:Dora?


 




Re: Perfect Match

2010-03-23 Thread Erick Erickson
What Ahmet was getting to was that you need parentheses to insure that
all your terms go against the artist_s field. Something like
artist_s:(Dora The Explorer). But watch capitalization.

Adding debugQuery=on to your query will show you a lot about what's going
on.

HTH
Erick

On Tue, Mar 23, 2010 at 9:59 AM, Nair, Manas manas.n...@mtvnmix.com wrote:

 Thankyou Ahmet. You were right. artist_s:Dora is bringing results.
 But I need artist_s:Dora the explorer to bring only those results which
 contain Dora the explorer.

 I tried to give artist_s:Dora the explorer (phrase search).. that is
 working. But artist_s:Dora the explorer is not working. Any way to make this
 artist_s:Dora the explorer to return results that contain this in them.

 Thanks.

 

 From: Ahmet Arslan [mailto:iori...@yahoo.com]
 Sent: Tue 3/23/2010 9:32 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Perfect Match




  I need help on one of my issues with perfect matching of
  terms.
 
  I have a collection of artists which are stored in the
  index against the field name artist_t which is a text type
  field. This field consists of values like [dora, Dora The
  Explorer, Princess Dora The explorer] across various docs
  as in
 
  doc
  field name=artist_tDora/field
  /doc
  doc
  field name=artist_tDora The
  Explorer/field
  /doc
  doc
  field name=artist_tPrincess Dora The
  Explorer/field
  /doc
 
  I am searching specifically on artist_t like
  q=artist_t:Dora.
  What I need is the one document which matches exactly with
  Dora, ie. the first doc. Dora the Explorer and Princess
  Dora The Explorer should not come along with it.
 
  But I am getting all the above.
 
  To tackle this problem, I tried to copyfield this artist_t
  to a new field called artist_s which is of type string and
  indexed the content again. But this approach also doesnt
  help.

 with type=string q=artist_s:Dora should return only
 doc
 field name=artist_sDora/field
 /doc

  I tried to create a new field type with Keyword Tokenizer.
  and tried to create a field of that type and copied artist_t
  to this field. That also doesnt work.

 May be you have trailing white-spaces in your artists? Can you try with
 adding TrimFilterFactory after KeywordTokenizerFactory?

  Is there any way of doing this??
 
  I need exact match ie. if I search for artist_t:Dora The
  Explorer, I should get only the second doc and not the third
  one(Princess Dora The Explorer).

 Note that q=artist_t:Dora The Explorer is parsed into artist_t:Dora
 defaultField:The defaultField:Explorer

 Can you do your tests with q=artist_s:Dora?








Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-23 Thread Alexey Serba
 Error loading class 'org.apache.solr.spelling.suggest.Suggester'
Are you sure you applied the patch correctly?
See http://wiki.apache.org/solr/HowToContribute#Working_With_Patches

Checkout Solr trunk source code (
http://svn.apache.org/repos/asf/lucene/solr/trunk ), apply patch,
verify that everything went smoothly, build solr and use built version
for your tests.

On Mon, Mar 22, 2010 at 9:42 PM, stocki st...@shopgate.com wrote:

 i patch an nightly build from solr.
 patch runs, classes are in the correct folder, but when i replace spellcheck
 with this spellchecl like in the comments, solr cannot find the classes =(

 searchComponent name=spellcheck class=solr.SpellCheckComponent
    lst name=spellchecker
      str name=namesuggest/str
      str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
      str
 name=lookupImplorg.apache.solr.spelling.suggest.jaspell.JaspellLookup/str
      str name=fieldtext/str
      str name=sourceLocationamerican-english/str
    /lst
  /searchComponent


 -- SCHWERWIEGEND: org.apache.solr.common.SolrException: Error loading class
 'org.ap
 ache.solr.spelling.suggest.Suggester'


 why is it so ??  i think no one has so many trouble to run a patch like
 me =( :D


 Andrzej Bialecki wrote:

 On 2010-03-19 13:03, stocki wrote:

 hello..

 i try to implement autosuggest component from these link:
 http://issues.apache.org/jira/browse/SOLR-1316

 but i have no idea how to do this !?? can anyone get me some tipps ?

 Please follow the instructions outlined in the JIRA issue, in the
 comment that shows fragments of XML config files.


 --
 Best regards,
 Andrzej Bialecki     
   ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




 --
 View this message in context: 
 http://old.nabble.com/SOLR-1316-How-To-Implement-this-autosuggest-component-tp27950949p27990809.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-23 Thread stocki

okay,
i do this.. 

but one file are not right updatet  
Index: trunk/src/java/org/apache/solr/util/HighFrequencyDictionary.java
(from the suggest.patch)

i checkout it from eclipse, apply patch, make an new solr.war ... its the
right way ?? 
i thought that is making a war i didnt need to make an build. 

how do i make an build ? 




Alexey-34 wrote:
 
 Error loading class 'org.apache.solr.spelling.suggest.Suggester'
 Are you sure you applied the patch correctly?
 See http://wiki.apache.org/solr/HowToContribute#Working_With_Patches
 
 Checkout Solr trunk source code (
 http://svn.apache.org/repos/asf/lucene/solr/trunk ), apply patch,
 verify that everything went smoothly, build solr and use built version
 for your tests.
 
 On Mon, Mar 22, 2010 at 9:42 PM, stocki st...@shopgate.com wrote:

 i patch an nightly build from solr.
 patch runs, classes are in the correct folder, but when i replace
 spellcheck
 with this spellchecl like in the comments, solr cannot find the classes
 =(

 searchComponent name=spellcheck class=solr.SpellCheckComponent
    lst name=spellchecker
      str name=namesuggest/str
      str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
      str
 name=lookupImplorg.apache.solr.spelling.suggest.jaspell.JaspellLookup/str
      str name=fieldtext/str
      str name=sourceLocationamerican-english/str
    /lst
  /searchComponent


 -- SCHWERWIEGEND: org.apache.solr.common.SolrException: Error loading
 class
 'org.ap
 ache.solr.spelling.suggest.Suggester'


 why is it so ??  i think no one has so many trouble to run a patch
 like
 me =( :D


 Andrzej Bialecki wrote:

 On 2010-03-19 13:03, stocki wrote:

 hello..

 i try to implement autosuggest component from these link:
 http://issues.apache.org/jira/browse/SOLR-1316

 but i have no idea how to do this !?? can anyone get me some tipps ?

 Please follow the instructions outlined in the JIRA issue, in the
 comment that shows fragments of XML config files.


 --
 Best regards,
 Andrzej Bialecki     
   ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




 --
 View this message in context:
 http://old.nabble.com/SOLR-1316-How-To-Implement-this-autosuggest-component-tp27950949p27990809.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://old.nabble.com/SOLR-1316-How-To-Implement-this-patch-autoComplete-tp27950949p28001938.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: use termscomponent like spellComponent ?!

2010-03-23 Thread Grant Ingersoll

On Mar 22, 2010, at 12:09 PM, stocki wrote:

 
 thx.
 
 it try to patch solr with 1316 but it not works =( 
 
 do i need to checkout from svn Nightly ? 
 http://svn.apache.org/repos/asf/lucene/solr/ 

Yes, you will need to work from trunk.

 
 when i create a patch and then create the WAR it has only 40 MB ...
 
 
 
 
 Grant Ingersoll-6 wrote:
 
 See https://issues.apache.org/jira/browse/SOLR-1316
 
 
 On Mar 21, 2010, at 2:34 PM, stocki wrote:
 
 
 hello.
 
 i play with solr but i didn`t find the perfect solution for me.
 
 my goal is a search like the amazonsearch from the iPhoneApp. ;)
 
 it is possible to use the TermsComponent like the SpellComponent ? So,
 that
 works termsComp with more than one single Term ?!  
 
 i got these 3 docs with the name in my index:
 - nikon one
 - nikon two
 - nikon three
 
 so when ich search for nik termsCom suggest me  nikon. thats
 correctly
 whar i want.
 but when i type nikon on i want that solr suggest me nikon one , 
 
 how is that realizable ??? pleeease help me somebody ;) 
 
 a merge of TC nad SC where best solution in think so.
 
 field name=name type=textgen indexed=true stored=true
 required=true / 
 this is my searchfield. did i use the correct type ? 
 
 
 -- 
 View this message in context:
 http://old.nabble.com/use-termscomponent-like-spellComponent--%21-tp27977008p27977008.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem using Solr/Lucene:
 http://www.lucidimagination.com/search
 
 
 
 
 -- 
 View this message in context: 
 http://old.nabble.com/use-termscomponent-like-spellComponent--%21-tp27977008p27988620.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 

--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search



RE: PDFBox/Tika Performance Issues

2010-03-23 Thread Giovanni Fernandez-Kincade
Sorry for the late reply - been out of town for a couple of days. 

From my solrconfig:

requestHandler name=/update/extract 
class=org.apache.solr.handler.extraction.ExtractingRequestHandler 
startup=lazy
lst name=defaults
  str name=uprefixignored_/str
  str name=map.contenttext/str
/lst
  /requestHandler


-Original Message-
From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll
Sent: Saturday, March 20, 2010 8:43 AM
To: solr-user@lucene.apache.org
Subject: Re: PDFBox/Tika Performance Issues

What's your configuration look like for the ExtractReqHandler?

On Mar 19, 2010, at 2:42 PM, Giovanni Fernandez-Kincade wrote:

 Yeah I've been trying that - I keep getting this error when indexing a PDF 
 with a trunk-build:
 
   Apache Tomcat/5.5.27 - Error report
   HTTP Status 500 - org.apache.solr.handler.
   
 ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)
   V  java.lang.AbstractMethodError: 
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
  
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)

   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

   at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)   
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)

   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)

   at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)

   at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)

   at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)

   at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
   at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
   at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)   
 at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) 
   at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
at java.lang.Thread.run(Unknown Source)  type  Status report   message 
   
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
   java.lang.AbstractMethodError: 
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
   at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
   at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)   
 at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) 
   at 
 

Re: PDFBox/Tika Performance Issues

2010-03-23 Thread Mattmann, Chris A (388J)
Hi Giovanni,

The error that you're showing in your logs below indicates that this message 
signature:

org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)

doesn't match what was expected. Are you sure you don't have another Solr jar 
on the classpath somewhere, or in your web server? Are you using Jetty, or 
Tomcat?

Thanks,
Chris



On 3/23/10 7:59 AM, Giovanni Fernandez-Kincade 
gfernandez-kinc...@capitaliq.com wrote:

Sorry for the late reply - been out of town for a couple of days.

From my solrconfig:

requestHandler name=/update/extract 
class=org.apache.solr.handler.extraction.ExtractingRequestHandler 
startup=lazy
lst name=defaults
  str name=uprefixignored_/str
  str name=map.contenttext/str
/lst
  /requestHandler


-Original Message-
From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll
Sent: Saturday, March 20, 2010 8:43 AM
To: solr-user@lucene.apache.org
Subject: Re: PDFBox/Tika Performance Issues

What's your configuration look like for the ExtractReqHandler?

On Mar 19, 2010, at 2:42 PM, Giovanni Fernandez-Kincade wrote:

 Yeah I've been trying that - I keep getting this error when indexing a PDF 
 with a trunk-build:

   Apache Tomcat/5.5.27 - Error report
   HTTP Status 500 - org.apache.solr.handler.
   
 ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)
   V  java.lang.AbstractMethodError: 
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
   at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
   at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
   at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
   at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
   at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
   at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)   
 at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) 
   at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
at java.lang.Thread.run(Unknown Source)  type  Status report   message 
   
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
   java.lang.AbstractMethodError: 
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at 
 

Spatial queries

2010-03-23 Thread Jean-Sebastien Vachon
Hi All,

I am using the package from JTeam to perform spatial searches on my index. I'd 
like to know if it is possible
to build a query that uses multiple clauses. Here is an example:

q={!spatial lat=123 long=456 radius=10} OR {!spatial lat=111 long=222 
radius=20}title:java

Basically that would return all documents having the word java in the title 
field and that are either
within 10 miles from the first location OR 20 miles from the second.

I've made a few tries but it does not seem to be supported. I'm still wondering 
if it would make sense to support this kind of queries.

I could use multiple queries and merge the results myself but then I need some 
faceting.

Thanks


Re: Issue w/ highlighting a String field

2010-03-23 Thread Markus Jelsma
Hello,


Check out the wiki [1] on what options to use for highlighting and other 
components.


[1]: http://wiki.apache.org/solr/FieldOptionsByUseCase


Cheers,



On Tuesday 23 March 2010 17:11:42 Saïd Radhouani wrote:
 I have trouble with highlighting field of type string. It looks like
 highlighting is only working with tokenized fields, f.i., it worked with
 text and another type I defined. Is this true, or I'm making a mistake that
 is preventing me to have the highlighting option working on string?
 
 Thanks for your help.
 

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350



Re: Issue w/ highlighting a String field

2010-03-23 Thread Saïd Radhouani
Thanks Markus. It says that a tokenizer ust be defined for the field. Here's
is the fildType I'm using and the field I want to highlight on. As you can
see, I defined a tokenizer, but it's not working though. Any idea?

In the schema:

fieldType name=text_Sort class=solr.TextField
sortMissingLast=true omitNorms=true
analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.TrimFilterFactory /
/analyzer
/fieldType

field name=title_sort type=text_Sort indexed=true
stored=true multiValued=false /

In solrconfig.xml:
 str name=hl.fltitle_sort text_description /str

At the same time, I wanted to highlight phrases (including stop words), but
it's not working. I use  and as you can see in my fieldType, I don't have
a stopword filter. Any idea?

Thanks a lot,
-S.


Thanks


2010/3/23 Markus Jelsma mar...@buyways.nl

 Hello,


 Check out the wiki [1] on what options to use for highlighting and other
 components.


 [1]: http://wiki.apache.org/solr/FieldOptionsByUseCase


 Cheers,



 On Tuesday 23 March 2010 17:11:42 Saïd Radhouani wrote:
  I have trouble with highlighting field of type string. It looks like
  highlighting is only working with tokenized fields, f.i., it worked with
  text and another type I defined. Is this true, or I'm making a mistake
 that
  is preventing me to have the highlighting option working on string?
 
  Thanks for your help.
 

 Markus Jelsma - Technisch Architect - Buyways BV
 http://www.linkedin.com/in/markus17
 050-8536620 http://www.linkedin.com/in/markus17%0A050-8536620 /
 06-50258350




Re: Issue w/ highlighting a String field

2010-03-23 Thread Erick Erickson
Did you restart solr and reindex? just changing the field definition
won't help you without reindexing...

One thing worries me about your fragment, you call it text_Sort.
If you really intend to sort by this field, it may NOT be tokenized,
you'll probably have to use copyfield

HTH
Erick

On Tue, Mar 23, 2010 at 12:45 PM, Saïd Radhouani r.steve@gmail.comwrote:

 Thanks Markus. It says that a tokenizer ust be defined for the field.
 Here's
 is the fildType I'm using and the field I want to highlight on. As you can
 see, I defined a tokenizer, but it's not working though. Any idea?

 In the schema:

fieldType name=text_Sort class=solr.TextField
 sortMissingLast=true omitNorms=true
analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.TrimFilterFactory /
/analyzer
/fieldType

field name=title_sort type=text_Sort indexed=true
 stored=true multiValued=false /

 In solrconfig.xml:
 str name=hl.fltitle_sort text_description /str

 At the same time, I wanted to highlight phrases (including stop words), but
 it's not working. I use  and as you can see in my fieldType, I don't have
 a stopword filter. Any idea?

 Thanks a lot,
 -S.


 Thanks


 2010/3/23 Markus Jelsma mar...@buyways.nl

  Hello,
 
 
  Check out the wiki [1] on what options to use for highlighting and other
  components.
 
 
  [1]: http://wiki.apache.org/solr/FieldOptionsByUseCase
 
 
  Cheers,
 
 
 
  On Tuesday 23 March 2010 17:11:42 Saïd Radhouani wrote:
   I have trouble with highlighting field of type string. It looks like
   highlighting is only working with tokenized fields, f.i., it worked
 with
   text and another type I defined. Is this true, or I'm making a mistake
  that
   is preventing me to have the highlighting option working on string?
  
   Thanks for your help.
  
 
  Markus Jelsma - Technisch Architect - Buyways BV
  http://www.linkedin.com/in/markus17
  050-8536620 http://www.linkedin.com/in/markus17%0A050-8536620 /
  06-50258350
 
 



Re: Issue w/ highlighting a String field

2010-03-23 Thread Saïd Radhouani
Thanks Erik. Actually, I restarted and reindexed numers of time, but still
not working.

RE: your question, I intend to use this field for automatic PHRASED
boosting; is that ok?:

str name=pf title_sort /str

Thanks.

2010/3/23 Erick Erickson erickerick...@gmail.com

 Did you restart solr and reindex? just changing the field definition
 won't help you without reindexing...

 One thing worries me about your fragment, you call it text_Sort.
 If you really intend to sort by this field, it may NOT be tokenized,
 you'll probably have to use copyfield

 HTH
 Erick

 On Tue, Mar 23, 2010 at 12:45 PM, Saïd Radhouani r.steve@gmail.com
 wrote:

  Thanks Markus. It says that a tokenizer ust be defined for the field.
  Here's
  is the fildType I'm using and the field I want to highlight on. As you
 can
  see, I defined a tokenizer, but it's not working though. Any idea?
 
  In the schema:
 
 fieldType name=text_Sort class=solr.TextField
  sortMissingLast=true omitNorms=true
 analyzer
 tokenizer class=solr.KeywordTokenizerFactory /
 filter class=solr.LowerCaseFilterFactory /
 filter class=solr.TrimFilterFactory /
 /analyzer
 /fieldType
 
 field name=title_sort type=text_Sort indexed=true
  stored=true multiValued=false /
 
  In solrconfig.xml:
  str name=hl.fltitle_sort text_description /str
 
  At the same time, I wanted to highlight phrases (including stop words),
 but
  it's not working. I use  and as you can see in my fieldType, I don't
 have
  a stopword filter. Any idea?
 
  Thanks a lot,
  -S.
 
 
  Thanks
 
 
  2010/3/23 Markus Jelsma mar...@buyways.nl
 
   Hello,
  
  
   Check out the wiki [1] on what options to use for highlighting and
 other
   components.
  
  
   [1]: http://wiki.apache.org/solr/FieldOptionsByUseCase
  
  
   Cheers,
  
  
  
   On Tuesday 23 March 2010 17:11:42 Saïd Radhouani wrote:
I have trouble with highlighting field of type string. It looks
 like
highlighting is only working with tokenized fields, f.i., it worked
  with
text and another type I defined. Is this true, or I'm making a
 mistake
   that
is preventing me to have the highlighting option working on string?
   
Thanks for your help.
   
  
   Markus Jelsma - Technisch Architect - Buyways BV
   http://www.linkedin.com/in/markus17
   050-8536620 http://www.linkedin.com/in/markus17%0A050-8536620 /
   06-50258350
  
  
 



Cannot fetch urls with target=_blank

2010-03-23 Thread Stefano Cherchi
As in subject: when I try to fetch a page whose link should open in new window, 
Nutch fails. 

I know it is not a Solr issue, actually, but I beg for a hint.

S

 -- 
Anyone proposing to run Windows on servers should be prepared to explain 
what they know about servers that Google, Yahoo, and Amazon don't.
Paul Graham


A mathematician is a device for turning coffee into theorems.
Paul Erdos (who obviously never met a sysadmin)







DIH - Deleting documents

2010-03-23 Thread André Maldonado
Hy all.

How can I delete documents when using DataImportHandler on a delta import?

Thank's

Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus. (Mateus 14:33)


Re: DIH - Deleting documents

2010-03-23 Thread Mauricio Scheffer
Take a look at the DIH special commands:
http://wiki.apache.org/solr/DataImportHandler#Special_Commands
http://wiki.apache.org/solr/DataImportHandler#Special_CommandsSome other
options:
http://stackoverflow.com/questions/1555610/solr-dih-how-to-handle-deleted-documents

Cheers,
Mauricio

2010/3/23 André Maldonado andre.maldon...@gmail.com

 Hy all.

 How can I delete documents when using DataImportHandler on a delta import?

 Thank's

 Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
 verdadeiramente o Filho de Deus. (Mateus 14:33)



Performing Starts with searches

2010-03-23 Thread Vladimir Sutskever
How do I perform a starts with search in Lucene/Solr.

Ex: I need all results that start with Bill   - NOT just contain Bill 
somewhere in the search string.



Thank You

-Vladimir
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.


RE: lowercasing for sorting

2010-03-23 Thread Binkley, Peter
Solr makes this easy:

tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/ 

You can populate this field from another field using copyField, if you
also need to be able to search or display the original values.

Just out of curiosity, can you tell us anything about what the Globe and
Mail is using Solr for? (assuming the question is work-related)

Peter


 -Original Message-
 From: Nagelberg, Kallin [mailto:knagelb...@globeandmail.com] 
 Sent: Tuesday, March 23, 2010 11:07 AM
 To: 'solr-user@lucene.apache.org'
 Subject: lowercasing for sorting
 
 I'm trying to perform a case-insensitive sort on a field in 
 my index that contains values like
 
 aaa
 bbb
 AA
 BB
 
 And I get them sorted like:
 
 aaa
 bbb
 AA
 BB
 
 When I would like them:
 
 aa
 aaa
 bb
 bbb
 
 To do this I'm trying to setup a fieldType who's sole purpose 
 is to lowercase a value on query and index. I don't want to 
 tokenize the value, just lowercase it. Any ideas?
 
 Thanks,
 Kallin Nagelberg
 


RE: lowercasing for sorting

2010-03-23 Thread Nagelberg, Kallin
Thanks, and my cover is apparently blown :P

We're looking at solr for a number of applications, from taking the load off 
the database, to user searching etc. I don't think I'll get fired for saying 
that :P

Thanks,
Kallin Nagelberg

-Original Message-
From: Binkley, Peter [mailto:peter.bink...@ualberta.ca] 
Sent: Tuesday, March 23, 2010 2:09 PM
To: solr-user@lucene.apache.org
Subject: RE: lowercasing for sorting

Solr makes this easy:

tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/ 

You can populate this field from another field using copyField, if you
also need to be able to search or display the original values.

Just out of curiosity, can you tell us anything about what the Globe and
Mail is using Solr for? (assuming the question is work-related)

Peter


 -Original Message-
 From: Nagelberg, Kallin [mailto:knagelb...@globeandmail.com] 
 Sent: Tuesday, March 23, 2010 11:07 AM
 To: 'solr-user@lucene.apache.org'
 Subject: lowercasing for sorting
 
 I'm trying to perform a case-insensitive sort on a field in 
 my index that contains values like
 
 aaa
 bbb
 AA
 BB
 
 And I get them sorted like:
 
 aaa
 bbb
 AA
 BB
 
 When I would like them:
 
 aa
 aaa
 bb
 bbb
 
 To do this I'm trying to setup a fieldType who's sole purpose 
 is to lowercase a value on query and index. I don't want to 
 tokenize the value, just lowercase it. Any ideas?
 
 Thanks,
 Kallin Nagelberg
 


RE: PDFBox/Tika Performance Issues

2010-03-23 Thread Giovanni Fernandez-Kincade
I don't think so. 

I'm using Tomcat on my servers, but I set up my local machine with the 
Eclipse-Jetty plugin from that Lucid article and I'm getting the same error. 

These are the libraries references in my Eclipse project:
apache-solr-core-1.5-dev.jar
apache-solr-dataimporthandler-1.5-dev.jar
apache-solr-solrj-1.5-dev.jar
commons-codec-1.3.jar
commons-csv-1.0-SNAPSHOT-r609327.jar
commons-fileupload-1.2.1.jar
commons-httpclient-3.1.jar
commons-io-1.4.jar
geronimo-stax-api_1.0_spec-1.0.1.jar
google-collect-1.0.jar
jcl-over-slf4j-1.5.5.jar
lucene-analyzers-2.9.2.jar
lucene-collation-2.9.2.jar
lucene-core-2.9.2.jar
lucene-fast-vector-highlighter-2.9.2.jar
lucene-highlighter-2.9.2.jar
lucene-memory-2.9.2.jar
lucene-misc-2.9.2.jar
lucene-queries-2.9.2.jar
lucene-snowball-2.9.2.jar
lucene-spatial-2.9.2.jar
lucene-spellchecker-2.9.2.jar
slf4j-api-1.5.5.jar
slf4j-jdk14-1.5.5.jar
wstx-asl-3.2.7.jar
apache-solr-cell-1.4-dev.jar
asm-3.1.jar
bcmail-jdk15-1.45.jar
bcprov-jdk15-1.45.jar
commons-codec-1.3.jar
commons-compress-1.0.jar
commons-io-1.4.jar
commons-lang-2.1.jar
commons-logging-1.1.1.jar
dir.txt
dom4j-1.6.1.jar
fontbox-1.0.0.jar
geronimo-stax-api_1.0_spec-1.0.1.jar
hamcrest-core-1.1.jar
icu4j-3.8.jar
jempbox-1.0.0.jar
junit-3.8.1.jar
log4j-1.2.14.jar
lucene-core-2.9.1-dev.jar
lucene-misc-2.9.1-dev.jar
metadata-extractor-2.4.0-beta-1.jar
mockito-core-1.7.jar
nekohtml-1.9.9.jar
objenesis-1.0.jar
ooxml-schemas-1.0.jar
pdfbox-1.0.0.jar
poi-3.6.jar
poi-ooxml-3.6.jar
poi-ooxml-schemas-3.6.jar
poi-scratchpad-3.6.jar
tagsoup-1.2.jar
tika-app-0.7-SNAPSHOT.jar
tika-core-0.7-SNAPSHOT.jar
tika-parsers-0.7-SNAPSHOT.jar
xercesImpl-2.8.1.jar
xml-apis-1.0.b2.jar
xmlbeans-2.3.0.jar

-Original Message-
From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov] 
Sent: Tuesday, March 23, 2010 11:03 AM
To: solr-user@lucene.apache.org
Subject: Re: PDFBox/Tika Performance Issues

Hi Giovanni,

The error that you're showing in your logs below indicates that this message 
signature:

org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)

doesn't match what was expected. Are you sure you don't have another Solr jar 
on the classpath somewhere, or in your web server? Are you using Jetty, or 
Tomcat?

Thanks,
Chris



On 3/23/10 7:59 AM, Giovanni Fernandez-Kincade 
gfernandez-kinc...@capitaliq.com wrote:

Sorry for the late reply - been out of town for a couple of days.

From my solrconfig:

requestHandler name=/update/extract 
class=org.apache.solr.handler.extraction.ExtractingRequestHandler 
startup=lazy
lst name=defaults
  str name=uprefixignored_/str
  str name=map.contenttext/str
/lst
  /requestHandler


-Original Message-
From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll
Sent: Saturday, March 20, 2010 8:43 AM
To: solr-user@lucene.apache.org
Subject: Re: PDFBox/Tika Performance Issues

What's your configuration look like for the ExtractReqHandler?

On Mar 19, 2010, at 2:42 PM, Giovanni Fernandez-Kincade wrote:

 Yeah I've been trying that - I keep getting this error when indexing a PDF 
 with a trunk-build:

   Apache Tomcat/5.5.27 - Error report
   HTTP Status 500 - org.apache.solr.handler.
   
 ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)
   V  java.lang.AbstractMethodError: 
 org.apache.solr.handler.ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
   at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
   at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
   at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
   at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
   at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
   at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)   

Re: Issue w/ highlighting a String field

2010-03-23 Thread Ahmet Arslan
 Thanks Erik. Actually, I restarted
 and reindexed numers of time, but still
 not working.

Highlighting on string typed fields perferctly works. See the output of :

http://localhost:8983/solr/select/?q=id%3ASOLR1000version=2.2start=0rows=10indent=onhl=truehl.fl=id

But there must be a match/hit to get highlighting. What is your query and 
candidate field content that you want to highlight?


  


Re: DIH - Deleting documents

2010-03-23 Thread blargy

Are there any examples out there for using these special commands? Im not
quite sure of the syntax. Any simple example will suffice. Thanks


mausch wrote:
 
 Take a look at the DIH special commands:
 http://wiki.apache.org/solr/DataImportHandler#Special_Commands
 http://wiki.apache.org/solr/DataImportHandler#Special_CommandsSome other
 options:
 http://stackoverflow.com/questions/1555610/solr-dih-how-to-handle-deleted-documents
 
 Cheers,
 Mauricio
 
 2010/3/23 André Maldonado andre.maldon...@gmail.com
 
 Hy all.

 How can I delete documents when using DataImportHandler on a delta
 import?

 Thank's

 Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
 verdadeiramente o Filho de Deus. (Mateus 14:33)

 
 

-- 
View this message in context: 
http://old.nabble.com/DIH---Deleting-documents-tp28004771p28005199.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Perfect Match

2010-03-23 Thread Ahmet Arslan
 Thankyou Ahmet. You were right.
 artist_s:Dora is bringing results.
 But I need artist_s:Dora the explorer to bring only those
 results which contain Dora the explorer.
  
 I tried to give artist_s:Dora the explorer (phrase
 search).. that is working. But artist_s:Dora the explorer is
 not working. Any way to make this artist_s:Dora the explorer
 to return results that contain this in them.

I learned this from Chris Hostetter's message[1] You can use 
q={!field f=artist_s}Dora the explorer
instead of q=artist_s:Dora the explorer.

[1]http://search-lucene.com/m/rrHVV1ZhO4j/this+is+what+the+%22field%22+QParserPlugin+was+invented+for


  


Re: DIH - Deleting documents

2010-03-23 Thread André Maldonado
In my case I will sove the problem with postImportDeleteQuery

Thank's

Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus. (Mateus 14:33)


On Tue, Mar 23, 2010 at 15:29, blargy zman...@hotmail.com wrote:


 Are there any examples out there for using these special commands? Im not
 quite sure of the syntax. Any simple example will suffice. Thanks


 mausch wrote:
 
  Take a look at the DIH special commands:
  http://wiki.apache.org/solr/DataImportHandler#Special_Commands
  http://wiki.apache.org/solr/DataImportHandler#Special_CommandsSome
 other
  options:
 
 http://stackoverflow.com/questions/1555610/solr-dih-how-to-handle-deleted-documents
 
  Cheers,
  Mauricio
 
  2010/3/23 André Maldonado andre.maldon...@gmail.com
 
  Hy all.
 
  How can I delete documents when using DataImportHandler on a delta
  import?
 
  Thank's
 
  Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
 És
  verdadeiramente o Filho de Deus. (Mateus 14:33)
 
 
 

 --
 View this message in context:
 http://old.nabble.com/DIH---Deleting-documents-tp28004771p28005199.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Impossible Boost Query?

2010-03-23 Thread blargy

I was wondering if this is even possible. I'll try to explain what I'm trying
to do to the best of my ability. 

Ok, so our site has a bunch of products that are sold by any number of
sellers. Currently when I search for some product I get back all products
matching that search term but the problem is there may be multiple products
sold by the same seller that are all closely related, therefore their scores
are related. So basically the search ends up with results that are all
closely clumped together by the same seller but I would much rather prefer
to distribute these results across sellers (given each seller a fair shot to
sell their goods). 

Is there any way to add some boost query for example that will start
weighing products lower when their seller has already been listed a few
times. For example, right now I have

Product foo by Seller A
Product foo by Seller A
Product foo by Seller A
Product foo by Seller B
Product foo by Seller B
Product foo by Seller B
Product foo by Seller C
Product foo by Seller C
Product foo by Seller C

where each result is very close in score. I would like something like this

Product foo by Seller A
Product foo by Seller B
Product foo by Seller C
Product foo by Seller A
Product foo by Seller B
Product foo by Seller C


basically distributing the results over the sellers. Is something like this
possible? I don't care if the solution involves a boost query or not. I just
want some way to distribute closely related documents.

Thanks!!!
-- 
View this message in context: 
http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html
Sent from the Solr - User mailing list archive at Nabble.com.



Out of Memory

2010-03-23 Thread Neil Chaudhuri
I am using the DataImportHandler to index literally millions of documents in an 
Oracle database. Not surprisingly, I got the following after a few hours:

java.sql.SQLException: ORA-04030: out of process memory when trying to allocate 
4032 bytes (kolaGetRfcHeap,kghsseg: kolaslCreateCtx)

Has anyone come across this? What are the ways around this, if any?

Thanks.


Solr Self-Join Query

2010-03-23 Thread Vladimir Sutskever
Hi Guys/Gals,


I have columns like so in my index
 
client_id,
client_name,
client_parent_id


Does SOLR support queries of self-join.

Example:

client_name:wallmart AND (client_parent_id!=client_id)

I need all entries that match wallmart and do NOT have 
client_parent_id==client_id

Thank you for your help
-Vladimir
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.


RE: Out of Memory

2010-03-23 Thread Craig Christman
Is this on Oracle 10.2.0.4?  Looking at the Oracle support site there's a 
memory leak using some of the XML functions that can be fixed by upgrading to 
10.2.0.5, 11.2, or by using 10.2.0.4 Patch 2 in Windows 32-bit.

-Original Message-
From: Neil Chaudhuri [mailto:nchaudh...@potomacfusion.com]
Sent: Tuesday, March 23, 2010 3:21 PM
To: 'solr-user@lucene.apache.org'
Subject: Out of Memory

I am using the DataImportHandler to index literally millions of documents in an 
Oracle database. Not surprisingly, I got the following after a few hours:

java.sql.SQLException: ORA-04030: out of process memory when trying to allocate 
4032 bytes (kolaGetRfcHeap,kghsseg: kolaslCreateCtx)

Has anyone come across this? What are the ways around this, if any?

Thanks.


RE: Out of Memory

2010-03-23 Thread Dennis Gearon
Now THAT's real open source help! Nice job Craig.
Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Tue, 3/23/10, Craig Christman cchrist...@caci.com wrote:

 From: Craig Christman cchrist...@caci.com
 Subject: RE: Out of Memory
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Date: Tuesday, March 23, 2010, 1:01 PM
 Is this on Oracle 10.2.0.4? 
 Looking at the Oracle support site there's a memory leak
 using some of the XML functions that can be fixed by
 upgrading to 10.2.0.5, 11.2, or by using 10.2.0.4 Patch 2 in
 Windows 32-bit.
 
 -Original Message-
 From: Neil Chaudhuri [mailto:nchaudh...@potomacfusion.com]
 Sent: Tuesday, March 23, 2010 3:21 PM
 To: 'solr-user@lucene.apache.org'
 Subject: Out of Memory
 
 I am using the DataImportHandler to index literally
 millions of documents in an Oracle database. Not
 surprisingly, I got the following after a few hours:
 
 java.sql.SQLException: ORA-04030: out of process memory
 when trying to allocate 4032 bytes (kolaGetRfcHeap,kghsseg:
 kolaslCreateCtx)
 
 Has anyone come across this? What are the ways around this,
 if any?
 
 Thanks.



Re: Solr Self-Join Query

2010-03-23 Thread Otis Gospodnetic
Vladimir,

Think of Solr/Lucene index as a single, flat, denormalized table, where the 
columns are called fields.

client_id:walmart
client_name:Walmart
client_parent_id:walmart

The query that I think you are looking for then becomes:

+client_id:walmart -client_parent_id:walmart


Otis 

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



- Original Message 
 From: Vladimir Sutskever vladimir.sutske...@jpmorgan.com
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 3:36:42 PM
 Subject: Solr Self-Join Query
 
 Hi Guys/Gals,


I have columns like so in my index

 
client_id,
client_name,
client_parent_id


Does SOLR support 
 queries of self-join.

Example:

client_name:wallmart AND 
 (client_parent_id!=client_id)

I need all entries that match wallmart 
 and do NOT have client_parent_id==client_id

Thank you for your 
 help
-Vladimir
This email is confidential and subject to important 
 disclaimers and
conditions including on offers for the purchase or sale 
 of
securities, accuracy and completeness of information, 
 viruses,
confidentiality, legal privilege, and legal entity 
 disclaimers,
available at 
 href=http://www.jpmorgan.com/pages/disclosures/email; target=_blank 
 http://www.jpmorgan.com/pages/disclosures/email.


Re: Impossible Boost Query?

2010-03-23 Thread Otis Gospodnetic
Would Field Collapsing from SOLR-236 do the job for you?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



- Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost Query?
 
 
I was wondering if this is even possible. I'll try to explain what I'm 
 trying
to do to the best of my ability. 

Ok, so our site has a bunch 
 of products that are sold by any number of
sellers. Currently when I search 
 for some product I get back all products
matching that search term but the 
 problem is there may be multiple products
sold by the same seller that are 
 all closely related, therefore their scores
are related. So basically the 
 search ends up with results that are all
closely clumped together by the same 
 seller but I would much rather prefer
to distribute these results across 
 sellers (given each seller a fair shot to
sell their goods). 

Is there 
 any way to add some boost query for example that will start
weighing products 
 lower when their seller has already been listed a few
times. For example, 
 right now I have

Product foo by Seller A
Product foo by Seller 
 A
Product foo by Seller A
Product foo by Seller B
Product foo by Seller 
 B
Product foo by Seller B
Product foo by Seller C
Product foo by Seller 
 C
Product foo by Seller C

where each result is very close in score. I 
 would like something like this

Product foo by Seller A
Product foo by 
 Seller B
Product foo by Seller C
Product foo by Seller A
Product foo by 
 Seller B
Product foo by Seller C


basically distributing the 
 results over the sellers. Is something like this
possible? I don't care if 
 the solution involves a boost query or not. I just
want some way to 
 distribute closely related documents.

Thanks!!!
-- 
View this 
 message in context: 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html; 
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html
Sent 
 from the Solr - User mailing list archive at Nabble.com.


Re: Cannot fetch urls with target=_blank

2010-03-23 Thread Otis Gospodnetic
hi Stefano,

nutch-user@ is a much better place to ask this question really.  You'll also 
want to include more info about Nutch fails.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



- Original Message 
 From: Stefano Cherchi stefanocher...@yahoo.it
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 1:40:46 PM
 Subject: Cannot fetch urls with target=_blank
 
 As in subject: when I try to fetch a page whose link should open in new 
 window, 
 Nutch fails. 

I know it is not a Solr issue, actually, but I beg for a 
 hint.

S

-- 
Anyone proposing 
 to run Windows on servers should be prepared to explain 
what they know about 
 servers that Google, Yahoo, and Amazon don't.
Paul Graham


A 
 mathematician is a device for turning coffee into theorems.
Paul Erdos (who 
 obviously never met a sysadmin)


Re: Features not present in Solr

2010-03-23 Thread David Smiley @MITRE.org

Interesting.  Do you have a reference (e.g. a patch, post, ...) to people
actually doing this?  The FieldCache seems like cheating because it's
in-memory and there is a limited amount of memory, so for large data sets I
have to wonder.


Grant Ingersoll-6 wrote:
 
 
 On Mar 23, 2010, at 4:17 AM, Andrzej Bialecki wrote:
 
 On 2010-03-23 06:25, David Smiley @MITRE.org wrote:
 
 I use Endeca and Solr.
 
 A few notable things in Endeca but not in Solr:
 1. Real-time search.
 
 
 2. related record navigation (RRN) is what they call it.  This is the
 ability to join in other records, something Lucene/Solr definitely can't
 do.
 
 Could you perhaps elaborate a bit on this functionality? Your description
 sounds intriguing - it reminds me of ParallelReader, but I'm probably
 completely wrong ...
 
 
 AIUI, it just allows you to do joins like in a db.  So, given a music
 band, get related things like band members, albums, etc.  You can do this
 in Lucene with some work by leveraging Field Cache, but it gets tricky in
 light of freq. updates.
 

-- 
View this message in context: 
http://old.nabble.com/Features-not-present-in-Solr-tp27966315p28006723.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Configuring multiple SOLR apps to play nice with MBeans / JMX

2010-03-23 Thread Otis Gospodnetic
Wow, this sounds interesting.  I never looked at JMX with multiple cores Solr 
instances.
I wonder if this calls for a new JIRA issue

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



- Original Message 
 From: Constantijn Visinescu baeli...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 8:42:30 AM
 Subject: Re: Configuring multiple SOLR apps to play nice with MBeans / JMX
 
 Hi,

Multicore lets me have me have multiple cores in a single 
 instance.

However since i have 3 different webapps with embedded solr 
 that means i
have 3 different instances of solr.
(and they're all trying 
 to park their JMX MBeans under the same name, 
 namely
solr)

Constantijn


On Tue, Mar 23, 2010 at 11:44 AM, 
 Charl Mert 
 href=mailto:ch...@knowledgetree.com;ch...@knowledgetree.comwrote:

 
 Hi Constantijn,

 I'm not too sure about the JMX monitoring side 
 of things but having looked
 at the Solr's MultiCore 
 href=http://wiki.apache.org/solr/CoreAdmin; target=_blank 
 http://wiki.apache.org/solr/CoreAdmin
 feature it seems really 
 simple to create multiple solr cores that could all
 be configured to 
 point
 to one MBean server.

 When creating a core you can 
 specify name like solr1, solr2:

 
 href=http://localhost:8983/solr/admin/cores?action=CREATEname=solr_01instanceDir=/etc/solr/multicore/core2config=solrconfig.xmlschema=schema.xmldataDir=data;
  
 target=_blank 
 http://localhost:8983/solr/admin/cores?action=CREATEname=solr_01instanceDir=/etc/solr/multicore/core2config=solrconfig.xmlschema=schema.xmldataDir=data

 
 This is made possible due to the fact that each core can have it's own
 
 solrconfig.xml
 See example/multicore/ in your solr 
 distribution.

 Hope this helps.

 Regards
 
 Charl Mert



 On Tue, Mar 23, 2010 at 12:10 PM, 
 Constantijn Visinescu
 
 href=mailto:baeli...@gmail.com;baeli...@gmail.comwrote:

 
  Hi,
 
  I'm having a problem trying to get multiple 
 solr applications to run in
 the
  same servlet container 
 because they all try to claim solr as a
  name/category to put 
 their mbeans under and that causes
 exceptions/crashes
  for 
 all the applications after the first.
 
  I've read 
 href=http://wiki.apache.org/solr/SolrJmx; target=_blank 
 http://wiki.apache.org/solr/SolrJmx and it shows configuration
  
 options to define a JMX server agentID or to provide your own JMX url 
 but
 i
  don't want either. (i think)
 
 
  I just want my webapps to show as solr1, solr2 and solr3 when
 
  monitoring them rather then all of them trying to race for solr 
 and
  having
  all of them after the first crash.
 
 
  Right now I've disabled JMX and that works to get my apps 
 started at
 least,
  but it's not what i want either.
 
 
  Anyone know how to configure solr to do this?
  If 
 a configuration option like jmx name=solr1 / exists that'd fix 
 my
  problem but i can't seem to find it in the 
 documentation.
 
  Thanks in advance,
  
 Constantijn Visinescu
 



Re: [POLL] Users of abortOnConfigurationError ?

2010-03-23 Thread Ryan McKinley
The 'abortOnConfigurationError' option was added a long time ago...
at the time, there were many errors that would just be written to the
logs but startup would continue normally.

I felt (and still do) that if there is a configuration error
everything should fail loudly.  The option in solrconfig.xml was added
as a back-compatible way to get both behaviors.

I don't see any value in letting solr continue working even though
something was configured wrong.

Does a lack replies to this thread imply that everyone agrees?
(Reading the email, and following directions, i should just ignore
this email)

Ryan


On Thu, Mar 18, 2010 at 9:12 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 Due to some issues with the (lack of) functionality behind the
 abortOnConfigurationError option in solrconfig.xml, I'd like to take a
 quick poll of the solr-user community...

  * If you have never heard of the abortOnConfigurationError
   option prior to this message, please ignore this email.

  * If you have seen abortOnConfigurationError in solrconfig.xml,
   or in error messages when using Solr, but you have never
   modified the value of this option in your configs, or changed
   it at run time, please ignore this email.

  * If you have ever set abortOnConfigurationError=false, either
   in your config files or at run time, please reply to these
   three questions...

 1) What version of Solr are you using ?

 2) What advantages do you percieve that you have by setting
   abortOnConfigurationError=false ?

 3) What problems do you suspect you would encounter if this
   option was eliminated in future versions of Solr ?

 Thank you.

 (For people who are interested, the impetuses for this Poll can be found in
 SOLR-1743, SOLR-1817, SOLR-1824, and SOLR-1832)


 -Hoss




Re: Impossible Boost Query?

2010-03-23 Thread blargy

Possibly. How can I install this as a contrib or do I need to actually
perform the patch?


Otis Gospodnetic wrote:
 
 Would Field Collapsing from SOLR-236 do the job for you?
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/
 
 
 
 - Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost Query?
 
 
 I was wondering if this is even possible. I'll try to explain what I'm 
 trying
 to do to the best of my ability. 
 
 Ok, so our site has a bunch 
 of products that are sold by any number of
 sellers. Currently when I search 
 for some product I get back all products
 matching that search term but the 
 problem is there may be multiple products
 sold by the same seller that are 
 all closely related, therefore their scores
 are related. So basically the 
 search ends up with results that are all
 closely clumped together by the same 
 seller but I would much rather prefer
 to distribute these results across 
 sellers (given each seller a fair shot to
 sell their goods). 
 
 Is there 
 any way to add some boost query for example that will start
 weighing products 
 lower when their seller has already been listed a few
 times. For example, 
 right now I have
 
 Product foo by Seller A
 Product foo by Seller 
 A
 Product foo by Seller A
 Product foo by Seller B
 Product foo by Seller 
 B
 Product foo by Seller B
 Product foo by Seller C
 Product foo by Seller 
 C
 Product foo by Seller C
 
 where each result is very close in score. I 
 would like something like this
 
 Product foo by Seller A
 Product foo by 
 Seller B
 Product foo by Seller C
 Product foo by Seller A
 Product foo by 
 Seller B
 Product foo by Seller C
 
 
 basically distributing the 
 results over the sellers. Is something like this
 possible? I don't care if 
 the solution involves a boost query or not. I just
 want some way to 
 distribute closely related documents.
 
 Thanks!!!
 -- 
 View this 
 message in context: 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
  
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html
 Sent 
 from the Solr - User mailing list archive at Nabble.com.
 
 

-- 
View this message in context: 
http://old.nabble.com/Impossible-Boost-Query--tp28005354p2800.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Impossible Boost Query?

2010-03-23 Thread blargy

Maybe a better question is... how can I install this and will it work with
1.4?

Thanks


blargy wrote:
 
 Possibly. How can I install this as a contrib or do I need to actually
 perform the patch?
 
 
 Otis Gospodnetic wrote:
 
 Would Field Collapsing from SOLR-236 do the job for you?
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/
 
 
 
 - Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost Query?
 
 
 I was wondering if this is even possible. I'll try to explain what I'm 
 trying
 to do to the best of my ability. 
 
 Ok, so our site has a bunch 
 of products that are sold by any number of
 sellers. Currently when I search 
 for some product I get back all products
 matching that search term but the 
 problem is there may be multiple products
 sold by the same seller that are 
 all closely related, therefore their scores
 are related. So basically the 
 search ends up with results that are all
 closely clumped together by the same 
 seller but I would much rather prefer
 to distribute these results across 
 sellers (given each seller a fair shot to
 sell their goods). 
 
 Is there 
 any way to add some boost query for example that will start
 weighing products 
 lower when their seller has already been listed a few
 times. For example, 
 right now I have
 
 Product foo by Seller A
 Product foo by Seller 
 A
 Product foo by Seller A
 Product foo by Seller B
 Product foo by Seller 
 B
 Product foo by Seller B
 Product foo by Seller C
 Product foo by Seller 
 C
 Product foo by Seller C
 
 where each result is very close in score. I 
 would like something like this
 
 Product foo by Seller A
 Product foo by 
 Seller B
 Product foo by Seller C
 Product foo by Seller A
 Product foo by 
 Seller B
 Product foo by Seller C
 
 
 basically distributing the 
 results over the sellers. Is something like this
 possible? I don't care if 
 the solution involves a boost query or not. I just
 want some way to 
 distribute closely related documents.
 
 Thanks!!!
 -- 
 View this 
 message in context: 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
  
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html
 Sent 
 from the Solr - User mailing list archive at Nabble.com.
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Configuring multiple SOLR apps to play nice with MBeans / JMX

2010-03-23 Thread Chris Hostetter

: I'm having a problem trying to get multiple solr applications to run in the
: same servlet container because they all try to claim solr as a

Hmmm... i think you're in new territory here.   I don't know that anyone 
has ever mentioned doing this before.

Honestly: I thought the hierarchical nature of JMX would mean that 
the Servlet Container would start up a JMX server, and present a seperate 
branch to each webapp in isolation -- based on what you're saying it 
sounds like different webapps can't actually break eachother by mucking 
with JMX Beans/values.

: If a configuration option like jmx name=solr1 / exists that'd fix my
: problem but i can't seem to find it in the documentation.

It doesn't, but it would probably be pretty trivial to add if you want to 
take a stab at a patch for it.


-Hoss



Re: use termscomponent like spellComponent ?!

2010-03-23 Thread Chris Hostetter

: so when ich search for nik termsCom suggest me  nikon. thats correctly
: whar i want.
: but when i type nikon on i want that solr suggest me nikon one , 

try using copyField to index an untokenized version of your field, so that 
nikon one is a single term, then nikon on as a prefix will match that 
in the TermComponent.



-Hoss



Re: How to get Facet results only on a range of search results documents

2010-03-23 Thread Chris Hostetter

: I would like to return Facet results only on the range of search results
: (say 1-100) not on the whole set of search results. Any idea how can I do
: it?

Thta's pretty trivial to do in the client layer (fetch the first 100 
results, iterate over them, and count per facet field)

If you really wanted this to happen server side, you could write a custom 
subclass of the QueryComponent that used the DocList to build and replace 
the DocSet ... that way faceting would only know about hte documents on 
the current page.


-Hoss



Re: Impossible Boost Query?

2010-03-23 Thread Otis Gospodnetic
You'd likely want to get the latest patch and trunk and try applying.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



- Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 6:10:22 PM
 Subject: Re: Impossible Boost Query?
 
 
Maybe a better question is... how can I install this and will it work 
 with
1.4?

Thanks


blargy wrote:
 
 Possibly. 
 How can I install this as a contrib or do I need to actually
 perform the 
 patch?
 
 
 Otis Gospodnetic wrote:
 
 
 Would Field Collapsing from SOLR-236 do the job for 
 you?
 
 Otis
 
 Sematext :: 
 href=http://sematext.com/; target=_blank http://sematext.com/ :: Solr - 
 Lucene - Nutch
 Hadoop ecosystem search :: 
 href=http://search-hadoop.com/; target=_blank 
 http://search-hadoop.com/
 
 
 
 
 - Original Message 
 From: blargy 
 ymailto=mailto:zman...@hotmail.com; 
 href=mailto:zman...@hotmail.com;zman...@hotmail.com
 
 To: 
 href=mailto:solr-user@lucene.apache.org;solr-user@lucene.apache.org
 
 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost 
 Query?
 
 
 I was wondering if this is 
 even possible. I'll try to explain what I'm 
 trying
 
 to do to the best of my ability. 
 
 Ok, so our site has a 
 bunch 
 of products that are sold by any number of
 
 sellers. Currently when I search 
 for some product I get back 
 all products
 matching that search term but the 
 
 problem is there may be multiple products
 sold by the same seller 
 that are 
 all closely related, therefore their 
 scores
 are related. So basically the 
 search ends up 
 with results that are all
 closely clumped together by the same 
 
 seller but I would much rather prefer
 to distribute 
 these results across 
 sellers (given each seller a fair shot 
 to
 sell their goods). 
 
 Is there 
 
 any way to add some boost query for example that will 
 start
 weighing products 
 lower when their seller has 
 already been listed a few
 times. For example, 
 right 
 now I have
 
 Product foo by Seller A
 Product 
 foo by Seller 
 A
 Product foo by Seller A
 
 Product foo by Seller B
 Product foo by Seller 
 
 B
 Product foo by Seller B
 Product foo by Seller 
 C
 Product foo by Seller 
 C
 Product foo 
 by Seller C
 
 where each result is very close in score. I 
 
 would like something like this
 
 Product 
 foo by Seller A
 Product foo by 
 Seller B
 
 Product foo by Seller C
 Product foo by Seller A
 Product 
 foo by 
 Seller B
 Product foo by Seller C
 
 
 
 basically distributing the 
 
 results over the sellers. Is something like this
 possible? I don't 
 care if 
 the solution involves a boost query or not. I 
 just
 want some way to 
 distribute closely related 
 documents.
 
 Thanks!!!
 -- 
 View 
 this 
 message in context: 
 href=
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html; 
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html; 
 
 target=_blank 
 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html; 
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html
 
 Sent 
 from the Solr - User mailing list archive at 
 Nabble.com.
 
 
 
 

-- 
View this 
 message in context: 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html; 
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html
Sent 
 from the Solr - User mailing list archive at Nabble.com.


Re: dismax and q.op

2010-03-23 Thread Chris Hostetter

:  *I haven't mentioned value for mm*
...
: My result:- No results; but each of the terms individually gave me results!

http://wiki.apache.org/solr/DisMaxRequestHandler#mm_.28Minimum_.27Should.27_Match.29

The default value is 100% (all clauses must match)

: 2. Does the default operator specified in schema.xml take effect when we use
: dismax also or is it only for the *standard* request handler. If it has an

dismax doesn't look at the default operator, or q.op.

: 3. How does q.alt and q difer in behavior in the above case. I found q.alt
: to be giving me the results which I got when I used the standard RH also.
: Hence used it.

q.alt is used if and only if there is no q param (or hte q param is blank) 
... the number of patches q gets, or the value of mm make no 
differnce.

: 4. When I make a change to the dismax set up I have in solrconfig.xml I
: believe i just have to bounce the SOLR server.Do i need to re-index again
: for the change to take effect

no ... changes to query time options like your SearchHandler configs 
don't require reindexing .. changes to your schema.xml *may* requre 
reindexing.

: 5. If I use the dismax how do I see the ANALYSIS feature on the admin
: console other wise used for *standard* RH.

I'm afraid i don't understand this question ... analysis.jsp just shows 
you the index and query time analysis that is performed when certain 
fields are used -- it dosen't know/care about your choice of parser ... it 
knows nothing about query parser syntax.



-Hoss



Re: Impossible Boost Query?

2010-03-23 Thread blargy

Thanks but Im not quite show on how to apply the patch. I just use the
packaged solr-1.4.0.war in my deployment (no compiling, etc). Is there a way
I can patch the war file?

Any instructions would be greatly appreciated. Thanks


Otis Gospodnetic wrote:
 
 You'd likely want to get the latest patch and trunk and try applying.
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/
 
 
 
 - Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 6:10:22 PM
 Subject: Re: Impossible Boost Query?
 
 
 Maybe a better question is... how can I install this and will it work 
 with
 1.4?
 
 Thanks
 
 
 blargy wrote:
 
 Possibly. 
 How can I install this as a contrib or do I need to actually
 perform the 
 patch?
 
 
 Otis Gospodnetic wrote:
 
 
 Would Field Collapsing from SOLR-236 do the job for 
 you?
 
 Otis
 
 Sematext :: 
 href=http://sematext.com/; target=_blank http://sematext.com/ :: Solr - 
 Lucene - Nutch
 Hadoop ecosystem search :: 
 href=http://search-hadoop.com/; target=_blank 
 http://search-hadoop.com/
 
 
 
 
 - Original Message 
 From: blargy 
 ymailto=mailto:zman...@hotmail.com; 
 href=mailto:zman...@hotmail.com;zman...@hotmail.com
 
 To: 
 href=mailto:solr-user@lucene.apache.org;solr-user@lucene.apache.org
 
 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost 
 Query?
 
 
 I was wondering if this is 
 even possible. I'll try to explain what I'm 
 trying
 
 to do to the best of my ability. 
 
 Ok, so our site has a 
 bunch 
 of products that are sold by any number of
 
 sellers. Currently when I search 
 for some product I get back 
 all products
 matching that search term but the 
 
 problem is there may be multiple products
 sold by the same seller 
 that are 
 all closely related, therefore their 
 scores
 are related. So basically the 
 search ends up 
 with results that are all
 closely clumped together by the same 
 
 seller but I would much rather prefer
 to distribute 
 these results across 
 sellers (given each seller a fair shot 
 to
 sell their goods). 
 
 Is there 
 
 any way to add some boost query for example that will 
 start
 weighing products 
 lower when their seller has 
 already been listed a few
 times. For example, 
 right 
 now I have
 
 Product foo by Seller A
 Product 
 foo by Seller 
 A
 Product foo by Seller A
 
 Product foo by Seller B
 Product foo by Seller 
 
 B
 Product foo by Seller B
 Product foo by Seller 
 C
 Product foo by Seller 
 C
 Product foo 
 by Seller C
 
 where each result is very close in score. I 
 
 would like something like this
 
 Product 
 foo by Seller A
 Product foo by 
 Seller B
 
 Product foo by Seller C
 Product foo by Seller A
 Product 
 foo by 
 Seller B
 Product foo by Seller C
 
 
 
 basically distributing the 
 
 results over the sellers. Is something like this
 possible? I don't 
 care if 
 the solution involves a boost query or not. I 
 just
 want some way to 
 distribute closely related 
 documents.
 
 Thanks!!!
 -- 
 View 
 this 
 message in context: 
 href=
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
  
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html; 
 
 target=_blank 
 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
  
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html
 
 Sent 
 from the Solr - User mailing list archive at 
 Nabble.com.
 
 
 
 
 
 -- 
 View this 
 message in context: 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html;
  
 target=_blank 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html
 Sent 
 from the Solr - User mailing list archive at Nabble.com.
 
 

-- 
View this message in context: 
http://old.nabble.com/Impossible-Boost-Query--tp28005354p28008495.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to Combine Dismax Query Handler and Clustering Component

2010-03-23 Thread Chris Hostetter

: How do we combine clustering component and Dismax query handler?

The dismax *handler* is now just the SearchHandler with defType=dismax ... 
so if you follow the examples for setting up the clustering component on 
an instance of SearchHandler, all you have to do is configure that 
instance to use the DismaxQParserPlugin by using defType=dismax as a 
(default or invarient) query param.



-Hoss



Re: 64 bit integers (MySQL bigint) and SOLR

2010-03-23 Thread Chris Hostetter
: 
: The primary key for my database is a BIGINT,  basically a 64 bit integer.  The
: value is well below the 32 bit maximum (about 230 million right now) but
: someday in the future that might not be the case.  In the schema, we have it
: mapped to a tint field type as defined in the example schema.  Is this going
: to work?  It is 64 bit CentOS 5.4 with 64 bit Sun JDK 1.6.0_18.  I did some
: searching and was not able to determine much.

No.  But a TrieLongField should.  

The Int and Long in the FieldType names corrisponds directly to the 
java primitive types, which do not change recardless of wether you have a 
64 bit JVM...
  http://java.sun.com/docs/books/tutorial/java/nutsandbolts/datatypes.html

(FYI: this is fairly trivial to test .. just index a realy big number and 
see if it sorts/searches properly)


-Hoss



Re: [POLL] Users of abortOnConfigurationError ?

2010-03-23 Thread Chris Hostetter

: I felt (and still do) that if there is a configuration error
: everything should fail loudly.  The option in solrconfig.xml was added
: as a back-compatible way to get both behaviors.

Oh man ... i completley remembered that backwards ... i thought you were 
the one that was argueing in favor of letting people set 
aportOnConfigError=false so that they could use handlerA even if handlerB 
didn't init properly.

: Does a lack replies to this thread imply that everyone agrees?

I think so.


-Hoss



Re: [ANN] Zoie Solr Plugin - Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+

2010-03-23 Thread brad anderson
I see, so when you do a commit it adds it to Zoie's ramdirectory. So, could
you just commit after every document without having a performance impact and
have real time search?

Thanks,
Brad

On 20 March 2010 00:34, Janne Majaranta janne.majara...@gmail.com wrote:

 To my understanding it adds a in-memory index which holds the recent
 commits and which is flushed to the main index based on the config options.
 Not sure if it helps to get solr near real time. I am evaluating it
 currently, and I am really not sure if it adds anything because of the cache
 regeneration of solr on every commit ??

 -Janne

 Lähetetty iPodista

 brad anderson solrinter...@gmail.com kirjoitti 19.3.2010 kello 20.53:


  Indeed, which is why I'm wondering what is Zoie adding if you still need
 to
 commit to search recent documents. Does anyone know?

 Thanks,
 Brad

 On 18 March 2010 19:41, Erik Hatcher erik.hatc...@gmail.com wrote:

  When I don't do the commit, I cannot search the documents I've indexed.
 -
 that's exactly how Solr without Zoie works, and it's how Lucene itself
 works.  Gotta commit to see the documents indexed.

  Erik



 On Mar 18, 2010, at 5:41 PM, brad anderson wrote:

 Tried following their tutorial for plugging zoie into solr:

  http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Server

 It appears it only allows you to search on documents after you do a
 commit?
 Am I missing something here, or does plugin not doing anything.

 Their tutorial tells you to do a commit when you index the docs:

 curl http://localhost:8983/solr/update/csv?commit=true --data-binary
 @books.csv -H 'Content-type:text/plain; charset=utf-8'


 When I don't do the commit, I cannot search the documents I've indexed.

 Thanks,
 Brad

 On 9 March 2010 23:34, Don Werve d...@madwombat.com wrote:

 2010/3/9 Shalin Shekhar Mangar shalinman...@gmail.com


 I think Don is talking about Zoie - it requires a long uniqueKey.



  Yep; we're using UUIDs.






Re: HTTP Status 500 - null java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:249)

2010-03-23 Thread Chris Hostetter

: I am doing a really simple query on my index (it's running in tomcat):
: 
: http://host:8080/solr_er_07_09/select/?q=hash_id:123456
...

details please ...

http://wiki.apache.org/solr/UsingMailingLists

... what version of solr? lucene? tomcat? 

: I built the index on a different machine than the one I am doing the

...ditto for that machine.

are you sure hte md5 checksums match for both copies of the index (ie: did 
it get corrupted when you copied it)

what does CheckIndex say about hte index?

: query on though the configuration is exactly the same. I can do the same
: query using solrj (I have an app doing that) and it works fine.

that seems highly bizzare ... are you certain it's the exact same query?  
what does the tomcat log say about hte two requests?



-Hoss



Re: release schedule?

2010-03-23 Thread Chris Hostetter

: I'm new to this list, so please excuse me if I'm asking in the wrong
: place. 

you're definitely in the right place.


: -  Are there any planned Solr releases for this year?
: 
: -  What are the planned release dates/contents, etc.?

releases aren't really planned .. they happen when the software is in a 
state that the development community feels like it should be released.

: -  Are there any beta releases to work with in the meantime?

there are automated builds of the trunk that happen nightly, you can 
always use these to test out new features that have been added since the 
most recent release, but features (and APIs) can and do change on the 
trunk so don't assume that once something exists in a nightly build that 
it will definitely be in the next release.


-Hoss



SOLR-236 patch with version 1.4

2010-03-23 Thread blargy

Is the field collapsing patch (236) not compatible with Solr 1.4?

$ patch -p0 -i ~/Desktop/SOLR-236.patch 
patching file src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/DocumentGroupCountCollapseCollectorFactory.java
patching file
src/java/org/apache/solr/search/fieldcollapse/CollapseGroup.java
patching file
src/java/org/apache/solr/search/fieldcollapse/AdjacentDocumentCollapser.java
patching file src/java/org/apache/solr/search/DocSetAwareCollector.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 530.
Hunk #3 FAILED at 586.
Hunk #4 FAILED at 610.
Hunk #5 FAILED at 663.
Hunk #6 FAILED at 705.
Hunk #7 FAILED at 716.
Hunk #8 FAILED at 740.
Hunk #9 FAILED at 1255.
9 out of 9 hunks FAILED -- saving rejects to file
src/java/org/apache/solr/search/SolrIndexSearcher.java.rej
patching file
src/java/org/apache/solr/handler/component/CollapseComponent.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/CollapseCollectorFactory.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/AggregateFunction.java
patching file
src/test/org/apache/solr/search/fieldcollapse/NonAdjacentDocumentCollapserTest.java
patching file src/java/org/apache/solr/util/DocSetScoreCollector.java
patching file
src/java/org/apache/solr/search/fieldcollapse/AbstractDocumentCollapser.java
patching file
src/java/org/apache/solr/search/fieldcollapse/util/Counter.java
patching file
src/java/org/apache/solr/search/fieldcollapse/DocumentCollapser.java
patching file
src/solrj/org/apache/solr/client/solrj/response/FieldCollapseResponse.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/FieldValueCountCollapseCollectorFactory.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/CollapseCollector.java
patching file
src/test/org/apache/solr/search/fieldcollapse/DistributedFieldCollapsingIntegrationTest.java
patching file
src/test/org/apache/solr/client/solrj/response/FieldCollapseResponseTest.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/MaxFunction.java
patching file src/test/test-files/solr/conf/solrconfig.xml
Hunk #1 FAILED at 396.
Hunk #2 FAILED at 418.
2 out of 2 hunks FAILED -- saving rejects to file
src/test/test-files/solr/conf/solrconfig.xml.rej
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/CollapseContext.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/MinFunction.java
patching file
src/solrj/org/apache/solr/client/solrj/response/QueryResponse.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 42.
Hunk #3 FAILED at 58.
Hunk #4 FAILED at 125.
Hunk #5 FAILED at 298.
5 out of 5 hunks FAILED -- saving rejects to file
src/solrj/org/apache/solr/client/solrj/response/QueryResponse.java.rej
patching file src/test/test-files/fieldcollapse/testResponse.xml
patching file
src/java/org/apache/solr/search/fieldcollapse/NonAdjacentDocumentCollapser.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/AbstractCollapseCollector.java
patching file src/java/org/apache/solr/handler/component/QueryComponent.java
Hunk #1 FAILED at 522.
1 out of 1 hunk FAILED -- saving rejects to file
src/java/org/apache/solr/handler/component/QueryComponent.java.rej
patching file
src/java/org/apache/solr/search/fieldcollapse/DocumentCollapseResult.java
patching file
src/test/org/apache/solr/handler/component/CollapseComponentTest.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/DocumentFieldsCollapseCollectorFactory.java
patching file src/test/test-files/solr/conf/schema-fieldcollapse.xml
patching file src/common/org/apache/solr/common/params/CollapseParams.java
patching file src/solrj/org/apache/solr/client/solrj/SolrQuery.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 50.
Hunk #3 FAILED at 76.
Hunk #4 FAILED at 148.
Hunk #5 FAILED at 197.
Hunk #6 FAILED at 665.
Hunk #7 FAILED at 721.
7 out of 7 hunks FAILED -- saving rejects to file
src/solrj/org/apache/solr/client/solrj/SolrQuery.java.rej
patching file
src/test/org/apache/solr/search/fieldcollapse/AdjacentCollapserTest.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/AggregateCollapseCollectorFactory.java
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/SumFunction.java
patching file src/java/org/apache/solr/search/DocSetHitCollector.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 28.
2 out of 2 hunks FAILED -- saving rejects to file
src/java/org/apache/solr/search/DocSetHitCollector.java.rej
patching file
src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/AverageFunction.java
patching file
src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTest.java

-- 
View this message in context: 
http://old.nabble.com/SOLR-236-patch-with-version-1.4-tp28008954p28008954.html

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-23 Thread Lance Norskog
You need 'ant' to do builds.  At the top level, do:
ant clean
ant example

These will build everything and set up the example/ directory. After that, run:
ant test-core

to run all of the unit tests and make sure that the build works. If
the autosuggest patch has a test, this will check that the patch went
in correctly.

Lance

On Tue, Mar 23, 2010 at 7:42 AM, stocki st...@shopgate.com wrote:

 okay,
 i do this..

 but one file are not right updatet 
 Index: trunk/src/java/org/apache/solr/util/HighFrequencyDictionary.java
 (from the suggest.patch)

 i checkout it from eclipse, apply patch, make an new solr.war ... its the
 right way ??
 i thought that is making a war i didnt need to make an build.

 how do i make an build ?




 Alexey-34 wrote:

 Error loading class 'org.apache.solr.spelling.suggest.Suggester'
 Are you sure you applied the patch correctly?
 See http://wiki.apache.org/solr/HowToContribute#Working_With_Patches

 Checkout Solr trunk source code (
 http://svn.apache.org/repos/asf/lucene/solr/trunk ), apply patch,
 verify that everything went smoothly, build solr and use built version
 for your tests.

 On Mon, Mar 22, 2010 at 9:42 PM, stocki st...@shopgate.com wrote:

 i patch an nightly build from solr.
 patch runs, classes are in the correct folder, but when i replace
 spellcheck
 with this spellchecl like in the comments, solr cannot find the classes
 =(

 searchComponent name=spellcheck class=solr.SpellCheckComponent
    lst name=spellchecker
      str name=namesuggest/str
      str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
      str
 name=lookupImplorg.apache.solr.spelling.suggest.jaspell.JaspellLookup/str
      str name=fieldtext/str
      str name=sourceLocationamerican-english/str
    /lst
  /searchComponent


 -- SCHWERWIEGEND: org.apache.solr.common.SolrException: Error loading
 class
 'org.ap
 ache.solr.spelling.suggest.Suggester'


 why is it so ??  i think no one has so many trouble to run a patch
 like
 me =( :D


 Andrzej Bialecki wrote:

 On 2010-03-19 13:03, stocki wrote:

 hello..

 i try to implement autosuggest component from these link:
 http://issues.apache.org/jira/browse/SOLR-1316

 but i have no idea how to do this !?? can anyone get me some tipps ?

 Please follow the instructions outlined in the JIRA issue, in the
 comment that shows fragments of XML config files.


 --
 Best regards,
 Andrzej Bialecki     
   ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




 --
 View this message in context:
 http://old.nabble.com/SOLR-1316-How-To-Implement-this-autosuggest-component-tp27950949p27990809.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 View this message in context: 
 http://old.nabble.com/SOLR-1316-How-To-Implement-this-patch-autoComplete-tp27950949p28001938.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
Lance Norskog
goks...@gmail.com


Re: Impossible Boost Query?

2010-03-23 Thread Lance Norskog
At this point (and for almost 3 years :) field collapsing is a source
patch. You have to check out the Solr trunk from the Apache subversion
server, apply the patch with the 'patch' command, and build the new
Solr with 'ant'.

On Tue, Mar 23, 2010 at 4:13 PM, blargy zman...@hotmail.com wrote:

 Thanks but Im not quite show on how to apply the patch. I just use the
 packaged solr-1.4.0.war in my deployment (no compiling, etc). Is there a way
 I can patch the war file?

 Any instructions would be greatly appreciated. Thanks


 Otis Gospodnetic wrote:

 You'd likely want to get the latest patch and trunk and try applying.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/



 - Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 6:10:22 PM
 Subject: Re: Impossible Boost Query?


 Maybe a better question is... how can I install this and will it work
 with
 1.4?

 Thanks


 blargy wrote:

 Possibly.
 How can I install this as a contrib or do I need to actually
 perform the
 patch?


 Otis Gospodnetic wrote:


 Would Field Collapsing from SOLR-236 do the job for
 you?

 Otis
 
 Sematext ::
 href=http://sematext.com/; target=_blank http://sematext.com/ :: Solr -
 Lucene - Nutch
 Hadoop ecosystem search ::
 href=http://search-hadoop.com/; target=_blank
 http://search-hadoop.com/




 - Original Message 
 From: blargy 
 ymailto=mailto:zman...@hotmail.com;
 href=mailto:zman...@hotmail.com;zman...@hotmail.com

 To:
 href=mailto:solr-user@lucene.apache.org;solr-user@lucene.apache.org

 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost
 Query?


 I was wondering if this is
 even possible. I'll try to explain what I'm
 trying

 to do to the best of my ability.

 Ok, so our site has a
 bunch
 of products that are sold by any number of

 sellers. Currently when I search
 for some product I get back
 all products
 matching that search term but the

 problem is there may be multiple products
 sold by the same seller
 that are
 all closely related, therefore their
 scores
 are related. So basically the
 search ends up
 with results that are all
 closely clumped together by the same

 seller but I would much rather prefer
 to distribute
 these results across
 sellers (given each seller a fair shot
 to
 sell their goods).

 Is there

 any way to add some boost query for example that will
 start
 weighing products
 lower when their seller has
 already been listed a few
 times. For example,
 right
 now I have

 Product foo by Seller A
 Product
 foo by Seller
 A
 Product foo by Seller A

 Product foo by Seller B
 Product foo by Seller

 B
 Product foo by Seller B
 Product foo by Seller
 C
 Product foo by Seller
 C
 Product foo
 by Seller C

 where each result is very close in score. I

 would like something like this

 Product
 foo by Seller A
 Product foo by
 Seller B

 Product foo by Seller C
 Product foo by Seller A
 Product
 foo by
 Seller B
 Product foo by Seller C

 

 basically distributing the

 results over the sellers. Is something like this
 possible? I don't
 care if
 the solution involves a boost query or not. I
 just
 want some way to
 distribute closely related
 documents.

 Thanks!!!
 --
 View
 this
 message in context:
 href=
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
 target=_blank
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;

 target=_blank
 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
 target=_blank
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html

 Sent
 from the Solr - User mailing list archive at
 Nabble.com.





 --
 View this
 message in context:
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html;
 target=_blank
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html
 Sent
 from the Solr - User mailing list archive at Nabble.com.



 --
 View this message in context: 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28008495.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
Lance Norskog
goks...@gmail.com


Re: Impossible Boost Query?

2010-03-23 Thread Lance Norskog
Also, there is a 'random' type which generates random numbers. This
might help you also.

On Tue, Mar 23, 2010 at 7:18 PM, Lance Norskog goks...@gmail.com wrote:
 At this point (and for almost 3 years :) field collapsing is a source
 patch. You have to check out the Solr trunk from the Apache subversion
 server, apply the patch with the 'patch' command, and build the new
 Solr with 'ant'.

 On Tue, Mar 23, 2010 at 4:13 PM, blargy zman...@hotmail.com wrote:

 Thanks but Im not quite show on how to apply the patch. I just use the
 packaged solr-1.4.0.war in my deployment (no compiling, etc). Is there a way
 I can patch the war file?

 Any instructions would be greatly appreciated. Thanks


 Otis Gospodnetic wrote:

 You'd likely want to get the latest patch and trunk and try applying.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/



 - Original Message 
 From: blargy zman...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 23, 2010 6:10:22 PM
 Subject: Re: Impossible Boost Query?


 Maybe a better question is... how can I install this and will it work
 with
 1.4?

 Thanks


 blargy wrote:

 Possibly.
 How can I install this as a contrib or do I need to actually
 perform the
 patch?


 Otis Gospodnetic wrote:


 Would Field Collapsing from SOLR-236 do the job for
 you?

 Otis
 
 Sematext ::
 href=http://sematext.com/; target=_blank http://sematext.com/ :: Solr -
 Lucene - Nutch
 Hadoop ecosystem search ::
 href=http://search-hadoop.com/; target=_blank
 http://search-hadoop.com/




 - Original Message 
 From: blargy 
 ymailto=mailto:zman...@hotmail.com;
 href=mailto:zman...@hotmail.com;zman...@hotmail.com

 To:
 href=mailto:solr-user@lucene.apache.org;solr-user@lucene.apache.org

 Sent: Tue, March 23, 2010 2:39:48 PM
 Subject: Impossible Boost
 Query?


 I was wondering if this is
 even possible. I'll try to explain what I'm
 trying

 to do to the best of my ability.

 Ok, so our site has a
 bunch
 of products that are sold by any number of

 sellers. Currently when I search
 for some product I get back
 all products
 matching that search term but the

 problem is there may be multiple products
 sold by the same seller
 that are
 all closely related, therefore their
 scores
 are related. So basically the
 search ends up
 with results that are all
 closely clumped together by the same

 seller but I would much rather prefer
 to distribute
 these results across
 sellers (given each seller a fair shot
 to
 sell their goods).

 Is there

 any way to add some boost query for example that will
 start
 weighing products
 lower when their seller has
 already been listed a few
 times. For example,
 right
 now I have

 Product foo by Seller A
 Product
 foo by Seller
 A
 Product foo by Seller A

 Product foo by Seller B
 Product foo by Seller

 B
 Product foo by Seller B
 Product foo by Seller
 C
 Product foo by Seller
 C
 Product foo
 by Seller C

 where each result is very close in score. I

 would like something like this

 Product
 foo by Seller A
 Product foo by
 Seller B

 Product foo by Seller C
 Product foo by Seller A
 Product
 foo by
 Seller B
 Product foo by Seller C

 

 basically distributing the

 results over the sellers. Is something like this
 possible? I don't
 care if
 the solution involves a boost query or not. I
 just
 want some way to
 distribute closely related
 documents.

 Thanks!!!
 --
 View
 this
 message in context:
 href=
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
 target=_blank
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;

 target=_blank
 
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html;
 target=_blank
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28005354.html

 Sent
 from the Solr - User mailing list archive at
 Nabble.com.





 --
 View this
 message in context:
 href=http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html;
 target=_blank
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28007880.html
 Sent
 from the Solr - User mailing list archive at Nabble.com.



 --
 View this message in context: 
 http://old.nabble.com/Impossible-Boost-Query--tp28005354p28008495.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 Lance Norskog
 goks...@gmail.com




-- 
Lance Norskog
goks...@gmail.com


Re: HTTP Status 500 - null java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:249)

2010-03-23 Thread Lance Norskog
That area of the Lucene code throws NullPEs and ArrayIndex bugs, but
they are all caused by corrupt indexes. They should be caught and
wrapped.

On Tue, Mar 23, 2010 at 4:33 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I am doing a really simple query on my index (it's running in tomcat):
 :
 : http://host:8080/solr_er_07_09/select/?q=hash_id:123456
        ...

 details please ...

    http://wiki.apache.org/solr/UsingMailingLists

 ... what version of solr? lucene? tomcat?

 : I built the index on a different machine than the one I am doing the

 ...ditto for that machine.

 are you sure hte md5 checksums match for both copies of the index (ie: did
 it get corrupted when you copied it)

 what does CheckIndex say about hte index?

 : query on though the configuration is exactly the same. I can do the same
 : query using solrj (I have an app doing that) and it works fine.

 that seems highly bizzare ... are you certain it's the exact same query?
 what does the tomcat log say about hte two requests?



 -Hoss





-- 
Lance Norskog
goks...@gmail.com


phrase segmentation plugin in component, analyzer, filter or parser?

2010-03-23 Thread Tommy Chheng

 I'm writing an experimental phrase segmentation plugin for solr.

My current plan is to write as a SearchComponent by overriding the 
queryString with the new grouped query.
ex. (university of california irvine 2009) will be re-written to 
university of calfornia irvine 2009



Is the SearchComponent the right class to extend for this type of logic?
I picked the component because it was one place where i could get access 
to overwrite the whole query string.


Or is it better design to write it as an analyzer, tokenizer, filter or 
parser plugin?



--
Tommy Chheng
Programmer and UC Irvine Graduate Student
Twitter @tommychheng
http://tommy.chheng.com