Re: NPE on MERGEINDEXES

2009-06-01 Thread Koji Sekiguchi
Reopened SOLR-1051:
https://issues.apache.org/jira/browse/SOLR-1051?focusedCommentId=12715030page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12715030

Koji

Koji Sekiguchi wrote:
 Maybe I did something wrong, I got NPE when trying to MERGEINDEXES:

 http://localhost:8983/solr/admin/cores?action=MERGEINDEXEScore=core0indexDirs=indexname

 java.lang.NullPointerException
 at
 org.apache.solr.update.processor.RunUpdateProcessor.init(RunUpdateProcessorFactory.java:55)
 at
 org.apache.solr.update.processor.RunUpdateProcessorFactory.getInstance(RunUpdateProcessorFactory.java:43)
 at
 org.apache.solr.update.processor.UpdateRequestProcessorChain.createProcessor(UpdateRequestProcessorChain.java:55)
 at
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:191)
 at
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:151)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:301)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174)
 at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
 at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
 at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
 at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

 I'm using the latest trunk.

 Solr was started:

 $ cd example
 $ java -Dsolr.solr.home=./multicore -jar start.jar


 Thank you,

 Koji



   



Re: When searching for !...@#$%^*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

Walter,

The analysis link does not produce any matches for either @ or !...@#$%^*()
strings when I try to match against bathing. I'm worried that this might be
the symptom of another problem (which has not revealed itself yet) and want
to get to the bottom of this...

Thank you.
sm


Walter Underwood wrote:
 
 Use the [analysis] link on the Solr admin UI to get more info on
 how this is being interpreted.
 
 However, I am curious about why this is important. Do users enter
 this query often? If not, maybe it is not something to spend time on.
 
 wunder
 
 On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote:
 
 
 Here is the output from the debug query when I'm trying to match the
 String @
 against Bathing (should not match)
 
 str name=GLOM-1
 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
   0.9994 = queryWeight(activity_type:NAME), product of:
 3.2689075 = idf(docFreq=153, numDocs=1489)
 0.30591258 = queryNorm
   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
 1.0 = tf(termFreq(activity_type:NAME)=1)
 3.2689075 = idf(docFreq=153, numDocs=1489)
 1.0 = fieldNorm(field=activity_type, doc=0)
 /str
 
 Looks like the AND clause in the search string is ignored...
 
 SM.
 
 
 ryantxu wrote:
 
 two key things to try (for anyone ever wondering why a query matches
 documents)
 
 1.  add debugQuery=true and look at the explain text below --
 anything that contributed to the score is listed there
 2.  check /admin/analysis.jsp -- this will let you see how analyzers
 break text up into tokens.
 
 Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
 something to do with it...
 
 
 On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com wrote:
 
 Hi,
 
 I'm running Solr 1.3/Java 1.6.
 
 When I run a query like  - (activity_type:NAME) AND
 title:(\...@#$%\^\*\(\))
 all the documents are returned even though there is not a single match.
 There is no title that matches the string (which has been escaped).
 
 My document structure is as follows
 
 doc
 str name=activity_typeNAME/str
 str name=titleBathing/str
 
 /doc
 
 
 The title field is of type text_title which is described below.
 
 fieldType name=text_title class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory/
        !-- in this example, we will only use synonyms at query time
        filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
        --
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt
 ignoreCase=true expand=true/
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/
 
      /analyzer
    /fieldType
 
 When I run the query against Luke, no results are returned. Any
 suggestions
 are appreciated.
 
 
 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
 s-are-matched-incorrectly-tp23797731p23797731.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^*() all documents are matched incorrectly

2009-06-01 Thread Yonik Seeley
OK, here's the deal:

str name=rawquerystring-features:foo features:(\...@#$%\^\*\(\))/str
str name=querystring-features:foo features:(\...@#$%\^\*\(\))/str
str name=parsedquery-features:foo/str
str name=parsedquery_toString-features:foo/str

The text analysis is throwing away non alphanumeric chars (probably
the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
away term queries when the token is zero length (after analysis).
Solr then interprets the left over -features:foo as all documents
not containing foo in the features field, so you get a bunch of
matches.

-Yonik
http://www.lucidimagination.com


On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels mas...@yahoo.com wrote:

 Walter,

 The analysis link does not produce any matches for either @ or !...@#$%^*()
 strings when I try to match against bathing. I'm worried that this might be
 the symptom of another problem (which has not revealed itself yet) and want
 to get to the bottom of this...

 Thank you.
 sm


 Walter Underwood wrote:

 Use the [analysis] link on the Solr admin UI to get more info on
 how this is being interpreted.

 However, I am curious about why this is important. Do users enter
 this query often? If not, maybe it is not something to spend time on.

 wunder

 On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote:


 Here is the output from the debug query when I'm trying to match the
 String @
 against Bathing (should not match)

 str name=GLOM-1
 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
   0.9994 = queryWeight(activity_type:NAME), product of:
     3.2689075 = idf(docFreq=153, numDocs=1489)
     0.30591258 = queryNorm
   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
     1.0 = tf(termFreq(activity_type:NAME)=1)
     3.2689075 = idf(docFreq=153, numDocs=1489)
     1.0 = fieldNorm(field=activity_type, doc=0)
 /str

 Looks like the AND clause in the search string is ignored...

 SM.


 ryantxu wrote:

 two key things to try (for anyone ever wondering why a query matches
 documents)

 1.  add debugQuery=true and look at the explain text below --
 anything that contributed to the score is listed there
 2.  check /admin/analysis.jsp -- this will let you see how analyzers
 break text up into tokens.

 Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
 something to do with it...


 On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com wrote:

 Hi,

 I'm running Solr 1.3/Java 1.6.

 When I run a query like  - (activity_type:NAME) AND
 title:(\...@#$%\^\*\(\))
 all the documents are returned even though there is not a single match.
 There is no title that matches the string (which has been escaped).

 My document structure is as follows

 doc
 str name=activity_typeNAME/str
 str name=titleBathing/str
 
 /doc


 The title field is of type text_title which is described below.

 fieldType name=text_title class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory/
        !-- in this example, we will only use synonyms at query time
        filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
        --
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt
 ignoreCase=true expand=true/
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/

      /analyzer
    /fieldType

 When I run the query against Luke, no results are returned. Any
 suggestions
 are appreciated.


 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
 s-are-matched-incorrectly-tp23797731p23797731.html
 Sent from the Solr - User mailing list archive at Nabble.com.








 --
 View this message in context: 
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: When searching for !...@#$%^*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

So the fix for this problem would be

1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
2. Not allow any search strings without any alphanumeric characters..

SM.


Yonik Seeley-2 wrote:
 
 OK, here's the deal:
 
 str name=rawquerystring-features:foo features:(\...@#$%\^\*\(\))/str
 str name=querystring-features:foo features:(\...@#$%\^\*\(\))/str
 str name=parsedquery-features:foo/str
 str name=parsedquery_toString-features:foo/str
 
 The text analysis is throwing away non alphanumeric chars (probably
 the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
 away term queries when the token is zero length (after analysis).
 Solr then interprets the left over -features:foo as all documents
 not containing foo in the features field, so you get a bunch of
 matches.
 
 -Yonik
 http://www.lucidimagination.com
 
 
 On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels mas...@yahoo.com wrote:

 Walter,

 The analysis link does not produce any matches for either @ or !...@#$%^*()
 strings when I try to match against bathing. I'm worried that this might
 be
 the symptom of another problem (which has not revealed itself yet) and
 want
 to get to the bottom of this...

 Thank you.
 sm


 Walter Underwood wrote:

 Use the [analysis] link on the Solr admin UI to get more info on
 how this is being interpreted.

 However, I am curious about why this is important. Do users enter
 this query often? If not, maybe it is not something to spend time on.

 wunder

 On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote:


 Here is the output from the debug query when I'm trying to match the
 String @
 against Bathing (should not match)

 str name=GLOM-1
 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
   0.9994 = queryWeight(activity_type:NAME), product of:
     3.2689075 = idf(docFreq=153, numDocs=1489)
     0.30591258 = queryNorm
   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
     1.0 = tf(termFreq(activity_type:NAME)=1)
     3.2689075 = idf(docFreq=153, numDocs=1489)
     1.0 = fieldNorm(field=activity_type, doc=0)
 /str

 Looks like the AND clause in the search string is ignored...

 SM.


 ryantxu wrote:

 two key things to try (for anyone ever wondering why a query matches
 documents)

 1.  add debugQuery=true and look at the explain text below --
 anything that contributed to the score is listed there
 2.  check /admin/analysis.jsp -- this will let you see how analyzers
 break text up into tokens.

 Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
 something to do with it...


 On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com
 wrote:

 Hi,

 I'm running Solr 1.3/Java 1.6.

 When I run a query like  - (activity_type:NAME) AND
 title:(\...@#$%\^\*\(\))
 all the documents are returned even though there is not a single
 match.
 There is no title that matches the string (which has been escaped).

 My document structure is as follows

 doc
 str name=activity_typeNAME/str
 str name=titleBathing/str
 
 /doc


 The title field is of type text_title which is described below.

 fieldType name=text_title class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory/
        !-- in this example, we will only use synonyms at query time
        filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
        --
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt
 ignoreCase=true expand=true/
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/

      /analyzer
    /fieldType

 When I run the query against Luke, no results are returned. Any
 suggestions
 are appreciated.


 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
 s-are-matched-incorrectly-tp23797731p23797731.html
 Sent from the Solr - User mailing list archive at Nabble.com.








 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23816242.html
Sent from 

Re: When searching for !...@#$%^*() all documents are matched incorrectly

2009-06-01 Thread Yonik Seeley
On Mon, Jun 1, 2009 at 10:50 AM, Sam Michaels mas...@yahoo.com wrote:

 So the fix for this problem would be

 1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
 2. Not allow any search strings without any alphanumeric characters..

Short term workaround for you, yes.
I would classify this surprising behavior as a bug we should
eventually fix though.  Could you open a JIRA issue for it?

-Yonik
http://www.lucidimagination.com

 SM.


 Yonik Seeley-2 wrote:

 OK, here's the deal:

 str name=rawquerystring-features:foo features:(\...@#$%\^\*\(\))/str
 str name=querystring-features:foo features:(\...@#$%\^\*\(\))/str
 str name=parsedquery-features:foo/str
 str name=parsedquery_toString-features:foo/str

 The text analysis is throwing away non alphanumeric chars (probably
 the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
 away term queries when the token is zero length (after analysis).
 Solr then interprets the left over -features:foo as all documents
 not containing foo in the features field, so you get a bunch of
 matches.

 -Yonik
 http://www.lucidimagination.com


 On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels mas...@yahoo.com wrote:

 Walter,

 The analysis link does not produce any matches for either @ or !...@#$%^*()
 strings when I try to match against bathing. I'm worried that this might
 be
 the symptom of another problem (which has not revealed itself yet) and
 want
 to get to the bottom of this...

 Thank you.
 sm


 Walter Underwood wrote:

 Use the [analysis] link on the Solr admin UI to get more info on
 how this is being interpreted.

 However, I am curious about why this is important. Do users enter
 this query often? If not, maybe it is not something to spend time on.

 wunder

 On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote:


 Here is the output from the debug query when I'm trying to match the
 String @
 against Bathing (should not match)

 str name=GLOM-1
 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
   0.9994 = queryWeight(activity_type:NAME), product of:
     3.2689075 = idf(docFreq=153, numDocs=1489)
     0.30591258 = queryNorm
   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
     1.0 = tf(termFreq(activity_type:NAME)=1)
     3.2689075 = idf(docFreq=153, numDocs=1489)
     1.0 = fieldNorm(field=activity_type, doc=0)
 /str

 Looks like the AND clause in the search string is ignored...

 SM.


 ryantxu wrote:

 two key things to try (for anyone ever wondering why a query matches
 documents)

 1.  add debugQuery=true and look at the explain text below --
 anything that contributed to the score is listed there
 2.  check /admin/analysis.jsp -- this will let you see how analyzers
 break text up into tokens.

 Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
 something to do with it...


 On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com
 wrote:

 Hi,

 I'm running Solr 1.3/Java 1.6.

 When I run a query like  - (activity_type:NAME) AND
 title:(\...@#$%\^\*\(\))
 all the documents are returned even though there is not a single
 match.
 There is no title that matches the string (which has been escaped).

 My document structure is as follows

 doc
 str name=activity_typeNAME/str
 str name=titleBathing/str
 
 /doc


 The title field is of type text_title which is described below.

 fieldType name=text_title class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory/
        !-- in this example, we will only use synonyms at query time
        filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
        --
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt
 ignoreCase=true expand=true/
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/

      /analyzer
    /fieldType

 When I run the query against Luke, no results are returned. Any
 suggestions
 are appreciated.


 --
 View this message in context:
 http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
 s-are-matched-incorrectly-tp23797731p23797731.html
 Sent from the Solr - User mailing list archive at Nabble.com.








 --
 View this message in context:
 

Highlighting and Field options

2009-06-01 Thread ashokc

Hi,

The 'content' field that I am indexing is usually large (e.g. a pdf doc of a
few Mb in size). I need highlighting to be on. This 'seems' to require that
I have to set the 'content' field to be STORED. This returns the whole
content field in the search result XML. for each matching document. The
highlighted text also is returned in a separate block. But I do NOT need the
entire content field to display the search results. I only use the
highlighted segments to display a brief description of each hit. The fact
that SOLR returns entire content field, makes the returned XML unnecessarily
huge, and makes for larger response times. How can I have SOLR return ONLY
the highlighted text for each hit and NOT the entire 'content' filed? Thanks
- ashok
-- 
View this message in context: 
http://www.nabble.com/Highlighting-and-Field-options-tp23818019p23818019.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highlighting and Field options

2009-06-01 Thread Jay Hill
Use the fl param to ask for only the fields you need, but also keep hl=true.
Something like this:

http://localhost:8080/solr/select/?q=bearversion=2.2start=0rows=10indent=onhl=truefl=id

Note that fl=id means the only field returned in the XML will be the id
field.

Highlights are still returned in the highlight element, but you won't get
back the unneeded content field.

-Jay


On Mon, Jun 1, 2009 at 9:41 AM, ashokc ash...@qualcomm.com wrote:


 Hi,

 The 'content' field that I am indexing is usually large (e.g. a pdf doc of
 a
 few Mb in size). I need highlighting to be on. This 'seems' to require that
 I have to set the 'content' field to be STORED. This returns the whole
 content field in the search result XML. for each matching document. The
 highlighted text also is returned in a separate block. But I do NOT need
 the
 entire content field to display the search results. I only use the
 highlighted segments to display a brief description of each hit. The fact
 that SOLR returns entire content field, makes the returned XML
 unnecessarily
 huge, and makes for larger response times. How can I have SOLR return ONLY
 the highlighted text for each hit and NOT the entire 'content' filed?
 Thanks
 - ashok
 --
 View this message in context:
 http://www.nabble.com/Highlighting-and-Field-options-tp23818019p23818019.html
 Sent from the Solr - User mailing list archive at Nabble.com.




How to get number of optimizes

2009-06-01 Thread iamithink

Hello,

I'm looking for a simple way to automate (in a shell script) a request for
the number of times an index has been optimized (since the Solr webapp has
last started).  I know that this information is available on the Solr stats
page (http://host:port/solr/admin/stats.jsp) under Update
Handlers/stats/optimizes, but I'm looking for a simpler way than to retrieve
the page using wget or similar and parse the HTML.  More generally, is there
a convenient way to get at the other data presented on the Stats page?  I'm
currently using Solr 1.2 but will be migrating to 1.3 soon in case that
makes a difference.

Thanks... 
-- 
View this message in context: 
http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to get number of optimizes

2009-06-01 Thread Eric Pugh

Not sure if it's simpler, but the JMX interface is more structured.

I think that just grabbing the page and parsing out the content with  
your favorite tool (Ruby  Hpricot) is pretty simple.


Eric

On Jun 1, 2009, at 1:17 PM, iamithink wrote:



Hello,

I'm looking for a simple way to automate (in a shell script) a  
request for
the number of times an index has been optimized (since the Solr  
webapp has
last started).  I know that this information is available on the  
Solr stats

page (http://host:port/solr/admin/stats.jsp) under Update
Handlers/stats/optimizes, but I'm looking for a simpler way than to  
retrieve
the page using wget or similar and parse the HTML.  More generally,  
is there
a convenient way to get at the other data presented on the Stats  
page?  I'm
currently using Solr 1.2 but will be migrating to 1.3 soon in case  
that

makes a difference.

Thanks...
--
View this message in context: 
http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html
Sent from the Solr - User mailing list archive at Nabble.com.



-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal






Re: Keyword Density

2009-06-01 Thread Alex Shevchenko
HI All,

Is there a way to perform filtering based on keyword density?

Thanks

-- 
Alex Shevchenko


Re: User search in Facebook like

2009-06-01 Thread Vincent Pérès

Thanks a lot for your answer it fixed all my issues !!!

It's really well working !

Cheers,
Vincent
-- 
View this message in context: 
http://www.nabble.com/User-search-in-Facebook-like-tp23804854p23818867.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to get number of optimizes

2009-06-01 Thread iamithink

Thanks for the quick response.  I agree that for this one-off task the grab
and parse method works fine, but I'll keep the JMX interface in mind for
other tasks in the future.

Here's my particular hack solution in case this helps anyone else:

wget -q -O- http://hostname:port/solr/admin/stats.jsp | awk
'/optimizes/{getline;print}'



Eric Pugh-4 wrote:
 
 Not sure if it's simpler, but the JMX interface is more structured.
 
 I think that just grabbing the page and parsing out the content with  
 your favorite tool (Ruby  Hpricot) is pretty simple.
 
 Eric
 
 On Jun 1, 2009, at 1:17 PM, iamithink wrote:
 

 Hello,

 I'm looking for a simple way to automate (in a shell script) a  
 request for
 the number of times an index has been optimized (since the Solr  
 webapp has
 last started).  I know that this information is available on the  
 Solr stats
 page (http://host:port/solr/admin/stats.jsp) under Update
 Handlers/stats/optimizes, but I'm looking for a simpler way than to  
 retrieve
 the page using wget or similar and parse the HTML.  More generally,  
 is there
 a convenient way to get at the other data presented on the Stats  
 page?  I'm
 currently using Solr 1.2 but will be migrating to 1.3 soon in case  
 that
 makes a difference.

 Thanks...
 -- 
 View this message in context:
 http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html
 Sent from the Solr - User mailing list archive at Nabble.com.

 
 -
 Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 |
 http://www.opensourceconnections.com
 Free/Busy: http://tinyurl.com/eric-cal
 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818964.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to get number of optimizes

2009-06-01 Thread Otis Gospodnetic

Hello,

That stats page is really XML + XSLT that transforms the XML to HTML.  View the 
source of the stats page.  That should make it very easy to parse the stats 
response/page and extract the data you need.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: iamithink leed...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Monday, June 1, 2009 1:17:32 PM
 Subject: How to get number of optimizes
 
 
 Hello,
 
 I'm looking for a simple way to automate (in a shell script) a request for
 the number of times an index has been optimized (since the Solr webapp has
 last started).  I know that this information is available on the Solr stats
 page (http://host:port/solr/admin/stats.jsp) under Update
 Handlers/stats/optimizes, but I'm looking for a simpler way than to retrieve
 the page using wget or similar and parse the HTML.  More generally, is there
 a convenient way to get at the other data presented on the Stats page?  I'm
 currently using Solr 1.2 but will be migrating to 1.3 soon in case that
 makes a difference.
 
 Thanks... 
 -- 
 View this message in context: 
 http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html
 Sent from the Solr - User mailing list archive at Nabble.com.



RE: Java OutOfmemory error during autowarming

2009-06-01 Thread Francis Yakin

Hi Chris,

I am new in solr.

When it is initialized for the first time, how can I change it?

Thanks

Francis

-Original Message-
From: Chris Harris [mailto:rygu...@gmail.com]
Sent: Sunday, May 31, 2009 3:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Java OutOfmemory error during autowarming

Solr offers no configuration for FieldCache, neither in solrconfig.xml nor 
anywhere else; rather, that cache gets populated automatically in the depths of 
Lucene when you do a sort (or also apparently, as Yonik says, when you use a 
field in a function query).

From the wiki: 'Lucene has a low level FieldCache which is used for sorting 
(and in some cases faceting). This cache is not managed by Solr it has no 
configuration options and cannot be autowarmed -- it is initialized the first 
time it is used for each Searcher.' (
http://wiki.apache.org/solr/SolrCaching)

2009/5/29 Francis Yakin fya...@liquid.com


 I know, but the FieldCache is not in the solrconfig.xml


 -Original Message-
 From: Yonik Seeley [mailto:ysee...@gmail.com]
 Sent: Friday, May 29, 2009 10:47 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Java OutOfmemory error during autowarming

 On Fri, May 29, 2009 at 1:44 PM, Francis Yakin fya...@liquid.com wrote:
 
  There is no FieldCache entries in solrconfig.xml ( BTW we are
  running version 1.2.0)

 Lucene FieldCache entries are created when you sort on a field or when
 you use a field in a function query.

 -Yonik



Re: Keyword Density

2009-06-01 Thread Alex Shevchenko
Something like that. Just not ' N times' but 'numbers of foo
appears/total number of words  some value'

On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic
otis_gospodne...@yahoo.comwrote:


 Hi Alex,

 Could you please provide an example of this?  Are you looking to do
 something like find all docs that match name:foo and where foo appears  N
 times (in the name field) in the matching document?

  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Alex Shevchenko caeza...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Monday, June 1, 2009 1:32:49 PM
  Subject: Re: Keyword Density
 
  HI All,
 
  Is there a way to perform filtering based on keyword density?
 
  Thanks
 
  --
  Alex Shevchenko




-- 
Alex Shevchenko


Dismax handler phrase matching question

2009-06-01 Thread anuvenk

Hello,

   I'm using the dismax handler for the phrase matching. I have a few legal
resources in my index in the following format for example

title  state 

dui faq1   california   
dui faq2   florida
dui faq3   federal

Now I want to be able to return federal results irrespective of the state.
For example dui california should return all federal results for 'dui' also
along with california results. i was thinking of a synonym mapping for the
states like 'state name' = 'federal' 
(i.e california,federal
florida, federal
maine, federal
etc
)
Is there a better way though?
-- 
View this message in context: 
http://www.nabble.com/Dismax-handler-phrase-matching-question-tp23820340p23820340.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Keyword Density

2009-06-01 Thread Alex Shevchenko
But I don't need to sort using this value. I need to cut results, where this
value (for particular term of query!) not in some range.

On Mon, Jun 1, 2009 at 22:20, Walter Underwood wunderw...@netflix.comwrote:

 That is the normal relevance scoring formula in Solr and Lucene.
 It is a bit fancier than that, but you don't have to do anything
 special to get that behavior.

 Solr also uses the inverse document frequency (rarity) of each
 word for weighting.

 Look up tf.idf for more info.

 wunder

 On 6/1/09 11:46 AM, Alex Shevchenko caeza...@gmail.com wrote:

  Something like that. Just not ' N times' but 'numbers of foo
  appears/total number of words  some value'
 
  On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic
  otis_gospodne...@yahoo.comwrote:
 
 
  Hi Alex,
 
  Could you please provide an example of this?  Are you looking to do
  something like find all docs that match name:foo and where foo appears
  N
  times (in the name field) in the matching document?
 
   Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
  From: Alex Shevchenko caeza...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Monday, June 1, 2009 1:32:49 PM
  Subject: Re: Keyword Density
 
  HI All,
 
  Is there a way to perform filtering based on keyword density?
 
  Thanks
 
  --
  Alex Shevchenko





-- 
Alex Shevchenko


Re: Keyword Density

2009-06-01 Thread Walter Underwood
That is the normal relevance scoring formula in Solr and Lucene.
It is a bit fancier than that, but you don't have to do anything
special to get that behavior.

Solr also uses the inverse document frequency (rarity) of each
word for weighting.

Look up tf.idf for more info.

wunder

On 6/1/09 11:46 AM, Alex Shevchenko caeza...@gmail.com wrote:

 Something like that. Just not ' N times' but 'numbers of foo
 appears/total number of words  some value'
 
 On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic
 otis_gospodne...@yahoo.comwrote:
 
 
 Hi Alex,
 
 Could you please provide an example of this?  Are you looking to do
 something like find all docs that match name:foo and where foo appears  N
 times (in the name field) in the matching document?
 
  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: Alex Shevchenko caeza...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Monday, June 1, 2009 1:32:49 PM
 Subject: Re: Keyword Density
 
 HI All,
 
 Is there a way to perform filtering based on keyword density?
 
 Thanks
 
 --
 Alex Shevchenko




What would be a good date for downloading a stable solr release

2009-06-01 Thread sroussey

We have too many issues with 1.3 running for longer than 12 hours and want to
look into a more updated version, either a nightly or a specific svn
revision that we can pull to replace it. Any recommendations for a date
since the 1.3.0 release 9 months ago? Doesn't have to be super new or
anything, just something that won't constantly run out of memory all the
time and is relatively stable that people have good experience with. Thanks!
-- 
View this message in context: 
http://www.nabble.com/What-would-be-a-good-date-for-downloading-a-stable-solr-release-tp23820504p23820504.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: What would be a good date for downloading a stable solr release

2009-06-01 Thread Walter Underwood
What sort of issues? We run Solr 1.3 for days or weeks with almost no
problems. We have one odd failure that we haven't been able to reproduce
in test, but it is very rare, once or twice per month across five servers.

wunder

On 6/1/09 12:34 PM, sroussey srous...@network54.com wrote:

 
 We have too many issues with 1.3 running for longer than 12 hours and want to
 look into a more updated version, either a nightly or a specific svn
 revision that we can pull to replace it. Any recommendations for a date
 since the 1.3.0 release 9 months ago? Doesn't have to be super new or
 anything, just something that won't constantly run out of memory all the
 time and is relatively stable that people have good experience with. Thanks!



Re: What would be a good date for downloading a stable solr release

2009-06-01 Thread Otis Gospodnetic

Hi,

1.3 is quite solid, so my guess is memory problems may be a question of 
configuration, inappropriate data input or analysis or inadequate hw.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: sroussey srous...@network54.com
 To: solr-user@lucene.apache.org
 Sent: Monday, June 1, 2009 3:34:01 PM
 Subject: What would be a good date for downloading a stable solr release
 
 
 We have too many issues with 1.3 running for longer than 12 hours and want to
 look into a more updated version, either a nightly or a specific svn
 revision that we can pull to replace it. Any recommendations for a date
 since the 1.3.0 release 9 months ago? Doesn't have to be super new or
 anything, just something that won't constantly run out of memory all the
 time and is relatively stable that people have good experience with. Thanks!
 -- 
 View this message in context: 
 http://www.nabble.com/What-would-be-a-good-date-for-downloading-a-stable-solr-release-tp23820504p23820504.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Unable to Search German text

2009-06-01 Thread Manepalli, Kalyan
Hi All,
I am facing an issue while adding multi language support in the 
Solr.
Here is what I am doing.
1)   have a field of type text_de which has analyzer as 
snowballporterFilterFactory with German2 as language
2)   copy the german locationName into this field at the index time.
3)   I can see that german text is converted into its corresponding English 
text, but still it doesn't return the doc when searched

Example:
Add a document with locationName as Köln into the field text_de.
When I run the analyzer on this field it display Köln converted into koln.
But when I search for koln I don't get any results.

Any suggestions on this will be very helpful


Thanks,
Kalyan Manepalli



RE: Unable to Search German text

2009-06-01 Thread Manepalli, Kalyan
I found what I was doing wrong. The XML document that I was posting didn't have 
the char encoding info, due to which the solr was ignoring the special chars.

Thanks,
Kalyan Manepalli
-Original Message-
From: Manepalli, Kalyan [mailto:kalyan.manepa...@orbitz.com] 
Sent: Monday, June 01, 2009 3:31 PM
To: solr-user@lucene.apache.org
Subject: Unable to Search German text

Hi All,
I am facing an issue while adding multi language support in the 
Solr.
Here is what I am doing.
1)   have a field of type text_de which has analyzer as 
snowballporterFilterFactory with German2 as language
2)   copy the german locationName into this field at the index time.
3)   I can see that german text is converted into its corresponding English 
text, but still it doesn't return the doc when searched

Example:
Add a document with locationName as Köln into the field text_de.
When I run the analyzer on this field it display Köln converted into koln.
But when I search for koln I don't get any results.

Any suggestions on this will be very helpful


Thanks,
Kalyan Manepalli



Solr.war

2009-06-01 Thread Francis Yakin

We are planning to upgrade solr 1.2.0 to 1.3.0

Under 1.3.0 - Which of war file that I need to use and deploy on my application?

We are using weblogic.

There are two war files under 
/opt//apache-solr-1.3.0/dist/apache-solr-1.3.0.war and under 
/opt/apache-solr-1.3.0/example/webapps/solr.war.
Which is one are we suppose to use?


Thanks

Francis




Error sorting random field with June 1, 2009 Solr 1.4 nightly

2009-06-01 Thread Robert Purdy

Hey all, 

I was just wondering if anyone else is getting an error with today's nightly
while sorting the random field.

Thanks Rob.

Jun 1, 2009 4:52:37 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at org.apache.lucene.search.SortField.getComparator(SortField.java:483)
at
org.apache.lucene.search.FieldValueHitQueue$OneComparatorFieldValueHitQueue.init(FieldValueHitQueue.java:80)
at
org.apache.lucene.search.FieldValueHitQueue.create(FieldValueHitQueue.java:190)
at
org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:851)
at
org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1360)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:868)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:337)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:176)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
at
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:51)
at org.apache.solr.core.SolrCore$4.call(SolrCore.java:1158)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
at java.lang.Thread.run(Thread.java:613)


-- 
View this message in context: 
http://www.nabble.com/Error-sorting-random-field-with-June-1%2C-2009-Solr-1.4-nightly-tp23824012p23824012.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using Chinese / How to ?

2009-06-01 Thread Grant Ingersoll
Can you provide details on the errors?  I don't think we have a  
specific how to, but I wouldn't think it would be much different from  
1.2


-Grant
On May 31, 2009, at 10:31 PM, Fer-Bj wrote:



Hello,

is there any how to already created to get me up using SOLR 1.3  
running

for a chinese based website?
Currently our site is using SOLR 1.2, and we tried to move into 1.3  
but we
couldn't complete our reindex as it seems like 1.3 is more strict  
when it

comes to special chars.

I would appreciate any help anyone may provide on this.

Thanks!!
--
View this message in context: 
http://www.nabble.com/Using-Chinese---How-to---tp23810129p23810129.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Solr.war

2009-06-01 Thread Koji Sekiguchi

They are identical. solr.war is a copy of apache-solr-1.3.0.war.
You may want to look at example target in build.xml:

 target name=example
 description=Creates a runnable example configuration.
 depends=init-forrest-entities,dist-contrib,dist-war
   !-- copy apache-solr-1.3.0.war to solr.war --
   copy file=${dist}/${fullnamever}.war
 tofile=${example}/webapps/${ant.project.name}.war/

Koji

Francis Yakin wrote:

We are planning to upgrade solr 1.2.0 to 1.3.0

Under 1.3.0 - Which of war file that I need to use and deploy on my application?

We are using weblogic.

There are two war files under 
/opt//apache-solr-1.3.0/dist/apache-solr-1.3.0.war and under 
/opt/apache-solr-1.3.0/example/webapps/solr.war.
Which is one are we suppose to use?


Thanks

Francis



  




Filter query results do not match facet counts

2009-06-01 Thread shopDave

I am using the 2009-05-27 build of solr 1.4.  Under this build, I get a facet
count on my category field named Seasonal of 7 values.  However, when I do
a filter query of 'fq=cat:Seasonal', I get only 1 result.

I switched back to Solr 1.3 to see if it's a problem with my config.  I
found that the counts and filter queries work as expected under 1.3.

Any ideas?  I nearly the same configuration between the nightly builds and
1.3.  I believe the only change I made was to label the schema version 1.2. 
I'm using the default data types from each original documents.
-- 
View this message in context: 
http://www.nabble.com/Filter-query-results-do-not-match-facet-counts-tp23824980p23824980.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using Chinese / How to ?

2009-06-01 Thread Fer-Bj

I'm sending 3 files:
- schema.xml
- solrconfig.xml
- error.txt (with the error description) 

I can confirm by now that this error is due to invalid characters for the
XML format (ASCII 0 or 11).
However, this problem now is taking a different direction: how to start
using the CJK instead of the english!
http://www.nabble.com/file/p23825881/error.txt error.txt 
http://www.nabble.com/file/p23825881/solrconfig.xml solrconfig.xml 
http://www.nabble.com/file/p23825881/schema.xml schema.xml 


Grant Ingersoll-6 wrote:
 
 Can you provide details on the errors?  I don't think we have a  
 specific how to, but I wouldn't think it would be much different from  
 1.2
 
 -Grant
 On May 31, 2009, at 10:31 PM, Fer-Bj wrote:
 

 Hello,

 is there any how to already created to get me up using SOLR 1.3  
 running
 for a chinese based website?
 Currently our site is using SOLR 1.2, and we tried to move into 1.3  
 but we
 couldn't complete our reindex as it seems like 1.3 is more strict  
 when it
 comes to special chars.

 I would appreciate any help anyone may provide on this.

 Thanks!!
 -- 
 View this message in context:
 http://www.nabble.com/Using-Chinese---How-to---tp23810129p23810129.html
 Sent from the Solr - User mailing list archive at Nabble.com.

 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Using-Chinese---How-to---tp23810129p23825881.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr multiple keyword search as google

2009-06-01 Thread The Spider

Hi,
   I am using solr nightly bind for my search.
I have to search in the location field of the table which is not my default
search field.
I will briefly explain my requirement below:
I want to get the same/similar result when I give location multiple
keywords, say  San jose ca USA
or USA ca san jose or CA San jose USA (like that of google search). That
means even if I rearranged the keywords of location I want to get proper
results. Is there any way to do that?
Thanks in advance
-- 
View this message in context: 
http://www.nabble.com/Solr-multiple-keyword-search-as-google-tp23826278p23826278.html
Sent from the Solr - User mailing list archive at Nabble.com.