Re: NPE on MERGEINDEXES
Reopened SOLR-1051: https://issues.apache.org/jira/browse/SOLR-1051?focusedCommentId=12715030page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12715030 Koji Koji Sekiguchi wrote: Maybe I did something wrong, I got NPE when trying to MERGEINDEXES: http://localhost:8983/solr/admin/cores?action=MERGEINDEXEScore=core0indexDirs=indexname java.lang.NullPointerException at org.apache.solr.update.processor.RunUpdateProcessor.init(RunUpdateProcessorFactory.java:55) at org.apache.solr.update.processor.RunUpdateProcessorFactory.getInstance(RunUpdateProcessorFactory.java:43) at org.apache.solr.update.processor.UpdateRequestProcessorChain.createProcessor(UpdateRequestProcessorChain.java:55) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:191) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:151) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:301) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) I'm using the latest trunk. Solr was started: $ cd example $ java -Dsolr.solr.home=./multicore -jar start.jar Thank you, Koji
Re: When searching for !...@#$%^*() all documents are matched incorrectly
Walter, The analysis link does not produce any matches for either @ or !...@#$%^*() strings when I try to match against bathing. I'm worried that this might be the symptom of another problem (which has not revealed itself yet) and want to get to the bottom of this... Thank you. sm Walter Underwood wrote: Use the [analysis] link on the Solr admin UI to get more info on how this is being interpreted. However, I am curious about why this is important. Do users enter this query often? If not, maybe it is not something to spend time on. wunder On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote: Here is the output from the debug query when I'm trying to match the String @ against Bathing (should not match) str name=GLOM-1 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of: 0.9994 = queryWeight(activity_type:NAME), product of: 3.2689075 = idf(docFreq=153, numDocs=1489) 0.30591258 = queryNorm 3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of: 1.0 = tf(termFreq(activity_type:NAME)=1) 3.2689075 = idf(docFreq=153, numDocs=1489) 1.0 = fieldNorm(field=activity_type, doc=0) /str Looks like the AND clause in the search string is ignored... SM. ryantxu wrote: two key things to try (for anyone ever wondering why a query matches documents) 1. add debugQuery=true and look at the explain text below -- anything that contributed to the score is listed there 2. check /admin/analysis.jsp -- this will let you see how analyzers break text up into tokens. Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has something to do with it... On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com wrote: Hi, I'm running Solr 1.3/Java 1.6. When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^\*\(\)) all the documents are returned even though there is not a single match. There is no title that matches the string (which has been escaped). My document structure is as follows doc str name=activity_typeNAME/str str name=titleBathing/str /doc The title field is of type text_title which is described below. fieldType name=text_title class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType When I run the query against Luke, no results are returned. Any suggestions are appreciated. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document s-are-matched-incorrectly-tp23797731p23797731.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: When searching for !...@#$%^*() all documents are matched incorrectly
OK, here's the deal: str name=rawquerystring-features:foo features:(\...@#$%\^\*\(\))/str str name=querystring-features:foo features:(\...@#$%\^\*\(\))/str str name=parsedquery-features:foo/str str name=parsedquery_toString-features:foo/str The text analysis is throwing away non alphanumeric chars (probably the WordDelimiterFilter). The Lucene (and Solr) query parser throws away term queries when the token is zero length (after analysis). Solr then interprets the left over -features:foo as all documents not containing foo in the features field, so you get a bunch of matches. -Yonik http://www.lucidimagination.com On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels mas...@yahoo.com wrote: Walter, The analysis link does not produce any matches for either @ or !...@#$%^*() strings when I try to match against bathing. I'm worried that this might be the symptom of another problem (which has not revealed itself yet) and want to get to the bottom of this... Thank you. sm Walter Underwood wrote: Use the [analysis] link on the Solr admin UI to get more info on how this is being interpreted. However, I am curious about why this is important. Do users enter this query often? If not, maybe it is not something to spend time on. wunder On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote: Here is the output from the debug query when I'm trying to match the String @ against Bathing (should not match) str name=GLOM-1 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of: 0.9994 = queryWeight(activity_type:NAME), product of: 3.2689075 = idf(docFreq=153, numDocs=1489) 0.30591258 = queryNorm 3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of: 1.0 = tf(termFreq(activity_type:NAME)=1) 3.2689075 = idf(docFreq=153, numDocs=1489) 1.0 = fieldNorm(field=activity_type, doc=0) /str Looks like the AND clause in the search string is ignored... SM. ryantxu wrote: two key things to try (for anyone ever wondering why a query matches documents) 1. add debugQuery=true and look at the explain text below -- anything that contributed to the score is listed there 2. check /admin/analysis.jsp -- this will let you see how analyzers break text up into tokens. Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has something to do with it... On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com wrote: Hi, I'm running Solr 1.3/Java 1.6. When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^\*\(\)) all the documents are returned even though there is not a single match. There is no title that matches the string (which has been escaped). My document structure is as follows doc str name=activity_typeNAME/str str name=titleBathing/str /doc The title field is of type text_title which is described below. fieldType name=text_title class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType When I run the query against Luke, no results are returned. Any suggestions are appreciated. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document s-are-matched-incorrectly-tp23797731p23797731.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: When searching for !...@#$%^*() all documents are matched incorrectly
So the fix for this problem would be 1. Stop using WordDelimiterFilter for queries (what is the alternative) OR 2. Not allow any search strings without any alphanumeric characters.. SM. Yonik Seeley-2 wrote: OK, here's the deal: str name=rawquerystring-features:foo features:(\...@#$%\^\*\(\))/str str name=querystring-features:foo features:(\...@#$%\^\*\(\))/str str name=parsedquery-features:foo/str str name=parsedquery_toString-features:foo/str The text analysis is throwing away non alphanumeric chars (probably the WordDelimiterFilter). The Lucene (and Solr) query parser throws away term queries when the token is zero length (after analysis). Solr then interprets the left over -features:foo as all documents not containing foo in the features field, so you get a bunch of matches. -Yonik http://www.lucidimagination.com On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels mas...@yahoo.com wrote: Walter, The analysis link does not produce any matches for either @ or !...@#$%^*() strings when I try to match against bathing. I'm worried that this might be the symptom of another problem (which has not revealed itself yet) and want to get to the bottom of this... Thank you. sm Walter Underwood wrote: Use the [analysis] link on the Solr admin UI to get more info on how this is being interpreted. However, I am curious about why this is important. Do users enter this query often? If not, maybe it is not something to spend time on. wunder On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote: Here is the output from the debug query when I'm trying to match the String @ against Bathing (should not match) str name=GLOM-1 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of: 0.9994 = queryWeight(activity_type:NAME), product of: 3.2689075 = idf(docFreq=153, numDocs=1489) 0.30591258 = queryNorm 3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of: 1.0 = tf(termFreq(activity_type:NAME)=1) 3.2689075 = idf(docFreq=153, numDocs=1489) 1.0 = fieldNorm(field=activity_type, doc=0) /str Looks like the AND clause in the search string is ignored... SM. ryantxu wrote: two key things to try (for anyone ever wondering why a query matches documents) 1. add debugQuery=true and look at the explain text below -- anything that contributed to the score is listed there 2. check /admin/analysis.jsp -- this will let you see how analyzers break text up into tokens. Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has something to do with it... On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com wrote: Hi, I'm running Solr 1.3/Java 1.6. When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^\*\(\)) all the documents are returned even though there is not a single match. There is no title that matches the string (which has been escaped). My document structure is as follows doc str name=activity_typeNAME/str str name=titleBathing/str /doc The title field is of type text_title which is described below. fieldType name=text_title class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType When I run the query against Luke, no results are returned. Any suggestions are appreciated. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document s-are-matched-incorrectly-tp23797731p23797731.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23816242.html Sent from
Re: When searching for !...@#$%^*() all documents are matched incorrectly
On Mon, Jun 1, 2009 at 10:50 AM, Sam Michaels mas...@yahoo.com wrote: So the fix for this problem would be 1. Stop using WordDelimiterFilter for queries (what is the alternative) OR 2. Not allow any search strings without any alphanumeric characters.. Short term workaround for you, yes. I would classify this surprising behavior as a bug we should eventually fix though. Could you open a JIRA issue for it? -Yonik http://www.lucidimagination.com SM. Yonik Seeley-2 wrote: OK, here's the deal: str name=rawquerystring-features:foo features:(\...@#$%\^\*\(\))/str str name=querystring-features:foo features:(\...@#$%\^\*\(\))/str str name=parsedquery-features:foo/str str name=parsedquery_toString-features:foo/str The text analysis is throwing away non alphanumeric chars (probably the WordDelimiterFilter). The Lucene (and Solr) query parser throws away term queries when the token is zero length (after analysis). Solr then interprets the left over -features:foo as all documents not containing foo in the features field, so you get a bunch of matches. -Yonik http://www.lucidimagination.com On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels mas...@yahoo.com wrote: Walter, The analysis link does not produce any matches for either @ or !...@#$%^*() strings when I try to match against bathing. I'm worried that this might be the symptom of another problem (which has not revealed itself yet) and want to get to the bottom of this... Thank you. sm Walter Underwood wrote: Use the [analysis] link on the Solr admin UI to get more info on how this is being interpreted. However, I am curious about why this is important. Do users enter this query often? If not, maybe it is not something to spend time on. wunder On 5/31/09 2:56 PM, Sam Michaels mas...@yahoo.com wrote: Here is the output from the debug query when I'm trying to match the String @ against Bathing (should not match) str name=GLOM-1 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of: 0.9994 = queryWeight(activity_type:NAME), product of: 3.2689075 = idf(docFreq=153, numDocs=1489) 0.30591258 = queryNorm 3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of: 1.0 = tf(termFreq(activity_type:NAME)=1) 3.2689075 = idf(docFreq=153, numDocs=1489) 1.0 = fieldNorm(field=activity_type, doc=0) /str Looks like the AND clause in the search string is ignored... SM. ryantxu wrote: two key things to try (for anyone ever wondering why a query matches documents) 1. add debugQuery=true and look at the explain text below -- anything that contributed to the score is listed there 2. check /admin/analysis.jsp -- this will let you see how analyzers break text up into tokens. Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has something to do with it... On Sat, May 30, 2009 at 5:59 PM, Sam Michaels mas...@yahoo.com wrote: Hi, I'm running Solr 1.3/Java 1.6. When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^\*\(\)) all the documents are returned even though there is not a single match. There is no title that matches the string (which has been escaped). My document structure is as follows doc str name=activity_typeNAME/str str name=titleBathing/str /doc The title field is of type text_title which is described below. fieldType name=text_title class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType When I run the query against Luke, no results are returned. Any suggestions are appreciated. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document s-are-matched-incorrectly-tp23797731p23797731.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context:
Highlighting and Field options
Hi, The 'content' field that I am indexing is usually large (e.g. a pdf doc of a few Mb in size). I need highlighting to be on. This 'seems' to require that I have to set the 'content' field to be STORED. This returns the whole content field in the search result XML. for each matching document. The highlighted text also is returned in a separate block. But I do NOT need the entire content field to display the search results. I only use the highlighted segments to display a brief description of each hit. The fact that SOLR returns entire content field, makes the returned XML unnecessarily huge, and makes for larger response times. How can I have SOLR return ONLY the highlighted text for each hit and NOT the entire 'content' filed? Thanks - ashok -- View this message in context: http://www.nabble.com/Highlighting-and-Field-options-tp23818019p23818019.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Highlighting and Field options
Use the fl param to ask for only the fields you need, but also keep hl=true. Something like this: http://localhost:8080/solr/select/?q=bearversion=2.2start=0rows=10indent=onhl=truefl=id Note that fl=id means the only field returned in the XML will be the id field. Highlights are still returned in the highlight element, but you won't get back the unneeded content field. -Jay On Mon, Jun 1, 2009 at 9:41 AM, ashokc ash...@qualcomm.com wrote: Hi, The 'content' field that I am indexing is usually large (e.g. a pdf doc of a few Mb in size). I need highlighting to be on. This 'seems' to require that I have to set the 'content' field to be STORED. This returns the whole content field in the search result XML. for each matching document. The highlighted text also is returned in a separate block. But I do NOT need the entire content field to display the search results. I only use the highlighted segments to display a brief description of each hit. The fact that SOLR returns entire content field, makes the returned XML unnecessarily huge, and makes for larger response times. How can I have SOLR return ONLY the highlighted text for each hit and NOT the entire 'content' filed? Thanks - ashok -- View this message in context: http://www.nabble.com/Highlighting-and-Field-options-tp23818019p23818019.html Sent from the Solr - User mailing list archive at Nabble.com.
How to get number of optimizes
Hello, I'm looking for a simple way to automate (in a shell script) a request for the number of times an index has been optimized (since the Solr webapp has last started). I know that this information is available on the Solr stats page (http://host:port/solr/admin/stats.jsp) under Update Handlers/stats/optimizes, but I'm looking for a simpler way than to retrieve the page using wget or similar and parse the HTML. More generally, is there a convenient way to get at the other data presented on the Stats page? I'm currently using Solr 1.2 but will be migrating to 1.3 soon in case that makes a difference. Thanks... -- View this message in context: http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to get number of optimizes
Not sure if it's simpler, but the JMX interface is more structured. I think that just grabbing the page and parsing out the content with your favorite tool (Ruby Hpricot) is pretty simple. Eric On Jun 1, 2009, at 1:17 PM, iamithink wrote: Hello, I'm looking for a simple way to automate (in a shell script) a request for the number of times an index has been optimized (since the Solr webapp has last started). I know that this information is available on the Solr stats page (http://host:port/solr/admin/stats.jsp) under Update Handlers/stats/optimizes, but I'm looking for a simpler way than to retrieve the page using wget or similar and parse the HTML. More generally, is there a convenient way to get at the other data presented on the Stats page? I'm currently using Solr 1.2 but will be migrating to 1.3 soon in case that makes a difference. Thanks... -- View this message in context: http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html Sent from the Solr - User mailing list archive at Nabble.com. - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal
Re: Keyword Density
HI All, Is there a way to perform filtering based on keyword density? Thanks -- Alex Shevchenko
Re: User search in Facebook like
Thanks a lot for your answer it fixed all my issues !!! It's really well working ! Cheers, Vincent -- View this message in context: http://www.nabble.com/User-search-in-Facebook-like-tp23804854p23818867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to get number of optimizes
Thanks for the quick response. I agree that for this one-off task the grab and parse method works fine, but I'll keep the JMX interface in mind for other tasks in the future. Here's my particular hack solution in case this helps anyone else: wget -q -O- http://hostname:port/solr/admin/stats.jsp | awk '/optimizes/{getline;print}' Eric Pugh-4 wrote: Not sure if it's simpler, but the JMX interface is more structured. I think that just grabbing the page and parsing out the content with your favorite tool (Ruby Hpricot) is pretty simple. Eric On Jun 1, 2009, at 1:17 PM, iamithink wrote: Hello, I'm looking for a simple way to automate (in a shell script) a request for the number of times an index has been optimized (since the Solr webapp has last started). I know that this information is available on the Solr stats page (http://host:port/solr/admin/stats.jsp) under Update Handlers/stats/optimizes, but I'm looking for a simpler way than to retrieve the page using wget or similar and parse the HTML. More generally, is there a convenient way to get at the other data presented on the Stats page? I'm currently using Solr 1.2 but will be migrating to 1.3 soon in case that makes a difference. Thanks... -- View this message in context: http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html Sent from the Solr - User mailing list archive at Nabble.com. - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal -- View this message in context: http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818964.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to get number of optimizes
Hello, That stats page is really XML + XSLT that transforms the XML to HTML. View the source of the stats page. That should make it very easy to parse the stats response/page and extract the data you need. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: iamithink leed...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 1:17:32 PM Subject: How to get number of optimizes Hello, I'm looking for a simple way to automate (in a shell script) a request for the number of times an index has been optimized (since the Solr webapp has last started). I know that this information is available on the Solr stats page (http://host:port/solr/admin/stats.jsp) under Update Handlers/stats/optimizes, but I'm looking for a simpler way than to retrieve the page using wget or similar and parse the HTML. More generally, is there a convenient way to get at the other data presented on the Stats page? I'm currently using Solr 1.2 but will be migrating to 1.3 soon in case that makes a difference. Thanks... -- View this message in context: http://www.nabble.com/How-to-get-number-of-optimizes-tp23818563p23818563.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Java OutOfmemory error during autowarming
Hi Chris, I am new in solr. When it is initialized for the first time, how can I change it? Thanks Francis -Original Message- From: Chris Harris [mailto:rygu...@gmail.com] Sent: Sunday, May 31, 2009 3:00 PM To: solr-user@lucene.apache.org Subject: Re: Java OutOfmemory error during autowarming Solr offers no configuration for FieldCache, neither in solrconfig.xml nor anywhere else; rather, that cache gets populated automatically in the depths of Lucene when you do a sort (or also apparently, as Yonik says, when you use a field in a function query). From the wiki: 'Lucene has a low level FieldCache which is used for sorting (and in some cases faceting). This cache is not managed by Solr it has no configuration options and cannot be autowarmed -- it is initialized the first time it is used for each Searcher.' ( http://wiki.apache.org/solr/SolrCaching) 2009/5/29 Francis Yakin fya...@liquid.com I know, but the FieldCache is not in the solrconfig.xml -Original Message- From: Yonik Seeley [mailto:ysee...@gmail.com] Sent: Friday, May 29, 2009 10:47 AM To: solr-user@lucene.apache.org Subject: Re: Java OutOfmemory error during autowarming On Fri, May 29, 2009 at 1:44 PM, Francis Yakin fya...@liquid.com wrote: There is no FieldCache entries in solrconfig.xml ( BTW we are running version 1.2.0) Lucene FieldCache entries are created when you sort on a field or when you use a field in a function query. -Yonik
Re: Keyword Density
Something like that. Just not ' N times' but 'numbers of foo appears/total number of words some value' On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic otis_gospodne...@yahoo.comwrote: Hi Alex, Could you please provide an example of this? Are you looking to do something like find all docs that match name:foo and where foo appears N times (in the name field) in the matching document? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alex Shevchenko caeza...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 1:32:49 PM Subject: Re: Keyword Density HI All, Is there a way to perform filtering based on keyword density? Thanks -- Alex Shevchenko -- Alex Shevchenko
Dismax handler phrase matching question
Hello, I'm using the dismax handler for the phrase matching. I have a few legal resources in my index in the following format for example title state dui faq1 california dui faq2 florida dui faq3 federal Now I want to be able to return federal results irrespective of the state. For example dui california should return all federal results for 'dui' also along with california results. i was thinking of a synonym mapping for the states like 'state name' = 'federal' (i.e california,federal florida, federal maine, federal etc ) Is there a better way though? -- View this message in context: http://www.nabble.com/Dismax-handler-phrase-matching-question-tp23820340p23820340.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Keyword Density
But I don't need to sort using this value. I need to cut results, where this value (for particular term of query!) not in some range. On Mon, Jun 1, 2009 at 22:20, Walter Underwood wunderw...@netflix.comwrote: That is the normal relevance scoring formula in Solr and Lucene. It is a bit fancier than that, but you don't have to do anything special to get that behavior. Solr also uses the inverse document frequency (rarity) of each word for weighting. Look up tf.idf for more info. wunder On 6/1/09 11:46 AM, Alex Shevchenko caeza...@gmail.com wrote: Something like that. Just not ' N times' but 'numbers of foo appears/total number of words some value' On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic otis_gospodne...@yahoo.comwrote: Hi Alex, Could you please provide an example of this? Are you looking to do something like find all docs that match name:foo and where foo appears N times (in the name field) in the matching document? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alex Shevchenko caeza...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 1:32:49 PM Subject: Re: Keyword Density HI All, Is there a way to perform filtering based on keyword density? Thanks -- Alex Shevchenko -- Alex Shevchenko
Re: Keyword Density
That is the normal relevance scoring formula in Solr and Lucene. It is a bit fancier than that, but you don't have to do anything special to get that behavior. Solr also uses the inverse document frequency (rarity) of each word for weighting. Look up tf.idf for more info. wunder On 6/1/09 11:46 AM, Alex Shevchenko caeza...@gmail.com wrote: Something like that. Just not ' N times' but 'numbers of foo appears/total number of words some value' On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic otis_gospodne...@yahoo.comwrote: Hi Alex, Could you please provide an example of this? Are you looking to do something like find all docs that match name:foo and where foo appears N times (in the name field) in the matching document? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alex Shevchenko caeza...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 1:32:49 PM Subject: Re: Keyword Density HI All, Is there a way to perform filtering based on keyword density? Thanks -- Alex Shevchenko
What would be a good date for downloading a stable solr release
We have too many issues with 1.3 running for longer than 12 hours and want to look into a more updated version, either a nightly or a specific svn revision that we can pull to replace it. Any recommendations for a date since the 1.3.0 release 9 months ago? Doesn't have to be super new or anything, just something that won't constantly run out of memory all the time and is relatively stable that people have good experience with. Thanks! -- View this message in context: http://www.nabble.com/What-would-be-a-good-date-for-downloading-a-stable-solr-release-tp23820504p23820504.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What would be a good date for downloading a stable solr release
What sort of issues? We run Solr 1.3 for days or weeks with almost no problems. We have one odd failure that we haven't been able to reproduce in test, but it is very rare, once or twice per month across five servers. wunder On 6/1/09 12:34 PM, sroussey srous...@network54.com wrote: We have too many issues with 1.3 running for longer than 12 hours and want to look into a more updated version, either a nightly or a specific svn revision that we can pull to replace it. Any recommendations for a date since the 1.3.0 release 9 months ago? Doesn't have to be super new or anything, just something that won't constantly run out of memory all the time and is relatively stable that people have good experience with. Thanks!
Re: What would be a good date for downloading a stable solr release
Hi, 1.3 is quite solid, so my guess is memory problems may be a question of configuration, inappropriate data input or analysis or inadequate hw. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: sroussey srous...@network54.com To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 3:34:01 PM Subject: What would be a good date for downloading a stable solr release We have too many issues with 1.3 running for longer than 12 hours and want to look into a more updated version, either a nightly or a specific svn revision that we can pull to replace it. Any recommendations for a date since the 1.3.0 release 9 months ago? Doesn't have to be super new or anything, just something that won't constantly run out of memory all the time and is relatively stable that people have good experience with. Thanks! -- View this message in context: http://www.nabble.com/What-would-be-a-good-date-for-downloading-a-stable-solr-release-tp23820504p23820504.html Sent from the Solr - User mailing list archive at Nabble.com.
Unable to Search German text
Hi All, I am facing an issue while adding multi language support in the Solr. Here is what I am doing. 1) have a field of type text_de which has analyzer as snowballporterFilterFactory with German2 as language 2) copy the german locationName into this field at the index time. 3) I can see that german text is converted into its corresponding English text, but still it doesn't return the doc when searched Example: Add a document with locationName as Köln into the field text_de. When I run the analyzer on this field it display Köln converted into koln. But when I search for koln I don't get any results. Any suggestions on this will be very helpful Thanks, Kalyan Manepalli
RE: Unable to Search German text
I found what I was doing wrong. The XML document that I was posting didn't have the char encoding info, due to which the solr was ignoring the special chars. Thanks, Kalyan Manepalli -Original Message- From: Manepalli, Kalyan [mailto:kalyan.manepa...@orbitz.com] Sent: Monday, June 01, 2009 3:31 PM To: solr-user@lucene.apache.org Subject: Unable to Search German text Hi All, I am facing an issue while adding multi language support in the Solr. Here is what I am doing. 1) have a field of type text_de which has analyzer as snowballporterFilterFactory with German2 as language 2) copy the german locationName into this field at the index time. 3) I can see that german text is converted into its corresponding English text, but still it doesn't return the doc when searched Example: Add a document with locationName as Köln into the field text_de. When I run the analyzer on this field it display Köln converted into koln. But when I search for koln I don't get any results. Any suggestions on this will be very helpful Thanks, Kalyan Manepalli
Solr.war
We are planning to upgrade solr 1.2.0 to 1.3.0 Under 1.3.0 - Which of war file that I need to use and deploy on my application? We are using weblogic. There are two war files under /opt//apache-solr-1.3.0/dist/apache-solr-1.3.0.war and under /opt/apache-solr-1.3.0/example/webapps/solr.war. Which is one are we suppose to use? Thanks Francis
Error sorting random field with June 1, 2009 Solr 1.4 nightly
Hey all, I was just wondering if anyone else is getting an error with today's nightly while sorting the random field. Thanks Rob. Jun 1, 2009 4:52:37 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org.apache.lucene.search.SortField.getComparator(SortField.java:483) at org.apache.lucene.search.FieldValueHitQueue$OneComparatorFieldValueHitQueue.init(FieldValueHitQueue.java:80) at org.apache.lucene.search.FieldValueHitQueue.create(FieldValueHitQueue.java:190) at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:851) at org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1360) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:868) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:337) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:176) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:51) at org.apache.solr.core.SolrCore$4.call(SolrCore.java:1158) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:613) -- View this message in context: http://www.nabble.com/Error-sorting-random-field-with-June-1%2C-2009-Solr-1.4-nightly-tp23824012p23824012.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using Chinese / How to ?
Can you provide details on the errors? I don't think we have a specific how to, but I wouldn't think it would be much different from 1.2 -Grant On May 31, 2009, at 10:31 PM, Fer-Bj wrote: Hello, is there any how to already created to get me up using SOLR 1.3 running for a chinese based website? Currently our site is using SOLR 1.2, and we tried to move into 1.3 but we couldn't complete our reindex as it seems like 1.3 is more strict when it comes to special chars. I would appreciate any help anyone may provide on this. Thanks!! -- View this message in context: http://www.nabble.com/Using-Chinese---How-to---tp23810129p23810129.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Solr.war
They are identical. solr.war is a copy of apache-solr-1.3.0.war. You may want to look at example target in build.xml: target name=example description=Creates a runnable example configuration. depends=init-forrest-entities,dist-contrib,dist-war !-- copy apache-solr-1.3.0.war to solr.war -- copy file=${dist}/${fullnamever}.war tofile=${example}/webapps/${ant.project.name}.war/ Koji Francis Yakin wrote: We are planning to upgrade solr 1.2.0 to 1.3.0 Under 1.3.0 - Which of war file that I need to use and deploy on my application? We are using weblogic. There are two war files under /opt//apache-solr-1.3.0/dist/apache-solr-1.3.0.war and under /opt/apache-solr-1.3.0/example/webapps/solr.war. Which is one are we suppose to use? Thanks Francis
Filter query results do not match facet counts
I am using the 2009-05-27 build of solr 1.4. Under this build, I get a facet count on my category field named Seasonal of 7 values. However, when I do a filter query of 'fq=cat:Seasonal', I get only 1 result. I switched back to Solr 1.3 to see if it's a problem with my config. I found that the counts and filter queries work as expected under 1.3. Any ideas? I nearly the same configuration between the nightly builds and 1.3. I believe the only change I made was to label the schema version 1.2. I'm using the default data types from each original documents. -- View this message in context: http://www.nabble.com/Filter-query-results-do-not-match-facet-counts-tp23824980p23824980.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using Chinese / How to ?
I'm sending 3 files: - schema.xml - solrconfig.xml - error.txt (with the error description) I can confirm by now that this error is due to invalid characters for the XML format (ASCII 0 or 11). However, this problem now is taking a different direction: how to start using the CJK instead of the english! http://www.nabble.com/file/p23825881/error.txt error.txt http://www.nabble.com/file/p23825881/solrconfig.xml solrconfig.xml http://www.nabble.com/file/p23825881/schema.xml schema.xml Grant Ingersoll-6 wrote: Can you provide details on the errors? I don't think we have a specific how to, but I wouldn't think it would be much different from 1.2 -Grant On May 31, 2009, at 10:31 PM, Fer-Bj wrote: Hello, is there any how to already created to get me up using SOLR 1.3 running for a chinese based website? Currently our site is using SOLR 1.2, and we tried to move into 1.3 but we couldn't complete our reindex as it seems like 1.3 is more strict when it comes to special chars. I would appreciate any help anyone may provide on this. Thanks!! -- View this message in context: http://www.nabble.com/Using-Chinese---How-to---tp23810129p23810129.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- View this message in context: http://www.nabble.com/Using-Chinese---How-to---tp23810129p23825881.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr multiple keyword search as google
Hi, I am using solr nightly bind for my search. I have to search in the location field of the table which is not my default search field. I will briefly explain my requirement below: I want to get the same/similar result when I give location multiple keywords, say San jose ca USA or USA ca san jose or CA San jose USA (like that of google search). That means even if I rearranged the keywords of location I want to get proper results. Is there any way to do that? Thanks in advance -- View this message in context: http://www.nabble.com/Solr-multiple-keyword-search-as-google-tp23826278p23826278.html Sent from the Solr - User mailing list archive at Nabble.com.