Re: How could I monitor solr cache
I am wondering how could I get solr cache running status. I know there is a JMX containing those information. Just want to know what tool or method do you make use of to monitor cache, in order to enhance performance or detect issue. You might find this interesting : http://sematext.com/spm/solr-performance-monitoring/index.html http://sematext.com/spm/index.html
solr slave's performance issue after replicate the optimized index
Hi all, I have a performance issue~ I do a optimize on solr master every night. but about a month ago, every time after the slaves get the new optimized index, system cpu usage will raise from 0.3 - 0.5% to 7 - 10% (daily average), and servers's load average also become 2 times more than normal.the load average remain high even I restart the tomcat . after many day's testing, I find that 4 ways to bring the slaves back to normal load average. 1. reboot linux server 2. shutdown tomcat, manually rm the index data and do repilcate again 3. shutdown tomcat, copy indexdata as indexdata2, rm indexdata, mv indexdata2 to indexdata, start tomcat 4. shutdown tomcat, use C to alloc 20G memory and free it, start server. I can only guess it has some relationship with the memory or the system cache. Is this a solr bug or lucence bug or just system issue? My System: CentOS 5.6 X64 Tomcat 7.0 JRocket 6 Intel E5620 *2 24GB DDR3 Solr 3.1 Index size 7G (after optimize) / 8G (before opitimize) Many thanks~
Re: ' invisible ' words
Hi Erick, thank you for the advice... I will be doing as you advised and update you here... - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/invisible-words-tp3158060p3181647.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps
Nagendra, In another email you mentioned there's a problem where if an existing document is updated both the old and new version will show up in search results. Has that been solved in Solr-RA 3.3? --- On Mon, 7/18/11, Nagendra Nagarajayya nnagaraja...@transaxtions.com wrote: From: Nagendra Nagarajayya nnagaraja...@transaxtions.com Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 1 tps To: solr-user@lucene.apache.org Date: Monday, July 18, 2011, 10:43 AM Hi! I would like to announce the availability of Solr 3.3 with RankingAlgorithm and Near Real Time (NRT) search capability now. The NRT performance is very high, 10,000 documents/sec with the MBArtists 390k index. The NRT functionality allows you to add documents without the IndexSearchers being closed or caches being cleared. A commit is also not needed with the document update. Searches can run concurrently with document updates. No changes are needed except for enabling the NRT through solrconfig.xml. RankingAlgorithm query performance is now 3x times faster than before and is exposed as the Lucene API. This release also adds supports for the last document with a unique id to be searchable and visible in search results in case of multiple updates of the document. I have a wiki page that describes NRT performance in detail and can be accessed from here: http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x You can download Solr 3.3 with RankingAlgorithm (NRT version) from here: http://solr-ra.tgels.org I would like to invite you to give this version a try as the performance is very high. Regards, - Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org
Solr 3.3: SEVERE: java.io.IOException: seek past EOF
Hi Developers and Users, a serious Problem occurred: 19.07.2011 10:50:32 org.apache.solr.common.SolrException log SEVERE: java.io.IOException: seek past EOF at org.apache.lucene.store.MMapDirectory$MMapIndexInput.seek(MMapDirectory.java:343) at org.apache.lucene.index.FieldsReader.seekIndex(FieldsReader.java:226) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:242) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:471) at org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:564) at org.apache.solr.search.SolrIndexReader.document(SolrIndexReader.java:260) at org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:440) at org.apache.solr.util.SolrPluginUtils.optimizePreFetchDocs(SolrPluginUtils.java:270) at org.apache.solr.handler.component.QueryComponent.doPrefetch(QueryComponent.java:358) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:265) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:202) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:462) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:562) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.java:210) at org.apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:85) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:395) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:250) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919) at java.lang.Thread.run(Thread.java:736) Fresh index with Solr 3.3. It only occurs with some Words (in this case it was Graf, no idea). Query-Type (dismax, standard, edismax), Highlighting and Faceting have no affect, only the term to search. And it seems to affect only OCR-fields, which are usually larger than fields for meta-data. Any ideas? Grettings best regards, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-3-SEVERE-java-io-IOException-seek-past-EOF-tp3181869p3181869.html Sent from the Solr - User mailing list archive at Nabble.com.
How to find whether solr server is running or not
I am running an application that get search results from solr server. But when server is not running i get no response from the server. Is there any way i can found that my server is not running so that i can give proper error message regarding it - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181870.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to find whether solr server is running or not
Check for HTTP response code, if its other than 200 means services are not OK. On 19 July 2011 14:39, Romi romijain3...@gmail.com wrote: I am running an application that get search results from solr server. But when server is not running i get no response from the server. Is there any way i can found that my server is not running so that i can give proper error message regarding it - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181870.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Re: How to find whether solr server is running or not
i am getting json response from solr as *$.getJSON(http://192.168.1.9:8983/solr/db/select/?qt=dismaxwt=jsonstart=0rows=100q=eleganthl=truehl.fl=texthl.usePhraseHighlighter=truesort= scorejson.wrf=?, function(result){* how can i check whether i get response or not - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181942.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 3.3: SEVERE: java.io.IOException: seek past EOF
Ups, false alarm. CustomSimilarity, combined with a very small set of documents caused the problem. Greetings, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-3-SEVERE-java-io-IOException-seek-past-EOF-tp3181869p3181943.html Sent from the Solr - User mailing list archive at Nabble.com.
Spatial Search with distance as a parameter
Hi all, I have the following problem: The documents in the index of my solr instance correspond to persons. Each document (=person) has lat/lon coordinates and additionally a travel radius. The coordinates correspond to the office of the person, the travel radius indicates a distance which the person is willing to travel. I would like to search for all persons which are willing to travel to a particular place (also given as lat/lon coordinates). In other words I have to do a query with the geofilt filter. The problem here: the distance parameter d cannot be defined in advance, but should correspond to the travel radius (which may be different for each person). Any ideas how this problem could be solved? Thanks in advance Michi
Re: How to find whether solr server is running or not
You can use ping: http://host:port/solr/admin/ping The response is something like this: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime5/intlst name=paramsstr name=echoParamsall/strstr name=rows10/strstr name=echoParamsall/strstr name=qsolrpingquery/strstr name=qtsearch/str/lst/lststr name=statusOK/str /response or with JSON response: http://host:port/solr/admin/ping?wt=json {responseHeader:{status:0,QTime:2,params:{echoParams:all,rows:10,echoParams:all,q:solrpingquery,qt:search,wt:json}},status:OK} Hope this helps. Péter -- eXtensible Catalog http://drupal.org/project/xc 2011/7/19 Romi romijain3...@gmail.com: i am getting json response from solr as *$.getJSON(http://192.168.1.9:8983/solr/db/select/?qt=dismaxwt=jsonstart=0rows=100q=eleganthl=truehl.fl=texthl.usePhraseHighlighter=truesort= scorejson.wrf=?, function(result){* how can i check whether i get response or not - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181942.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to find whether solr server is running or not
But the problem is when solr server is not runing *http://host:port/solr/admin/ping* will not give me any json response then how will i get the status :( when i run this url browser gives me following error *Unable to connect Firefox can't establish a connection to the server at 192.168.1.9:8983.* - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182202.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: - character in search query
Anybody? -- View this message in context: http://lucene.472066.n3.nabble.com/character-in-search-query-tp3168604p3182228.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: any detailed tutorials on plugin development?
which plugin specifically are you going to implement? On Mon, Jul 18, 2011 at 2:24 AM, deniz denizdurmu...@gmail.com wrote: anyone knows any tutorials on implementing tutorials? there is one page on wiki but i dont think we can it tutorial... i am looking for something with some example code... - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/any-detailed-tutorials-on-plugin-development-tp3177821p3177821.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Dmitry Kan
Re: How to find whether solr server is running or not
Try doing this from a program rather than the browser. If Solr isn't running, you have to infer that fact from the lack of a response. Best Erick On Tue, Jul 19, 2011 at 7:42 AM, Romi romijain3...@gmail.com wrote: But the problem is when solr server is not runing *http://host:port/solr/admin/ping* will not give me any json response then how will i get the status :( when i run this url browser gives me following error *Unable to connect Firefox can't establish a connection to the server at 192.168.1.9:8983.* - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182202.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to find whether solr server is running or not
I think anything but a 200 OK mean it is dead like the proverbial parrot :) François On Jul 19, 2011, at 7:42 AM, Romi wrote: But the problem is when solr server is not runing *http://host:port/solr/admin/ping* will not give me any json response then how will i get the status :( when i run this url browser gives me following error *Unable to connect Firefox can't establish a connection to the server at 192.168.1.9:8983.* - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182202.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: - character in search query
Let's see the complete fieldType definition. Have you looked at your index with, say, Luke and seen what's actually in your index? And do you re-index after each schema change? What does your admin/analysis page look like? Have you considered PatternReplaceCharFilterFactory rather than the tokenizer? Best Erick On Tue, Jul 19, 2011 at 7:48 AM, roySolr royrutten1...@gmail.com wrote: Anybody? -- View this message in context: http://lucene.472066.n3.nabble.com/character-in-search-query-tp3168604p3182228.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Need Suggestion
Look this link : http://wiki.apache.org/solr/DistributedSearch This will help you when you have large index. -Original Message- From: Rohit Gupta [mailto:ro...@in-rev.com] Sent: Friday, July 15, 2011 11:37 PM To: solr-user@lucene.apache.org Subject: Re: Need Suggestion I am using -Xms2g and -Xmx6g What would be the ideal JVM size? Regards, Rohit From: Mohammad Shariq shariqn...@gmail.com To: solr-user@lucene.apache.org Sent: Fri, 15 July, 2011 7:27:38 PM Subject: Re: Need Suggestion below are certain things to do for search latency. 1) Do bulk insert. 2) Commit after every ~5000 docs. 3) Do optimization once in a day. 4) in search query use fq parameter. What is the size of JVM you are using ??? On 15 July 2011 17:44, Rohit ro...@in-rev.com wrote: I am facing some performance issues on my Solr Installation (3core server). I am indexing live twitter data based on certain keywords, as you can imagine, the rate at which documents are received is very high and so the updates to the core is very high and regular. Given below are the document size on my three core. Twitter - 26874747 Core2- 3027800 Core3- 6074253 My Server configuration has 8GB RAM, but now we are experiencing server performance drop. What can be done to improve this? Also, I have a few questions. 1. Does the number of commit takes high memory? Will reducing the number of commits per hour help? 2. Most of my queries are field or date faceting based? how to improve those? Regards, Rohit Regards, Rohit Mobile: +91-9901768202 About Me: http://about.me/rohitg http://about.me/rohitg -- Thanks and Regards Mohammad Shariq
Re: defType argument weirdness
On Jul 18, 2011, at 19:15 , Naomi Dushay wrote: I found a weird behavior with the Solr defType argument, perhaps with respect to default queries? q={!defType=dismax}*:* hits this is the confusing one. defType is a Solr request parameter, but not something that works as a local (inside the {!} brackets) parameter. Confusing, indeed. But just not how local params/defType works at the moment. So, with defType being ignored in those curly brackets, you're getting the default lucene query parser. Check it out with debugQuery=true and see how queries parse. Erik
Re: XInclude Multiple Elements
On Mon, Jul 18, 2011 at 8:06 PM, Chris Hostetter hossman_luc...@fucit.org wrote: Can you post the details of your JVM / ServletContainer and the full stack trace of the exception? My understanding is that fragment identifiers are a mandatory part of the xinclude/xpointer specs. It would also be good to know if you tried the explicit xpointer attribute approach on the xinclude syntax also mentioned in that thread... I think it owuld be something like... xi:include href=solrconfigIncludes.xml xpointer=xpointer(//requestHandler) / In general, Solr really isn't doing anything special with XInclude ... it's all just delegated to the XML Libraries. You might want to start by ignoring solr, and reading up on XInclude/XPointer tutorials in general, and experimenting with command line xml tools to figure out the syntax you need to get the final xml structures you want -- then aply that knowledge to the solr config files. -Hoss This is running on java 1.6.0_26, and jetty 7.4.4.v20110707. The stack trace in the case of the use of the fragment is: 2011-07-13 18:52:42,953 [main] ERROR org.apache.solr.core.Config - Exception during parsing file: solrconfig.xml:org.xml.sax.SAXParseException: Fragment identifiers must not be used. The 'href' attribute value '../../conf/solrconfigIncludes.xml#xpointer(root/node())' is not permitted. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.reportError(Unknown Source) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.reportFatalError(Unknown Source) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.handleIncludeElement(Unknown Source) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.emptyElement(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at org.apache.solr.core.Config.init(Config.java:159) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:131) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:435) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:742) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:245) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1208) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:586) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:449) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:89) at org.eclipse.jetty.server.Server.doStart(Server.java:258) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58) at com.issinc.cidne.solr.App.main(App.java:41) I did attempt the xpointer=xpointer(//requestHandler) syntax, and received this error: 2011-07-13 18:49:06,640 [main] WARN org.apache.solr.core.Config - XML parse warning in solrres:/solrconfig.xml, line 3, column 133:
Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps
Yes, this problem has been solved though not completely, there is still a refresh problem. To eliminate duplicate documents with a unique id during update, you need to set maxBufferedDeleteTerms1/maxBufferedDeleteTerms. This makes the most recent updated document to become searchable as well as removing the older documents. There is a catch though, if some of the fields in a document are different and this is updated , older content might show up as part of the results even though the query matches the most recent document content ie. if the most recent doc has afield set to docafieldabc/afield/doc and this is updated, and the old docs were docafieldxyz/afield, at query time, q=afield:abc matches, but the results show may show docafieldxyz/afield. I am still researching this. You can get more information about the performance and known issues here: http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x Regards, - Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 7/19/2011 1:21 AM, Andy wrote: Nagendra, In another email you mentioned there's a problem where if an existing document is updated both the old and new version will show up in search results. Has that been solved in Solr-RA 3.3? --- On Mon, 7/18/11, Nagendra Nagarajayyannagaraja...@transaxtions.com wrote: From: Nagendra Nagarajayyannagaraja...@transaxtions.com Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 1 tps To: solr-user@lucene.apache.org Date: Monday, July 18, 2011, 10:43 AM Hi! I would like to announce the availability of Solr 3.3 with RankingAlgorithm and Near Real Time (NRT) search capability now. The NRT performance is very high, 10,000 documents/sec with the MBArtists 390k index. The NRT functionality allows you to add documents without the IndexSearchers being closed or caches being cleared. A commit is also not needed with the document update. Searches can run concurrently with document updates. No changes are needed except for enabling the NRT through solrconfig.xml. RankingAlgorithm query performance is now 3x times faster than before and is exposed as the Lucene API. This release also adds supports for the last document with a unique id to be searchable and visible in search results in case of multiple updates of the document. I have a wiki page that describes NRT performance in detail and can be accessed from here: http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x You can download Solr 3.3 with RankingAlgorithm (NRT version) from here: http://solr-ra.tgels.org I would like to invite you to give this version a try as the performance is very high. Regards, - Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org
Re: Solr User Interface
Hi, You can to send wt param to Solr such as follow: wt=json or wt=phps In the first case, Solr result are retorned in JSON format response and the second case, are returned in PHP serialized format. Regards. El 19/07/11 15:46, serenity keningston escribió: Hi, I installed Solr 3.2 and able to search results successfully from the crawled data, however , I would like to develop UI for the http or json response. Can anyone guide me with the tutorial or sample ? I referred few thing like Ajax Solr but am not sure how to do the things. Serenity
Solr UI
Hi, I installed Solr 3.2 and able to search results successfully from the crawled data, however , I would like to develop UI for the http or json response. Can anyone guide me with the tutorial or sample ? I referred few thing like Ajax Solr but am not sure how to do the things. Serenity -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UI-tp3182594p3182594.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Spatial Search with distance as a parameter
Hi Michael, It appears that you want to index circles (aka point-radius) and you want to do a query that is a point that matches documents where this point is within an indexed circle. I'm working with a couple Lucene/Solr committers on a geospatial module that can do this today against Solr 4 (trunk), but it's rough on the edges and needs testing. In the mean time, I suggest you look at this post from Spaceman Steve in which he indexed lat-lon boxes and he has a query box. With a bit of creativity, you may be able to adapt it to your needs. http://lucene.472066.n3.nabble.com/intersecting-map-extent-with-solr-spatial-documents-tc3104098.html ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/ On Jul 19, 2011, at 6:14 AM, Michael Lorz wrote: Hi all, I have the following problem: The documents in the index of my solr instance correspond to persons. Each document (=person) has lat/lon coordinates and additionally a travel radius. The coordinates correspond to the office of the person, the travel radius indicates a distance which the person is willing to travel. I would like to search for all persons which are willing to travel to a particular place (also given as lat/lon coordinates). In other words I have to do a query with the geofilt filter. The problem here: the distance parameter d cannot be defined in advance, but should correspond to the travel radius (which may be different for each person). Any ideas how this problem could be solved? Thanks in advance Michi
Re: Solr UI
There's several starting points for Solr UI out there, but really the best choice is whatever fits your environment and the skills/resources you have handy. Here's a few off the top of my head - * Blacklight - it's a Ruby on Rails full-featured search UI powered by Solr. It can be customized fairly easily to work with any arbitrary Solr schema, but by default it is kinda library-specific in it's out of the box experience. It powers UVa, Stanford, and other libraries and sites out there in production now - http://projectblacklight.org/ * Flare - it's the first prototype to Blacklight, and fairly dusty and prototypical, but I still think a good example of how lean a search UI can be that has a number of fancy features - http://wiki.apache.org/solr/Flare/HowTo * Solritas/VelocityResponseWriter - this is built right into Solr and allows easily templating of Solr responses. It's the /browse interface out of the box. While probably not how someone would deploy a production search UI, it can make proof of concepts and getting up and running quite quick and easy - http://wiki.apache.org/solr/VelocityResponseWriter And there's a new little tinkering I've started a while back that might be good food for thought for the same sorts of ideas as the above but in a slightly different direction - https://github.com/lucidimagination/Prism Erik On Jul 19, 2011, at 10:00 , serenity wrote: Hi, I installed Solr 3.2 and able to search results successfully from the crawled data, however , I would like to develop UI for the http or json response. Can anyone guide me with the tutorial or sample ? I referred few thing like Ajax Solr but am not sure how to do the things. Serenity -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UI-tp3182594p3182594.html Sent from the Solr - User mailing list archive at Nabble.com.
query time boosting in solr
Hi Is query time boosting possible in Solr? Here is what I want to do: I want to boost the ranking of certain documents, which have their relevant field values, in a particular range (selected by user at query time)... when I do something like: http://localhost:8085/solr/select?indent=onversion=2.2q=scientific+temperfq=field1:[10%20TO%2030]start=0rows=10 -I guess, it is just a filter over the normal results and not exactly a query. I tried giving this: http://localhost:8085/solr/select?indent=onversion=2.2q=scientific+temper+field1:[10%20TO%2030]start=0rows=10 -This still worked and gave me different results. But, I did not quite understand what this second query meant. Does it mean: Rank those documents with field1 value in 10-30 better than those without ? S -- Sowmya V.B. Losing optimism is blasphemy! http://vbsowmya.wordpress.com
Error 400 in Solr 1.4
Hi, I have a problem when I try to send the qt param to Solr 1.4 with dismax value. I get the following error from Solr response: HTTP ERROR: 400 undefined field price RequestURI=/solr/select Any idea? Regards.
Re: Error 400 in Solr 1.4
Just a hunch, ;), but I'm guessing you don't have a price field defined. qt is for selecting a request handler you have defined in your solrconfig.xml - you need to customize the parameters to your schema. Erik On Jul 19, 2011, at 04:32 , Yusniel Hidalgo Delgado wrote: Hi, I have a problem when I try to send the qt param to Solr 1.4 with dismax value. I get the following error from Solr response: HTTP ERROR: 400 undefined field price RequestURI=/solr/select Any idea? Regards.
Edismax and leading wildcards
My schema.xml currently has a content field and a content_rev field which is the field run through the reversed wild card filter, my question is does Edismax support using this field? Reading through this jira(https://issues.apache.org/jira/browse/SOLR-1321) it seems to indicate that SolrQueryParser was updated to support using this field but it feels like there should be something that I'd need to configure to let Solr know that if doing wildcard queries on field content content_rev should be used if it has a lower cost. Is that not the case?
Re: Error 400 in Solr 1.4
Thanks Erik for your quick reply. You are right, In my solrconfig.xml file, I did have a wrong configuration option. Thanks again. El 19/07/11 16:37, Erik Hatcher escribió: Just a hunch, ;), but I'm guessing you don't have a price field defined. qt is for selecting a request handler you have defined in your solrconfig.xml - you need to customize the parameters to your schema. Erik On Jul 19, 2011, at 04:32 , Yusniel Hidalgo Delgado wrote: Hi, I have a problem when I try to send the qt param to Solr 1.4 with dismax value. I get the following error from Solr response: HTTP ERROR: 400 undefined field price RequestURI=/solr/select Any idea? Regards.
Re: How to find whether solr server is running or not
i am new in HADOOP TESTING , any body tell me about the hadoop testing , what should be sufficient for a tester to test hadoop based projects . please help me -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182201.html Sent from the Solr - User mailing list archive at Nabble.com.
delta import exception
Hi, I am trying to trace the exception I get from the deletedPkQuery I am running. When I kick off the delta-import, the statusMessage has the following message after 2 hours, but no single document was modified or deleted. str name=Total Rows Fetched2813450/str and then it bailed out when i submitted another heavy query on the mysql prompt. Does it mean that the importer was still trying to identify documents to update/delete? It might be because the deletedPkQuery takes too long to return the documents? Where is the source code for getNext()? I could not find under apache-solr-1.4.0/src/java/org/apache/solr/handler Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost. at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2455) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2906) ... 22 more Jul 19, 2011 11:35:00 AM org.apache.solr.handler.dataimport.EntityProcessorBase getNext SEVERE: getNext() failed for query ' ***removed***' org.apache.solr.handler.dataimport.DataImportHandlerException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was0 milliseconds ago.The last packet sent successfully to the server was 9042 milliseconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem. at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:339) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$700(JdbcDataSource.java:228) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.hasNext(JdbcDataSource.java:262) at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:78) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextDeletedRowKey(SqlEntityProcessor.java:93) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextDeletedRowKey(EntityProcessorWrapper.java:258) at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:636) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:258) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:172) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:352) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Elaine
RE: Searching for strings
Thanks Rob. It turns out this was a false alarm; I was misinterpreting a different problem with my crawl. -Original Message- From: Rob Casson [mailto:rob.cas...@gmail.com] Sent: Monday, July 18, 2011 5:58 PM To: solr-user@lucene.apache.org Subject: Re: Searching for strings chip, gonna need more information about your particular analysis chain, content, and example searches to give a better answer, but phrase queries (using quotes) are supported in both the standard and dismax query parsers that being said, lots of things may not match a person's idea of an exact string...stopwords, synonyms, slop, etc. cheers, rob On Mon, Jul 18, 2011 at 5:25 PM, Chip Calhoun ccalh...@aip.org wrote: Is there a way to search for a specific string using Solr, either by putting it in quotes or by some other means? I haven't been able to do this, but I may be missing something. Thanks, Chip
Re: Data Import from a Queue
Let me provide some more details to the question: I was unable to find any example implementations where individual documents (single document per message) are read from a message queue (like ActiveMQ or RabbitMQ) and then added to Solr via SolrJ, a HTTP POST or another method. Does anyone know of any available examples for this type of import? If no examples exist, what would be a recommended commit strategy for performance? My best guess for this would be to have a queue per core and commit once the queue is empty. Thanks. On Mon, Jul 18, 2011 at 6:52 PM, Erick Erickson erickerick...@gmail.comwrote: This is a really cryptic problem statement. you might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Jul 15, 2011 at 1:52 PM, Brandon Fish brandon.j.f...@gmail.com wrote: Does anyone know of any existing examples of importing data from a queue into Solr? Thank you.
Re: defType argument weirdness
qf_dismax and pf_dismax are irrelevant -- I shouldn't have included that info. They are passed in the url and they work; they do not affect this problem. Your reminder of debugQuery was a good one - I use that a lot but forgot in this case. Regardless, I thought that defType=dismaxq=*:* is supposed to be equivalent to q={!defType=dismax}*:* and also equivalent to q={! dismax}*:* defType=dismaxq=*:* DOESN'T WORK str name=rawquerystring*:*/str str name=querystring*:*/str str name=parsedquery+() ()/str str name=parsedquery_toString+() ()/str leaving out the explicit query defType=dismax WORKS null name=rawquerystring/ null name=querystring/ str name=parsedquery+MatchAllDocsQuery(*:*)/str str name=parsedquery_toString+*:*/str q={!dismax}*:* DOESN'T WORK str name=rawquerystring*:*/str str name=querystring*:*/str str name=parsedquery+() ()/str str name=parsedquery_toString+() ()/str leaving out the explicit query: q={!dismax}WORKS str name=rawquerystring{!dismax}/str str name=querystring{!dismax}/str str name=parsedquery+MatchAllDocsQuery(*:*)/str str name=parsedquery_toString+*:*/str q={!defType=dismax}*:*WORKS str name=rawquerystring{!defType=dismax}*:*/str str name=querystring{!defType=dismax}*:*/str str name=parsedqueryMatchAllDocsQuery(*:*)/str str name=parsedquery_toString*:*/str leaving out the explicit query: q={!defType=dismax}DOESN'T WORK org.apache.lucene.queryParser.ParseException: Cannot parse '': Encountered EOF at line 1, column 0. On Jul 18, 2011, at 5:44 PM, Erick Erickson wrote: What are qf_dismax and pf_dismax? They are meaningless to Solr. Try adding debugQuery=on to your URL and you'll see the parsed query, which helps a lot here If you change these to the proper dismax values (qf and pf) you'll get beter results. As it is, I think you'll see output like: str name=parsedquery+() ()/str showing that your query isn't actually going against any fields Best Erick On Mon, Jul 18, 2011 at 7:15 PM, Naomi Dushay ndus...@stanford.edu wrote: I found a weird behavior with the Solr defType argument, perhaps with respect to default queries? defType=dismaxq=*:* no hits q={!defType=dismax}*:* hits defType=dismax hits Here is the request handler, which I explicitly indicate: requestHandler name=search class=solr.SearchHandler default=true lst name=defaults str name=defTypelucene/str !-- lucene params -- str name=dfhas_model_s/str str name=q.opAND/str !-- dismax params -- str name=mm 2-1 5-2 690% /str str name=q.alt*:*/str str name=qf_dismaxid^0.8 id_t^0.8 title_t^0.3 mods_t^0.2 text/str str name=pf_dismaxid^0.9 id_t^0.9 title_t^0.5 mods_t^0.2 text/str int name=ps100/int float name=tie0.01/float /requestHandler Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 - Naomi
Question on the appropriate software
Greetings, I'm interesting in having a server based personal document library with a few specific features and I'm trying to determine what the most appropriate tools are to build it. I have the following content which I wish to include in the archive: 1. A smallish collection of technical books in PDF format (around 100) 2. Many years of several different magazine subscriptions in PDF format (probably another 100 - 200 PDFs) 3. Several years of personal documents which were scanned in and converted to searchable PDF format (300 - 500 documents) 4. I also have local mirrors of several HTML based reference sites I'd like to have the ability to index all of this content and search it from a web form (so that I and a few other can reach it from multiple locations). Here are two examples of the functionality I'm looking for: Scenario 1. What was that software that has all the nutritional data and hooks up to some USDA database? I know I read about it in one of my Linux Journals last year. Now I'd like to be able to pull up the webform and search for nutrition USDA. I'd like to restrict the search to the Linux Journal magazine PDFs (or refine the results). I'd like results to contain context snippets with each search result. Finally most importantly, I'd like multiple results per PDF (or all occurrences). The last one is important so that I can actually quickly find the right issue (in case there is some advertisement in every issue for the last year that contains those terms). When I click on the desired result, the PDF is downloaded by my browser. Scenario 2. How much have I been paying for property taxes for the last five years again? (the bills are all scanned in) In this case I'd like to search for my property identification number (which is on the bills) and the results should show all the documents that have it, with context. Clicking on results downloads the documents. I assume this example is simple to achieve if example 1 can be done. So in general, my question is - can this be done in a fairly straight forward manner with Solr? Is there a more appropriate tool to be using (e.g. Nutch?). Also, I have looked high and low for a free, already baked solution which can do scenario 1 but haven't been able to find something - so if someone knows of such a thing, please let me know. Thanks! -Matt
use case: structured DB records with a bunch of related files
Greetings. I have a bunch of highly structured DB records, and I'm pretty clear on how to index those. However, each of those records may have any number of related documents (Word, Excel, PDF, PPT, etc.). All of this information will change over time. Can someone point me to a use case or some good reading to get me started on configuring Solr to index the DB records and files in such a way as to relate the two types of information? By relate, I mean that if there's a hit in a related file, then I need to show the user a link to the DB record as well as a link to the file. Thanks in advance. cheers, Travis -- ** *Travis Low, Director of Development* ** t...@4centurion.com* * *Centurion Research Solutions, LLC* *14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151* *703-956-6276 *•* 703-378-4474 (fax)* *http://www.centurionresearch.com* http://www.centurionresearch.com **The information contained in this email message is confidential and protected from disclosure. If you are not the intended recipient, any use or dissemination of this communication, including attachments, is strictly prohibited. If you received this email message in error, please delete it and immediately notify the sender. This email message and any attachments have been scanned and are believed to be free of malicious software and defects that might affect any computer system in which they are received and opened. No responsibility is accepted by Centurion Research Solutions, LLC for any loss or damage arising from the content of this email.
RE: Analysis page output vs. actually getting search matches, a discrepency?
Thanks Eric, Unfortunately I'm stemming the same on both sides, similar to the SOLR example settings for the text type field. Default search field is moreWords, as I want yes. Since I don't have this problem for any other mfg names at all in our index of almost 10 mm product docs, and this shows that is should match in my best estimation. Note: LucidKStemFilterFactory does not take 'Sterling' down to 'Sterl' in indexing nor searching, it stays as 'Sterling'. I have given up on this. I've decided it is just an unexplainable anomaly, and have solved it by inserting a LucidKStemFilterFactory and just modifying that word to it's searchable form before hitting the WhitespaceTokenizerFactory, which is kind of hackish but solves my problem at least. This seller only has a couple hundred cheap products on our site, so I have bigger fish to fry at this point. I've wasted too much time trying to chase this down. Cheers all Robi -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday, July 18, 2011 5:33 PM To: solr-user@lucene.apache.org Subject: Re: Analysis page output vs. actually getting search matches, a discrepency? Hmmm, is there any chance that you're stemming one place and not the other? And I infer from your output that your default search field is moreWords, is that true and expected? You might use luke or the TermsComponent to see what's actually in the index, I'm going to guess that you'll find sterl but not sterling as an indexed term and your problem is stemming, but that's a shot in the dark. Best Erick On Mon, Jul 18, 2011 at 5:37 PM, Robert Petersen rober...@buy.com wrote: OK I did what Hoss said, it only confirms I don't get a match when I should and that the query parser is doing the expected. Here are the details for one test sku. My analysis page output is shown in my email starting this thread and here is my query debug output. This absolutely should match but doesn't. Both the indexing side and the query side are splitting on case changes. This actually isn't a problem for any of our other content, for instance there is no issue searching for 'VideoSecu'. Their products come up fine in our searches regardless of casing in the query. Only SterlingTek's products seem to be causing us issues. Indexed content has camel case, stored in the text field 'moreWords': SterlingTek's NB-2LH 2 Pack Batteries + Charger Combo for Canon DC301 Search term not matching with camel case: SterlingTek's Search term matching if no case changes: Sterlingtek's Indexing: filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 preserveOriginal=0 / Searching: filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 preserveOriginal=0 / Thanks http://ssdevrh01.buy.com:8983/solr/1/select?indent=onversion=2.2q= SterlingTek%27sfq=start=0rows=1fl=*%2Cscoreqt=standardwt=standard debugQuery=onexplainOther=sku%3A216473417hl=onhl.fl=echoHandler=true adf response lst name=responseHeader int name=status0/int int name=QTime4/int str name=handlerorg.apache.solr.handler.component.SearchHandler/str lst name=params str name=explainOthersku:216473417/str str name=indenton/str str name=echoHandlertrue/str str name=hl.fl/ str name=wtstandard/str str name=hlon/str str name=rows1/str str name=version2.2/str str name=fl*,score/str str name=debugQueryon/str str name=start0/str str name=qSterlingTek's/str str name=qtstandard/str str name=fq/ /lst /lst result name=response numFound=0 start=0 maxScore=0.0/ lst name=highlighting/ lst name=debug str name=rawquerystringSterlingTek's/str str name=querystringSterlingTek's/str str name=parsedqueryPhraseQuery(moreWords:sterling tek)/str str name=parsedquery_toStringmoreWords:sterling tek/str lst name=explain/ str name=otherQuerysku:216473417/str lst name=explainOther str name=216473417 0.0 = fieldWeight(moreWords:sterling tek in 76351), product of: 0.0 = tf(phraseFreq=0.0) 19.502613 = idf(moreWords: sterling=1 tek=72) 0.15625 = fieldNorm(field=moreWords, doc=76351) /str /lst str name=QParserLuceneQParser /str arr name=filter_queries str/ /arr -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Friday, July 15, 2011 4:36 PM To: solr-user@lucene.apache.org Subject: Re: Analysis page output vs. actually getting search matches, a discrepency? : Subject: Analysis page output vs. actually getting search matches, : a discrepency? 99% of the time when people ask questions like this, it's because of confusion about how/when QueryParsing comes into play (as opposed to
Re: DIH full-import - when is commit() actally triggered?
Ahmet Arslan schrieb: I am running a full import with a quite plain data-config (a root entity with three sub entities ) from a jdbc datasource. This import is expected to add approximately 10 mio documents What I now see from my logfiles is, that a newSearcher event is fired about every five seconds. This is triggered by autoCommit in every 300,000 milli seconds. You need to remove maxTime30/maxTime to disable this mechanism. Thanks Ahmet, indeed I had to remove the maxDocs Entry. So now a commit happens only every five minutes. -- mit freundlichem Gruß, Frank Wesemann Fotofinder GmbH USt-IdNr. DE812854514 Software EntwicklungWeb: http://www.fotofinder.com/ Potsdamer Str. 96 Tel: +49 30 25 79 28 90 10785 BerlinFax: +49 30 25 79 28 999 Sitz: Berlin Amtsgericht Berlin Charlottenburg (HRB 73099) Geschäftsführer: Ali Paczensky
Geospatial queries in Solr
I have looked at the code being shared on the lucene-spatial-playground and was wondering if anyone could provide some details as to its state. Specifically I'm looking to add geospatial support to my application based on a user provided polygon, is this currently possible using this extension?
Re: Using FieldCache in SolrIndexSearcher - crazy idea?
: Quite probably ... you typically can't assume that a FieldCache can be : constructed for *any* field, but it should be a safe assumption for the : uniqueKey field, so for that initial request of the mutiphase distributed : search it's quite possible it would speed things up. : : Ah, thanks Hoss - I had meant to respond to the original email, but : then I lost track of it. : : Via pseudo-fields, we actually already have the ability to retrieve : values via FieldCache. : fl=id:{!func}id isn't that kind of orthoginal to the question though? ... a user can use the new psuedo-field functionality to request values from the FieldCache instead of stored fields, but specificly in the case of distributed search, when the first request is only asking for the uniqueKey values and scores, shouldn't that use the FieldCache to get those values? (w/o the user needing to jumpt thorugh hoops in how the request is made/configured) -Hoss
Re: Using FieldCache in SolrIndexSearcher - crazy idea?
On Tue, Jul 19, 2011 at 3:20 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Quite probably ... you typically can't assume that a FieldCache can be : constructed for *any* field, but it should be a safe assumption for the : uniqueKey field, so for that initial request of the mutiphase distributed : search it's quite possible it would speed things up. : : Ah, thanks Hoss - I had meant to respond to the original email, but : then I lost track of it. : : Via pseudo-fields, we actually already have the ability to retrieve : values via FieldCache. : fl=id:{!func}id isn't that kind of orthoginal to the question though? ... a user can use the new psuedo-field functionality to request values from the FieldCache instead of stored fields, but specificly in the case of distributed search, when the first request is only asking for the uniqueKey values and scores, shouldn't that use the FieldCache to get those values? (w/o the user needing to jumpt thorugh hoops in how the request is made/configured) Well, I was pointing out that distributed search could be easily modified to use the field-cache by changing id to id:{!func}id But I'm not sure we should do that by default - the memory of a full fieldCache entry is non-trivial for some people. Using a CSF id field would be better I think (the type were it doesn't populate a fieldcache entry). -Yonik http://www.lucidimagination.com
Re: Geospatial queries in Solr
Hi Jamie. I work on LSP; it can index polygons and query for them. Although the capability is there, we have more testing benchmarking to do, and then we need to put together a tutorial to explain how to use it at the Solr layer. I recently cleaned up the READMEs a bit. Try downloading the trunk codebase, and follow the README. It points to another README which shows off a demo webapp. At the conclusion of this, you'll need to examine the tests and webapp a bit to figure out how to apply it in your app. We don't yet have a tutorial as the framework has been in flux although it has stabilized a good deal. Oh... by the way, this works off of Lucene/Solr trunk. Within the past week there was a major change to trunk and LSP won't compile until we make updates. Either Ryan McKinley or I will get to that by the end of the week. So unless you have access to 2-week old maven artifacts of Lucene/Solr, you're stuck right now. ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/ On Jul 19, 2011, at 3:03 PM, Jamie Johnson wrote: I have looked at the code being shared on the lucene-spatial-playground and was wondering if anyone could provide some details as to its state. Specifically I'm looking to add geospatial support to my application based on a user provided polygon, is this currently possible using this extension?
RE: Analysis page output vs. actually getting search matches, a discrepency?
Um sorry for any confusion. I meant to say I solved my issue by inserting a charFilter before the WhitespaceTokenizerFactory to convert my problem word to a searchable form. I had a cut n paste malfunction below. Thanks guys. -Original Message- From: Robert Petersen [mailto:rober...@buy.com] Sent: Tuesday, July 19, 2011 11:06 AM To: solr-user@lucene.apache.org Subject: RE: Analysis page output vs. actually getting search matches, a discrepency? Thanks Eric, Unfortunately I'm stemming the same on both sides, similar to the SOLR example settings for the text type field. Default search field is moreWords, as I want yes. Since I don't have this problem for any other mfg names at all in our index of almost 10 mm product docs, and this shows that is should match in my best estimation. Note: LucidKStemFilterFactory does not take 'Sterling' down to 'Sterl' in indexing nor searching, it stays as 'Sterling'. I have given up on this. I've decided it is just an unexplainable anomaly, and have solved it by inserting a LucidKStemFilterFactory and just modifying that word to it's searchable form before hitting the WhitespaceTokenizerFactory, which is kind of hackish but solves my problem at least. This seller only has a couple hundred cheap products on our site, so I have bigger fish to fry at this point. I've wasted too much time trying to chase this down. Cheers all Robi -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday, July 18, 2011 5:33 PM To: solr-user@lucene.apache.org Subject: Re: Analysis page output vs. actually getting search matches, a discrepency? Hmmm, is there any chance that you're stemming one place and not the other? And I infer from your output that your default search field is moreWords, is that true and expected? You might use luke or the TermsComponent to see what's actually in the index, I'm going to guess that you'll find sterl but not sterling as an indexed term and your problem is stemming, but that's a shot in the dark. Best Erick On Mon, Jul 18, 2011 at 5:37 PM, Robert Petersen rober...@buy.com wrote: OK I did what Hoss said, it only confirms I don't get a match when I should and that the query parser is doing the expected. Here are the details for one test sku. My analysis page output is shown in my email starting this thread and here is my query debug output. This absolutely should match but doesn't. Both the indexing side and the query side are splitting on case changes. This actually isn't a problem for any of our other content, for instance there is no issue searching for 'VideoSecu'. Their products come up fine in our searches regardless of casing in the query. Only SterlingTek's products seem to be causing us issues. Indexed content has camel case, stored in the text field 'moreWords': SterlingTek's NB-2LH 2 Pack Batteries + Charger Combo for Canon DC301 Search term not matching with camel case: SterlingTek's Search term matching if no case changes: Sterlingtek's Indexing: filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 preserveOriginal=0 / Searching: filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 preserveOriginal=0 / Thanks http://ssdevrh01.buy.com:8983/solr/1/select?indent=onversion=2.2q= SterlingTek%27sfq=start=0rows=1fl=*%2Cscoreqt=standardwt=standard debugQuery=onexplainOther=sku%3A216473417hl=onhl.fl=echoHandler=true adf response lst name=responseHeader int name=status0/int int name=QTime4/int str name=handlerorg.apache.solr.handler.component.SearchHandler/str lst name=params str name=explainOthersku:216473417/str str name=indenton/str str name=echoHandlertrue/str str name=hl.fl/ str name=wtstandard/str str name=hlon/str str name=rows1/str str name=version2.2/str str name=fl*,score/str str name=debugQueryon/str str name=start0/str str name=qSterlingTek's/str str name=qtstandard/str str name=fq/ /lst /lst result name=response numFound=0 start=0 maxScore=0.0/ lst name=highlighting/ lst name=debug str name=rawquerystringSterlingTek's/str str name=querystringSterlingTek's/str str name=parsedqueryPhraseQuery(moreWords:sterling tek)/str str name=parsedquery_toStringmoreWords:sterling tek/str lst name=explain/ str name=otherQuerysku:216473417/str lst name=explainOther str name=216473417 0.0 = fieldWeight(moreWords:sterling tek in 76351), product of: 0.0 = tf(phraseFreq=0.0) 19.502613 = idf(moreWords: sterling=1 tek=72) 0.15625 = fieldNorm(field=moreWords, doc=76351) /str /lst str name=QParserLuceneQParser /str arr name=filter_queries str/ /arr -Original
Re: Solr Request Logging
: I am using the trunk version of solr and I am getting a ton more logging : information than I really care to see and definitely more than 1.4, but : I cant really see a way to change it. http://wiki.apache.org/solr/SolrLogging -Hoss
Using functions in fq
My documents have two prices retail_price and current_price. I want to get products which have a sale of x%, the x is dynamic and can be specified by the user. I was trying to achieve this by using fq. If I want all sony tv's that are at least 20% off, I want to write something like q=sony tvfq=current_price:[0 TO product(retail_price,0.80)] this does not work as the function is not expected in fq. how else can I achieve this? Thanks
Re: Using functions in fq
On Tue, Jul 19, 2011 at 6:49 PM, solr nps solr...@gmail.com wrote: My documents have two prices retail_price and current_price. I want to get products which have a sale of x%, the x is dynamic and can be specified by the user. I was trying to achieve this by using fq. If I want all sony tv's that are at least 20% off, I want to write something like q=sony tvfq=current_price:[0 TO product(retail_price,0.80)] this does not work as the function is not expected in fq. how else can I achieve this? The frange query parser may do what you want. http://www.lucidimagination.com/blog/2009/07/06/ranges-over-functions-in-solr-14/ fq={!frange l=0 u=0.8}div(current_price, retail_price) -Yonik http://www.lucidimagination.com
Re: omitNorms and omitTermFreqAndPosition
As a general rule, if you are looking at the score explanations from debugQuery, and you don't understand why you get the scores thta you do, then you should actaully send the score explanations along with your email when you ask why it doesn't match what you expect. In the absense of any other information to go on, i'm going to guess that the reason for the differnet scores is that category may be a multiValued field, and some docs are matching multiple clauses of your query -- so the coord factor of hte boolean query comes into play (rewarding docs for matching multiple clauses) ... but as i said, i can't be certain because you didn't actually tell us what hte score explanation said. assuming i'm right, and assuming you want all the docs to score the same, or for the score to be driven by some other factor besides the relevancy of the query you are sending then anothe general rule comes into play: if you don't care about hte score of a query, then that query probably makes more sense as a filter fq=category(X OR Y OR Z)q=...whatever, maybe *:*... : i have a problem with omitTermFreqAndPosition and omitNorms. : In my schema i have some fields with these property set True. : for example the field category : : then i make a query like: : select?q=category:(x OR y or Z) : : it returns all docs that have as category x or y or z. : : i make a debugQuery=on to see the score and i see every docs have different : score. : why? the tf is calculated and, also normalization. why? they should be have : the same score.. : cause it's not a full-text search but i search only docs that are inside a : group. stop : Thank you very much -Hoss
Re: Using functions in fq
I read about frange but didn't think about using it like you mentioned :) Thank you. On Tue, Jul 19, 2011 at 4:12 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Tue, Jul 19, 2011 at 6:49 PM, solr nps solr...@gmail.com wrote: My documents have two prices retail_price and current_price. I want to get products which have a sale of x%, the x is dynamic and can be specified by the user. I was trying to achieve this by using fq. If I want all sony tv's that are at least 20% off, I want to write something like q=sony tvfq=current_price:[0 TO product(retail_price,0.80)] this does not work as the function is not expected in fq. how else can I achieve this? The frange query parser may do what you want. http://www.lucidimagination.com/blog/2009/07/06/ranges-over-functions-in-solr-14/ fq={!frange l=0 u=0.8}div(current_price, retail_price) -Yonik http://www.lucidimagination.com
solr chewing up system swap
I have arrived a site where solr is being run under jetty. It is ubuntu 10.04 i386 hosted on AWS (xen). Our combined solr index size is a mere 21 MB. What I am seeing that solr is steadily consuming about 150 MB of swap per week and won't relinquish it until sunspot is restarted. Oddly, Jetty doesn't seem to have any memory parameters to speak up supplied to it, which may very well be the problem in that no garbage collection is taking place, but I wanted to see if anyone else who uses solr/jetty has encountered this and if they added some memory parameters to the jetty's java args. Thanks, Matthew -- View this message in context: http://lucene.472066.n3.nabble.com/solr-chewing-up-system-swap-tp3184083p3184083.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: any detailed tutorials on plugin development?
gosh sorry for my typo in msg first... i just realized it now... well anyway... i would like to find a detailed tutorial about how to implement an analyzer or a request handler plugin... but all i have got is nothing from the documentation of solr wiki... - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/any-detailed-tutorials-on-plugin-development-tp3177821p3184160.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: defType argument weirdness
On Tue, Jul 19, 2011 at 1:25 PM, Naomi Dushay ndus...@stanford.edu wrote: Regardless, I thought that defType=dismaxq=*:* is supposed to be equivalent to q={!defType=dismax}*:* and also equivalent to q={!dismax}*:* Not quite - there is a very subtle distinction. {!dismax} is short for {!type=dismax}, the type of the actual query, and this may not be overridden. The defType local param is only the default type for sub-queries (as opposed to the current query). It's useful in conjunction with the query or nested query qparser: http://lucene.apache.org/solr/api/org/apache/solr/search/NestedQParserPlugin.html -Yonik http://www.lucidimagination.com
Re: How could I monitor solr cache
I working on dev performance turning. I am looking for a method that could record cache status into log files. On Tue, Jul 19, 2011 at 2:24 PM, Ahmet Arslan iori...@yahoo.com wrote: I am wondering how could I get solr cache running status. I know there is a JMX containing those information. Just want to know what tool or method do you make use of to monitor cache, in order to enhance performance or detect issue. You might find this interesting : http://sematext.com/spm/solr-performance-monitoring/index.html http://sematext.com/spm/index.html
Re: I found a sorting bug in solr/lucene
According to that bug list, there are other characters that break the sorting function. Is there a list of safe characters I can use as a delimiter? On Mon, Jul 18, 2011 at 1:31 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : When I try to sort by a column with a colon in it like : scores:rails_f, solr has cutoff the column name from the colon : forward so scores:rails_f becomes scores Yes, this bug was recently reported against the 3.x line, but no fix has yet been identified... https://issues.apache.org/jira/browse/SOLR-2606 : Can anyone else confirm this is a bug. Is this in lucene or solr? I believe : the issue resides in solr. it's specific to the param parsing, likely due to the addition of supporting functions in the sort param. -Hoss -- - sent from my mobile 6176064373
embeded solrj doesn't refresh index
Hi, I am using embedded solrj. After I add new doc to the index, I can see the changes through solr web, but not from embedded solrj. But after I restart the embedded solrj, I do see the changes. It works as if there was a cache. Anyone knows the problem? Thanks. Jianbin
how to get solr core information using solrj
hi all, Our solr server contains two cores:core0,core1,and they both works well. Now I'am trying to find a way to get information about core0 and core1. Can solrj or other api do this? thanks very much.
RE: defType argument weirdness
Is it generally recognized that this terminology is confusing, or is it just me? I do understand what they do (at least well enough to use them), but I find it confusing that it's called defType as a main param, but type in a LocalParam, when to me they both seem to do the same thing -- which I think should probably be called 'queryParser' rather than 'type' or 'defType'. That's what they do, choose the query parser for the query they apply to, right? (And if they did/do different things, 'defType' vs 'type' doesn't really provide much hint as to what!) These are both the same, right, but with different param names depending on position: defType=luceneq=foo q={!type=lucene}foo # uri escaping not shown (and then there's 'qt', often confused with defType/type by newbies, since they guess it stands for 'query type', but which should probably actually have been called 'requestHandler'/'rh' instead, since that's what it actually chooses, no? It gets very confusing). If it's generally recognized it's confusing and perhaps a somewhat inconsistent mental model being implied, I wonder if there'd be any interest in renaming these to be more clear, leaving the old ones as aliases/synonyms for backwards compatibility (perhaps with a long deprecation period, or perhaps existing forever). I know it was very confusing to me to keep track of these parameters and what they did for quite a while, and still trips me up from time to time. Jonathan From: ysee...@gmail.com [ysee...@gmail.com] on behalf of Yonik Seeley [yo...@lucidimagination.com] Sent: Tuesday, July 19, 2011 9:40 PM To: solr-user@lucene.apache.org Subject: Re: defType argument weirdness On Tue, Jul 19, 2011 at 1:25 PM, Naomi Dushay ndus...@stanford.edu wrote: Regardless, I thought that defType=dismaxq=*:* is supposed to be equivalent to q={!defType=dismax}*:* and also equivalent to q={!dismax}*:* Not quite - there is a very subtle distinction. {!dismax} is short for {!type=dismax}, the type of the actual query, and this may not be overridden. The defType local param is only the default type for sub-queries (as opposed to the current query). It's useful in conjunction with the query or nested query qparser: http://lucene.apache.org/solr/api/org/apache/solr/search/NestedQParserPlugin.html -Yonik http://www.lucidimagination.com
Re: How to find whether solr server is running or not
François Schiettecatte, how will i get 200 status code. I am getting json response from solr server, as *$.getJSON(http://192.168.1.9:8983/solr/db/select/?qt=dismaxwt=jsonstart=0rows=100q=eleganthl=truehl.fl=texthl.usePhraseHighlighter=truesort= scorejson.wrf=?, function(result){ * but if solr server is not running this code does not execute...how can i check than that server is not running - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3184556.html Sent from the Solr - User mailing list archive at Nabble.com.