Re: Errors when implementing VelocityResponseWriter
Well, you need to specify a path, relative or absolute, that points to the directory where the Velocity JAR file resides. I'm not sure, at this point, exactly what you're missing. But it should be fairly straightforward. Solr startup logs the libraries it loads, so maybe that is helpful info. 1.4.1 - does it support lib? (I'm not sure off the top of my head) Erik On Feb 15, 2011, at 12:04 , McGibbney, Lewis John wrote: Hi Erik thank you for the reply I have placed all velocity jar files in my /lib directory. As explained below, I have added relevant configuration to solrconfig.xml, I am just wondering if the config instructions in the wiki are missing something? Can anyone advise on this. As you mentioned, my terminal output suggests that the VelocityResponseWriter class is not present and therefore the velocity jar is not present... however this is not the case. I have specified lib dir=./lib / in solrconfig.xml, is this enough or do I need to use an exact path. I have already tried specifying an exact path and it does not seem to work either. Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 15 February 2011 06:48 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter looks like you're missing the Velocity JAR. It needs to be in some Solr visible lib directory. With 1.4.1 you'll need to put it in solr-home/lib. In later versions, you can use the lib elements in solrconfig.xml to point to other directories. Erik On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: Hello List, I am currently trying to implement the above in Solr 1.4.1. Having moved velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my webapp /lib directory, then adding queryResponseWriter name=blah and class=blah followed by the responseHandler specifics I am shown the following terminal output. I also added lib dir=./lib / in solrconfig. Can anyone suggest what I have not included in the config that is still required? Thanks Lewis SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.response.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) at org.apache.solr.core.SolrCore.init(SolrCore.java:547) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.solr.response.VelocityResponseWriter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) ... 21 more Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the
Re: Question regarding inner entity in dataimporthandler
Greg, a few things, i noticed while reading your post: 1) you don't need an field-assignment for fields where the name does not change, you can just skip that. field column=creationDate name=creationDate / - just to name one example 2) TemplateTransformer (http://wiki.apache.org/solr/DataImportHandler#TemplateTransformer) has no name-attribute, just column and template 3) again TemplateTransformer - never tried it out, but it should return 'doc' in your case, when ${document.documentId} has no value .. it should not work like MySQL's CONCAT() which returns null, if at least one argument is null. but actually i see no reason for using RegexTransformer?! 4) Your Sub-Entity-Problem is more or less obviously ;) If ${document.categoryId} is empty (regardless an empty string or just null) your Query is invalid. What will work, wrap the var with Quotes (select field1, field2 from table where field3 = '$variable') then it will work .. w/ or w/o an value Hope that Helps, Stefan On Tue, Feb 15, 2011 at 8:13 PM, Greg Georges greg.geor...@biztree.com wrote: OK, I think I found some information, supposedly TemplateTransformer will return an empty string if the value of a variable is null. Some people say to use the regex transformer instead, can anyone clarify this? Thanks -Original Message- From: Greg Georges [mailto:greg.geor...@biztree.com] Sent: 15 février 2011 13:38 To: solr-user@lucene.apache.org Subject: Question regarding inner entity in dataimporthandler Hello all, I have searched the forums for the question I am about to ask, never found any concrete results. This is my case. I am defining the data config file with the document and entity tags. I define with success a basic entity mapped to my mysql database, and I then add some inner entities. The problem I have is with the one-to-one relationship I have between my document entity and its documentcategory entity. In my document table, the documentcategory foreign key is optional. Here is my mapping document entity name=document query=select DocumentID, DocumentID as documentId, CreationDate as creationDate, DocumentName as documentName, Description as description, DescriptionAbstract as descriptionAbstract, Downloads as downloads, Downloads30days as downloads30days, Downloads90days as downloads90days, PageViews as pageViews, PageViews30days as PageViews30days, PageViews90days as pageViews90days, Bookmarks as bookmarks, Bookmarks30days as bookmarks30days, Bookmarks90days as bookmarks90days, DocumentRating as documentRating, DocumentRating30days as documentRating30days, DocumentRating90days as documentRating90days, LicenseType as licenseType, BizTreeLibraryDoc as bizTreeLibraryDoc, DocFormat as docFormat, Price as price, CreatedByMemberID as memberId, DocumentCategoryID as categoryId, IsFreeDoc as isFreeDoc from document transformer=TemplateTransformer field column=id name=id template=doc${document.documentId} / field column=documentId name=docId/ field column=creationDate name=creationDate / field column=documentName name=documentName / field column=description name=description / field column=descriptionAbstract name=descriptionAbstract / field column=downloads name=downloads / field column=downloads30days name=downloads30days / field column=downloads90days name=downloads90days / field column=pageViews name=pageViews / field column=pageViews30days name=pageViews30days / field column=pageViews90days name=pageViews90days / field column=bookmarks name=bookmarks / field column=bookmarks30days name=bookmarks30days / field column=bookmarks90days name=bookmarks90days / field column=documentRating name=documentRating /
Re: clustering with tomcat
On Debian you can edit /etc/default/tomcat6 hi, i am using solr1.4 with apache tomcat. to enable the clustering feature i follow the link http://wiki.apache.org/solr/ClusteringComponent Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. after that which steps be will required.
Re: How to use XML parser in DIH for a database?
What about using http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor ? On Wed, Feb 16, 2011 at 10:08 AM, Bill Bell billnb...@gmail.com wrote: I am using DIH. I am trying to take a column in a SQL Server database that returns an XML string and use Xpath to get data out of it. I noticed that Xpath works with external files, how do I get it to work with a database? I need something like //insur[5][@name='Blue Cross'] Thanks.
Re: clustering with tomcat
On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote: On Debian you can edit /etc/default/tomcat6 hi, i am using solr1.4 with apache tomcat. to enable the clustering feature i follow the link http://wiki.apache.org/solr/ClusteringComponent Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. after that which steps be will required. i did nt understand can u plz elaborate how to do this
Re: clustering with tomcat
What distro are you using? On at least Debian systems you can put the - Dsolr.clustering.enabled=true environment variable in /etc/default/tomcat6. You can also, of course, remove all occurences of ${solr.clustering.enabled} from you solrconfig.xml On Wednesday 16 February 2011 10:52:35 Isha Garg wrote: On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote: On Debian you can edit /etc/default/tomcat6 hi, i am using solr1.4 with apache tomcat. to enable the clustering feature i follow the link http://wiki.apache.org/solr/ClusteringComponent Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. after that which steps be will required. i did nt understand can u plz elaborate how to do this -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: clustering with tomcat
On Wednesday 16 February 2011 03:32 PM, Markus Jelsma wrote: What distro are you using? On at least Debian systems you can put the - Dsolr.clustering.enabled=true environment variable in /etc/default/tomcat6. You can also, of course, remove all occurences of ${solr.clustering.enabled} from you solrconfig.xml On Wednesday 16 February 2011 10:52:35 Isha Garg wrote: On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote: On Debian you can edit /etc/default/tomcat6 hi, i am using solr1.4 with apache tomcat. to enable the clustering feature i follow the link http://wiki.apache.org/solr/ClusteringComponent Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. after that which steps be will required. i did nt understand can u plz elaborate how to do this I have embed solr with apache-tomcat5.5 on linux i am getting error HTTP Status 500 - Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in null - java.lang.NoSuchMethodError: org.carrot2.util.pool.SoftUnboundedPool.init(Lorg/carrot2/util/pool/IInstantiationListener;Lorg/carrot2/util/pool/IActivationListener;Lorg/carrot2/util/pool/IPassivationListener;Lorg/carrot2/util/pool/IDisposalListener;)V at org.carrot2.core.CachingController.init(CachingController.java:189) at org.carrot2.core.CachingController.init(CachingController.java:115) at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.init(CarrotClusteringEngine.java:94) at org.apache.solr.handler.clustering.ClusteringComponent.inform(ClusteringComponent.java:123) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486) at org.apache.solr.core.SolrCore.init(SolrCore.java:588) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3666) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4258) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:980) at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:943) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:500) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1203) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:319) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Now can u tell me what to do I am not familiar with distro and Debian systems
Snappull failed
Hi, There are a couple of Solr 1.4.1 slaves, all doing the same. Pulling some snaps, handling some queries, nothing exciting. But can anyone explain a sudden nightly occurence of this error? 2011-02-16 01:23:04,527 ERROR [solr.handler.ReplicationHandler] - [pool-238- thread-1] - : SnapPull failed org.apache.solr.common.SolrException: Unable to download _gv.frq completely. Downloaded 209715200!=583644834 at org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java:1026) at org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:906) at org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:541) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:294) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) All i know is that it was unable to download but the reason eludes me. Sometimes, a machine rolls out many of these errors and increasing the index size because it can't handle the already downloaded data. Cheers, -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: clustering with tomcat
I have no idea, seems you haven't compiled Carrot2 or haven't included all jars. On Wednesday 16 February 2011 11:29:30 Isha Garg wrote: On Wednesday 16 February 2011 03:32 PM, Markus Jelsma wrote: What distro are you using? On at least Debian systems you can put the - Dsolr.clustering.enabled=true environment variable in /etc/default/tomcat6. You can also, of course, remove all occurences of ${solr.clustering.enabled} from you solrconfig.xml On Wednesday 16 February 2011 10:52:35 Isha Garg wrote: On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote: On Debian you can edit /etc/default/tomcat6 hi, i am using solr1.4 with apache tomcat. to enable the clustering feature i follow the link http://wiki.apache.org/solr/ClusteringComponent Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. after that which steps be will required. i did nt understand can u plz elaborate how to do this I have embed solr with apache-tomcat5.5 on linux i am getting error HTTP Status 500 - Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in null - java.lang.NoSuchMethodError: org.carrot2.util.pool.SoftUnboundedPool.init(Lorg/carrot2/util/pool/IInst antiationListener;Lorg/carrot2/util/pool/IActivationListener;Lorg/carrot2/u til/pool/IPassivationListener;Lorg/carrot2/util/pool/IDisposalListener;)V at org.carrot2.core.CachingController.init(CachingController.java:189) at org.carrot2.core.CachingController.init(CachingController.java:115) at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.init(Carr otClusteringEngine.java:94) at org.apache.solr.handler.clustering.ClusteringComponent.inform(ClusteringCom ponent.java:123) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486) at org.apache.solr.core.SolrCore.init(SolrCore.java:588) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.jav a:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilte rConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFi lterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterCo nfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3 666) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4258) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java :760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:980) at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:94 3) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:500) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1203) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:319) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSuppo rt.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:3 9) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp l.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Now can u tell me what to do I am not familiar with distro and Debian systems -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: Solr not Available with Ping when DocBuilder is running
my error is, that solr is not reachable with a ping. ping over php-HttpRequest ... - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-not-Available-with-Ping-when-DocBuilder-is-running-tp2500214p2508686.html Sent from the Solr - User mailing list archive at Nabble.com.
strange search-behavior over dynamic field
Hello. i have the field reason_1 and reason_2. this two fields is in my schema one dynamicField: dynamicField name=reason_* type=textgen indexed=true stored=false/ i copy this field in my text-default search field: copyField source=reason_* dest=text/ And in a new field reason: copyField source=reason_* dest=reason/ --- if i have two documents with the exactly same value in the reason_1 field, solr can only find ONE document, not both. why ? is it a behavior of solr or a wrong usage of me ? - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2508711.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrException x undefined field
Hi, We do have a validation layer for other purposes, but this layer do not know about the fields and i would not like to replicate this configuration. Is there any way to query the solr core about declared fields? thanks, [ ]'s Leonardo da S. Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^ On Wed, Feb 16, 2011 at 9:16 AM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hi, If you have an Application layer and are not directly hitting Solr then maybe this functionality could be implemented in Validation layer prior to making the Solr call ? Cheers, - Savvas On 16 February 2011 10:23, Leonardo Souza leonardo...@gmail.com wrote: Hi, We are using solr 1.4 in a big project. Now it's time to make some improvements. We use the standard query parser and we would like to handle the misspelled field names. The problem is that SolrException can not help to flag the problem appropriately because this exception is used for other problems during the query processing. I found some clue in SolrException.ErrorCode enumeration but did not help. thanks in advance! [ ]'s Leonardo Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^
Re: strange search-behavior over dynamic field
What does the admin page show you are the contents of your index for reason_1? I suspect you don't really have two documents with the same value. Perhaps you give them both the same uniqueKey and one overwrites the other. Perhaps you didn't commit the second. Perhaps But you haven't provided enough information to go on here. Where is the query (don't forget debugQuery=on). Where is the input? Best Erick On Wed, Feb 16, 2011 at 6:26 AM, stockii stock.jo...@googlemail.com wrote: Hello. i have the field reason_1 and reason_2. this two fields is in my schema one dynamicField: dynamicField name=reason_* type=textgen indexed=true stored=false/ i copy this field in my text-default search field: copyField source=reason_* dest=text/ And in a new field reason: copyField source=reason_* dest=reason/ --- if i have two documents with the exactly same value in the reason_1 field, solr can only find ONE document, not both. why ? is it a behavior of solr or a wrong usage of me ? - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2508711.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrException x undefined field
Hi Stefan, LukeRequestHandler could be a good solution, there's a lot of useful info. This handler works with version 1.4x? thanks [ ]'s Leonardo da S. Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^ On Wed, Feb 16, 2011 at 10:41 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: Maybe the http://wiki.apache.org/solr/LukeRequestHandler ? On Wed, Feb 16, 2011 at 1:34 PM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: There is probably a better and more robust way of doing this, but you could make a request to /solr/admin/file/?file=schema.xml and parse the returned xml? Does anyone else know of a better way to query Solr for its schema? Thanks, - Savvas On 16 February 2011 11:34, Leonardo Souza leonardo...@gmail.com wrote: Hi, We do have a validation layer for other purposes, but this layer do not know about the fields and i would not like to replicate this configuration. Is there any way to query the solr core about declared fields? thanks, [ ]'s Leonardo da S. Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^ On Wed, Feb 16, 2011 at 9:16 AM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hi, If you have an Application layer and are not directly hitting Solr then maybe this functionality could be implemented in Validation layer prior to making the Solr call ? Cheers, - Savvas On 16 February 2011 10:23, Leonardo Souza leonardo...@gmail.com wrote: Hi, We are using solr 1.4 in a big project. Now it's time to make some improvements. We use the standard query parser and we would like to handle the misspelled field names. The problem is that SolrException can not help to flag the problem appropriately because this exception is used for other problems during the query processing. I found some clue in SolrException.ErrorCode enumeration but did not help. thanks in advance! [ ]'s Leonardo Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^
Spatial Search
Hi, I have very typical problem. From one of my applications I get data in the format add doc field name=address Some Address/field field name=zipcode1/field /doc /add How can I implement a spatial search for this data? Any ideas are welcome Regards, Nishant Anand This message and any attachments are solely for the intended recipient and may contain Birlasoft confidential or privileged information. If you are not the intended recipient,any disclosure, copying, use, or distribution of the information included in this message and any attachments is prohibited. If you have received this communication in error, please notify us by reply e-mail(mailad...@birlasoft.com) immediately and permanently delete this message and any attachments. Thank you.
Re: SolrCloud - Example C not working
On Wed, Feb 16, 2011 at 3:57 AM, Thorsten Scherler scher...@gmail.com wrote: On Tue, 2011-02-15 at 09:59 -0500, Yonik Seeley wrote: On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler thors...@apache.org wrote: Hi all, I followed http://wiki.apache.org/solr/SolrCloud and everything worked fine till I tried Example C:. Verified. I just tried and it failed for me too. Hi Yonik, thanks for verifying. :) Should I open an issue and move the thread to the dev list? Yeah, thanks! -Yonik http://lucidimagination.com
Re: SolrException x undefined field
Regarding the Wiki-Page .. since 1.2 .. so, yes, should :) On Wed, Feb 16, 2011 at 1:55 PM, Leonardo Souza leonardo...@gmail.com wrote: Hi Stefan, LukeRequestHandler could be a good solution, there's a lot of useful info. This handler works with version 1.4x? thanks [ ]'s Leonardo da S. Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^ On Wed, Feb 16, 2011 at 10:41 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: Maybe the http://wiki.apache.org/solr/LukeRequestHandler ? On Wed, Feb 16, 2011 at 1:34 PM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: There is probably a better and more robust way of doing this, but you could make a request to /solr/admin/file/?file=schema.xml and parse the returned xml? Does anyone else know of a better way to query Solr for its schema? Thanks, - Savvas On 16 February 2011 11:34, Leonardo Souza leonardo...@gmail.com wrote: Hi, We do have a validation layer for other purposes, but this layer do not know about the fields and i would not like to replicate this configuration. Is there any way to query the solr core about declared fields? thanks, [ ]'s Leonardo da S. Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^ On Wed, Feb 16, 2011 at 9:16 AM, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: Hi, If you have an Application layer and are not directly hitting Solr then maybe this functionality could be implemented in Validation layer prior to making the Solr call ? Cheers, - Savvas On 16 February 2011 10:23, Leonardo Souza leonardo...@gmail.com wrote: Hi, We are using solr 1.4 in a big project. Now it's time to make some improvements. We use the standard query parser and we would like to handle the misspelled field names. The problem is that SolrException can not help to flag the problem appropriately because this exception is used for other problems during the query processing. I found some clue in SolrException.ErrorCode enumeration but did not help. thanks in advance! [ ]'s Leonardo Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^
Re: Spatial Search
Nishant, correct me if i'm wrong .. but spatial search normally requires geo-information, like latitude and longitude to work? so you would need to fetch this information before putting them into solr. the google maps api offers http://code.google.com/intl/all/apis/maps/documentation/geocoding/#ReverseGeocoding (for example) Regards Stefan On Wed, Feb 16, 2011 at 1:58 PM, nishant.an...@birlasoft.com wrote: Hi, I have very typical problem. From one of my applications I get data in the format add doc field name=address Some Address/field field name=zipcode1/field /doc /add How can I implement a spatial search for this data? Any ideas are welcome Regards, Nishant Anand This message and any attachments are solely for the intended recipient and may contain Birlasoft confidential or privileged information. If you are not the intended recipient,any disclosure, copying, use, or distribution of the information included in this message and any attachments is prohibited. If you have received this communication in error, please notify us by reply e-mail(mailad...@birlasoft.com) immediately and permanently delete this message and any attachments. Thank you.
Re: Triggering optimise based on time interval
Renaud, just because i'm interested in .. what are your concerns about using cron for that? Stefan On Wed, Feb 16, 2011 at 2:12 PM, Renaud Delbru renaud.del...@deri.org wrote: Hi, We would like to trigger an optimise every x hours. From what I can see, there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing. We have a master-slave configuration. The masters are tuned for fast indexing (large merge factor). However, for the moment, the master index is replicated as it is to the slaves, and therefore it does not provide very fast query time. Our idea was - to configure the replication so that it only happens after an optimise, and - schedule a partial optimise in order to reduce the number of segments every x hours for faster querying. We do not want to rely on cron job for executing the partial optimise every x hours, but we would prefer to configure this directly within the solr config. Our first idea was to create a SolrEventListener, that will be postCommit triggered, and that will be in charge of executing an optimise at regular time interval. Is this a good approach ? Or is there other solutions to achieve this ? Thanks, -- Renaud Delbru
Re: strange search-behavior over dynamic field
the fieldType is textgen. - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2509166.html Sent from the Solr - User mailing list archive at Nabble.com.
Triggering optimise based on time interval
Hi, We would like to trigger an optimise every x hours. From what I can see, there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing. We have a master-slave configuration. The masters are tuned for fast indexing (large merge factor). However, for the moment, the master index is replicated as it is to the slaves, and therefore it does not provide very fast query time. Our idea was - to configure the replication so that it only happens after an optimise, and - schedule a partial optimise in order to reduce the number of segments every x hours for faster querying. We do not want to rely on cron job for executing the partial optimise every x hours, but we would prefer to configure this directly within the solr config. Our first idea was to create a SolrEventListener, that will be postCommit triggered, and that will be in charge of executing an optimise at regular time interval. Is this a good approach ? Or is there other solutions to achieve this ? Thanks, -- Renaud Delbru
Re: Triggering optimise based on time interval
Mainly technical administration effort. We are trying to have a solr packaging that - minimises the effort to deploy the system on a machine. - reduces errors when deploying - centralised the logic of the Solr system Ideally, we would like to have a central place (e.g., solrconfig) where the logic of the system is configured. In that case, the system administrator does not have to bother with a long list of tasks and checkpoints every time we need to release a new version of the solr system, or extend our clusters. He should just have to take the new release, ship it on a machine, and start up solr. -- Renaud Delbru On 16/02/11 13:15, Stefan Matheis wrote: Renaud, just because i'm interested in .. what are your concerns about using cron for that? Stefan On Wed, Feb 16, 2011 at 2:12 PM, Renaud Delbrurenaud.del...@deri.org wrote: Hi, We would like to trigger an optimise every x hours. From what I can see, there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing. We have a master-slave configuration. The masters are tuned for fast indexing (large merge factor). However, for the moment, the master index is replicated as it is to the slaves, and therefore it does not provide very fast query time. Our idea was - to configure the replication so that it only happens after an optimise, and - schedule a partial optimise in order to reduce the number of segments every x hours for faster querying. We do not want to rely on cron job for executing the partial optimise every x hours, but we would prefer to configure this directly within the solr config. Our first idea was to create a SolrEventListener, that will be postCommit triggered, and that will be in charge of executing an optimise at regular time interval. Is this a good approach ? Or is there other solutions to achieve this ? Thanks, -- Renaud Delbru
Re: Dismax problem
It looks like you are trying to use a function query on a multi-valued field? -Yonik http://lucidimagination.com On Tue, Feb 15, 2011 at 8:34 AM, Ezequiel Calderara ezech...@gmail.com wrote: Hi, im having a problem while trying to do a dismax search. For example i have the standard query url like this: It returns 1 result. But when i try to use the dismax query type i have the following error: 15/02/2011 10:27:07 org.apache.solr.common.SolrException log GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28 at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224) at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692) at org.apache.solr.search.function.StringIndexDocValues.init(StringIndexDocValues.java:35) at org.apache.solr.search.function.OrdFieldSource$1.init(OrdFieldSource.java:84) at org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58) at org.apache.solr.search.function.FunctionQuery$AllScorer.init(FunctionQuery.java:123) at org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579) at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) The Solr instance is running as a replication slave. This is the solrconfig.xml: http://pastebin.com/GSv2wBB4 This is the schema.xml: http://pastebin.com/5VpRT5Jj Any help? How can i find what is causing this exception? I thought that the dismax didn't throw exceptions... -- __ Ezequiel. Http://www.ironicnet.com
Re: Triggering optimise based on time interval
hm okay, reasonable :) never used it, but maybe a pointer into the right direction? http://wiki.apache.org/solr/DataImportHandler#Scheduling On Wed, Feb 16, 2011 at 2:27 PM, Renaud Delbru renaud.del...@deri.org wrote: Mainly technical administration effort. We are trying to have a solr packaging that - minimises the effort to deploy the system on a machine. - reduces errors when deploying - centralised the logic of the Solr system Ideally, we would like to have a central place (e.g., solrconfig) where the logic of the system is configured. In that case, the system administrator does not have to bother with a long list of tasks and checkpoints every time we need to release a new version of the solr system, or extend our clusters. He should just have to take the new release, ship it on a machine, and start up solr. -- Renaud Delbru On 16/02/11 13:15, Stefan Matheis wrote: Renaud, just because i'm interested in .. what are your concerns about using cron for that? Stefan On Wed, Feb 16, 2011 at 2:12 PM, Renaud Delbrurenaud.del...@deri.org wrote: Hi, We would like to trigger an optimise every x hours. From what I can see, there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing. We have a master-slave configuration. The masters are tuned for fast indexing (large merge factor). However, for the moment, the master index is replicated as it is to the slaves, and therefore it does not provide very fast query time. Our idea was - to configure the replication so that it only happens after an optimise, and - schedule a partial optimise in order to reduce the number of segments every x hours for faster querying. We do not want to rely on cron job for executing the partial optimise every x hours, but we would prefer to configure this directly within the solr config. Our first idea was to create a SolrEventListener, that will be postCommit triggered, and that will be in charge of executing an optimise at regular time interval. Is this a good approach ? Or is there other solutions to achieve this ? Thanks, -- Renaud Delbru
Re: strange search-behavior over dynamic field
the documents havent the same uniquekey, only reason is the same. i cannot show the exactly search request, because of privacy policy... the query is like that: reason_1: firstname lastname, reason_2: 1234, 02.02.2011 -- in field reason: firstname lastname, 1234, 02.02.2011 the search request is form an PHP-Application. On my TestEnvironment i cannot rebuild this case ... =(( okay ... i dont know why, but after a delta-import, its all okay ... - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2509610.html Sent from the Solr - User mailing list archive at Nabble.com.
CJKAnalyzer and Synonyms
Hi everyone, I am trying to get Synonyms working with CJKAnalyzer. Search works fine but synonyms do not work as expected. Here is my field definition in the schema file: fieldType name=cjk class=solr.TextField analyzer class=org.apache.lucene.analysis.cjk.CJKAnalyzer filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ /analyzer /fieldType When testing on the analysis page, the synonym filter does not kick in at all. My question is: What am I doing wrong and what is the proper way of defining the field type? Thanks in advance for your help! Alex -- View this message in context: http://lucene.472066.n3.nabble.com/CJKAnalyzer-and-Synonyms-tp2510104p2510104.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Triggering optimise based on time interval
I think you can get far by just optimizing how often you do commits (as seldom as possible), as well as MergeFactor, to get a good balance between indexing and query efficiency. It may be that you're looking for fewer segments on average - not always one fully optimized segment. If you still feel you need more optimizing, the far easiest is to implement the logic in your client which sends an explicit optimize whenever your logic dictates. One way to hide this inside Solr config could be to change your MergePolicy in solrconfig.xml or implementing your own (http://lucene.apache.org/java/3_0_0/api/all/org/apache/lucene/index/MergePolicy.html) if you cannot find any suitable ones. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 16. feb. 2011, at 14.12, Renaud Delbru wrote: Hi, We would like to trigger an optimise every x hours. From what I can see, there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing. We have a master-slave configuration. The masters are tuned for fast indexing (large merge factor). However, for the moment, the master index is replicated as it is to the slaves, and therefore it does not provide very fast query time. Our idea was - to configure the replication so that it only happens after an optimise, and - schedule a partial optimise in order to reduce the number of segments every x hours for faster querying. We do not want to rely on cron job for executing the partial optimise every x hours, but we would prefer to configure this directly within the solr config. Our first idea was to create a SolrEventListener, that will be postCommit triggered, and that will be in charge of executing an optimise at regular time interval. Is this a good approach ? Or is there other solutions to achieve this ? Thanks, -- Renaud Delbru
Re: Are there any restrictions on what kind of how many fields you can use in Pivot Query? I get ClassCastException when I use some of my string fields, and don't when I use some other sting fields
Hello Ravish, Erick, I'm facing the same issue with solr-trunk (as of r1071282) - Field configuration : fieldType name=normalized_string class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory / filter class=solr.LowerCaseFilterFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldType - Schema configuration : field name=f1 type=normalized_string indexed=true stored=true/ field name=f2 type=normalized_string indexed=true stored=true/ field name=f3 type=normalized_string indexed=true stored=true/ In my test index, I have documents with sparse values : Some documents may or may not have a value for f1, f2 and/or f3 The number of indexed documents is around 25. I'm facing the issue at query time, depending on my query, and the temperature of the index. Parameters having an effect on the reproducibility : - number of levels of the decision tree : the deeper the tree, the faster the exceptions arises - facet.limit parameter : the higher the limit, the faster the exception arises. Examples : All docs, facet-pivoting on all fields that matters, varying on facet.limit : q=*:* pivot=f1,f2,f3 facet.limit=1 : OK q=*:* pivot=f1,f2,f3 facet.limit=2 : OK ... q=*:* pivot=f1,f2,f3 facet.limit=8 : OK q=*:* pivot=f1,f2,f3 facet.limit=9 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=9 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=9 : OK q=*:* pivot=f1,f2,f3 facet.limit=10 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=10 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=10 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=10 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=10 : NOT OK retry q=*:* pivot=f1,f2,f3 facet.limit=10 : OK q=*:* pivot=f1,f2,f3 facet.limit=11 : NOT OK ... It really looks like a cache issue. After some retries, I can finally obtain my results, and not an HTTP 500. Once I obtain my results, I can ask for more, if wait a little. That's very odd. So before I continue, here is my query configuration : query maxBooleanClauses1024/maxBooleanClauses filterCache class=solr.FastLRUCache size=512 initialSize=512 autowarmCount=0/ queryResultCache class=solr.LRUCache size=1024 initialSize=512 autowarmCount=0/ documentCache class=solr.LRUCache size=1024 initialSize=512 autowarmCount=0/ enableLazyFieldLoadingtrue/enableLazyFieldLoading queryResultWindowSize20/queryResultWindowSize queryResultMaxDocsCached200/queryResultMaxDocsCached listener event=newSearcher class=solr.QuerySenderListener arr name=queries !-- lst str name=qsolr/str str name=start0/str str name=rows10/str /lst lst str name=qrocks/str str name=start0/str str name=rows10/str /lst lststr name=qstatic newSearcher warming query from solrconfig.xml/str/lst -- /arr /listener listener event=firstSearcher class=solr.QuerySenderListener arr name=queries lst str name=qsolr rocks/strstr name=start0/strstr name=rows10/str/lst lststr name=qstatic firstSearcher warming query from solrconfig.xml/str/lst /arr /listener useColdSearcherfalse/useColdSearcher maxWarmingSearchers2/maxWarmingSearchers /query That's very much like the default configuration. I guess that the default cache configuration is not perfectly suitable for facet pivoting, so any hint on how to tweak it right is welcome. Kind regards, -- Tanguy On 02/15/2011 06:05 PM, Erick Erickson wrote: To get meaningful help, you have to post a minimum of: 1 the relevant schema definitions for the field that makes it blow up. include thefieldType andfield tags. 2 the query you used, with some indication of the field that makes it blow up. 3 What version you're using 4 any changes you've made to the standard configurations. 5 whether you've recently installed a new version. It might help if you reviewed: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Tue, Feb 15, 2011 at 11:27 AM, Ravish Bhagdev ravish.bhag...@gmail.com wrote: Looks like its a bug? Is it not? Ravish On Tue, Feb 15, 2011 at 4:03 PM, Ravish Bhagdevravish.bhag...@gmail.comwrote: When include some of the fields in my search query: SEVERE: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; at org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377) at org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329) at org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144) at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:904) at org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:121) at org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:126) at org.apache.solr.handler.component.PivotFacetHelper.process(PivotFacetHelper.java:85)
Re: Term Vector Query on Single Document
On Wednesday 16 February 2011 16:49:51 Tod wrote: I have a couple of semi-related questions regarding the use of the Term Vector Component: - Using curl is there a way to query a specific document (maybe using Tika when required?) to get a distribution of the terms it contains? No Tika involved here. You can just query a document q=id:whatever and enable the TVComponent. Make sure you list your fields in the tv.fl parameter. Those fields, of course, need TermVectors enabled. - When I set the termVector on a field do I need to reindex? I'm thinking 'yes' Yes. - How expensive is setting the termVector on a field? Takes up additional disk space and RAM. Can be a lot. Thanks - Tod -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: How to use XML parser in DIH for a database?
It only works on FileDataSource right ? Bill Bell Sent from mobile On Feb 16, 2011, at 2:17 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: What about using http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor ? On Wed, Feb 16, 2011 at 10:08 AM, Bill Bell billnb...@gmail.com wrote: I am using DIH. I am trying to take a column in a SQL Server database that returns an XML string and use Xpath to get data out of it. I noticed that Xpath works with external files, how do I get it to work with a database? I need something like //insur[5][@name='Blue Cross'] Thanks.
Re: SolrCloud - Example C not working
2011/2/16 Yonik Seeley yo...@lucidimagination.com On Wed, Feb 16, 2011 at 3:57 AM, Thorsten Scherler scher...@gmail.com wrote: On Tue, 2011-02-15 at 09:59 -0500, Yonik Seeley wrote: On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler thors...@apache.org wrote: Hi all, I followed http://wiki.apache.org/solr/SolrCloud and everything worked fine till I tried Example C:. Verified. I just tried and it failed for me too. Hi Yonik, thanks for verifying. :) Should I open an issue and move the thread to the dev list? Yeah, thanks! -Yonik http://lucidimagination.com Hi, For me, example C doesn't work eater. I just tried it - example A B worked like a charm, Stijn Vanhoorelbeke
Re: SolrCloud - Example C not working
It looks like a log4j issue: java.lang.NoClassDefFoundError: org/apache/log4j/jmx/HierarchyDynamicMBean at org.apache.zookeeper.jmx.ManagedUtil.registerLog4jMBeans(ManagedUtil.java:51) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:114) at org.apache.solr.cloud.SolrZkServer$1.run(SolrZkServer.java:111) -Yonik http://lucidimagination.com On Wed, Feb 16, 2011 at 11:35 AM, Stijn Vanhoorelbeke stijn.vanhoorelb...@gmail.com wrote: 2011/2/16 Yonik Seeley yo...@lucidimagination.com On Wed, Feb 16, 2011 at 3:57 AM, Thorsten Scherler scher...@gmail.com wrote: On Tue, 2011-02-15 at 09:59 -0500, Yonik Seeley wrote: On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler thors...@apache.org wrote: Hi all, I followed http://wiki.apache.org/solr/SolrCloud and everything worked fine till I tried Example C:. Verified. I just tried and it failed for me too. Hi Yonik, thanks for verifying. :) Should I open an issue and move the thread to the dev list? Yeah, thanks! -Yonik http://lucidimagination.com Hi, For me, example C doesn't work eater. I just tried it - example A B worked like a charm, Stijn Vanhoorelbeke
RE: Errors when implementing VelocityResponseWriter
Managed to get this working. Changed my solrconfig for the one provided in velocity dir, repackaged the war file and redeployed on tomcat. Although this seems like a ridiculously obvious thing to do, I somehow overlooked the repackaging aspect, this was where the problem was. Thanks for the help Erik From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 16 February 2011 08:06 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter Well, you need to specify a path, relative or absolute, that points to the directory where the Velocity JAR file resides. I'm not sure, at this point, exactly what you're missing. But it should be fairly straightforward. Solr startup logs the libraries it loads, so maybe that is helpful info. 1.4.1 - does it support lib? (I'm not sure off the top of my head) Erik On Feb 15, 2011, at 12:04 , McGibbney, Lewis John wrote: Hi Erik thank you for the reply I have placed all velocity jar files in my /lib directory. As explained below, I have added relevant configuration to solrconfig.xml, I am just wondering if the config instructions in the wiki are missing something? Can anyone advise on this. As you mentioned, my terminal output suggests that the VelocityResponseWriter class is not present and therefore the velocity jar is not present... however this is not the case. I have specified lib dir=./lib / in solrconfig.xml, is this enough or do I need to use an exact path. I have already tried specifying an exact path and it does not seem to work either. Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 15 February 2011 06:48 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter looks like you're missing the Velocity JAR. It needs to be in some Solr visible lib directory. With 1.4.1 you'll need to put it in solr-home/lib. In later versions, you can use the lib elements in solrconfig.xml to point to other directories. Erik On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: Hello List, I am currently trying to implement the above in Solr 1.4.1. Having moved velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my webapp /lib directory, then adding queryResponseWriter name=blah and class=blah followed by the responseHandler specifics I am shown the following terminal output. I also added lib dir=./lib / in solrconfig. Can anyone suggest what I have not included in the config that is still required? Thanks Lewis SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.response.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) at org.apache.solr.core.SolrCore.init(SolrCore.java:547) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.solr.response.VelocityResponseWriter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
Re: Deploying Solr CORES on OVH Cloud
Hi, Jetty on Ubuntu has been working well for us and a bunch of our customers. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Rosa (Anuncios) rosaemailanunc...@gmail.com To: solr-user@lucene.apache.org Sent: Tue, February 15, 2011 6:08:39 AM Subject: Re: Deploying Solr CORES on OVH Cloud Thanks for your response, but it doesn't help me a whole lot! Jetty VS Tomcat? Ubuntu o Debian? What are the pro of solr using? Le 14/02/2011 23:12, William Bell a écrit : The first two questions are almost like religion. I am not sure we want to start a debate. Core setup is fairly easy. Add a solr.xml file and subdirs one per core (see example/) directory. Make sure you use the right URL for the admin console. On Mon, Feb 14, 2011 at 3:38 AM, Rosa (Anuncios) rosaemailanunc...@gmail.com wrote: Hi, I'm a bit new in Solr. I'm trying to set up a bunch of server (just for solr) on OVH cloud (http://www.ovh.co.uk/cloud/) and create new cores as needed on each server. First question: What do you recommend: Ubuntu or Debian? I mean in term od performance? Second question: Jetty or Tomcat? Again in term of performance and security? Third question: I've followed all the wiki but i can't get it working the CORES... Impossible to create CORE or access my cores? Does anyone have a working config to share? Thanks a lot for your help Regards,
Re: slave out of sync
Hi Tri, You could look at the stats page for each slave and compare the number of docs in them. The one(s) that are off from the rest/majority are out of sync. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Tri Nguyen tringuye...@yahoo.com To: solr-user@lucene.apache.org Sent: Mon, February 14, 2011 7:19:58 PM Subject: slave out of sync Hi, We're thinking of having a master-slave configuration where there are multiple slaves. Let's say during replication, one of the slaves does not replicate properly. How will we dectect that the 1 slave is out of sync? Tri
optimize and mergeFactor
In my own Solr 1.4, I am pretty sure that running an index optimize does give me significant better performance. Perhaps because I use some largeish (not huge, maybe as large as 200k) stored fields. So I'm interested in always keeping my index optimized. Am I right that if I set mergeFactor to '1', essentially my index will always be optimized after every commit, and actually running 'optimize' will be redundant? What are the possible negative repurcussions of setting mergeFactor to 1? Is this a really bad idea? If not 1, what about some other lower-than-usually-recommended value like 2 or 3? Anyone done this? I imagine it will slow down my commits, but if the alternative is running optimize a lot anyway I wonder at what point I get 'break even' (if I optimize after every single commit, clearly might as well just set the mergeFactor low, right? But if I optimize after every X documents or Y commits don't know what X/Y are break-even). Jonathan
Re: optimize and mergeFactor
In my own Solr 1.4, I am pretty sure that running an index optimize does give me significant better performance. Perhaps because I use some largeish (not huge, maybe as large as 200k) stored fields. 200.000 stored fields? I asume that number includes your number of documents? Sounds crazy =) So I'm interested in always keeping my index optimized. Am I right that if I set mergeFactor to '1', essentially my index will always be optimized after every commit, and actually running 'optimize' will be redundant? You can set mergeFactor to 2, not lower. What are the possible negative repurcussions of setting mergeFactor to 1? Is this a really bad idea? If not 1, what about some other lower-than-usually-recommended value like 2 or 3? Anyone done this? I imagine it will slow down my commits, but if the alternative is running optimize a lot anyway I wonder at what point I get 'break even' (if I optimize after every single commit, clearly might as well just set the mergeFactor low, right? But if I optimize after every X documents or Y commits don't know what X/Y are break-even). This depends on commit rate and if there are a lot of updates and deletes instead of adds. Setting it very low will indeed cause a lot of merging and slow commits. It will also be very slow in replication because merged files are copied over again and again, causing high I/O on your slaves. There is always a `break even` but it depends (as usual) on your scenario and business demands. Jonathan
Solr multi cores or not
Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
Re: Shutdown hook executing for a long time
Closing a core will shutdown almost everything related to the workings of a core. Update and search handlers, possible warming searchers etc. Check the implementation of the close method: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/src/java/org/apache/solr/core/SolrCore.java?view=markup 2011-02-16 11:32:45.489::INFO: Shutdown hook executing 2011-02-16 11:35:36.002::INFO: Shutdown hook complete The shutdown time seems to be proportional to the amount of time that Solr has been running. If I immediately restart and shut down again, it takes a fraction of a second. What causes it to take so long to shut down and is there anything I can do to make it happen quicker?
Help with parsing configuration using SolrParams/NamedList
Hi, I'm trying to use a CustomSimilarityFactory and pass in per-field options from the schema.xml, like so: similarity class=org.ads.solr.CustomSimilarityFactory lst name=field_a int name=min500/int int name=max1/int float name=steepness0.5/float /lst lst name=field_b int name=min500/int int name=max2/int float name=steepness0.5/float /lst /similarity My problem is I am utterly failing to figure out how to parse this nested option structure within my CustomSimilarityFactory class. I know that the settings are available as a SolrParams object within the getSimilarity() method. I'm convinced I need to convert to a NamedList using params.toNamedList(), but my java fu is too feeble to code the dang thing. The closest I seem to get is the top level as a NamedList where the keys are field_a and field_b, but then my values are strings, e.g., {min=500,max=1,steepness=0.5}. Anyone who could dash off a quick example of how to do this? Thanks, --jay
Re: optimize and mergeFactor
Thanks for the answers, more questions below. On 2/16/2011 3:37 PM, Markus Jelsma wrote: 200.000 stored fields? I asume that number includes your number of documents? Sounds crazy =) Nope, I wasn't clear. I have less than a dozen stored field, but the value of a stored field can sometimes be as large as 200kb. You can set mergeFactor to 2, not lower. Am I right though that manually running an 'optimize' is the equivalent of a mergeFactor=1? So there's no way to get Solr to keep the index in an 'always optimized' state, if I'm understanding correctly? Cool. Just want to understand what's going on. This depends on commit rate and if there are a lot of updates and deletes instead of adds. Setting it very low will indeed cause a lot of merging and slow commits. It will also be very slow in replication because merged files are copied over again and again, causing high I/O on your slaves. There is always a `break even` but it depends (as usual) on your scenario and business demands. There are indeed sadly lots of updates and deletes, which is why I need to run optimize periodically. I am aware that this will cause more work for replication -- I think this is true whether I manually issue an optimize before replication _or_ whether I just keep the mergeFactor very low, right? Same issue either way. So... if I'm going to do lots of updates and deletes, and my other option is running an optimize before replication anyway is there any reason it's going to be completely stupid to set the mergeFactor to 2 on the master? I realize it'll mean all index files are going to have to be replicated, but that would be the case if I ran a manual optimize in the same situation before replication too, I think. Jonathan
Re: Solr multi cores or not
Solr multi-core essentially just lets you run multiple seperate distinct Solr indexes in the same running Solr instance. It does NOT let you run queries accross multiple cores at once. The cores are just like completely seperate Solr indexes, they are just conveniently running in the same Solr instance. (Which can be easier and more compact to set up than actually setting up seperate Solr instances. And they can share some config more easily. And it _may_ have implications on JVM usage, not sure). There is no good way in Solr to run a query accross multiple Solr indexes, whether they are multi-core or single cores in seperate Solr doesn't matter. Your first approach should be to try and put all the data in one Solr index. (one Solr 'core'). Jonathan On 2/16/2011 3:45 PM, Thumuluri, Sai wrote: Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
minimum Solr slave replication config
Solr 1.4.1. So, from the documentation at http://wiki.apache.org/solr/SolrReplication I was wondering if I could get away without having any actual configuration in my slave at all. The replication handler is turned on, but if I'm going to manually trigger replication pulls while supplying the master URL manually with the command too, by: command=fetchIndexmasterUrl=$solr_master Then I was thinking, gee, maybe I don't need any slave config at all. That _appears_ to not be true. In such a situation, when I tell the slave to fetchIndexmasterUrl=$solr_master, the command gives a 200 OK. But then I go and check /replication?command=details on the slave, I'm actually presented with an exception: message null java.lang.NullPointerException at org.apache.solr.handler.ReplicationHandler.isPollingDisabled(ReplicationHandler.java:412) at So I'm thinking this is probably becuase you actually can't get away with no slave config at all. So: 1) Is this a bug? Maybe I did something I shoudn't have, but having command=details report a NullPointerException is probably not good, right? If someone who knows better agrees, I'll file it in JIRA? 2) Does anyone know what the minimal slave config is? If I plan to manually trigger replication pulls, and supply the masterUrl maybe just an empty lst name=slave/lst. Or are there other parameters I have to set even though I don't plan to use them? (I do not want automatic polling, only manually triggered pulls). Anyone have any advice, or should I just trial and error?
Re: optimize and mergeFactor
Thanks for the answers, more questions below. On 2/16/2011 3:37 PM, Markus Jelsma wrote: 200.000 stored fields? I asume that number includes your number of documents? Sounds crazy =) Nope, I wasn't clear. I have less than a dozen stored field, but the value of a stored field can sometimes be as large as 200kb. You can set mergeFactor to 2, not lower. Am I right though that manually running an 'optimize' is the equivalent of a mergeFactor=1? So there's no way to get Solr to keep the index in an 'always optimized' state, if I'm understanding correctly? Cool. Just want to understand what's going on. That should be it. If i remember correctly a second segment is always written, new updates aren't merged immediately. This depends on commit rate and if there are a lot of updates and deletes instead of adds. Setting it very low will indeed cause a lot of merging and slow commits. It will also be very slow in replication because merged files are copied over again and again, causing high I/O on your slaves. There is always a `break even` but it depends (as usual) on your scenario and business demands. There are indeed sadly lots of updates and deletes, which is why I need to run optimize periodically. I am aware that this will cause more work for replication -- I think this is true whether I manually issue an optimize before replication _or_ whether I just keep the mergeFactor very low, right? Same issue either way. Yes. But having several segments shouldn't make that much of a difference. If search latency is just a few addidional milliseconds than i'd rather have a few more segments being copied over more quickly. So... if I'm going to do lots of updates and deletes, and my other option is running an optimize before replication anyway is there any reason it's going to be completely stupid to set the mergeFactor to 2 on the master? I realize it'll mean all index files are going to have to be replicated, but that would be the case if I ran a manual optimize in the same situation before replication too, I think. No, it's not stupid if you allow for slow indexing and slow copying of files but want a very quick search. Jonathan
RE: Solr multi cores or not
Hmmm. Maybe I'm not understanding what you're getting at, Jonathan, when you say 'There is no good way in Solr to run a query across multiple Solr indexes'. What about the 'shards' parameter? That allows searching across multiple cores in the same instance, or shards across multiple instances. There are certainly implications here (like Relevance not being consistent across cores / shards), but it works pretty well for us... Thanks! Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, February 16, 2011 4:09 PM To: solr-user@lucene.apache.org Cc: Thumuluri, Sai Subject: Re: Solr multi cores or not Solr multi-core essentially just lets you run multiple seperate distinct Solr indexes in the same running Solr instance. It does NOT let you run queries accross multiple cores at once. The cores are just like completely seperate Solr indexes, they are just conveniently running in the same Solr instance. (Which can be easier and more compact to set up than actually setting up seperate Solr instances. And they can share some config more easily. And it _may_ have implications on JVM usage, not sure). There is no good way in Solr to run a query accross multiple Solr indexes, whether they are multi-core or single cores in seperate Solr doesn't matter. Your first approach should be to try and put all the data in one Solr index. (one Solr 'core'). Jonathan On 2/16/2011 3:45 PM, Thumuluri, Sai wrote: Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
last item in results page is always the same
(I'm using solr 1.4) I'm doing a test of my index, so I'm reading out every document in batches of 500. The query is (I added newlines here to make it readable): http://localhost:8983/solr/archive_ECCO/select/ ?q=archive%3AECCO fl=uri version=2.2 start=0 rows=500 indent=on sort=uri%20asc It turns out, in this case, the query should match every document. The response shows numFound=182413. If I scan the returned values, they appear sorted properly except the last one. In other words, the uri that are returned on the first page are: 100100 100200 etc... 0006601600 0006601700 1723200600 That 499th value is returned as the 499th value on every page. That is, if I call it with start=500, then most of the entries look right, but that last value will still be 1723200600, and the true 499th value is never returned. 1723200600 should have been returned as the 181,499th item. Is this a known solr bug or is there something subtle going on? Thanks, Paul
Re: last item in results page is always the same
On Wed, Feb 16, 2011 at 5:08 PM, Paul p...@nines.org wrote: Is this a known solr bug or is there something subtle going on? Yes, I think it's the following bug, fixed in 1.4.1: * SOLR-1777: fieldTypes with sortMissingLast=true or sortMissingFirst=true can result in incorrectly sorted results. -Yonik http://lucidimagination.com
Re: Solr multi cores or not
Yes, you're right, from now on when I say that, I'll say except shards. It is true. My understanding is that shards functionality's intended use case is for when your index is so large that you want to split it up for performance. I think it works pretty well for that, with some limitations as you mention. From reading the list, my impression is that when people try to use shards to solve some _other_ problem, they generally run into problems. But maybe that's just because the people with the problems are the ones who appear on the list? My personal advice is still to try and put everything together in one big index, Solr will give you the least trouble with that, it's what Solr likes to do best; move to shards certainly if your index is so large that moving to shards will give you performance advantage you need, that's what they're for; be very cautious moving to shards for other challenges that 'one big index' is giving you that you're thinking shards will solve. Shards is, as I understand it, _not_ intended as a general purpose federation function, it's specifically intended to split an index accross multiple hosts for performance. Jonathan On 2/16/2011 4:37 PM, Bob Sandiford wrote: Hmmm. Maybe I'm not understanding what you're getting at, Jonathan, when you say 'There is no good way in Solr to run a query across multiple Solr indexes'. What about the 'shards' parameter? That allows searching across multiple cores in the same instance, or shards across multiple instances. There are certainly implications here (like Relevance not being consistent across cores / shards), but it works pretty well for us... Thanks! Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, February 16, 2011 4:09 PM To: solr-user@lucene.apache.org Cc: Thumuluri, Sai Subject: Re: Solr multi cores or not Solr multi-core essentially just lets you run multiple seperate distinct Solr indexes in the same running Solr instance. It does NOT let you run queries accross multiple cores at once. The cores are just like completely seperate Solr indexes, they are just conveniently running in the same Solr instance. (Which can be easier and more compact to set up than actually setting up seperate Solr instances. And they can share some config more easily. And it _may_ have implications on JVM usage, not sure). There is no good way in Solr to run a query accross multiple Solr indexes, whether they are multi-core or single cores in seperate Solr doesn't matter. Your first approach should be to try and put all the data in one Solr index. (one Solr 'core'). Jonathan On 2/16/2011 3:45 PM, Thumuluri, Sai wrote: Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
Re: Solr multi cores or not
Hi, That depends (as usual) on your scenario. Let me ask some questions: 1. what is the sum of documents for your applications? 2. what is the expected load in queries/minute 3. what is the update frequency in documents/minute and how many documents per commit? 4. how many different applications do you have? 5. are the query demands for the business the same (or very similar) for all applications? 6. can you easily upgrade hardware or demand more machines? 7. must you enforce security between applications and are the clients not under your control? I'm puzzled though, you have so much memory but so little CPU. What about the disks? Size? Spinning or SSD? Cheers, Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
Re: Solr multi cores or not
You can also easily abuse shards to query multiple cores that share parts of the schema. This way you have isolation with the ability to query them all. The same can, of course, also be achieved using a sinlge index with a simple field identying the application and using fq on that one. Yes, you're right, from now on when I say that, I'll say except shards. It is true. My understanding is that shards functionality's intended use case is for when your index is so large that you want to split it up for performance. I think it works pretty well for that, with some limitations as you mention. From reading the list, my impression is that when people try to use shards to solve some _other_ problem, they generally run into problems. But maybe that's just because the people with the problems are the ones who appear on the list? My personal advice is still to try and put everything together in one big index, Solr will give you the least trouble with that, it's what Solr likes to do best; move to shards certainly if your index is so large that moving to shards will give you performance advantage you need, that's what they're for; be very cautious moving to shards for other challenges that 'one big index' is giving you that you're thinking shards will solve. Shards is, as I understand it, _not_ intended as a general purpose federation function, it's specifically intended to split an index accross multiple hosts for performance. Jonathan On 2/16/2011 4:37 PM, Bob Sandiford wrote: Hmmm. Maybe I'm not understanding what you're getting at, Jonathan, when you say 'There is no good way in Solr to run a query across multiple Solr indexes'. What about the 'shards' parameter? That allows searching across multiple cores in the same instance, or shards across multiple instances. There are certainly implications here (like Relevance not being consistent across cores / shards), but it works pretty well for us... Thanks! Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, February 16, 2011 4:09 PM To: solr-user@lucene.apache.org Cc: Thumuluri, Sai Subject: Re: Solr multi cores or not Solr multi-core essentially just lets you run multiple seperate distinct Solr indexes in the same running Solr instance. It does NOT let you run queries accross multiple cores at once. The cores are just like completely seperate Solr indexes, they are just conveniently running in the same Solr instance. (Which can be easier and more compact to set up than actually setting up seperate Solr instances. And they can share some config more easily. And it _may_ have implications on JVM usage, not sure). There is no good way in Solr to run a query accross multiple Solr indexes, whether they are multi-core or single cores in seperate Solr doesn't matter. Your first approach should be to try and put all the data in one Solr index. (one Solr 'core'). Jonathan On 2/16/2011 3:45 PM, Thumuluri, Sai wrote: Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
Re: solr.HTMLStripCharFilterFactory not working
I updated my data importer. I used to have: field column=webtitle stripHTML=true / field column=webdescription stripHTML=true / which wasn't working. But I changed that to field column=webtitle name=webtitle stripHTML=true / field column=webdescription name=webdescription stripHTML=true / and it is working fine. On Tue, Feb 15, 2011 at 5:50 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: (11/02/16 8:03), Tanner Postert wrote: I am using the data import handler and using the HTMLStripTransformer doesn't seem to be working either. I've changed webtitle and webdescription to not by copied from title and description in the schema.xml file then set them both to just but duplicates of title and description in the data importer query: document name=items entity dataSource=db name=item transformer=HTMLStripTransformer query=select title as title, title as webtitle, description as description, description as webdescription FROM ... field column=webtitle stripHTML=true / field column=webdescription stripHTML=true / /entity /document Just for input (I'm not sure that I could help you), I'm using HTMLStripTransformer with PlainTextEntityProcessor and it works fine with me: dataConfig dataSource name=f type=URLDataSource encoding=UTF-8 baseUrl=http://lucene.apache.org// document entity name=solr processor=PlainTextEntityProcessor transformer=HTMLStripTransformer dataSource=f url=solr/ field column=plainText name=text stripHTML=true/ /entity /document /dataConfig Koji -- http://www.rondhuit.com/en/
Re: Solr multi cores or not
I frequently use multiple cores for these reasons: * Completely different applications, such as web search and directory search or if their update latency / query /caching requirements are very different I can then also nuke one without affecting the other Also, you get nice separation for monitoring each app with e.g. NewRelicRPM * Two news collections in different languages, and I don't want the TF/IDF for overlapping terms between the languages destroy relevancy. I then use sharding if we need to return results from both cores In production we run 3-4 cores on same server without problems. But be aware that you have enough memory for the extra caches and a few more Java objects. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 17. feb. 2011, at 00.28, Markus Jelsma wrote: You can also easily abuse shards to query multiple cores that share parts of the schema. This way you have isolation with the ability to query them all. The same can, of course, also be achieved using a sinlge index with a simple field identying the application and using fq on that one. Yes, you're right, from now on when I say that, I'll say except shards. It is true. My understanding is that shards functionality's intended use case is for when your index is so large that you want to split it up for performance. I think it works pretty well for that, with some limitations as you mention. From reading the list, my impression is that when people try to use shards to solve some _other_ problem, they generally run into problems. But maybe that's just because the people with the problems are the ones who appear on the list? My personal advice is still to try and put everything together in one big index, Solr will give you the least trouble with that, it's what Solr likes to do best; move to shards certainly if your index is so large that moving to shards will give you performance advantage you need, that's what they're for; be very cautious moving to shards for other challenges that 'one big index' is giving you that you're thinking shards will solve. Shards is, as I understand it, _not_ intended as a general purpose federation function, it's specifically intended to split an index accross multiple hosts for performance. Jonathan On 2/16/2011 4:37 PM, Bob Sandiford wrote: Hmmm. Maybe I'm not understanding what you're getting at, Jonathan, when you say 'There is no good way in Solr to run a query across multiple Solr indexes'. What about the 'shards' parameter? That allows searching across multiple cores in the same instance, or shards across multiple instances. There are certainly implications here (like Relevance not being consistent across cores / shards), but it works pretty well for us... Thanks! Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, February 16, 2011 4:09 PM To: solr-user@lucene.apache.org Cc: Thumuluri, Sai Subject: Re: Solr multi cores or not Solr multi-core essentially just lets you run multiple seperate distinct Solr indexes in the same running Solr instance. It does NOT let you run queries accross multiple cores at once. The cores are just like completely seperate Solr indexes, they are just conveniently running in the same Solr instance. (Which can be easier and more compact to set up than actually setting up seperate Solr instances. And they can share some config more easily. And it _may_ have implications on JVM usage, not sure). There is no good way in Solr to run a query accross multiple Solr indexes, whether they are multi-core or single cores in seperate Solr doesn't matter. Your first approach should be to try and put all the data in one Solr index. (one Solr 'core'). Jonathan On 2/16/2011 3:45 PM, Thumuluri, Sai wrote: Hi, I have a need to index multiple applications using Solr, I also have the need to share indexes or run a search query across these application indexes. Is solr multi-core - the way to go? My server config is 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the recommendation? Thanks, Sai Thumuluri
Re: Migration from Solr 1.2 to Solr 1.4
: if you don't have any custom components, you can probably just use : your entire solr home dir as is -- just change the solr.war. (you can't : just copy the data dir though, you need to use the same configs) : : test it out, and note the Upgrading notes in the CHANGES.txt for the : 1.3, 1.4, and 1.4.1 releases for gotchas that you might wnat to watch : out for. : Thank you for your reply, I've tried to copy the data and configuration : directory without success : : SEVERE: Could not start SOLR. Check solr/home property : java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException: : Unknown format version: -10 Hmmm... ok, i'm not sure why that would happen. According to the CAHNGES.txt, Solr 1.2 used Lucene 2.1 and Solr 1.4.1 used 2.9.3 -- so Solr 1.4 should have been able to read an index created by Solr 1.2. You *could* try upgrading first from 1.2 to 1.3, run an optimize command, and then try upgradin from 1.3 to 1.4 -- but i can't make any assertions that that will work better, since going straight from 1.2 to 1.4 should have worked the same way. When in doubt: reindex. -Hoss
Re: Searching for negative numbers very slow
: This was my first thought but -1 is relatively common but we have other : numbers just as common. i assume that when you say that you mean ...we have other numbers (that are not negative) just as common, (but searching for them is much faster) ? I don't have any insight into why your negative numbers are slower, but FWIW... : Interestingly enough : : fq=uid:-1 : fq=foo:bar : fq=alpha:omega : : is much (4x) slower than : : q=uid:-1 AND foo:bar AND alpha:omega ...this is (in and of itself) not that suprising for any three arbitrary disjoint queries. when a BoleanQuery is a full disjunction like this (all clause required) it can efficiently skip scoring a lot of documents by looping over the clauses, asking each one for the next doc they match, and then leap frogging the other clauses to that doc. in the case of the three fq params, each query is executd in isolatin, and *all* of the matches of each is accounted for. the speed of using distinct fq params in situations like this comes from the reuse after they are in the filterCache -- you can change fq=foo:bar to fq=foo:baz on the next query, and still reuse 2/3 of the work that was done on the first query. likewise if hte next query is fq=uid:-1fq=foo:barfq=alpha:omegabeta then 2/3 of the work is already done again, and if a following query is fq=uid:-1fq=foo:bazfq=alpha:omegabeta then all of the work is already done and cached even though that particular request has never been seen by solr. -Hoss
Re: How to use XML parser in DIH for a database?
Does anyone have an example of using this with SQL Server varchar or XML field? ?? dataConfig dataSource / document entity name=y query=select * from y where xid=${x.id} entity name=x processor=XPathEntityProcessor forEach=/the/record/xpath url=${y.xml_name} field column=full_name xpath=/field/xpath/ /entity /entity /document /dataConfig On 2/16/11 2:17 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: What about using http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor ? On Wed, Feb 16, 2011 at 10:08 AM, Bill Bell billnb...@gmail.com wrote: I am using DIH. I am trying to take a column in a SQL Server database that returns an XML string and use Xpath to get data out of it. I noticed that Xpath works with external files, how do I get it to work with a database? I need something like //insur[5][@name='Blue Cross'] Thanks.
Re: score from two cores
A common problem in metasearch engines. Its not intractable. You just have to surface the right statistics into a 'fusion' scorer. - NOT always nice. When are we getting better releases? -- View this message in context: http://lucene.472066.n3.nabble.com/score-from-two-cores-tp2012444p2515617.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Http Connection is hanging while deleteByQuery
Thanks for updating your solution On Tue, Feb 8, 2011 at 8:20 AM, shan2812 shanmugaraja...@gmail.com wrote: Hi, At last the migration to Solr-1.4.1 does solve this issue :-).. Cheers -- View this message in context: http://lucene.472066.n3.nabble.com/Http-Connection-is-hanging-while-deleteByQuery-tp2367405p2451214.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
Thanks for the response Hoss. Sorry for replying late was on a business trip. The server was indexing as well as searching at the same time and it was configured for a Native file lock, could that be the issue ? I got another server so moved it to a Master slave configuration with file lock being single on both machines, that solved the issue. I would however love to know what caused that error (its never too late to learn, right ???) Thanks, Ravi Kiran On Mon, Feb 7, 2011 at 2:51 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : While reloading a core I got this following error, when does this : occur ? Prior to this exception I do not see anything wrong in the logs. well, there are realy two distinct types of errors in your log... : [#|2011-02-01T13:02:36.697-0500|SEVERE|sun-appserver2.1|org.apache.solr.servlet.SolrDispatchFilter|_ThreadID=25;_ThreadName=httpWorkerThread-9001-5;_RequestID=450f6337-1f5c-42bc-a572-f0924de36b56;|org.apache.lucene.store.LockObtainFailedException: : Lock obtain timed out: NativeFSLock@ : /data/solr/core/solr-data/index/lucene-7dc773a074342fa21d7d5ba09fc80678-write.lock : at org.apache.lucene.store.Lock.obtain(Lock.java:85) : at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565) : at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1421) ...this is error #1, indicating that for some reason the IndexWriter Solr wasn't trying to create wasn't able to get a Native Filesystem lock on your index directory -- is it possible you have two intsances of Solr (or two solr cores) trying to re-use the same data directory? (diagnosing exampley why you got this error also requires knowing what Filesystem you are using). : [#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter : was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE : LEAK!!!|#] : : [#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter : was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE : LEAK!!!|#] ...these errors are warning you that something very unexpected was discovered when the the Garbage Collector tried to cleanup the SolrIndexWriter -- it found that the SolrIndexWriter had never been formally closed. In normal operation, this might indicate the existence of a bug in code not managing it's resources properly --and in fact, it does indicate the existence of a bug in that evidently a Lock timed out failure doesn't cause the SOlrIndexWriter to be closed -- but in your case it's not really something to be worried about -- it's just a cascading effect of the first error. -Hoss
Validate Query Syntax of Solr Request Before Sending
Hi, I wonder if it is possible to let the user build up a Solr Query and have it validated by some java API before sending it to Solr. Is there a parser that could help with that? I would like to help the user building a valid query as she types by showing messages like The query is not valid or purhaps even more advanced: The parentheses are not balanced. Maybe one day it would also be possible to analyse the semantics of the query like: This query has a build-in inconsistency because the two dates you have specified requires documents to be before AND after these date. But this is far future... Regards, Christian Sonne Jensen -- View this message in context: http://lucene.472066.n3.nabble.com/Validate-Query-Syntax-of-Solr-Request-Before-Sending-tp2515797p2515797.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to use XML parser in DIH for a database?
Use a fielddatasource for reading field from database and then use xpathentityprocessor .Field datasource will give you the stream that is needed by xpathentity processor.Bellow is the example dih configuration code. ?xml version=1.0? dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@localhost:1521:xe user=user password=password name=ds/ dataSource name=fieldSource type=FieldReaderDataSource / document entity name=clobxml dataSource=ds query=select * from tableXX transformer=ClobTransformer field column=ID name=id / field column=SUPPLIER_APPROVALS name=supplier_approvals clob=true/ entity name=xmlread dataSource=fieldSource processor=XPathEntityProcessor forEach=/suppliers/supplier dataField=clobxml.SUPPLIER_APPROVALS onError=continue field column=supplier_name xpath=/suppliers/supplier/name / field column=supplier_id xpath=/suppliers/supplier/id / /entity /entity /document /dataConfig - Thanx: Grijesh http://lucidimagination.com -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-XML-parser-in-DIH-for-a-database-tp2508015p2515910.html Sent from the Solr - User mailing list archive at Nabble.com.
Is facet could be used for Analytics
Hello all, We need to build a Analytics kind of application. Intially we plan to aggregate the result and add it to database or use any ETL tool. I have an idea to use Facet search. I just want to know others opinion on this. We require results in the below fashion. Top 3 results in each column. Top users Country PageAccessed UserA (100) India (1000) /Articles/abc (200) UserB (100) US(500) /Articles/xyz (200) UserC (100) Russia(200)/Articles/aaa (100) When click on particular user, the results should be grouped for that User. Top users Country PageAccessed UserA (100) India (100) /Articles/abc (55) US(50) /Articles/xyz (25) /Articles/aaa (10) This is just an example. I think facet search will help to solve this kind of issue. Regards Ganesh Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php
Re: Is facet could be used for Analytics
I thing facet search is good for your requirement. Also what about Result Grouping feature of Solr ? - Thanx: Grijesh http://lucidimagination.com -- View this message in context: http://lucene.472066.n3.nabble.com/Is-facet-could-be-used-for-Analytics-tp2515938p2515959.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching for negative numbers very slow
Is it my imagination or has this exact email been on the list already? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. From: Chris Hostetter hossman_luc...@fucit.org To: solr-user@lucene.apache.org Cc: yo...@lucidimagination.com Sent: Wed, February 16, 2011 6:20:28 PM Subject: Re: Searching for negative numbers very slow : This was my first thought but -1 is relatively common but we have other : numbers just as common. i assume that when you say that you mean ...we have other numbers (that are not negative) just as common, (but searching for them is much faster) ? I don't have any insight into why your negative numbers are slower, but FWIW... : Interestingly enough : : fq=uid:-1 : fq=foo:bar : fq=alpha:omega : : is much (4x) slower than : : q=uid:-1 AND foo:bar AND alpha:omega ...this is (in and of itself) not that suprising for any three arbitrary disjoint queries. when a BoleanQuery is a full disjunction like this (all clause required) it can efficiently skip scoring a lot of documents by looping over the clauses, asking each one for the next doc they match, and then leap frogging the other clauses to that doc. in the case of the three fq params, each query is executd in isolatin, and *all* of the matches of each is accounted for. the speed of using distinct fq params in situations like this comes from the reuse after they are in the filterCache -- you can change fq=foo:bar to fq=foo:baz on the next query, and still reuse 2/3 of the work that was done on the first query. likewise if hte next query is fq=uid:-1fq=foo:barfq=alpha:omegabeta then 2/3 of the work is already done again, and if a following query is fq=uid:-1fq=foo:bazfq=alpha:omegabeta then all of the work is already done and cached even though that particular request has never been seen by solr. -Hoss