Re: Use case for storing positions and offsets in index?
Term positions in the index are used for phrase query and span queries. There is a separate concept called term vectors that maintains positions as well. It is most useful for highlighting - you want to know exactly where a term started and ended. -- Jack Krupansky -Original Message- From: KnightRider Sent: Tuesday, May 07, 2013 12:58 PM To: solr-user@lucene.apache.org Subject: Use case for storing positions and offsets in index? Can someone please tell me the usecase for storing term positions and offsets in the index? I am trying to understand the difference between storing positions/offsets vs indexing positions/offsets. Thanks KR - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble.com/Use-case-for-storing-positions-and-offsets-in-index-tp4061376.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Use case for storing positions and offsets in index?
Consider further that term vector data and highlighting becomes very useful if you highlight externally to Solr. That is to say, you have the data stored externally and wish to re-parse positions of terms (especially synonyms) from source material. This is a (not too uncommon) technique used for extremely large articles, where data storage in the Lucene index might be repetitive. On May 8, 2013, at 11:04 PM, Jack Krupansky j...@basetechnology.com wrote: Term positions in the index are used for phrase query and span queries. There is a separate concept called term vectors that maintains positions as well. It is most useful for highlighting - you want to know exactly where a term started and ended. -- Jack Krupansky -Original Message- From: KnightRider Sent: Tuesday, May 07, 2013 12:58 PM To: solr-user@lucene.apache.org Subject: Use case for storing positions and offsets in index? Can someone please tell me the usecase for storing term positions and offsets in the index? I am trying to understand the difference between storing positions/offsets vs indexing positions/offsets. Thanks KR - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble.com/Use-case-for-storing-positions-and-offsets-in-index-tp4061376.html Sent from the Solr - User mailing list archive at Nabble.com.
Portability of Solr index
I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index to Centos release 6.2, 32 Bit OS. The index is readable and the application is able to load data from the index on Linux. But there are a few fields on which FQ Queries dont work on Linux , but same FQ Query work on windows. I have a situation where in i have to prepare index on windows and port it on Linux. I need the index to be portable. The only thing which is not working is the FQ Queries. Inside the BlockTreeTermsReader seekExact API, I have enabled debugging and system out statements scanToTermLeaf: block fp=1705107 prefix=0 nextEnt=0 (of 167) target=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIazevPo h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61 7a 65 76 50 6f d a 68 37 71 61 62 74 4c 68 58 77 3d 3d] term= [] This is a Term Query, and target bytes to match As per the algorithm it runs through the term and tries to match , now the 6th term is a exact match, but there is a problem of few bytescycle: term 6 (of 167) suffix=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIazevPo h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61 7a 65 76 50 6f a 68 37 71 61 62 74 4c 68 58 77 3d 3d] Prefix:=0 Suffix:=89 target.offset:=0 target.length :=90 targetLimit :=89 from the first section 50 6f d a 68 37 from the second section 50 6f a 68 37. The test scenario is the index is built on linux and i am testing the index through solr api on windows machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Portability-of-Solr-index-tp4061794.html Sent from the Solr - User mailing list archive at Nabble.com.
Need solr query help
We are doing spatial search. with following logic. a) There are shops in a city . Each provides the facility of home delivery b) each shop has different max_delivery_distance . Now my query is suppose some one is searching from point P1 with radius R. User wants the result of shops those can deliver him.(distance between P1 to shop s1 say d1 should be less than max_delivery distance say md1 ) how can i implement this by solr spatial query.
More Like This and Caching
Hi all, Could anybody explain which Solr cache (e.g. queryResultCache, documentCache, fieldCache, etc.) can be used by the More Like This handler? One of my colleagues had previously suggested that the More Like This handler does not take advantage of any of the Solr caches. However, if I issue two identical MLT requests to the same Solr instance, the second request will execute much faster than the first request (for example, the first request will execute in 200ms and the second request will execute in 20ms). This makes me believe that at least one of the Solr caches is being used by the More Like This handler. I think the documentCache is the cache that is most likely being used, but would you be able to confirm? As information, I am currently using Solr version 3.6.1. Kind regards, Giammarco Schisani
Re: Re: Re: Re: Shard update error when using DIH
Thank you all, guys. Your advises work great and I don't see any errors in Solr logs anymore. Best, Alex Monday 29 April 2013, you wrote: On 29 April 2013 14:55, heaven [hidden email][1] wrote: Got these errors after switching the field type to long: * *crm-test:* org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Unknown fieldtype 'long' specified on field _version_ You have probably edited your schema. The default one has fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ towards the top of the file. Regards, Gora *If you reply to this email, your message will be added to the discussion below:* http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059740.html[2] To unsubscribe from Shard update error when using DIH, click here[3]. NAML[4] [1] /user/SendEmail.jtp?type=nodenode=4059740i=0 [2] http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059740.html [3] http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe _by_codenode=4035502code=YWhlYXZlbjg3QGdtYWlsLmNvbXw0MDM1NTAyfDE3 MDI0ODI4OTY= [4] http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_view erid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.B asicNamespace-nabble.view.web.template.NabbleNamespace- nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21 nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml- send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4061812.html Sent from the Solr - User mailing list archive at Nabble.com.
filter result by numFound in Result Grouping
Hello list In one of our search that we use Result Grouping we have a need to filter results to only groups that have more then one document in the group, or more specifically to groups that have two documents. Is it possible in some way? Thank you
Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml
My question was: When you move DIH libs to Solr's classloader (e.g. instanceDir/lib and refer from solrconfig.xml), and remove solr.war from tomcat/lib, what error msg do you then get? Also make sure to delete the old tomcat/webapps/solr folder just to make sure you're starting from scratch -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 9. mai 2013 kl. 01:54 skrev William Pierce evalsi...@hotmail.com: The reason I placed the solr.war in tomcat/lib was -- I guess -- because that's way I had always done it since 1.3 days. Our tomcat instance(s) run nothing other than solr - so that seemed as good a place as any. The DIH jars that I placed in the tomcat/lib are: solr-dataimporthandler-4.3.0.jar and solr-dataimporthandler-extras-4.3.0.jar. Are there any dependent jars that also need to be added that I am unaware of? On the specific errors - I get a stack trace noted in the first email that began this thread but repeated here for convenience: ERROR - 2013-05-08 10:43:48.185; org.apache.solr.core.CoreContainer; Unable to create core: collection1 org.apache.solr.common.SolrException: org/apache/solr/util/plugin/SolrCoreAware at org.apache.solr.core.SolrCore.init(SolrCore.java:821) at org.apache.solr.core.SolrCore.init(SolrCore.java:618) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/SolrCoreAware at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1700) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:396) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:518) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:592) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154) at org.apache.solr.core.SolrCore.init(SolrCore.java:758) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.util.plugin.SolrCoreAware at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) ... 40 more ERROR - 2013-05-08 10:43:48.189; org.apache.solr.common.SolrException; null:org.apache.solr.common.SolrException: Unable to create core: collection1 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at
Portability of Solr index
I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index to Centos release 6.2, 32 Bit OS. The index is readable and the application is able to load data from the index on Linux. But there are a few fields on which FQ Queries dont work on Linux , but same FQ Query work on windows. I have a situation where in i have to prepare index on windows and port it on Linux. I need the index to be portable. The only thing which is not working is the FQ Queries. Inside the BlockTreeTermsReader seekExact API, I have enabled debugging and system out statements scanToTermLeaf: block fp=1705107 prefix=0 nextEnt=0 (of 167) target=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIazevPo h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61 7a 65 76 50 6f d a 68 37 71 61 62 74 4c 68 58 77 3d 3d] term= [] This is a Term Query, and target bytes to match As per the algorithm it runs through the term and tries to match , now the 6th term is a exact match, but there is a problem of few bytescycle: term 6 (of 167) suffix=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIazevPo h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61 7a 65 76 50 6f a 68 37 71 61 62 74 4c 68 58 77 3d 3d] Prefix:=0 Suffix:=89 target.offset:=0 target.length :=90 targetLimit :=89 from the first comment 50 6f d a 68 37 from the second comment 50 6f a 68 37. The test scenario is the index is built on linux and i am testing the index through solr api on windows machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Portability-of-Solr-index-tp4061783.html Sent from the Solr - User mailing list archive at Nabble.com.
ColrCloud: IOException occured when talking to server at
Hi, observing lots of these errors with SolrCloud Here is the instruction I am using to run services: zookeeper: 1: cd /opt/zookeeper/ 2: sudo bin/zkServer.sh start zoo1.cfg 3: sudo bin/zkServer.sh start zoo2.cfg 4: sudo bin/zkServer.sh start zoo3.cfg shards: 1: cd /opt/solr-cluster/shard1/ sudo su solr -c java -Xmx4096M -DzkHost=localhost:2181,localhost:2182,localhost:2183 -Dbootstrap_confdir=./solr/conf -Dcollection.configName=Carmen -DnumShards=2 -jar start.jar etc/jetty.xml etc/jetty-logging.xml 2: cd ../shard2/ sudo su solr -c java -Xmx4096M -DzkHost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar etc/jetty.xml etc/jetty-logging.xml replicas: 1: cd ../replica1/ sudo su solr -c java -Xmx4096M -DzkHost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar etc/jetty.xml etc/jetty-logging.xml 2: cd ../replica2/ sudo su solr -c java -Xmx4096M -DzkHost=localhost:2181,localhost:2182,localhost:2183 -jar start.jar etc/jetty.xml etc/jetty-logging.xml zoo1.cfg: # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/zookeeper/data/1 # the port at which the clients will connect clientPort=2181 server.1=localhost:2888:3888 server.2=localhost:2889:3889 server.3=localhost:2890:3890 zoo2.cfg and zoo3.cfg are the same except dataDir and client port respectively. Also very often I see: org.apache.solr.common.SolrException: No registered leader was found and lots of other errors. Just updated jetty.xml and set org.eclipse.jetty.server.Request.maxFormContentSize to 10MB and restarted the cluster — half of errors gone, but this one about IOException still here. I am re-indexing a few models (rails application), they have from 1 000 000 to 20 000 000 of records. For indexing I have a queue (mongodb) and a few workers which process it in batches of 200-500 records. All Solr and Zookeeper instances are launched on the same server: 2 intel xenon processors, 8 total cores, 32Gb of memory and rapid RAID storage. Please help me to figure out what could be the reason for those errors and how can fix them. Please tell me if I can provide some more information about the server setup, logs, errors, etc. Best, Alex http://lucene.472066.n3.nabble.com/file/n4061831/Topology.png Shard 1: http://lucene.472066.n3.nabble.com/file/n4061831/Shard1.png Replica 1: http://lucene.472066.n3.nabble.com/file/n4061831/Replica1.png Shard 2: http://lucene.472066.n3.nabble.com/file/n4061831/Shard2.png Replica 2: http://lucene.472066.n3.nabble.com/file/n4061831/Replica2.png -- View this message in context: http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.2 rollback not working
So for all current versions of Solr, rollback will not work for SolrCloud? Will this change in the future, or will rollback always be unsupported for SolrCloud? This did catch me by surprise. Should the SolrJ documentation be updated to reflect this behavior? http://lucene.apache.org/solr/4_3_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback%28%29 http://lucene.apache.org/solr/4_3_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback%28%29 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-2-rollback-not-working-tp4060393p4061834.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ColrCloud: IOException occured when talking to server at
Forget to mention Solr is 4.2 and zookepeer 3.4.5 I do not do manual commits and prefer softCommit each second and autoCommit each 3 minutes. the problem happened again, lots of errors in logs and no description. Cluster state changed, on the shard 2 replica became a leader, former leader get in to recovering mode. The error happened when 1. Shard1 tried to forward an update to Shard2, and this was the initial error From Shard2: ClusterState says we are the leader, but locally we don't think so 2. Shard2 forwarded the update to the Replica2 and get: org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader Please see attachments Topology: http://lucene.472066.n3.nabble.com/file/n4061839/Topology_new.png Shard1: http://lucene.472066.n3.nabble.com/file/n4061839/Shard1_new.png Replica1: http://lucene.472066.n3.nabble.com/file/n4061839/Replica1_new.png Shard2: http://lucene.472066.n3.nabble.com/file/n4061839/Shard2_new.png Replica2: http://lucene.472066.n3.nabble.com/file/n4061839/Replica2_new.png All errors from the screenshots appears each time the server load gets higher. Only I started a few more queue workers, load gets higher and cluster becomes unstable. So I have doubts about reliability. Could any docs be lost during these errors or should I just ignore those? I understand that 4 solr instances and 3 zookeeper could be too many for a single machine, there could be not enough resources, etc. But anyway it should not cause anything like that. The worst scenario there should be is a timeout error, when Solr not responding and my queue processors could handle that and resend a request after a while. -- View this message in context: http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831p4061839.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ColrCloud: IOException occured when talking to server at
Zookeeper log: 1 *2013-05-09 03:03:07,177* [myid:3] - WARN [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Follower@118] - Got zxid 0x20001 expected 0x1 2 *2013-05-09 03:36:52,918* [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 3 java.nio.channels.CancelledKeyException 4 at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) 5 at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) 6 at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153) 7 at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076) 8 at org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113) 9 at org.apache.zookeeper.server.DataTree.setWatches(DataTree.java:1327) 10 at org.apache.zookeeper.server.ZKDatabase.setWatches(ZKDatabase.java:384) 11 at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:304) 12 at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74) 13 *2013-05-09 03:36:52,928* [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 14 java.nio.channels.CancelledKeyException 15 at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) 16 at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) 17 at org.apache.zookeeper.server.NIOServerCnxn.s*2013-05-09 04:26:04,790* [myid:2] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@349] - caught end of stream exception 18 EndOfStreamException: Unable to read additional data from client sessionid 0x23e88bdaf81, likely client has closed socket 19 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) 20 at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) 21 at java.lang.Thread.run(Thread.java:679) 22 tionKeyImpl.ensureValid(SelectionKeyImpl.java:73) 23 at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) 24 at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153) 25 at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076) 26 at org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113) 27 at org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:120) 28 at org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:92) 29 at org.apache.zookeeper.server.DataTree.setData(DataTree.java:620) 30 at org.apache.zookeeper.server.DataTree.processTxn(DataTree.java:807) 31 at org.apache.zookeeper.server.ZKDatabase.processTxn(ZKDatabase.java:329) 32 at org.apache.zookeeper.server.ZooKeeperServer.processTxn(ZooKeeperServer.java:965) 33 at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:116) 34 at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74) 35 *2013-05-09 04:27:04,002* [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 36 java.nio.channels.CancelledKeyException 37 at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) 38 at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) 39 at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153) 40 at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076) 41 at org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113) 42 at org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:120) 43 at org.apache.zookeeper.server.WatchManager.triggerWatch(WatchManager.java:92) 44 at org.apache.zookeeper.server.DataTree.deleteNode(DataTree.java:591) 45 at org.apache.zookeeper.server.DataTree.killSession(DataTree.java:966) 46 at org.apache.zookeeper.server.DataTree.processTxn(DataTree.java:818) 47 at org.apache.zookeeper.server.ZKDatabase.processTxn(ZKDatabase.java:329) 48 at org.apache.zookeeper.server.ZooKeeperServer.processTxn(ZooKeeperServer.java:965) 49 at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:116) 50 at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74) 51 *2013-05-09 04:36:00,485* [myid:3] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxn@349] - caught end of stream exception 52 EndOfStreamException: Unable to read additional data from client sessionid
Re: Solr 4.2 rollback not working
At the least it should throw an exception if you try rollback with SolrCloud - though now there is discussion about removing it entirely. But yes, it's not supported and there are no real plans to support it. - Mark On May 9, 2013, at 7:21 AM, mark12345 marks1900-pos...@yahoo.com.au wrote: So for all current versions of Solr, rollback will not work for SolrCloud? Will this change in the future, or will rollback always be unsupported for SolrCloud? This did catch me by surprise. Should the SolrJ documentation be updated to reflect this behavior? http://lucene.apache.org/solr/4_3_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback%28%29 http://lucene.apache.org/solr/4_3_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback%28%29 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-2-rollback-not-working-tp4060393p4061834.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ColrCloud: IOException occured when talking to server at
Can confirm this lead to data loss. I have 1217427 records in database and only 1217216 indexed. Which does mean that Solr gave a successful response and then did not added some documents to the index. Seems like SolrCloud is not a production-ready solution, would be good if there was a warning in the Solr wiki about that. -- View this message in context: http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831p4061847.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml
I got this to work (thanks, Jan, and all). It turns out that DIH jars need to be included explicitly by specifying in solrconfig.xml or placed in some default path under solr.home. I placed these jars in instanceDir/lib and it worked. Previously I had reported it as not working - this was because I had mistakenly left a copy of the jars under tomcat/lib. Bill -Original Message- From: Jan Høydahl Sent: Thursday, May 09, 2013 2:58 AM To: solr-user@lucene.apache.org Subject: Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml My question was: When you move DIH libs to Solr's classloader (e.g. instanceDir/lib and refer from solrconfig.xml), and remove solr.war from tomcat/lib, what error msg do you then get? Also make sure to delete the old tomcat/webapps/solr folder just to make sure you're starting from scratch -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 9. mai 2013 kl. 01:54 skrev William Pierce evalsi...@hotmail.com: The reason I placed the solr.war in tomcat/lib was -- I guess -- because that's way I had always done it since 1.3 days. Our tomcat instance(s) run nothing other than solr - so that seemed as good a place as any. The DIH jars that I placed in the tomcat/lib are: solr-dataimporthandler-4.3.0.jar and solr-dataimporthandler-extras-4.3.0.jar. Are there any dependent jars that also need to be added that I am unaware of? On the specific errors - I get a stack trace noted in the first email that began this thread but repeated here for convenience: ERROR - 2013-05-08 10:43:48.185; org.apache.solr.core.CoreContainer; Unable to create core: collection1 org.apache.solr.common.SolrException: org/apache/solr/util/plugin/SolrCoreAware at org.apache.solr.core.SolrCore.init(SolrCore.java:821) at org.apache.solr.core.SolrCore.init(SolrCore.java:618) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/SolrCoreAware at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1700) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:396) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:518) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:592) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154) at org.apache.solr.core.SolrCore.init(SolrCore.java:758) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.util.plugin.SolrCoreAware at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) ... 40 more ERROR - 2013-05-08 10:43:48.189; org.apache.solr.common.SolrException; null:org.apache.solr.common.SolrException: Unable to
Re: Portability of Solr index
What is the query/term you are looking for? I wonder if the difference is due to newline treatment on different platforms. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, May 9, 2013 at 1:49 AM, mukesh katariya mukesh.katar...@e-zest.in wrote: I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index to Centos release 6.2, 32 Bit OS. The index is readable and the application is able to load data from the index on Linux. But there are a few fields on which FQ Queries dont work on Linux , but same FQ Query work on windows. I have a situation where in i have to prepare index on windows and port it on Linux. I need the index to be portable. The only thing which is not working is the FQ Queries. Inside the BlockTreeTermsReader seekExact API, I have enabled debugging and system out statements scanToTermLeaf: block fp=1705107 prefix=0 nextEnt=0 (of 167) target=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIazevPo h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61 7a 65 76 50 6f d a 68 37 71 61 62 74 4c 68 58 77 3d 3d] term= [] This is a Term Query, and target bytes to match As per the algorithm it runs through the term and tries to match , now the 6th term is a exact match, but there is a problem of few bytescycle: term 6 (of 167) suffix=1RD0JIHMr9aw4RPPuS0DVzB2tKf38FfjKaEg7HsYDd7EtAOpE9FYvvj5ryB7679r4KNnlIazevPo h7qabtLhXw== [31 52 44 30 4a 49 48 4d 72 39 61 77 34 52 50 50 75 53 30 44 56 7a 42 32 74 4b 66 33 38 46 66 6a 4b 61 45 67 37 48 73 59 44 64 37 45 74 41 4f 70 45 39 46 59 76 76 6a 35 72 79 42 37 36 37 39 72 34 4b 4e 6e 6c 49 61 7a 65 76 50 6f a 68 37 71 61 62 74 4c 68 58 77 3d 3d] Prefix:=0 Suffix:=89 target.offset:=0 target.length :=90 targetLimit :=89 from the first comment 50 6f d a 68 37 from the second comment 50 6f a 68 37. The test scenario is the index is built on linux and i am testing the index through solr api on windows machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Portability-of-Solr-index-tp4061783.html Sent from the Solr - User mailing list archive at Nabble.com.
Fuzzy searching documents over multiple fields using Solr
Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query. Usecase: a user is searching for a tv with a couple of filters selected. No tv matches all filters. How to come up with a bunch of suggestions that match the selected filters as closely as possible? The hard part is to determine what 'closely' means in this context, etc. This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone ever tried to do something similar? any plugins, etc? or reasons Solr/Lucene would/wouldn't be the correct system to build on? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-searching-documents-over-multiple-fields-using-Solr-tp4061867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Fuzzy searching documents over multiple fields using Solr
A simple OR boolean query will boost documents that have more matches. You can also selectively boost individual OR terms to control importance. And do and AND for the required terms, like tv. -- Jack Krupansky -Original Message- From: britske Sent: Thursday, May 09, 2013 11:21 AM To: solr-user@lucene.apache.org Subject: Fuzzy searching documents over multiple fields using Solr Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query. Usecase: a user is searching for a tv with a couple of filters selected. No tv matches all filters. How to come up with a bunch of suggestions that match the selected filters as closely as possible? The hard part is to determine what 'closely' means in this context, etc. This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone ever tried to do something similar? any plugins, etc? or reasons Solr/Lucene would/wouldn't be the correct system to build on? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-searching-documents-over-multiple-fields-using-Solr-tp4061867.html Sent from the Solr - User mailing list archive at Nabble.com.
4.3 logging setup
On all prior index version I setup my log via the logging.properties file in /usr/local/tomcat/conf, it looked like this: # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the License); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an AS IS BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. handlers = 1catalina.org.apache.juli.FileHandler, 2localhost.org.apache.juli.FileHandler, 3manager.org.apache.juli.FileHandler, 4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler .handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler # Handler specific properties. # Describes specific configuration info for Handlers. 1catalina.org.apache.juli.FileHandler.level = WARNING 1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 1catalina.org.apache.juli.FileHandler.prefix = catalina. 2localhost.org.apache.juli.FileHandler.level = FINE 2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 2localhost.org.apache.juli.FileHandler.prefix = localhost. 3manager.org.apache.juli.FileHandler.level = FINE 3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 3manager.org.apache.juli.FileHandler.prefix = manager. 4host-manager.org.apache.juli.FileHandler.level = FINE 4host-manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 4host-manager.org.apache.juli.FileHandler.prefix = host-manager. java.util.logging.ConsoleHandler.level = FINE java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter # Facility specific properties. # Provides extra control for each logger. org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = 2localhost.org.apache.juli.FileHandler org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = 3manager.org.apache.juli.FileHandler org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers = 4host-manager.org.apache.juli.FileHandler # For example, set the org.apache.catalina.util.LifecycleBase logger to log # each component that extends LifecycleBase changing state: #org.apache.catalina.util.LifecycleBase.level = FINE # To see debug messages in TldLocationsCache, uncomment the following line: #org.apache.jasper.compiler.TldLocationsCache.level = FINE After upgrading to 4.3 today the files defined aren't being logged to. I know things have changed for logging w/ 4.3 but how can I get it setup like it was before? -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Fuzzy searching documents over multiple fields using Solr
I didn't mention it but I'd like individual fields to contribute to the overall score on a continuum instead of 1 (match) and 0 (no match), which will lead to more fine-grained scoring. A contrived example: all other things equal a tv of 40 inch should score higher than a 38 inch tv when searching for a 42 inch tv. This based on some distance modeling on the 'size' -field. (eg: score(42,40) = 0.6 and score(42,38) = 0,4). Other qualitative fields may be modeled in the same way: (e.g: restaurants with field 'price' with values: 'budget','mid-range', 'expensive', ...) Any way to incorporate this? 2013/5/9 Jack Krupansky j...@basetechnology.com A simple OR boolean query will boost documents that have more matches. You can also selectively boost individual OR terms to control importance. And do and AND for the required terms, like tv. -- Jack Krupansky -Original Message- From: britske Sent: Thursday, May 09, 2013 11:21 AM To: solr-user@lucene.apache.org Subject: Fuzzy searching documents over multiple fields using Solr Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query. Usecase: a user is searching for a tv with a couple of filters selected. No tv matches all filters. How to come up with a bunch of suggestions that match the selected filters as closely as possible? The hard part is to determine what 'closely' means in this context, etc. This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone ever tried to do something similar? any plugins, etc? or reasons Solr/Lucene would/wouldn't be the correct system to build on? Thanks -- View this message in context: http://lucene.472066.n3.** nabble.com/Fuzzy-searching-**documents-over-multiple-** fields-using-Solr-tp4061867.**htmlhttp://lucene.472066.n3.nabble.com/Fuzzy-searching-documents-over-multiple-fields-using-Solr-tp4061867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Use case for storing positions and offsets in index?
Thanks Jack Jason - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble.com/Use-case-for-storing-positions-and-offsets-in-index-tp4061376p4061890.html Sent from the Solr - User mailing list archive at Nabble.com.
Grouping search results by field returning all search results for a given query
Hi, I'm using solr to maintain an index of items that belong to different companies. I want the search results to be returned in a way that is fair to all companies, thus I wish to group the results such that each company has 1 item in each group, and the groups of results should be returned sorted by score. example: -- 20 companies first 100 results 1-20 results - (company1 highest score item, company2 highest score item, etc..) 20-40 results - (company1 second highest score item, company 2 second highest score item, etc..) ... -- I'm trying to use the field collapsing feature but I have only been able to create the first group of results by using group.limit=1,group.field=companyid. If I raise the group.limit value, I would be violating the 'fairness rule' because more than one result of a company would be returned in the first group of results. Can I achieve the desired search result using SOLR, or do I have to look at other options? thank you, Luis Guerrero
Re: Fuzzy searching documents over multiple fields using Solr
You can use function queries to boost documents as well. Sorry, but it can get messy to figure out. See: http://wiki.apache.org/solr/FunctionQuery See also the edismax bf parameter: http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 -- Jack Krupansky -Original Message- From: Geert-Jan Brits Sent: Thursday, May 09, 2013 12:32 PM To: solr-user@lucene.apache.org Subject: Re: Fuzzy searching documents over multiple fields using Solr I didn't mention it but I'd like individual fields to contribute to the overall score on a continuum instead of 1 (match) and 0 (no match), which will lead to more fine-grained scoring. A contrived example: all other things equal a tv of 40 inch should score higher than a 38 inch tv when searching for a 42 inch tv. This based on some distance modeling on the 'size' -field. (eg: score(42,40) = 0.6 and score(42,38) = 0,4). Other qualitative fields may be modeled in the same way: (e.g: restaurants with field 'price' with values: 'budget','mid-range', 'expensive', ...) Any way to incorporate this? 2013/5/9 Jack Krupansky j...@basetechnology.com A simple OR boolean query will boost documents that have more matches. You can also selectively boost individual OR terms to control importance. And do and AND for the required terms, like tv. -- Jack Krupansky -Original Message- From: britske Sent: Thursday, May 09, 2013 11:21 AM To: solr-user@lucene.apache.org Subject: Fuzzy searching documents over multiple fields using Solr Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query. Usecase: a user is searching for a tv with a couple of filters selected. No tv matches all filters. How to come up with a bunch of suggestions that match the selected filters as closely as possible? The hard part is to determine what 'closely' means in this context, etc. This relates to (approximate) nearest neighbor, Kd-trees, etc. Has anyone ever tried to do something similar? any plugins, etc? or reasons Solr/Lucene would/wouldn't be the correct system to build on? Thanks -- View this message in context: http://lucene.472066.n3.** nabble.com/Fuzzy-searching-**documents-over-multiple-** fields-using-Solr-tp4061867.**htmlhttp://lucene.472066.n3.nabble.com/Fuzzy-searching-documents-over-multiple-fields-using-Solr-tp4061867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Grouping search results by field returning all search results for a given query
Luis, I am presuming you do not have an overarching grouping value here…and simply wish to show a standard search result that shows 1 item per company. You should be able to accomplish your second page of desired items (the second item from each of your 20 represented companies) by using the group.offset parameter. This will shift the position in the returned array of documents to the value provided. Thus: group.limit=1group.field=companyidgroup.offset=1 …would return the second item in each companyid group matching your current query. Jason On May 9, 2013, at 10:30 AM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: Hi, I'm using solr to maintain an index of items that belong to different companies. I want the search results to be returned in a way that is fair to all companies, thus I wish to group the results such that each company has 1 item in each group, and the groups of results should be returned sorted by score. example: -- 20 companies first 100 results 1-20 results - (company1 highest score item, company2 highest score item, etc..) 20-40 results - (company1 second highest score item, company 2 second highest score item, etc..) ... -- I'm trying to use the field collapsing feature but I have only been able to create the first group of results by using group.limit=1,group.field=companyid. If I raise the group.limit value, I would be violating the 'fairness rule' because more than one result of a company would be returned in the first group of results. Can I achieve the desired search result using SOLR, or do I have to look at other options? thank you, Luis Guerrero
RE: More Like This and Caching
I'm not the expert here, but perhaps what you're noticing is actually the OS's disk cache. The actual solr index isn't cached by solr, but as you read the blocks off disk the OS disk cache probably did cache those blocks for you. On the 2nd run the index blocks were read out of memory. There was a very extensive discussion on this list not long back titled: Re: SolrCloud loadbalancing, replication, and failover look that thread up and you'll get a lot of in-depth on the topic. David -Original Message- From: Giammarco Schisani [mailto:giamma...@schisani.com] Sent: Thursday, May 09, 2013 2:59 PM To: solr-user@lucene.apache.org Subject: More Like This and Caching Hi all, Could anybody explain which Solr cache (e.g. queryResultCache, documentCache, fieldCache, etc.) can be used by the More Like This handler? One of my colleagues had previously suggested that the More Like This handler does not take advantage of any of the Solr caches. However, if I issue two identical MLT requests to the same Solr instance, the second request will execute much faster than the first request (for example, the first request will execute in 200ms and the second request will execute in 20ms). This makes me believe that at least one of the Solr caches is being used by the More Like This handler. I think the documentCache is the cache that is most likely being used, but would you be able to confirm? As information, I am currently using Solr version 3.6.1. Kind regards, Giammarco Schisani
Re: 4.3 logging setup
From: http://lucene.apache.org/solr/4_3_0/changes/Changes.html#4.3.0.upgrading_from_solr_4.2.0 Slf4j/logging jars are no longer included in the Solr webapp. All logging jars are now in example/lib/ext. Changing logging impls is now as easy as updating the jars in this folder with those necessary for the logging impl you would like. If you are using another webapp container, these jars will need to go in the corresponding location for that container. In conjunction, the dist-excl-slf4j and dist-war-excl-slf4 build targets have been removed since they are redundent. See the Slf4j documentation, SOLR-3706, and SOLR-4651 for more details. It should just require you provide your preferred logging jars within an appropriate classpath. On May 9, 2013, at 9:24 AM, richardg richa...@dvdempire.com wrote: On all prior index version I setup my log via the logging.properties file in /usr/local/tomcat/conf, it looked like this: # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the License); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an AS IS BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. handlers = 1catalina.org.apache.juli.FileHandler, 2localhost.org.apache.juli.FileHandler, 3manager.org.apache.juli.FileHandler, 4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler .handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler # Handler specific properties. # Describes specific configuration info for Handlers. 1catalina.org.apache.juli.FileHandler.level = WARNING 1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 1catalina.org.apache.juli.FileHandler.prefix = catalina. 2localhost.org.apache.juli.FileHandler.level = FINE 2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 2localhost.org.apache.juli.FileHandler.prefix = localhost. 3manager.org.apache.juli.FileHandler.level = FINE 3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 3manager.org.apache.juli.FileHandler.prefix = manager. 4host-manager.org.apache.juli.FileHandler.level = FINE 4host-manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs 4host-manager.org.apache.juli.FileHandler.prefix = host-manager. java.util.logging.ConsoleHandler.level = FINE java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter # Facility specific properties. # Provides extra control for each logger. org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = 2localhost.org.apache.juli.FileHandler org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = 3manager.org.apache.juli.FileHandler org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers = 4host-manager.org.apache.juli.FileHandler # For example, set the org.apache.catalina.util.LifecycleBase logger to log # each component that extends LifecycleBase changing state: #org.apache.catalina.util.LifecycleBase.level = FINE # To see debug messages in TldLocationsCache, uncomment the following line: #org.apache.jasper.compiler.TldLocationsCache.level = FINE After upgrading to 4.3 today the files defined aren't being logged to. I know things have changed for logging w/ 4.3 but how can I get it setup like it was before? -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: More Like This and Caching
Purely from empirical observation, both the DocumentCache and QueryResultCache are being populated and reused in reloads of a simple MLT search. You can see in the cache inserts how much extra-curricular activity is happening to populate the MLT data by how many inserts and lookups occur on the first load. (lifted right out of the MLT wiki http://wiki.apache.org/solr/MoreLikeThis ) http://localhost:8983/solr/select?q=apachemlt=truemlt.fl=manu,catmlt.mindf=1mlt.mintf=1fl=id,score There is no activity in the filterCache, fieldCache, or fieldValueCache - and that makes plenty of sense. On May 9, 2013, at 11:12 AM, David Parks davidpark...@yahoo.com wrote: I'm not the expert here, but perhaps what you're noticing is actually the OS's disk cache. The actual solr index isn't cached by solr, but as you read the blocks off disk the OS disk cache probably did cache those blocks for you. On the 2nd run the index blocks were read out of memory. There was a very extensive discussion on this list not long back titled: Re: SolrCloud loadbalancing, replication, and failover look that thread up and you'll get a lot of in-depth on the topic. David -Original Message- From: Giammarco Schisani [mailto:giamma...@schisani.com] Sent: Thursday, May 09, 2013 2:59 PM To: solr-user@lucene.apache.org Subject: More Like This and Caching Hi all, Could anybody explain which Solr cache (e.g. queryResultCache, documentCache, fieldCache, etc.) can be used by the More Like This handler? One of my colleagues had previously suggested that the More Like This handler does not take advantage of any of the Solr caches. However, if I issue two identical MLT requests to the same Solr instance, the second request will execute much faster than the first request (for example, the first request will execute in 200ms and the second request will execute in 20ms). This makes me believe that at least one of the Solr caches is being used by the More Like This handler. I think the documentCache is the cache that is most likely being used, but would you be able to confirm? As information, I am currently using Solr version 3.6.1. Kind regards, Giammarco Schisani
Re: 4.3 logging setup
Thanks for responding. My issue is I've never changed anything w/ logging, I have always used the built in Juli. I've never messed w/ any jar files, just had edit the logging.properties file. I don't know where I would get the jars for juli or where to put them, if that is what is needed. I had read what you posted before I just can't make any sense of it. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061901.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format
I have a similar problem. With 5 shards, querying 500K rows fails, but 400K is fine. Querying individual shards for 1.5 million rows works. All solr instances are v4.2.1 and running on separate Ubuntu VMs. It is not random, can be always reproduced by adding rows=50 to a query where numFound is 500K Is this a configuration issue, where some setting can be increased? - This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful.
Re: 4.3 logging setup
If you nab the jars in example/lib/ext and place them within the appropriate folder in Tomcat (and this will somewhat depend on which version of Tomcat you are using…let's presume tomcat/lib as a brute-force approach) you should be back in business. On May 9, 2013, at 11:41 AM, richardg richa...@dvdempire.com wrote: Thanks for responding. My issue is I've never changed anything w/ logging, I have always used the built in Juli. I've never messed w/ any jar files, just had edit the logging.properties file. I don't know where I would get the jars for juli or where to put them, if that is what is needed. I had read what you posted before I just can't make any sense of it. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061901.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: 4.3 logging setup
Hi, FIrst of all, to setup loggin using Log4J (which is really better than JULI), copy all the jars from Jetty's lib/ext over to tomcat's lib folder, see instructions here: http://wiki.apache.org/solr/SolrLogging#Solr_4.3_and_above. You can place your log4j.properties in tomcat/lib as well so it will be read automatically. Now when you start your Tomcat, you will find a file tomcat/logs/solr.log in nicer format than before, with one log entry per line instead of two, and automatic log file rotation and cleaning. However, if you like to switch to Java Util logging, do the following: 1. Download slf4j version 1.6.6 (since that's what we use). http://www.slf4j.org/dist/slf4j-1.6.6.zip 2. Unpack, and pull out the file slf4j-jdk14-1.6.6.jar 3. Remove tomcat/lib/slf4j-log4j12-1.6.6.jar and copy slf4j-jdk14-1.6.6.jar to tomcat/lib instead 4. Use your old logging.properties (either place it on classpath or point to it with startup opt) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 9. mai 2013 kl. 20:41 skrev richardg richa...@dvdempire.com: Thanks for responding. My issue is I've never changed anything w/ logging, I have always used the built in Juli. I've never messed w/ any jar files, just had edit the logging.properties file. I don't know where I would get the jars for juli or where to put them, if that is what is needed. I had read what you posted before I just can't make any sense of it. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061901.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format
Adding the original message. Thank you Sergiu -Original Message- From: Sergiu Bivol [mailto:sbi...@blackberry.com] Sent: Thursday, May 09, 2013 2:50 PM To: solr-user@lucene.apache.org Subject: RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format I have a similar problem. With 5 shards, querying 500K rows fails, but 400K is fine. Querying individual shards for 1.5 million rows works. All solr instances are v4.2.1 and running on separate Ubuntu VMs. It is not random, can be always reproduced by adding rows=50 to a query where numFound is 500K Is this a configuration issue, where some setting can be increased? - From: Ahmet Arslan iori...@yahoo.com Subject: Invalid version (expected 2, but 60) or the data in not in 'javabin' format Date: Mon, 21 Jan 2013 22:35:10 GMT Hi, I am was hitting the following exception when doing distributed search. I am faceting on an int field named contentID. For some queries it was giving this error. For some queries it just works fine. localhost:8080/solr/kanu/select/?shards=localhost:8080/solr/rega,localhost:8080/solr/kanuindent=trueq=kararstart=0rows=15hl=falsewt=xmlfacet=truefacet.limit=-1facet.sort=falsejson.nl=arrarrfq=isXml:falsemm=100%facet.field=contentIDf.contentID.facet.mincount=2 Same search URL works fine for cores (kanu and rega) individually. Plus if I use rega core as base search URL it works too. e.g. localhost:8080/solr/rega/select/?shards=localhost:8080... I see that rega core has lots of unique values for contentID field. So my conclusion is, if a shard response is too big this happens. This is a bad usage of faceting and I will remove faceting on that field since it was added accidentally. I still want to share stack traces since error message is somehow misleading. Jan 21, 2013 10:36:53 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.SolrException: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:300) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1701) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:166) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:133) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) ... 1 more When I add shards.tolerant=true exception becomes: Jan 21, 2013 10:51:51
Re: 4.3 logging setup
I had already copied those jars over and gotten the app to start(it wouldn't without them). I was able configure solf4j/log4j logging using the log4j.properties in the /lib folder to start logging but I don't want to switch. I have alerts set on the wording that the juli logging puts out but everything I've tried to get it to work has failed. I have older indexes(4.2 and under) running on the server that are still able to log correctly, it is just 4.3. I am obviously missing something. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061907.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: 4.3 logging setup
On 5/9/2013 12:54 PM, Jason Hellman wrote: If you nab the jars in example/lib/ext and place them within the appropriate folder in Tomcat (and this will somewhat depend on which version of Tomcat you are using…let's presume tomcat/lib as a brute-force approach) you should be back in business. On May 9, 2013, at 11:41 AM, richardg richa...@dvdempire.com wrote: Thanks for responding. My issue is I've never changed anything w/ logging, I have always used the built in Juli. I've never messed w/ any jar files, just had edit the logging.properties file. I don't know where I would get the jars for juli or where to put them, if that is what is needed. I had read what you posted before I just can't make any sense of it. I've been looking into this a little bit. Tomcat's juli is an apache reimplementation of java.util.logging. Solr uses SLF4J, but before 4.3, Solr's slf4j was bound to java.util.logging ... which I would bet was being intercepted by tomcat and sent through the juli config. With 4.3, SLF4J is bound to log4j by default. If you stick with this binding, then you need to configure log4j instead of juli. Richard, you could go back to java.util.logging (the way earlier versions had it) with this procedure, and this will probably restore the ability to configure logging with juli. - Delete the following jars from Solr's example lib/ext: -- jul-to-slf4j-1.6.6.jar -- log4j-1.2.16.jar -- slf4j-log4j12-1.6.6.jar - Download slf4j version 1.6.6 from their website. - Copy the following jars from the download into lib/ext: -- log4j-over-slf4j-1.6.6.jar -- slf4j-jdk14-1.6.6.jar - Copy all jars in lib/ext to tomcat's lib directory. http://www.slf4j.org/dist/ http://www.slf4j.org Alternatively, you could copy the jars from lib/ext to a directory in your classpath, or add Solr's lib/ext to your classpath. If you want to upgrade to the newest slf4j, you can, you'll just have to use the new version for all slf4j jars. Please let me know whether this worked for you so we can get a proper procedure up on the wiki. Thanks, Shawn
Re: Grouping search results by field returning all search results for a given query
Thank you for the prompt reply jason. The group.offset parameter is working for me, now I can iterate through all items for each company. The problem I'm having right now is pagination. Is there a way how this can be implemented out of the box with solr? Before I was using the group.main=true for easy pagination of results, but it seems like I'll have to ditch that and use the standard grouping format returned by solr for the group.offset parameter to be useful. Since all groups don't have the same number of items, I'll have to carefully calculate the results that should be returned for each page of 20 items and probably make several solr calls per page rendered. On Thu, May 9, 2013 at 1:07 PM, Jason Hellman jhell...@innoventsolutions.com wrote: Luis, I am presuming you do not have an overarching grouping value here…and simply wish to show a standard search result that shows 1 item per company. You should be able to accomplish your second page of desired items (the second item from each of your 20 represented companies) by using the group.offset parameter. This will shift the position in the returned array of documents to the value provided. Thus: group.limit=1group.field=companyidgroup.offset=1 …would return the second item in each companyid group matching your current query. Jason On May 9, 2013, at 10:30 AM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: Hi, I'm using solr to maintain an index of items that belong to different companies. I want the search results to be returned in a way that is fair to all companies, thus I wish to group the results such that each company has 1 item in each group, and the groups of results should be returned sorted by score. example: -- 20 companies first 100 results 1-20 results - (company1 highest score item, company2 highest score item, etc..) 20-40 results - (company1 second highest score item, company 2 second highest score item, etc..) ... -- I'm trying to use the field collapsing feature but I have only been able to create the first group of results by using group.limit=1,group.field=companyid. If I raise the group.limit value, I would be violating the 'fairness rule' because more than one result of a company would be returned in the first group of results. Can I achieve the desired search result using SOLR, or do I have to look at other options? thank you, Luis Guerrero -- Luis Carlos Guerrero Covo M.S. Computer Engineering (57) 3183542047
Re: Grouping search results by field returning all search results for a given query
I would think pagination is resolved by obtaining the numFound value for your returned groups. If you have numFound=6 then each page of 20 items (one item per company) would imply a total of 6 pages. You'll have to arbitrate for the variance here…but it would seem to me you need as many pages as the highest value in the numFound field for all groups. This shouldn't require requerying but will definitely require a little intelligence on the web app to handle the groups that are less than the largest size. Hope that's useful! On May 9, 2013, at 12:23 PM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: Thank you for the prompt reply jason. The group.offset parameter is working for me, now I can iterate through all items for each company. The problem I'm having right now is pagination. Is there a way how this can be implemented out of the box with solr? Before I was using the group.main=true for easy pagination of results, but it seems like I'll have to ditch that and use the standard grouping format returned by solr for the group.offset parameter to be useful. Since all groups don't have the same number of items, I'll have to carefully calculate the results that should be returned for each page of 20 items and probably make several solr calls per page rendered. On Thu, May 9, 2013 at 1:07 PM, Jason Hellman jhell...@innoventsolutions.com wrote: Luis, I am presuming you do not have an overarching grouping value here…and simply wish to show a standard search result that shows 1 item per company. You should be able to accomplish your second page of desired items (the second item from each of your 20 represented companies) by using the group.offset parameter. This will shift the position in the returned array of documents to the value provided. Thus: group.limit=1group.field=companyidgroup.offset=1 …would return the second item in each companyid group matching your current query. Jason On May 9, 2013, at 10:30 AM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: Hi, I'm using solr to maintain an index of items that belong to different companies. I want the search results to be returned in a way that is fair to all companies, thus I wish to group the results such that each company has 1 item in each group, and the groups of results should be returned sorted by score. example: -- 20 companies first 100 results 1-20 results - (company1 highest score item, company2 highest score item, etc..) 20-40 results - (company1 second highest score item, company 2 second highest score item, etc..) ... -- I'm trying to use the field collapsing feature but I have only been able to create the first group of results by using group.limit=1,group.field=companyid. If I raise the group.limit value, I would be violating the 'fairness rule' because more than one result of a company would be returned in the first group of results. Can I achieve the desired search result using SOLR, or do I have to look at other options? thank you, Luis Guerrero -- Luis Carlos Guerrero Covo M.S. Computer Engineering (57) 3183542047
Re: 4.3 logging setup
These are the files I have in my /lib folder: slf4j-api-1.6.6 log4j-1.2.16 jul-to-slf4j-1.6.6 jcl-over-slf4j-1.6.6 slf4j-jdk14-1.6.6 log4j-over-slf4j-1.6.6 Currently everything seems to be logging like before. After I followed the instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar with this slf4j-jdk14-1.6.6.jar it all started working. Shawn I then removed everything as you instructed and put in just log4j-over-slf4j-1.6.6.jar and slf4j-jdk14-1.6.6.jar but the index showed an error and wouldn't start. So that is why I have those 6 files in there now, I'm not sure if log4j-over-slf4j-1.6.6.jar this file is needed or not. Let me know if you need me to test anything else. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/4-3-logging-setup-tp4061875p4061922.html Sent from the Solr - User mailing list archive at Nabble.com.
Does Distributed Search are Cached Only the By Node That Runs Query?
I have Solr 4.2.1 and run them as SolrCloud. When I do a search on SolrCloud as like that: ip_of_node_1:8983solr/select?q=*:*rows=1 and when I check admin page I see that: I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray. Before my search it was something like: 150 MB dark gray, 500 MB gray. I understand that when I do a search like that, fields are cached. However when I look at other SolrCloud nodes' admin pages there are no differences. Why that query is cached only by the node that I run that query on?
Re: 4.3 logging setup
On 5/9/2013 1:41 PM, richardg wrote: These are the files I have in my /lib folder: slf4j-api-1.6.6 log4j-1.2.16 jul-to-slf4j-1.6.6 jcl-over-slf4j-1.6.6 slf4j-jdk14-1.6.6 log4j-over-slf4j-1.6.6 Currently everything seems to be logging like before. After I followed the instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar with this slf4j-jdk14-1.6.6.jar it all started working. Shawn I then removed everything as you instructed and put in just log4j-over-slf4j-1.6.6.jar and slf4j-jdk14-1.6.6.jar but the index showed an error and wouldn't start. So that is why I have those 6 files in there now, I'm not sure if log4j-over-slf4j-1.6.6.jar this file is needed or not. Let me know if you need me to test anything else. You're on the right track. Your list just has two files that shouldn't be there - log4j-1.2.16 and jul-to-slf4j-1.6.6. They are probably not causing any real problems, but they might in the future. Remove those and you will have the exact list I was looking for. If that doesn't work, use a paste website (pastie.org and others) to send a log showing the errors you get. Thanks, Shawn
Is the CoreAdmin RENAME method atomic?
We need to implement a locking mechanism for a full-reindexing SOLR server pool. We could use a database, Zookeeper as our locking mechanism but thats a lot of work. Could solr do it? I noticed the core admin RENAME function (http://wiki.apache.org/solr/CoreAdmin#RENAME) Is this an synchronous atomic operation? What I'm thinking is we create a solr core named 'lock' and any process that wants to obtain a solr server from the pool tries to rename the 'lock' core to say 'lock.someuniqueid'. If it fails, then it tries another server in the pools or waits a bit. If it succeeds, it reindexes it's data and then renames 'lock.someuniqueid' back to 'lock' to return the server back to the pool. -- View this message in context: http://lucene.472066.n3.nabble.com/Is-the-CoreAdmin-RENAME-method-atomic-tp4061944.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Frequent OOM - (Unknown source in logs).
We ended up using a Solr 4.0 (now 4.2) without the cloud option. And it seems to be holding good. -- View this message in context: http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361p4061945.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud Sorting Results By Relevance
When I make a search at Solr 4.2.1 that runs as SolrCloud I get: result name=response numFound=18720 start=0 maxScore=1.2672108 First one has that boost: float name=boost 1.3693064 /float Second one has that: float name=boost 1.7501166 /float and third one: float name=boost 1.0387472 /float Here is default schema for Nutch: http://svn.apache.org/viewvc/nutch/tags/release-2.1/conf/schema-solr4.xml?revision=1388536view=markup Do I miss something or result are already sorted by relevance by Solr?
Apache Whirr for SolrCloud with external Zookeeper
Hi Folks; I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is ready at my pre-production environment. I want to learn that does anybody uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What folks are using for such kind of purposes?
Status of EDisMax
Hi, what is the current status of the Extended DisMax Query Parser? The release notes for Solr 3.1 say it was experimental at that time (two years back). The current wiki page for EDisMax does not contain any such statement. We recently ran into the issue described in SOLR-2649 (using q.op=AND) which I think is a very fundamental defect making it unusable at least in our case. Thanks, André
Negative Boosting at Recent Versions of Solr?
I know that whilst Lucene allows negative boosts, Solr does not. However did it change with newer versions of Solr (I use Solr 4.2.1) or still same?
Re: Apache Whirr for SolrCloud with external Zookeeper
I've never encountered anyone using Whirr to launch Solr even though that's possible - http://issues.apache.org/jira/browse/WHIRR-465 Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Folks; I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is ready at my pre-production environment. I want to learn that does anybody uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What folks are using for such kind of purposes?
Re: Apache Whirr for SolrCloud with external Zookeeper
I saw that ticket and wanted to ask it to mail list. I want to give it a try and feedback to mail list. What folks use for such kind of purposes? 2013/5/10 Otis Gospodnetic otis.gospodne...@gmail.com I've never encountered anyone using Whirr to launch Solr even though that's possible - http://issues.apache.org/jira/browse/WHIRR-465 Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Folks; I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is ready at my pre-production environment. I want to learn that does anybody uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What folks are using for such kind of purposes?
Re: Negative Boosting at Recent Versions of Solr?
Solr does support both additive and multiplicative boosts. Although Solr doesn't support negative multiplicative boosts on query terms, it does support fractional multiplicative boosts (0.25) which do allow you to de-boost a term. The boosts for individual query terms and for the edismax qf parameter cannot be negative, but can be fractional. The edismax bf parameter give a function query that provides an additive boost, which could be negative. The edismax boost parameter gives a function query that provides a multiplicative boost - which could be negative, so it’s not absolutely true that doesn't support negative boosts. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Thursday, May 09, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: Negative Boosting at Recent Versions of Solr? I know that whilst Lucene allows negative boosts, Solr does not. However did it change with newer versions of Solr (I use Solr 4.2.1) or still same?
Re: Index compatibility between Solr releases.
Solr strives to keep backwards-compatible 1 major revision, so 4.x should be able to work with 3.x indexes. One caution though, well actually two. 1 If you have a master/slave setup, upgrade the _slaves_ first. If you upgrade a master fist and it merges segments, then the slaves won't be able to read the 4.x formst. 2 make backups first G... BTW, when the segments are written, they should be written in 4.x format. So I've heard of people doing the migration, then forcing an optimize just to bring all the segments up to the 4.x format. Best Erick On Tue, May 7, 2013 at 3:28 PM, Skand Gupta skandsgu...@gmail.com wrote: We have a fairly large (in the order of 10s of TB) indices built using Solr 3.5. We are considering migrating to Solr 4.3 and was wondering what the policy is on maintaining backward compatibility of the indices? Will 4.3 work with my 3.5 indexes? Because of the large data size, I would ideally like to move new data to 4.3 and gradually re-index all the 3.5 indices. Thanks, - Skand.
Re: Index corrupted detection from http get command.
There's no way to do this that I know of. There's the checkindex tool, but it's fairly expensive resource-wise and there's no HTTP command to do it. Best Erick On Tue, May 7, 2013 at 8:04 PM, Michel Dion diom...@gmail.com wrote: Hello, I'm look for a way to detect solr index corruption using a http get command. I've look at the /admin/ping and /admin/luke request handlers but not sure if the their status provide guarantees that everything is all right. The idea is to be able to tell a load balancer to put a given solr instance out of rotation if its index is corrupted. Thanks Michel
Re: transientCacheSize doesn't seem to have any effect, except on startup
I'm slammed with stuff and have to leave for vacation Saturday morning so I'll be going silent for a while, sorry Best Erick On Wed, May 8, 2013 at 11:27 AM, didier deshommes dfdes...@gmail.com wrote: Any idea on this? I still cannot get the combination of transient cores and transientCacheSize to work as I think it should: give me the ability to create a large number cores and automatically load and unload them for me based on a limit that I set. If anyone else is using this feature and it is working for you, let me know how you got it working! On Fri, May 3, 2013 at 2:11 PM, didier deshommes dfdes...@gmail.com wrote: On Fri, May 3, 2013 at 11:18 AM, Erick Erickson erickerick...@gmail.comwrote: The cores aren't loaded (or at least shouldn't be) for getting the status. The _names_ of the cores should be returned, but those are (supposed) to be retrieved from a list rather than loaded cores. So are you sure that's not what you are seeing? How are you determining whether the cores are actually loaded or not? I'm looking at the output of : $ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status; cores that are loaded have a startTime and upTime value. Cores that are unloaded don't appear in the output at all. For example, I created 3 transient cores with transientCacheSize=2 . When I asked for a list of all cores, all 3 cores were returned. I explicitly unloaded 1 core and got back 2 cores when I asked for the list again. It would be nice if cores had a isTransient and a isCurrentlyLoaded value so that one could see exactly which cores are loaded. That said, it's perfectly possible that the status command is doing something we didn't anticipate, but I took a quick look at the code (got to rush to a plane) and CoreAdminHandler _appears_ to be just returning whatever info it can about an unloaded core for status. I _think_ you'll get more info if the core has ever been loaded though, even though if it's been removed from the transient cache. Ditto for the create action. So let's figure out whether you're really seeing loaded cores or not, and then raise a JIRA if so... Thanks for reporting! Erick On Thu, May 2, 2013 at 1:27 PM, didier deshommes dfdes...@gmail.com wrote: Hi, I've been very interested in the transient core feature of solr to manage a large number of cores. I'm especially interested in this use case, that the wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down now): loadOnStartup=false transient=true: This is really the use-case. There are a large number of cores in your system that are short-duration use. You want Solr to load them as necessary, but unload them when the cache gets full on an LRU basis. I'm creating 10 transient core via core admin like so $ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=CREATEname=new_core2instanceDir=collection1/dataDir=new_core2transient=trueloadOnStartup=false and have transientCacheSize=2 in my solr.xml file, which I take means I should have at most 2 transient cores loaded at any time. The problem is that these cores are still loaded when when I ask solr to list cores: $ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status; From the explanation in the wiki, it looks like solr would manage loading and unloading transient cores for me without having to worry about them, but this is not what's happening. The situation is different when I restart solr; it does the right thing by loading the maximum cores set by transientCacheSize. When I add more cores, the old behavior happens again, where all created transient cores are loaded in solr. I'm using the development branch lucene_solr_4_3 to run my example. I can open a jira if need be.
Re: Apache Whirr for SolrCloud with external Zookeeper
Great, let us know how it works for you. Blog post? Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 6:30 PM, Furkan KAMACI furkankam...@gmail.com wrote: I saw that ticket and wanted to ask it to mail list. I want to give it a try and feedback to mail list. What folks use for such kind of purposes? 2013/5/10 Otis Gospodnetic otis.gospodne...@gmail.com I've never encountered anyone using Whirr to launch Solr even though that's possible - http://issues.apache.org/jira/browse/WHIRR-465 Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Folks; I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is ready at my pre-production environment. I want to learn that does anybody uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What folks are using for such kind of purposes?
Re: SolrCloud: IOException occured when talking to server at
On 5/9/2013 7:31 AM, heaven wrote: Can confirm this lead to data loss. I have 1217427 records in database and only 1217216 indexed. Which does mean that Solr gave a successful response and then did not added some documents to the index. Seems like SolrCloud is not a production-ready solution, would be good if there was a warning in the Solr wiki about that. You've got some kind of underlying problem here. Here are my guesses about what that might be: - An improperly configured Linux firewall and/or SELinux is enabled. - The hardware is already overtaxed by other software. - Your zkClientTimeout value is extremely small. - Your GC pauses are large. - You're running into an open file limit. Here's what you could do to resolve each of these: - Disable the firewall and selinux, reboot. - Stop other software. - The example zkClientTimeout is 15 seconds. Try 30-60. - See http://wiki.apache.org/solr/SolrPerformanceProblems for some GC ideas. - Increase the file and process limits. For most versions of Linux, in /etc/security/limits.conf: solr hardnproc 6144 solr softnproc 4096 solr hardnofile 65536 solr softnofile 49152 These numbers should be sufficient for deployments considerably larger than yours. SolrCloud is not only production ready, it's being used by many many people for extremely large indexes. My own SolrCloud deployment is fairly small with only 1.5 million docs, but it's extremely stable. I also have a somewhat large (77 million docs) non-cloud deployment. Are you running 4.2.1? I feel fairly certain based on your screenshots that you are not running 4.3, but I can't tell which version you are running. There are some bugs in the 4.3 release, a 4.3.1 will be released soon. If you had planned to upgrade, you should wait for 4.3.1 or 4.4. NB, and something you might already know: When talking about production-ready, you can't run everything on the same server. You need at least three - two of them can run Solr and zookeeper, and the third runs zookeeper. This single-server setup is fine for a proof-of-concept. Thanks, Shawn
Re: SolrCloud Sorting Results By Relevance
Hits are sorted by relevance score by default. You are listing boost. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 5:16 PM, Furkan KAMACI furkankam...@gmail.com wrote: When I make a search at Solr 4.2.1 that runs as SolrCloud I get: result name=response numFound=18720 start=0 maxScore=1.2672108 First one has that boost: float name=boost 1.3693064 /float Second one has that: float name=boost 1.7501166 /float and third one: float name=boost 1.0387472 /float Here is default schema for Nutch: http://svn.apache.org/viewvc/nutch/tags/release-2.1/conf/schema-solr4.xml?revision=1388536view=markup Do I miss something or result are already sorted by relevance by Solr?
Re: Status of EDisMax
Didn't check that issue, but edismax is not experimental any more - most solr users use it. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 5:36 PM, André Widhani andre.widh...@digicol.de wrote: Hi, what is the current status of the Extended DisMax Query Parser? The release notes for Solr 3.1 say it was experimental at that time (two years back). The current wiki page for EDisMax does not contain any such statement. We recently ran into the issue described in SOLR-2649 (using q.op=AND) which I think is a very fundamental defect making it unusable at least in our case. Thanks, André
Re: Does Distributed Search are Cached Only the By Node That Runs Query?
You are looking at jvm heap but attributing it to caching only. Not quite right...there are other things in that jvm heap. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 3:55 PM, Furkan KAMACI furkankam...@gmail.com wrote: I have Solr 4.2.1 and run them as SolrCloud. When I do a search on SolrCloud as like that: ip_of_node_1:8983solr/select?q=*:*rows=1 and when I check admin page I see that: I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray. Before my search it was something like: 150 MB dark gray, 500 MB gray. I understand that when I do a search like that, fields are cached. However when I look at other SolrCloud nodes' admin pages there are no differences. Why that query is cached only by the node that I run that query on?
Re: More Like This and Caching
This is correct, doc cache for previously read docs regardless of which query read them and query cache for repeat query. Plus OS cache for actual index files. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 2:32 PM, Jason Hellman jhell...@innoventsolutions.com wrote: Purely from empirical observation, both the DocumentCache and QueryResultCache are being populated and reused in reloads of a simple MLT search. You can see in the cache inserts how much extra-curricular activity is happening to populate the MLT data by how many inserts and lookups occur on the first load. (lifted right out of the MLT wiki http://wiki.apache.org/solr/MoreLikeThis) http://localhost:8983/solr/select?q=apachemlt=truemlt.fl=manu,catmlt.mindf=1mlt.mintf=1fl=id,score There is no activity in the filterCache, fieldCache, or fieldValueCache - and that makes plenty of sense. On May 9, 2013, at 11:12 AM, David Parks davidpark...@yahoo.com wrote: I'm not the expert here, but perhaps what you're noticing is actually the OS's disk cache. The actual solr index isn't cached by solr, but as you read the blocks off disk the OS disk cache probably did cache those blocks for you. On the 2nd run the index blocks were read out of memory. There was a very extensive discussion on this list not long back titled: Re: SolrCloud loadbalancing, replication, and failover look that thread up and you'll get a lot of in-depth on the topic. David -Original Message- From: Giammarco Schisani [mailto:giamma...@schisani.com] Sent: Thursday, May 09, 2013 2:59 PM To: solr-user@lucene.apache.org Subject: More Like This and Caching Hi all, Could anybody explain which Solr cache (e.g. queryResultCache, documentCache, fieldCache, etc.) can be used by the More Like This handler? One of my colleagues had previously suggested that the More Like This handler does not take advantage of any of the Solr caches. However, if I issue two identical MLT requests to the same Solr instance, the second request will execute much faster than the first request (for example, the first request will execute in 200ms and the second request will execute in 20ms). This makes me believe that at least one of the Solr caches is being used by the More Like This handler. I think the documentCache is the cache that is most likely being used, but would you be able to confirm? As information, I am currently using Solr version 3.6.1. Kind regards, Giammarco Schisani
Re: Per Shard Replication Factor
Could these just be different collections? Then sharding and replication is independent. And you can reduce replication factor as the index ages. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 1:43 AM, Steven Bower smb-apa...@alcyon.net wrote: Is it currently possible to have per-shard replication factor? A bit of background on the use case... If you are hashing content to shards by a known factor (lets say date ranges, 12 shards, 1 per month) it might be the case that most of your search traffic would be directed to one particular shard (eg. the current month shard) and having increased query capacity in that shard would be useful... this could be extended to many use cases such as data hashed by organization, type, etc. Thanks, steve
Re: 4.3 logging setup
I've updated the WIKI: http://wiki.apache.org/solr/SolrLogging#Switching_from_Log4J_logging_back_to_Java-util_logging -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 9. mai 2013 kl. 21:57 skrev Shawn Heisey s...@elyograg.org: On 5/9/2013 1:41 PM, richardg wrote: These are the files I have in my /lib folder: slf4j-api-1.6.6 log4j-1.2.16 jul-to-slf4j-1.6.6 jcl-over-slf4j-1.6.6 slf4j-jdk14-1.6.6 log4j-over-slf4j-1.6.6 Currently everything seems to be logging like before. After I followed the instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar with this slf4j-jdk14-1.6.6.jar it all started working. Shawn I then removed everything as you instructed and put in just log4j-over-slf4j-1.6.6.jar and slf4j-jdk14-1.6.6.jar but the index showed an error and wouldn't start. So that is why I have those 6 files in there now, I'm not sure if log4j-over-slf4j-1.6.6.jar this file is needed or not. Let me know if you need me to test anything else. You're on the right track. Your list just has two files that shouldn't be there - log4j-1.2.16 and jul-to-slf4j-1.6.6. They are probably not causing any real problems, but they might in the future. Remove those and you will have the exact list I was looking for. If that doesn't work, use a paste website (pastie.org and others) to send a log showing the errors you get. Thanks, Shawn
Re: SOLR Error: Document is missing mandatory uniqueKey field
Here is the stack trace: DEBUG - 2013-05-09 18:53:06.411; org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE add{,id=(null)} {wt=javabinversion=2} DEBUG - 2013-05-09 18:53:06.411; org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE FINISH {wt=javabinversion=2} INFO - 2013-05-09 18:53:06.412; org.apache.solr.update.processor.LogUpdateProcessor; [orderitemsStage] webapp=/solr path=/update params={wt=javabinversion=2} {add=[488653_0_0_141_388 (1434610076088270848), 488653_0_0_141_388 (1434610076090368000), 488653_0_0_141_388 (1434610076091416576), 488653_0_0_141_388 (1434610076091416577), 488653_0_0_141_388 (1434610076092465152), 488653_0_0_141_388 (1434610076093513728), 488653_0_0_141_388 (1434610076094562304), 488653_0_0_141_388 (1434610076094562305), 488653_0_0_141_388 (1434610076095610880), 488653_0_0_141_388 (1434610076096659456), ... (4031 adds)]} 0 2790 ERROR - 2013-05-09 18:53:06.412; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: orderItemKey at org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:517) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:396) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Error-Document-is-missing-mandatory-uniqueKey-field-tp4062177p4062178.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR Error: Document is missing mandatory uniqueKey field
I repeatedly get this error while adding documents to SOLR using SOLRJ Document is missing mandatory uniqueKey field: orderItemKey. This field is defined as uniqueKey in the Document Schema. I've made sure that I'm passing this field from Java by logging it upfront. As suggested somwhere, I've tried upgrading from 4.0 to 4.3, and also made the field as required=false. Please help me debug or get a resolution to this problem. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Error-Document-is-missing-mandatory-uniqueKey-field-tp4062177.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: dataimport handler
It does not work anymore in 4.x. ${dih.last_index_time} does work, but the entity version does not. Bill On Tue, May 7, 2013 at 4:19 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Using ${dih.entity_name.last_index_time} should work. Make sure you put it in quotes in your query. On Tue, May 7, 2013 at 12:07 PM, Eric Myers emy...@nabancard.com wrote: In the data import handler I have multiple entities. Each one generates a date in the dataimport.properties i.e. entityname.last_index_time. How do I reference the specific entity time in my delta queries? Thanks Eric -- Regards, Shalin Shekhar Mangar. -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: SOLR Error: Document is missing mandatory uniqueKey field
On 5/9/2013 7:44 PM, zaheer.java wrote: I repeatedly get this error while adding documents to SOLR using SOLRJ Document is missing mandatory uniqueKey field: orderItemKey. This field is defined as uniqueKey in the Document Schema. I've made sure that I'm passing this field from Java by logging it upfront. As suggested somwhere, I've tried upgrading from 4.0 to 4.3, and also made the field as required=false. If you have a uniqueKey defined in your schema, then every document must define that field or you'll get the error message you're seeing. That's the entire point of a uniqueKey. It is pretty much the same concept as a primary key on a database table. There is one main difference between uniqueKey and a DB primary key - the database will prevent you from inserting a record with the same ID as an existing record, but Solr uses it to allow easy reindexing. Sending a document with the same ID as an existing document will cause Solr to delete the old one before inserting the new one. Certain Solr features, notably distributed search, require a uniqueKey. SolrCloud uses distributed search so it also requires it. If you're not using features that require uniqueKey, and you don't need Solr to delete duplicate documents, then you can remove that from your schema. It's not recommended, but it should work. Thanks, Shawn
Re: SOLR guidance required
On 5/9/2013 9:41 PM, Kamal Palei wrote: I hope there must be some mechanism, by which I can associate salary, experience, age etc with resume document during indexing. And when I search for resumes I can give all filters accordingly and can retrieve 100 records and strait way I can show 100 records to user without doing any mysql query. Please let me know if this is feasible. If so, kindly give me some pointer how do I do it. If you define fields for these values in your schema, then you can send send filter queries to restrict the search. Solr will filter invalid documents out and only send the results that match your requirements. Some examples of the filter queries you can use are below. You can add more than one of these; they will be ANDed together. fq=age:[21 TO 45] fq=experience:[2 TO *] fq=salaryReq:[* TO 55000] If you're using a Solr API (for Java, PHP, etc) rather than constructing a URL to send directly to Solr, then the API will have a mechanism for adding filters to your query. One caveat: unless you can write code that will automatically extract this information from a resume and/or application, then you will need someone doing data entry that drives the indexing, or you will need prospective employees to fill out a computerized form for their application. Thanks, Shawn
RE: Is the CoreAdmin RENAME method atomic?
Find the discussion titled Indexing off the production servers just a week ago in this same forum, there is a significant discussion of this feature that you will probably want to review. -Original Message- From: Lan [mailto:dung@gmail.com] Sent: Friday, May 10, 2013 3:42 AM To: solr-user@lucene.apache.org Subject: Is the CoreAdmin RENAME method atomic? We need to implement a locking mechanism for a full-reindexing SOLR server pool. We could use a database, Zookeeper as our locking mechanism but thats a lot of work. Could solr do it? I noticed the core admin RENAME function (http://wiki.apache.org/solr/CoreAdmin#RENAME) Is this an synchronous atomic operation? What I'm thinking is we create a solr core named 'lock' and any process that wants to obtain a solr server from the pool tries to rename the 'lock' core to say 'lock.someuniqueid'. If it fails, then it tries another server in the pools or waits a bit. If it succeeds, it reindexes it's data and then renames 'lock.someuniqueid' back to 'lock' to return the server back to the pool. -- View this message in context: http://lucene.472066.n3.nabble.com/Is-the-CoreAdmin-RENAME-method-atomic-tp4 061944.html Sent from the Solr - User mailing list archive at Nabble.com.