Re: SPLITSHARD not working in SOLR-4.4.0
Sorry I misunderstood. That NPE can only happen if the uniqueKey is not defined. The code already checks for a reader.fields() returning null. On Wed, Oct 16, 2013 at 11:22 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Just to be clear, you had a required uniqueKey defined in the schema before you indexed any document, is that correct? It is possible to have a NPE in that line if there is an empty segment or if there are documents but no fields! I'm curious to understand how you ended up with an index like that. On Wed, Oct 16, 2013 at 11:01 AM, RadhaJayalakshmi rlakshminaraya...@inautix.co.in wrote: Thanks for the response!! Yes i have defined unique key in the schema... Still it is throwing the same error.. Is this SPLITSHARD a new feature that is under development in solr 4.4? Has anyone able to split the shards using SPLITSHARD successfully? -- View this message in context: http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095789.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: limiting deep pagination
I just wonder that: Don't you implement a custom API that interacts with Solr and limits such kinds of requestst? (I know that you are asking about how to do that in Solr but I handle such situations at my custom search APIs and want to learn what fellows do) 9 Ekim 2013 Çarşamba tarihinde Michael Sokolov msoko...@safaribooksonline.com adlı kullanıcı şöyle yazdı: On 10/8/13 6:51 PM, Peter Keegan wrote: Is there a way to configure Solr 'defaults/appends/invariants' such that the product of the 'start' and 'rows' parameters doesn't exceed a given value? This would be to prevent deep pagination. Or would this require a custom requestHandler? Peter Just wondering -- isn't it the sum that you should be concerned about rather than the product? Actually I think what we usually do is limit both independently, with slightly different concerns, since. eg start=1, rows=1000 causes memory problems if you have large fields in your results, where start=1000, rows=1 may not actually be a problem -Mike
Re: Regarding Solr Cloud issue...
I sometimes also do get null ranges when doing colletions/cores API actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed because zkCli had problems with putfile command, but in 4.5.0 it works OK. All you have to do is download clusterstate.json from ZK (get /clusterstate.json), fix ranges to appropriate values and upload the file back to ZK with zkCli. But why those null ranges happen at all is beyond me :) Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 07:37 Subject:Re: Regarding Solr Cloud issue... I'm sorry I am not able to reproduce this issue. I started 5 solr-4.4 instances. I copied example directory into example1, example2, example3 and example4 cd example; java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar After that I invoked: http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1 I can see all shards having non-null ranges in clusterstate. On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote: Hi Shalin,. Thank you for your quick reply. I appreciate all the help. I started the solr cloud servers first...with 5 nodes. then i issued a command like below to create the shards - http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4 Please advice. Regards, Chris On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: How did you create these shards? Can you tell us how to reproduce the issue? Any shard in a collection with compositeId router should never have null ranges. On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote: Hi, I am using solr 4.4 as cloud. while creating shards i see that the last shard has range of null. i am not sure if this is a bug. I am stuck with having null value for the range in clusterstate.json (attached below) shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1, node_name:domain-name.com:1981_solr, base_url: http://domain-name.com:1981/solr;, leader:true, router:compositeId}, I tried to use zookeeper cli to change this, but it was not able to. I tried to locate this file, but didn't find it anywhere. Can you please let me know how do i change the range from null to something meaningful? i have the range that i need, so if i can find the file, maybe i can change it manually. My next question is - can we have a catch all for ranges, i mean if things don't match any other range then insert in this shard..is this possible? Kindly advice. Chris -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Cores with lot of folders with prefix index.XXXXXXX
I will certainly try, but give me some time :) Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 07:05 Subject:Re: Cores with lot of folders with prefix index.XXX I think that's an acceptable strategy. Can you put up a patch? On Tue, Oct 15, 2013 at 2:32 PM, primoz.sk...@policija.si wrote: I have a question for developers of Solr regarding the issue of left-over index folders when replication fails. Could be this issue resolved quickly if when replication starts Solr creates a flag file in index. folder and when replication ends (and commits) this file is deleted? In this case if a server is restarted (or on schedule) it could quickly scan all the index. folders and delete those (maybe not the last one or those relevant to the index.properties file) that still *contain* a flag file and are so unfinished and uncommited. I have not really looked at the code yet so I may have a different view on the workings of replication. Would the solution I described at least address this issue? Best regards, Primoz From: primoz.sk...@policija.si To: solr-user@lucene.apache.org Date: 11.10.2013 12:46 Subject:Re: Cores with lot of folders with prefix index.XXX Thanks, I guess I was wrong after all in my last post. Primož From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 11.10.2013 12:43 Subject:Re: Cores with lot of folders with prefix index.XXX There are open issues related to extra index.XXX folders lying around if replication/recovery fails. See https://issues.apache.org/jira/browse/SOLR-4506 On Fri, Oct 11, 2013 at 4:06 PM, Yago Riveiro yago.rive...@gmail.comwrote: The thread that you point is about master / slave - replication, Is this issue valid on SolrCloud context? I check the index.properties and indeed the variable index=index.X point to a folder, the others can be deleted without any scary side effect? -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday, October 11, 2013 at 11:31 AM, primoz.sk...@policija.si wrote: Do you have a lot of failed replications? Maybe those folders have something to do with this (please see the last answer at http://stackoverflow.com/questions/3145192/why-does-my-solr-slave-index-keep-growing ). If your disk space is valuable check index.properties file under data folder and try to determine which folders can be safely deleted. Primo¾ From: Yago Riveiro yago.rive...@gmail.com (mailto: yago.rive...@gmail.com) To: solr-user@lucene.apache.org (mailto:solr-user@lucene.apache.org) Date: 11.10.2013 12:13 Subject: Re: Cores with lot of folders with prefix index.XXX I have ssd's therefor my space is like gold, I can have 30% of my space waste in failed replications, or replications that are not cleaned. The question for me is if this a normal behaviour or is a bug. If is a normal behaviour I have a trouble because a ssd with more than 512G is expensive. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday, October 11, 2013 at 11:03 AM, primoz.sk...@policija.si(mailto: primoz.sk...@policija.si) wrote: I think this is connected to replications being made? I also have quite some of them but currently I am not worried :) -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Regarding Solr Cloud issue...
Chris, can you post your complete clusterstate.json? Do all shards have a null range? Also, did you issue any core admin CREATE commands apart from the create collection api. Primoz, I was able to reproduce this but by doing an illegal operation. Suppose I create a collection with numShards=5 and then I issue a core admin create command such as: http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6 Then a shard6 is added to the collection with a null range. This is a bug because we should never allow such a core admin create to succeed anyway. I'll open an issue. On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote: I sometimes also do get null ranges when doing colletions/cores API actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed because zkCli had problems with putfile command, but in 4.5.0 it works OK. All you have to do is download clusterstate.json from ZK (get /clusterstate.json), fix ranges to appropriate values and upload the file back to ZK with zkCli. But why those null ranges happen at all is beyond me :) Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 07:37 Subject:Re: Regarding Solr Cloud issue... I'm sorry I am not able to reproduce this issue. I started 5 solr-4.4 instances. I copied example directory into example1, example2, example3 and example4 cd example; java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar After that I invoked: http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1 I can see all shards having non-null ranges in clusterstate. On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote: Hi Shalin,. Thank you for your quick reply. I appreciate all the help. I started the solr cloud servers first...with 5 nodes. then i issued a command like below to create the shards - http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4 Please advice. Regards, Chris On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: How did you create these shards? Can you tell us how to reproduce the issue? Any shard in a collection with compositeId router should never have null ranges. On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote: Hi, I am using solr 4.4 as cloud. while creating shards i see that the last shard has range of null. i am not sure if this is a bug. I am stuck with having null value for the range in clusterstate.json (attached below) shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1, node_name:domain-name.com:1981_solr, base_url: http://domain-name.com:1981/solr;, leader:true, router:compositeId}, I tried to use zookeeper cli to change this, but it was not able to. I tried to locate this file, but didn't find it anywhere. Can you please let me know how do i change the range from null to something meaningful? i have the range that i need, so if i can find the file, maybe i can change it manually. My next question is - can we have a catch all for ranges, i mean if things don't match any other range then insert in this shard..is this possible? Kindly advice. Chris -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Regarding Solr Cloud issue...
If I am not mistaken the only way to create a new shard from a collection in 4.4.0 was to use cores API. That worked fine for me until I used *other* cores API commands. Those usually produced null ranges. In 4.5.0 this is fixed with newly added commands createshard etc. to the collections API, right? Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 09:06 Subject:Re: Regarding Solr Cloud issue... Chris, can you post your complete clusterstate.json? Do all shards have a null range? Also, did you issue any core admin CREATE commands apart from the create collection api. Primoz, I was able to reproduce this but by doing an illegal operation. Suppose I create a collection with numShards=5 and then I issue a core admin create command such as: http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6 Then a shard6 is added to the collection with a null range. This is a bug because we should never allow such a core admin create to succeed anyway. I'll open an issue. On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote: I sometimes also do get null ranges when doing colletions/cores API actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed because zkCli had problems with putfile command, but in 4.5.0 it works OK. All you have to do is download clusterstate.json from ZK (get /clusterstate.json), fix ranges to appropriate values and upload the file back to ZK with zkCli. But why those null ranges happen at all is beyond me :) Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 07:37 Subject:Re: Regarding Solr Cloud issue... I'm sorry I am not able to reproduce this issue. I started 5 solr-4.4 instances. I copied example directory into example1, example2, example3 and example4 cd example; java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar After that I invoked: http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1 I can see all shards having non-null ranges in clusterstate. On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote: Hi Shalin,. Thank you for your quick reply. I appreciate all the help. I started the solr cloud servers first...with 5 nodes. then i issued a command like below to create the shards - http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4 Please advice. Regards, Chris On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: How did you create these shards? Can you tell us how to reproduce the issue? Any shard in a collection with compositeId router should never have null ranges. On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote: Hi, I am using solr 4.4 as cloud. while creating shards i see that the last shard has range of null. i am not sure if this is a bug. I am stuck with having null value for the range in clusterstate.json (attached below) shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1, node_name:domain-name.com:1981_solr, base_url: http://domain-name.com:1981/solr;, leader:true, router:compositeId}, I tried to use zookeeper cli to change this, but it was not able to. I tried to locate this file, but didn't find it anywhere. Can you please let me know how do i change the range from null to something meaningful? i have the range that i need, so if i can find the file, maybe i can change it manually. My next question is - can we have a catch all for ranges, i mean if things don't match any other range then insert in this shard..is this possible? Kindly advice. Chris -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: SPLITSHARD not working in SOLR-4.4.0
Shalin, It is working for me. As you pointed rightly, i had defined UNIQUE_KEY field in schema, but forgot to mention this field in the uniqueKey decalaration. After i added this, it started working. One another question i have with regard to SPLITSHARD is, we are not able to control, which nodes of tomcat, the splitted shards should be create. While creating a collection, we can mention createNodeSet to set our preference of tomcat nodes on which the collections slices should be created. But i dont find that feature in SPLITSHARD API. Would you know that it is a limitation in solr 4.4 or is there any other means by which we can achieve this -- View this message in context: http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095809.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: [Indexing XML files in Solr with DataImportHandler]
it is not indexing, it is saying there are no files indexed -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-XML-files-in-Solr-with-DataImportHandler-tp4095628p4095811.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: [Indexing XML files in Solr with DataImportHandler]
On 16 October 2013 13:06, kujta1 kujtim.rahm...@gmail.com wrote: it is not indexing, it is saying there are no files indexed If you expect answers on the mailing list it might be best to provide details here. From a quick glance at Stackoverflow, it looks like you need a FileListEntityProcessor. Searching Google turns up many examples of using a FileDataSource, e.g., see: http://java.dzone.com/news/data-import-handler-%E2%80%93-import Regards, Gora
Re: Debugging update request
Thanks Erick! The version is 4.4.0. I'm posting 100k docs batches every 30-40 sec from each indexing client and sometimes two or more clients post in a very small timeframe. That's when i think the deadlock happens. I'll try to replicate the problem and check the thread dump. -- View this message in context: http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095821.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Debugging update request
I ran an import last night, and this morning my cloud wouldn't accept updates. I'm running the latest 4.6 snapshot. I was importing with latest solrj snapshot, and using java bin transport with CloudSolrServer. The cluster had indexed ~1.3 million docs before no further updates were accepted, querying still working. I'll run jstack shortly and provide the results. On Wednesday, October 16, 2013, michael.boom wrote: Thanks Erick! The version is 4.4.0. I'm posting 100k docs batches every 30-40 sec from each indexing client and sometimes two or more clients post in a very small timeframe. That's when i think the deadlock happens. I'll try to replicate the problem and check the thread dump. -- View this message in context: http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095821.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SPLITSHARD not working in SOLR-4.4.0
Thanks for clearing that. The way it is implemented, shard splitting must create the leaders of sub-shards on the same node as the leader of the parent shard. The location of the other replicas of the sub-shards are chosen at random. Split shard doesn't support a createNodeSet parameter yet but it'd make for a nice improvement. Can you please open a jira issue? On Wed, Oct 16, 2013 at 1:00 PM, RadhaJayalakshmi rlakshminaraya...@inautix.co.in wrote: Shalin, It is working for me. As you pointed rightly, i had defined UNIQUE_KEY field in schema, but forgot to mention this field in the uniqueKey decalaration. After i added this, it started working. One another question i have with regard to SPLITSHARD is, we are not able to control, which nodes of tomcat, the splitted shards should be create. While creating a collection, we can mention createNodeSet to set our preference of tomcat nodes on which the collections slices should be created. But i dont find that feature in SPLITSHARD API. Would you know that it is a limitation in solr 4.4 or is there any other means by which we can achieve this -- View this message in context: http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095809.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Regarding Solr Cloud issue...
If the initial collection was created with a numShards parameter (and hence compositeId router then there was no way to create a new logical shard. You can add replicas with the core admin API but only to shards that already exist. A new logical shard can only be created by splitting an existing one. The createshard API also has the same limitation -- it cannot create a shard for a collection with compositeId router. It is supposed to be used for collections with custom sharding (i.e. implicit router). In such collections, there is no concept of a hash range and routing is done explicitly by the user using the shards parameter in the request or by sending the request to the target core/node directly. So, in summary, attempting to add a new logical shard to a collection with compositeId router via CoreAdmin APIs is wrong, unsupported and should be disallowed. Adding replicas to existing logical shards is okay though. On Wed, Oct 16, 2013 at 12:56 PM, primoz.sk...@policija.si wrote: If I am not mistaken the only way to create a new shard from a collection in 4.4.0 was to use cores API. That worked fine for me until I used *other* cores API commands. Those usually produced null ranges. In 4.5.0 this is fixed with newly added commands createshard etc. to the collections API, right? Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 09:06 Subject:Re: Regarding Solr Cloud issue... Chris, can you post your complete clusterstate.json? Do all shards have a null range? Also, did you issue any core admin CREATE commands apart from the create collection api. Primoz, I was able to reproduce this but by doing an illegal operation. Suppose I create a collection with numShards=5 and then I issue a core admin create command such as: http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6 Then a shard6 is added to the collection with a null range. This is a bug because we should never allow such a core admin create to succeed anyway. I'll open an issue. On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote: I sometimes also do get null ranges when doing colletions/cores API actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed because zkCli had problems with putfile command, but in 4.5.0 it works OK. All you have to do is download clusterstate.json from ZK (get /clusterstate.json), fix ranges to appropriate values and upload the file back to ZK with zkCli. But why those null ranges happen at all is beyond me :) Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 07:37 Subject:Re: Regarding Solr Cloud issue... I'm sorry I am not able to reproduce this issue. I started 5 solr-4.4 instances. I copied example directory into example1, example2, example3 and example4 cd example; java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar After that I invoked: http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1 I can see all shards having non-null ranges in clusterstate. On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote: Hi Shalin,. Thank you for your quick reply. I appreciate all the help. I started the solr cloud servers first...with 5 nodes. then i issued a command like below to create the shards - http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4 Please advice. Regards, Chris On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: How did you create these shards? Can you tell us how to reproduce the issue? Any shard in a collection with compositeId router should never have null ranges. On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote: Hi, I am using solr 4.4 as cloud. while creating shards i see that the last shard has range of null. i am not sure if this is a bug. I am stuck with having null value for the range in clusterstate.json (attached below) shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1,
RE: ClusteringComponent under Tomcat 7
Hi, If I recall correctly this problem relate to the class loader path. make sure that the ./lib (solr home, were you've replaced the jars) is not also part of the Tomcat class loader path. (in other words solr and Tomcat cannot share the same ./lib directories.) -Ariel -Original Message- From: ravi koshal [mailto:ravikosha...@gmail.com] Sent: Tuesday, October 15, 2013 10:10 AM To: solr-user@lucene.apache.org Subject: Re: ClusteringComponent under Tomcat 7 Hi Lieberman, I am facing the same issue. were you able to resolve this? I am able to see the solr home , but the cores do not appear. my stack trace is as follows : org.apache.solr.common.SolrException: Error Instantiating SearchComponent, solr.clustering.ClusteringComponent failed to instantiate org.apache.solr.handler.component.SearchComponent at org.apache.solr.core.SolrCore.init(SolrCore.java:834) at org.apache.solr.core.SolrCore.init(SolrCore.java:625) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:524) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:559) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.solr.common.SolrException: Error Instantiating SearchComponent, solr.clustering.ClusteringComponent failed to instantiate org.apache.solr.handler.component.SearchComponent at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:547) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:582) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2128) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2122) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2155) at org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1177) at org.apache.solr.core.SolrCore.init(SolrCore.java:762) ... 11 more Caused by: java.lang.ClassCastException: class org.apache.solr.handler.clustering.ClusteringComponent at java.lang.Class.asSubclass(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:44 3) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:38 1) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:526) Lieberman, Ariel Ariel.Lieberman at verint.com writes: Hi, I'm trying to run Solr 4.3 (and 4.4) with -Dsolr.clustering.enabled=true I've copied all relevant jars to ./lib directory under the instance. With jetty it runs OK! But, under Tomcat I receives the error (exception) below. Any idea/help? Thanks, -Ariel org.apache.solr.common.SolrException: Error Instantiating SearchComponent, solr.clustering.ClusteringComponent failed to instantiate org.apache.solr.handler.component.SearchComponent at org.apache.solr.core.SolrCore.init(SolrCore.java:835) at org.apache.solr.core.SolrCore.init(SolrCore.java:629) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:622) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.solr.common.SolrException: Error Instantiating SearchComponent, solr.clustering.ClusteringComponent failed to instantiate org.apache.solr.handler.component.SearchComponent at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:586) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2173) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2167) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2200) at org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1231) at org.apache.solr.core.SolrCore.init(SolrCore.java:766) ... 13 more Caused by:
req info : SOLRJ and TermVector
hi, can i access TermVector information using solrj ? thx, elfu
Re: Regarding Solr Cloud issue...
Yap, you are right - I only created extra replicas with cores API. For a new shard I had to use split shard command. My apologies. Primož From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 10:45 Subject:Re: Regarding Solr Cloud issue... If the initial collection was created with a numShards parameter (and hence compositeId router then there was no way to create a new logical shard. You can add replicas with the core admin API but only to shards that already exist. A new logical shard can only be created by splitting an existing one. The createshard API also has the same limitation -- it cannot create a shard for a collection with compositeId router. It is supposed to be used for collections with custom sharding (i.e. implicit router). In such collections, there is no concept of a hash range and routing is done explicitly by the user using the shards parameter in the request or by sending the request to the target core/node directly. So, in summary, attempting to add a new logical shard to a collection with compositeId router via CoreAdmin APIs is wrong, unsupported and should be disallowed. Adding replicas to existing logical shards is okay though. On Wed, Oct 16, 2013 at 12:56 PM, primoz.sk...@policija.si wrote: If I am not mistaken the only way to create a new shard from a collection in 4.4.0 was to use cores API. That worked fine for me until I used *other* cores API commands. Those usually produced null ranges. In 4.5.0 this is fixed with newly added commands createshard etc. to the collections API, right? Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 09:06 Subject:Re: Regarding Solr Cloud issue... Chris, can you post your complete clusterstate.json? Do all shards have a null range? Also, did you issue any core admin CREATE commands apart from the create collection api. Primoz, I was able to reproduce this but by doing an illegal operation. Suppose I create a collection with numShards=5 and then I issue a core admin create command such as: http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6 Then a shard6 is added to the collection with a null range. This is a bug because we should never allow such a core admin create to succeed anyway. I'll open an issue. On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote: I sometimes also do get null ranges when doing colletions/cores API actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed because zkCli had problems with putfile command, but in 4.5.0 it works OK. All you have to do is download clusterstate.json from ZK (get /clusterstate.json), fix ranges to appropriate values and upload the file back to ZK with zkCli. But why those null ranges happen at all is beyond me :) Primoz From: Shalin Shekhar Mangar shalinman...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 07:37 Subject:Re: Regarding Solr Cloud issue... I'm sorry I am not able to reproduce this issue. I started 5 solr-4.4 instances. I copied example directory into example1, example2, example3 and example4 cd example; java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar After that I invoked: http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1 I can see all shards having non-null ranges in clusterstate. On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote: Hi Shalin,. Thank you for your quick reply. I appreciate all the help. I started the solr cloud servers first...with 5 nodes. then i issued a command like below to create the shards - http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4 Please advice. Regards, Chris On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: How did you create these shards? Can you tell us how to reproduce the issue? Any shard in a collection with compositeId router should never have null ranges. On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote: Hi, I am using solr 4.4 as cloud. while creating shards i
Local Solr and Webserver-Solr act differently (and treated like or)
Hello Solr-Experts, I am currently having a strange issue with my solr querys. I am running a small php/mysql-website that uses Solr for faster text-searches in name-lists, movie-titles, etc. Recently I noticed that the results on my local development-environment differ from those on my webserver. Both use the 100% same mysql-database with identical solr-queries for data-import. This is a sample query: http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29version=2.2start=0rows=1000indent=onfl=titleid It is autogenerated by an php-script and 100% identical on local and on my webserver. My local solr gives me the expected results: all entries that have the words into AND the AND wild* in them. But my webserver acts as if I was looking for into OR the OR wild*, eventhough the query is the same (as shown above). That's why I get useless (too many) results on the webserver-side. I don't know what could be the issue. I have tried to check the config-files but I don't really know what to look for, so it is overwhelming for me to search through this big file without knowing. What could be the problem, where can I check/find it and how can I solve that problem? In case, additional informations are needed, let me know please. Thank you! (Excuse my poor english, please. It's not my mother-language.)
Re: Debugging update request
I got the trace from jstack. I found references to semaphore but not sure if this is what you meant. Here's the trace: http://pastebin.com/15QKAz7U -- View this message in context: http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095847.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Debugging update request
Here is my jstack output... Lots of blocked threads. http://pastebin.com/1ktjBYbf On 16 October 2013 10:28, michael.boom my_sky...@yahoo.com wrote: I got the trace from jstack. I found references to semaphore but not sure if this is what you meant. Here's the trace: http://pastebin.com/15QKAz7U -- View this message in context: http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095847.html Sent from the Solr - User mailing list archive at Nabble.com.
Boosting a field with defType:dismax -- No results at all
Hi there, i want to boost a field, see below. If i add the defType:dismax i don't get results at all anymore. What i am doing wrong? Regards Uwe requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=omitHeadertrue/str str name=dftext/str str name=q.opAND/str str name=spellcheck.dictionarydefault/str str name=spellchecktrue/str str name=spellcheck.extendedResultstrue/str str name=spellcheck.count1/str str name=spellcheck.maxResultsForSuggest100/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollations1/str str name=defTypedismax/str str name=qf SignalImpl.baureihe^1011 text^0.1 /str /lst arr name=last-components strspellcheck/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850.html Sent from the Solr - User mailing list archive at Nabble.com.
Timeout Errors while using Collections API
Hi, My setup is Zookeeper ensemble - running with 3 nodes Tomcats - 9 Tomcat instances are brought up, by registereing with zookeeper. Steps : 1) I uploaded the solr configuration like db_data_config, solrconfig, schema xmls into zookeeoper 2) Now, i am trying to create a collection with the collection API like below: http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATEname=Schwab_InvACC_CollnumShards=1replicationFactor=2createNodeSet=localhost:7034_solr,localhost:7036_solrcollection.configName=InvestorAccountDomainConfig Now, when i execute this command, i am getting the following error: responselst name=responseHeaderint name=status500/intint name=QTime60015/int/lstlst name=errorstr name=msgcreatecollection the collection time out:60s/strstr name=traceorg.apache.solr.common.SolrException: createcollection the collection time out:60s at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175) at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156) at org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290) at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) /strint name=code500/int/lst/response Now after i got this error, i am not able to do any operation on these instances with collection API. It is repeteadly giving the same timeout error.. This setup was working fine 5 mins back. suddenly it started throwing this exceptions. Any ideas please?? -- View this message in context: http://lucene.472066.n3.nabble.com/Timeout-Errors-while-using-Collections-API-tp4095852.html Sent from the Solr - User mailing list archive at Nabble.com.
how does solr load plugins?
Hi I write a plugin to index contents reusing our DAO layer which is developed using Spring. What I am doing now is putting the plugin jar and all other depending jars of DAO layer to shared lib folder under solr home. In the log, I can see all the jars are loaded through SolrResourceLoader like: INFO - 2013-10-16 16:25:30.611; org.apache.solr.core.SolrResourceLoader; Adding 'file:/D:/apache-tomcat-7.0.42/solr/lib/spring-tx-3.1.0.RELEASE.jar' to classloader Then initialize the Spring context using: ApplicationContext context = new FileSystemXmlApplicationContext(/solr/spring/solr-plugin-bean-test.xml); Then Spring will complain: INFO - 2013-10-16 16:33:57.432; org.springframework.context.support.AbstractApplicationContext; Refreshing org.springframework.context.support.FileSystemXmlApplicationContext@e582a85: startup date [Wed Oct 16 16:33:57 CST 2013]; root of context hierarchy INFO - 2013-10-16 16:33:57.491; org.springframework.beans.factory.xml.XmlBeanDefinitionReader; Loading XML bean definitions from file [D:\apache-tomcat-7.0.42\solr\spring\solr-plugin-bean-test.xml] ERROR - 2013-10-16 16:33:59.944; com.test.search.solr.spring.AppicationContextWrapper; Configuration problem: Unable to locate Spring NamespaceHandler for XML schema namespace [ http://www.springframework.org/schema/context] Offending resource: file [D:\apache-tomcat-7.0.42\solr\spring\solr-plugin-bean-test.xml] Spring context requires spring-tx-3.1.xsd which does exist in spring-tx-3.1.0.RELEASE.jar under org\springframework\transaction\config\ package, but the program can't find it even though it could load spring classes successfully. The following won't work either. ApplicationContext context = new ClassPathXmlApplicationContext(classpath:spring/solr-plugin-bean-test.xml); //the solr-plugin-bean-test.xml is packaged in plugin.jar as well. But when I but all the jars under TOMECAT_HOME/webapp/solr/WEB-INF/lib, and using ApplicationContext context = new ClassPathXmlApplicationContext(classpath:spring/solr-plugin-bean-test.xml); everything works fine, I could initialize spring context and load DAO beans to read data and then write them to solr index. But isn't modifying solr.war a bad practice? It seems SolrResourceLoader only loads classes from plugins jars but these jars are NOT in classpath. Please correct me if I am wrong, Is there any ways to use resources in plugin jars such as configuration file? BTW is there any difference between SolrResourceLoader with tomcat webapp classLoader? -- All the best Liu Bo
SolrCloud Query Balancing
I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3 machines along with 3 Zookeeper instances. My web application makes queries to Solr specifying the hostname of one of the machines. So that machine will always get the request and the other ones will just serve as an aid. So I would like to setup a load balancer that would fix that, balancing the queries to all machines. Maybe doing the same while indexing. Would this be a good practice ? Any recommended tools for doing that? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud Query Balancing
If your web application is using SolrJ/Java based - use a CloudSolrServer instance with the zkHosts. It will take care of load balancing when querying, indexing, and handle routing if a node goes down. On 16 October 2013 10:52, michael.boom my_sky...@yahoo.com wrote: I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3 machines along with 3 Zookeeper instances. My web application makes queries to Solr specifying the hostname of one of the machines. So that machine will always get the request and the other ones will just serve as an aid. So I would like to setup a load balancer that would fix that, balancing the queries to all machines. Maybe doing the same while indexing. Would this be a good practice ? Any recommended tools for doing that? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud Query Balancing
Thanks! I've read a lil' bit about that, but my app is php-based so I'm afraid I can't use that. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Different document types in different collections OR same collection without sharing fields?
Can some expert users please leave a comment on this ? On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote: Using a single node Solr instance, I need to search for, lets say, electronics items grocery items. But I never want to search both of them together. When I search for electrnoics I don't expect a grocery item ever vice versa. Should I be defining both these document types within a single schema.xml or should I use different collection for each of these two(maintaining separate schema.xml solrconfig.xml for each of two) ? I believe that if I add both to a single collection, without sharing fields among these two document types, I should be equally good as separating them in two collection(in terms of performance all), as their indexes/filter caches would be totally independent of each other when they don't share fields? Also posted at SO: http://stackoverflow.com/q/19202882/530153
Re: SolrCloud Query Balancing
What you could do (and what we do) is to have a simple proxy in front of your Solr instances. We for example run with Nginx in front of all of our Tomcats, and use Nginx's upstream capabilities to do a simple loadbalancer for our SolrCloud cluster. http://wiki.nginx.org/HttpUpstreamModule I'm sure other web servers have similar modules. Den 16/10/2013 kl. 12.08 skrev michael.boom my_sky...@yahoo.commailto:my_sky...@yahoo.com: Thanks! I've read a lil' bit about that, but my app is php-based so I'm afraid I can't use that. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.com.
Re: Local Solr and Webserver-Solr act differently (and treated like or)
What does the debug output say from debugQuery=true say between the two? On Oct 16, 2013, at 5:16, Stavros Delisavas stav...@delisavas.de wrote: Hello Solr-Experts, I am currently having a strange issue with my solr querys. I am running a small php/mysql-website that uses Solr for faster text-searches in name-lists, movie-titles, etc. Recently I noticed that the results on my local development-environment differ from those on my webserver. Both use the 100% same mysql-database with identical solr-queries for data-import. This is a sample query: http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29version=2.2start=0rows=1000indent=onfl=titleid It is autogenerated by an php-script and 100% identical on local and on my webserver. My local solr gives me the expected results: all entries that have the words into AND the AND wild* in them. But my webserver acts as if I was looking for into OR the OR wild*, eventhough the query is the same (as shown above). That's why I get useless (too many) results on the webserver-side. I don't know what could be the issue. I have tried to check the config-files but I don't really know what to look for, so it is overwhelming for me to search through this big file without knowing. What could be the problem, where can I check/find it and how can I solve that problem? In case, additional informations are needed, let me know please. Thank you! (Excuse my poor english, please. It's not my mother-language.)
Solr Copy field append values ?
Hi , Schema like this external_id is multivalued field. copyField source=upc dest=external_id / I want to know will values of upc will be appended to exiting values of external_id or override it ? For example if I send a document having values upc:131 external_id:423 for indexing in sorl with above mentioned schema.what will be value of external_id field 131 or 131,423. Thanks Vishal -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Copy-field-append-values-tp4095862.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Different document types in different collections OR same collection without sharing fields?
Hi, Please refer below link for clarification on fields having null value. http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value logically it is better to have different collections for different domain data. Having 2 collections will improve the overall performances. Currently am holding 2 collections for different domain data. It eases importing data and re-indexing. regards, Shrikanth On Wed, Oct 16, 2013 at 3:48 PM, user 01 user...@gmail.com wrote: Can some expert users please leave a comment on this ? On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote: Using a single node Solr instance, I need to search for, lets say, electronics items grocery items. But I never want to search both of them together. When I search for electrnoics I don't expect a grocery item ever vice versa. Should I be defining both these document types within a single schema.xml or should I use different collection for each of these two(maintaining separate schema.xml solrconfig.xml for each of two) ? I believe that if I add both to a single collection, without sharing fields among these two document types, I should be equally good as separating them in two collection(in terms of performance all), as their indexes/filter caches would be totally independent of each other when they don't share fields? Also posted at SO: http://stackoverflow.com/q/19202882/530153 --
Re: Regarding Solr Cloud issue...
Hi, Please find the clusterstate.json as below: I have created a dev environment on one of my servers so that you can see the issue live - http://64.251.14.47:1984/solr/ Also, There seems to be something wrong in zookeeper, when we try to add documents using solrj, it works fine as long as load of insert is not much, but once we start doing many inserts, then it throws a lot of errors... I am doing something like - CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL); solrCoreCloud.setDefaultCollection(Image); UpdateResponse up = solrCoreCloud.addBean(resultItem); UpdateResponse upr = solrCoreCloud.commit(); clusterstate.json --- { collection1:{ shards:{ shard2:{ range:b333-e665, state:active, replicas:{core_node4:{ state:active, core:collection1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node5:{ state:active, core:collection1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{ core_node2:{ state:active, core:collection1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr}, core_node6:{ state:active, core:collection1, node_name:64.251.14.47:1981_solr, base_url:http://64.251.14.47:1981/solr;, leader:true}}}, shard5:{ range:4ccc-7fff, state:active, replicas:{core_node3:{ state:active, core:collection1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true, router:compositeId}, Web:{ shards:{ shard1:{ range:8000-b332, state:active, replicas:{core_node2:{ state:active, core:Web_shard1_replica1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true}}}, shard2:{ range:b333-e665, state:active, replicas:{core_node3:{ state:active, core:Web_shard2_replica1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node4:{ state:active, core:Web_shard3_replica1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{core_node5:{ state:active, core:Web_shard4_replica1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1, node_name:64.251.14.47:1981_solr, base_url:http://64.251.14.47:1981/solr;, leader:true, router:compositeId}, Image:{ shards:{ shard1:{ range:8000-b332, state:active, replicas:{core_node1:{ state:active, core:Image_shard1_replica1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true}}}, shard2:{ range:b333-e665, state:active, replicas:{core_node2:{ state:active, core:Image_shard2_replica1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node3:{ state:active, core:Image_shard3_replica1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{core_node5:{ state:active, core:Image_shard4_replica1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr;, leader:true}}}, shard5:{ range:null, state:active, replicas:{core_node4:{ state:active, core:Image_shard5_replica1, node_name:64.251.14.47:1981_solr,
Re: Local Solr and Webserver-Solr act differently (and treated like or)
My local solr gives me: http://pastebin.com/Q6d9dFmZ and my webserver this: http://pastebin.com/q87WEjVA I copied only the first few hundret lines (of more than 8000) because the webserver output was to big even for pastebin. On 16.10.2013 12:27, Erik Hatcher wrote: What does the debug output say from debugQuery=true say between the two? On Oct 16, 2013, at 5:16, Stavros Delisavas stav...@delisavas.de wrote: Hello Solr-Experts, I am currently having a strange issue with my solr querys. I am running a small php/mysql-website that uses Solr for faster text-searches in name-lists, movie-titles, etc. Recently I noticed that the results on my local development-environment differ from those on my webserver. Both use the 100% same mysql-database with identical solr-queries for data-import. This is a sample query: http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29version=2.2start=0rows=1000indent=onfl=titleid It is autogenerated by an php-script and 100% identical on local and on my webserver. My local solr gives me the expected results: all entries that have the words into AND the AND wild* in them. But my webserver acts as if I was looking for into OR the OR wild*, eventhough the query is the same (as shown above). That's why I get useless (too many) results on the webserver-side. I don't know what could be the issue. I have tried to check the config-files but I don't really know what to look for, so it is overwhelming for me to search through this big file without knowing. What could be the problem, where can I check/find it and how can I solve that problem? In case, additional informations are needed, let me know please. Thank you! (Excuse my poor english, please. It's not my mother-language.)
Re: Concurent indexing
Run jstack on the solr process (standard with Java) and look for the word semaphore. You should see your servers blocked on this in the Solr code. That'll pretty much nail it. There's an open JIRA to fix the underlying cause, see: SOLR-5232, but that's currently slated for 4.6 which won't be cut for a while. Also, there's a patch that will fix this as a side effect, assuming you're using SolrJ, see. This is available in 4.5 SOLR-4816 Best, Erick On Tue, Oct 15, 2013 at 1:33 PM, michael.boom my_sky...@yahoo.com wrote: Here's some of the Solr's last words (log content before it stoped accepting updates), maybe someone can help me interpret that. http://pastebin.com/mv7fH62H -- View this message in context: http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Regarding Solr Cloud issue...
oops, the actual url is -http://64.251.14.47:1981/solr/ Also, another issue that needs to be raised is the creation of cores from the core admin section of the gui, doesnt really work well, it creates files but then they do not work (again i am using 4.4) On Wed, Oct 16, 2013 at 4:12 PM, Chris christu...@gmail.com wrote: Hi, Please find the clusterstate.json as below: I have created a dev environment on one of my servers so that you can see the issue live - http://64.251.14.47:1984/solr/ Also, There seems to be something wrong in zookeeper, when we try to add documents using solrj, it works fine as long as load of insert is not much, but once we start doing many inserts, then it throws a lot of errors... I am doing something like - CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL); solrCoreCloud.setDefaultCollection(Image); UpdateResponse up = solrCoreCloud.addBean(resultItem); UpdateResponse upr = solrCoreCloud.commit(); clusterstate.json --- { collection1:{ shards:{ shard2:{ range:b333-e665, state:active, replicas:{core_node4:{ state:active, core:collection1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node5:{ state:active, core:collection1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{ core_node2:{ state:active, core:collection1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr}, core_node6:{ state:active, core:collection1, node_name:64.251.14.47:1981_solr, base_url:http://64.251.14.47:1981/solr;, leader:true}}}, shard5:{ range:4ccc-7fff, state:active, replicas:{core_node3:{ state:active, core:collection1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true, router:compositeId}, Web:{ shards:{ shard1:{ range:8000-b332, state:active, replicas:{core_node2:{ state:active, core:Web_shard1_replica1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true}}}, shard2:{ range:b333-e665, state:active, replicas:{core_node3:{ state:active, core:Web_shard2_replica1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node4:{ state:active, core:Web_shard3_replica1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{core_node5:{ state:active, core:Web_shard4_replica1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1, node_name:64.251.14.47:1981_solr, base_url:http://64.251.14.47:1981/solr;, leader:true, router:compositeId}, Image:{ shards:{ shard1:{ range:8000-b332, state:active, replicas:{core_node1:{ state:active, core:Image_shard1_replica1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true}}}, shard2:{ range:b333-e665, state:active, replicas:{core_node2:{ state:active, core:Image_shard2_replica1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node3:{ state:active, core:Image_shard3_replica1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard4:{
Re: Regarding Solr Cloud issue...
Also, is there any easy way upgrading to 4.5 without having to change most of my plugins configuration files? On Wed, Oct 16, 2013 at 4:18 PM, Chris christu...@gmail.com wrote: oops, the actual url is -http://64.251.14.47:1981/solr/ Also, another issue that needs to be raised is the creation of cores from the core admin section of the gui, doesnt really work well, it creates files but then they do not work (again i am using 4.4) On Wed, Oct 16, 2013 at 4:12 PM, Chris christu...@gmail.com wrote: Hi, Please find the clusterstate.json as below: I have created a dev environment on one of my servers so that you can see the issue live - http://64.251.14.47:1984/solr/ Also, There seems to be something wrong in zookeeper, when we try to add documents using solrj, it works fine as long as load of insert is not much, but once we start doing many inserts, then it throws a lot of errors... I am doing something like - CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL); solrCoreCloud.setDefaultCollection(Image); UpdateResponse up = solrCoreCloud.addBean(resultItem); UpdateResponse upr = solrCoreCloud.commit(); clusterstate.json --- { collection1:{ shards:{ shard2:{ range:b333-e665, state:active, replicas:{core_node4:{ state:active, core:collection1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node5:{ state:active, core:collection1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{ core_node2:{ state:active, core:collection1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr}, core_node6:{ state:active, core:collection1, node_name:64.251.14.47:1981_solr, base_url:http://64.251.14.47:1981/solr;, leader:true}}}, shard5:{ range:4ccc-7fff, state:active, replicas:{core_node3:{ state:active, core:collection1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true, router:compositeId}, Web:{ shards:{ shard1:{ range:8000-b332, state:active, replicas:{core_node2:{ state:active, core:Web_shard1_replica1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true}}}, shard2:{ range:b333-e665, state:active, replicas:{core_node3:{ state:active, core:Web_shard2_replica1, node_name:64.251.14.47:1984_solr, base_url:http://64.251.14.47:1984/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node4:{ state:active, core:Web_shard3_replica1, node_name:64.251.14.47:1982_solr, base_url:http://64.251.14.47:1982/solr;, leader:true}}}, shard4:{ range:1999-4ccb, state:active, replicas:{core_node5:{ state:active, core:Web_shard4_replica1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard5:{ range:null, state:active, replicas:{core_node1:{ state:active, core:Web_shard5_replica1, node_name:64.251.14.47:1981_solr, base_url:http://64.251.14.47:1981/solr;, leader:true, router:compositeId}, Image:{ shards:{ shard1:{ range:8000-b332, state:active, replicas:{core_node1:{ state:active, core:Image_shard1_replica1, node_name:64.251.14.47:1983_solr, base_url:http://64.251.14.47:1983/solr;, leader:true}}}, shard2:{ range:b333-e665, state:active, replicas:{core_node2:{ state:active, core:Image_shard2_replica1, node_name:64.251.14.47:1985_solr, base_url:http://64.251.14.47:1985/solr;, leader:true}}}, shard3:{ range:e666-1998, state:active, replicas:{core_node3:{ state:active,
Re: Concurent indexing
Hi Erick, here is a paste from other thread (debugging update request) with my input as I am seeing errors too: I ran an import last night, and this morning my cloud wouldn't accept updates. I'm running the latest 4.6 snapshot. I was importing with latest solrj snapshot, and using java bin transport with CloudSolrServer. The cluster had indexed ~1.3 million docs before no further updates were accepted, querying still working. I'll run jstack shortly and provide the results. Here is my jstack output... Lots of blocked threads. http://pastebin.com/1ktjBYbf On 16 October 2013 11:46, Erick Erickson erickerick...@gmail.com wrote: Run jstack on the solr process (standard with Java) and look for the word semaphore. You should see your servers blocked on this in the Solr code. That'll pretty much nail it. There's an open JIRA to fix the underlying cause, see: SOLR-5232, but that's currently slated for 4.6 which won't be cut for a while. Also, there's a patch that will fix this as a side effect, assuming you're using SolrJ, see. This is available in 4.5 SOLR-4816 Best, Erick On Tue, Oct 15, 2013 at 1:33 PM, michael.boom my_sky...@yahoo.com wrote: Here's some of the Solr's last words (log content before it stoped accepting updates), maybe someone can help me interpret that. http://pastebin.com/mv7fH62H -- View this message in context: http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Different document types in different collections OR same collection without sharing fields?
@Shrikanth: how do you manage multiple redundant configurations(isn' it?) ? I thought indexes would be separate when fields aren't shared. I don't need to import any data/ or re-indexing, if those are the only benefits for separate collections. I just index when a request comes/ new item is added to DB. On Wed, Oct 16, 2013 at 4:12 PM, shrikanth k jconsult.s...@gmail.comwrote: Hi, Please refer below link for clarification on fields having null value. http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value logically it is better to have different collections for different domain data. Having 2 collections will improve the overall performances. Currently am holding 2 collections for different domain data. It eases importing data and re-indexing. regards, Shrikanth On Wed, Oct 16, 2013 at 3:48 PM, user 01 user...@gmail.com wrote: Can some expert users please leave a comment on this ? On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote: Using a single node Solr instance, I need to search for, lets say, electronics items grocery items. But I never want to search both of them together. When I search for electrnoics I don't expect a grocery item ever vice versa. Should I be defining both these document types within a single schema.xml or should I use different collection for each of these two(maintaining separate schema.xml solrconfig.xml for each of two) ? I believe that if I add both to a single collection, without sharing fields among these two document types, I should be equally good as separating them in two collection(in terms of performance all), as their indexes/filter caches would be totally independent of each other when they don't share fields? Also posted at SO: http://stackoverflow.com/q/19202882/530153 --
Re: Regarding Solr Cloud issue...
Also, another issue that needs to be raised is the creation of cores from the core admin section of the gui, doesnt really work well, it creates files but then they do not work (again i am using 4.4) From my experience core admin section of the GUI does not work well in SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0 which acts much better. I would use only HTTP requests (cores and collections API) with SolrCloud and would use GUI only for viewing the state of cluster and cores. Primoz
Re: req info : SOLRJ and TermVector
(13/10/16 17:47), elfu wrote: hi, can i access TermVector information using solrj ? There is TermVectorComponent to get termVector info: http://wiki.apache.org/solr/TermVectorComponent So yes, you can access it using solrj. koji -- http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html
RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID
Hi Otis, Did you get a chance to look into the logs. Please let me know if you need more information. Thank you. Regards, Bharat Akkinepalli -Original Message- From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com] Sent: Friday, October 11, 2013 2:16 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Hi Otis, Thanks for the response. The log files can be found here. MasterLog : http://pastebin.com/DPLKMPcF Slave Log: http://pastebin.com/DX9sV6Jx One more point worth mentioning here is that when we issue the commit with expungeDeletes=true, then the delete by id replication is successful. i.e. http://localhost:8983/solr/annotation/update?commit=trueexpungeDeletes=true Regards, Bharat Akkinepalli -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Wednesday, October 09, 2013 6:35 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Bharat, Can you look at the logs on the Master when you issue the delete and the subsequent commits and share that? Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi, We have recently migrated from Solr 3.6 to Solr 4.4. We are using the Master/Slave configuration in Solr 4.4 (not Solr Cloud). We have noticed the following behavior/defect. Configuration: === 1. The Hard Commit and Soft Commit are disabled in the configuration (we control the commits from the application) 2. We have 1 Master and 2 Slaves configured and the pollInterval is configured to 10 Minutes. 3. The Master is configured to have the replicateAfter as commit startup Steps to reproduce the problem: == 1. Delete a document in Solr (using delete by id). URL - http://localhost:8983/solr/annotation/update with body as deleteidchange.me/id/delete 2. Issue a commit in Master (http://localhost:8983/solr/annotation/update?commit=true). 3. The replication of the DELETE WILL NOT happen. The master and slave has the same Index version. 4. If we try to issue another commit in Master, we see that it replicates fine. Request you to please confirm if this is a known issue. Thank you. Regards, Bharat Akkinepalli
Re: Find documents that are composed of % words
Hi Shahzad, Personally I am of the same opinion as others who have replied, that you are better off going back to your clients at this stage itself, with all the new found info/data points. Further, to the questions that you put to me directly: 1) For option 1, as indicated earlier, you have to compute the myfieldwordcount outside of Solr push it in as any other field to Solr. As far as I know, there is no filter that will do this for you out of the box. 2) For option 2, you had to take a look at: http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts Related links: Function Query: http://wiki.apache.org/solr/FunctionQuery#norm Norms: http://lucene.apache.org/core/3_5_0/api/all/org/apache/lucene/search/Similarity.html#computeNorm(java.lang.String, org.apache.lucene.index.FieldInvertState) Changes to schema: http://wiki.apache.org/solr/SchemaXml#Common_field_options (omitNorms option) For a field with default boost (= 1), norm = lengthNorm (approximately 1/sqrrt(numTerms)). Norm's been multiplied twice in the query to divide the score (approx.) by numTerms. Hope that helps. Regards, Aloke On Fri, Oct 11, 2013 at 5:36 PM, shahzad73 shahzad...@yahoo.com wrote: Aloke Ghoshal i'm trying to work out your equation. i am using standard scheme provided by nutch for solr and not aware of how to calculate myfieldwordcount in first query.no idea where this count will come from. is there any filter that will store number of tokens generated for a specific field and store it as another field. that way we can use it . not sure what norm does in second equation try to find information for this from online and did not find any yet. please explain Shahzad -- View this message in context: http://lucene.472066.n3.nabble.com/Find-documents-that-are-composed-of-words-tp4094264p4094955.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Concurent indexing
Here's another jstack http://pastebin.com/8JiQc3rb On 16 October 2013 11:53, Chris Geeringh geeri...@gmail.com wrote: Hi Erick, here is a paste from other thread (debugging update request) with my input as I am seeing errors too: I ran an import last night, and this morning my cloud wouldn't accept updates. I'm running the latest 4.6 snapshot. I was importing with latest solrj snapshot, and using java bin transport with CloudSolrServer. The cluster had indexed ~1.3 million docs before no further updates were accepted, querying still working. I'll run jstack shortly and provide the results. Here is my jstack output... Lots of blocked threads. http://pastebin.com/1ktjBYbf On 16 October 2013 11:46, Erick Erickson erickerick...@gmail.com wrote: Run jstack on the solr process (standard with Java) and look for the word semaphore. You should see your servers blocked on this in the Solr code. That'll pretty much nail it. There's an open JIRA to fix the underlying cause, see: SOLR-5232, but that's currently slated for 4.6 which won't be cut for a while. Also, there's a patch that will fix this as a side effect, assuming you're using SolrJ, see. This is available in 4.5 SOLR-4816 Best, Erick On Tue, Oct 15, 2013 at 1:33 PM, michael.boom my_sky...@yahoo.com wrote: Here's some of the Solr's last words (log content before it stoped accepting updates), maybe someone can help me interpret that. http://pastebin.com/mv7fH62H -- View this message in context: http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Switching indexes
Shawn, It all makes sense, I'm just dealing with production servers here so I'm trying to be very careful (shutting down one node at a time is OK, just don't want to do something catastrophic.) OK, so I should use that aliasing feature. On index1 I have: core1 core1new core2 On index2 and index3 I have: core1 core2 If I do the alias command on index1 and have core1 alias core1new: 1) Will that then get rid of the existing core1 and have core1new data be used for queries? 2) Will that change make core1 instances on index2 and index3 update to have core1new data? Thanks again! -- Chris On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey s...@elyograg.org wrote: On 10/15/2013 2:17 PM, Christopher Gross wrote: I have 3 Solr nodes (and 5 ZK nodes). For #1, would I have to do that on all of them? For #2, I'm not getting the auto-replication between node 1 and nodes 2 3 for my new index. I have 2 indexes -- just call them index and indexbk (bk being the backup containing the full data set) up and running on one node. If I were to do a swap (via the Core Admin page), would that push the changes for indexbk over to the other two nodes? Would I need to do that switch on the leader, or could that be done on one of the other nodes? For #1, I don't know how you want to handle your sharding and/or replication. I would assume that you probably have numShards=1 and replicationFactor=3, but I could be wrong. At any rate, where the collection lives is an implementation detail that's up to you. SolrCloud keeps track of all your collections, whether they are on one server or all servers. Typically you can send requests (queries, API calls, etc) that deal with entire collections to any node in your cluster and they will be handled correctly. If you need to deal with a specific core, that call needs to go to the correct node. For #2, when you create a core and want it to be a replica of something that already exists, you need to give it a name that's not in use on your cluster, such as index2_shard1_replica3. You also tell it what collection it's part of, which for my example, would probably be index2. Then you tell it what shard it will contain. That will be shard1, shard2, etc. Here's an example of a CREATE call: http://server:port/solr/admin/**cores?action=CREATEname=** index2_shard1_replica3**collection=index2shard=shard1 For the rest of your message: Core swapping and SolrCloud do NOT get along. If you are using SolrCloud, CoreAdmin features like that need to disappear from your toolset. Attempting a core swap will make bad things (tm) happen. Collection aliasing is the way in SolrCloud that you can now do what used to be done with swapping. You have collections named index1, index2, index3, etc ... and you keep an alias called just index that points to one of those other collections, so that you don't have to change your application - you just repoint the alias and all the application queries going to index will go to the correct place. I hope I haven't made things more confusing for you! Thanks, Shawn
Re: Regarding Solr Cloud issue...
oh great. Thanks Primoz. is there any simple way to do the upgrade to 4.5 without having to change my configurations? update a few jar files etc? On Wed, Oct 16, 2013 at 4:58 PM, primoz.sk...@policija.si wrote: Also, another issue that needs to be raised is the creation of cores from the core admin section of the gui, doesnt really work well, it creates files but then they do not work (again i am using 4.4) From my experience core admin section of the GUI does not work well in SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0 which acts much better. I would use only HTTP requests (cores and collections API) with SolrCloud and would use GUI only for viewing the state of cluster and cores. Primoz
Error when i want to create a CORE
I install the version solr 4.5 on windows. I launch with Jetty web server the example. I have no problem with collection 1 core. But, when i want to create my core, the server send me this error : * org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load config file C:\Documents and Settings\r.lucas\Bureau\Moteur\solr-4.5.0\example\solr\index1\solrconfig.xml* could you help please -- View this message in context: http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Regarding Solr Cloud issue...
Hm, good question. I haven't really done any upgrading yet, because I just reinstall and reindex everything. I would replace jars with the new ones (if needed - check release notes for version 4.4.0 and 4.5.0 where all the versions of external tools [tika, maven, etc.] are stated) and deploy the updated WAR file to servlet container. Primoz From: Chris christu...@gmail.com To: solr-user solr-user@lucene.apache.org Date: 16.10.2013 14:30 Subject:Re: Regarding Solr Cloud issue... oh great. Thanks Primoz. is there any simple way to do the upgrade to 4.5 without having to change my configurations? update a few jar files etc? On Wed, Oct 16, 2013 at 4:58 PM, primoz.sk...@policija.si wrote: Also, another issue that needs to be raised is the creation of cores from the core admin section of the gui, doesnt really work well, it creates files but then they do not work (again i am using 4.4) From my experience core admin section of the GUI does not work well in SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0 which acts much better. I would use only HTTP requests (cores and collections API) with SolrCloud and would use GUI only for viewing the state of cluster and cores. Primoz
Re: Error when i want to create a CORE
Can you try with a directory path that contains *no* spaces. Primoz From: raige regis...@gmail.com To: solr-user@lucene.apache.org Date: 16.10.2013 14:46 Subject:Error when i want to create a CORE I install the version solr 4.5 on windows. I launch with Jetty web server the example. I have no problem with collection 1 core. But, when i want to create my core, the server send me this error : * org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load config file C:\Documents and Settings\r.lucas\Bureau\Moteur\solr-4.5.0\example\solr\index1\solrconfig.xml* could you help please -- View this message in context: http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Boosting a field with defType:dismax -- No results at all
Get rid of the newlines before and after the value of the qf parameter. -- Jack Krupansky -Original Message- From: uwe72 Sent: Wednesday, October 16, 2013 5:36 AM To: solr-user@lucene.apache.org Subject: Boosting a field with defType:dismax -- No results at all Hi there, i want to boost a field, see below. If i add the defType:dismax i don't get results at all anymore. What i am doing wrong? Regards Uwe requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=omitHeadertrue/str str name=dftext/str str name=q.opAND/str str name=spellcheck.dictionarydefault/str str name=spellchecktrue/str str name=spellcheck.extendedResultstrue/str str name=spellcheck.count1/str str name=spellcheck.maxResultsForSuggest100/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollations1/str str name=defTypedismax/str str name=qf SignalImpl.baureihe^1011 text^0.1 /str /lst arr name=last-components strspellcheck/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Error when i want to create a CORE
Assuming that you are using the Admin UI: The instanceDir must be already existing (in your case index1). Inside it there should be conf/ directory holding the cofiguration files. In the config field only insert the file name (like solrconfig.xml) which shoulf be found in the conf/ directory -- View this message in context: http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894p4095900.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Copy field append values ?
Appended. -- Jack Krupansky -Original Message- From: vishgupt Sent: Wednesday, October 16, 2013 6:25 AM To: solr-user@lucene.apache.org Subject: Solr Copy field append values ? Hi , Schema like this external_id is multivalued field. copyField source=upc dest=external_id / I want to know will values of upc will be appended to exiting values of external_id or override it ? For example if I send a document having values upc:131 external_id:423 for indexing in sorl with above mentioned schema.what will be value of external_id field 131 or 131,423. Thanks Vishal -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Copy-field-append-values-tp4095862.html Sent from the Solr - User mailing list archive at Nabble.com.
AW: Boosting a field with defType:dismax -- No results at all
Perfect!!! THANKS A LOT That was the mistake. Von: Jack Krupansky-2 [via Lucene] [mailto:ml-node+s472066n409590...@n3.nabble.com] Gesendet: Mittwoch, 16. Oktober 2013 14:55 An: uwe72 Betreff: Re: Boosting a field with defType:dismax -- No results at all Get rid of the newlines before and after the value of the qf parameter. -- Jack Krupansky -Original Message- From: uwe72 Sent: Wednesday, October 16, 2013 5:36 AM To: [hidden email] Subject: Boosting a field with defType:dismax -- No results at all Hi there, i want to boost a field, see below. If i add the defType:dismax i don't get results at all anymore. What i am doing wrong? Regards Uwe requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=omitHeadertrue/str str name=dftext/str str name=q.opAND/str str name=spellcheck.dictionarydefault/str str name=spellchecktrue/str str name=spellcheck.extendedResultstrue/str str name=spellcheck.count1/str str name=spellcheck.maxResultsForSuggest100/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollations1/str str name=defTypedismax/str str name=qf SignalImpl.baureihe^1011 text^0.1 /str /lst arr name=last-components strspellcheck/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No -results-at-all-tp4095850.html Sent from the Solr - User mailing list archive at Nabble.com. _ If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No -results-at-all-tp4095850p4095901.html To unsubscribe from Boosting a field with defType:dismax -- No results at all, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubsc ribe_by_codenode=4095850code=dXdlLmNsZW1lbnRAZXh4Y2VsbGVudC5kZXw0MDk1ODU wfC0yOTkxOTMwMjI= . http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_v iewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.Ba sicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.temp late.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-in stant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.nam l NAML -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095906.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Regarding Solr Cloud issue...
very well, i will try the same, maybe an auto update tool should be also put on the line...just a thought ... On Wed, Oct 16, 2013 at 6:20 PM, primoz.sk...@policija.si wrote: Hm, good question. I haven't really done any upgrading yet, because I just reinstall and reindex everything. I would replace jars with the new ones (if needed - check release notes for version 4.4.0 and 4.5.0 where all the versions of external tools [tika, maven, etc.] are stated) and deploy the updated WAR file to servlet container. Primoz From: Chris christu...@gmail.com To: solr-user solr-user@lucene.apache.org Date: 16.10.2013 14:30 Subject:Re: Regarding Solr Cloud issue... oh great. Thanks Primoz. is there any simple way to do the upgrade to 4.5 without having to change my configurations? update a few jar files etc? On Wed, Oct 16, 2013 at 4:58 PM, primoz.sk...@policija.si wrote: Also, another issue that needs to be raised is the creation of cores from the core admin section of the gui, doesnt really work well, it creates files but then they do not work (again i am using 4.4) From my experience core admin section of the GUI does not work well in SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0 which acts much better. I would use only HTTP requests (cores and collections API) with SolrCloud and would use GUI only for viewing the state of cluster and cores. Primoz
Re: SolrCloud Query Balancing
Thanks! Could you provide some examples or details of the configuration you use ? I think this solution would suit me also. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095910.html Sent from the Solr - User mailing list archive at Nabble.com.
How to retrieve the query for a boolean keyword?
Dear all, I am using solrj as client for indexing and searching documents on the solr server My question: How to retrieve the query for a boolean keyword? For example: I have this query: text:(“vacuna” AND “esteve news”) OR text:(“vacuna”) OR text:(“esteve news”) And searching in: text-- Esteve news: Obtener una vacuna para frenar el... Solr returns: emEsteve news/em: obtener una emvacuna/em para frenar el ... It is ok. My question is: Can I know with solr that results: emEsteve news/em emvacuna/em are provided by the query with the AND operator? is it posible to retrieve with solrj? Thanks a lot in advance, Sil, * * *Tecnologías y SaaS para el análisis de marcas comerciales.* Nota: Usted ha recibido este mensaje al estar en la libreta de direcciones del remitente, en los archivos de la empresa o mediante el sistema de “responder” al ser usted la persona que contactó por este medio con el remitente. En caso de no querer recibir ningún email mas del remitente o de cualquier miembro de la organización a la que pertenece, por favor, responda a este email solicitando la baja de su dirección en nuestros archivos. Advertencia legal: Este mensaje y, en su caso, los ficheros anexos son confidenciales, especialmente en lo que respecta a los datos personales, y se dirigen exclusivamente al destinatario referenciado. Si usted no lo es y lo ha recibido por error o tiene conocimiento del mismo por cualquier motivo, le rogamos que nos lo comunique por este medio y proceda a destruirlo o borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar, archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo ello bajo pena de incurrir en responsabilidades legales.
Re: SolrCloud Query Balancing
On 10/16/2013 3:52 AM, michael.boom wrote: I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3 machines along with 3 Zookeeper instances. My web application makes queries to Solr specifying the hostname of one of the machines. So that machine will always get the request and the other ones will just serve as an aid. So I would like to setup a load balancer that would fix that, balancing the queries to all machines. Maybe doing the same while indexing. SolrCloud actually handles load balancing for you. You'll find that when you send requests to one server, they are actually being re-directed across the entire cloud, unless you include a distrib=false parameter on the request, but that would also limit the search to one shard, which is probably not what you want. The only thing that you don't get with a non-Java client is redundancy. If you can't build in failover capability yourself, which is a very advanced programming technique, then you need a load balancer. For my large non-Cloud Solr install, I use haproxy as a load balancer. Most of the time, it doesn't actually balance the load, just makes sure that Solr is always reachable even if part of it goes down. The haproxy program is simple and easy to use, but performs extremely well. I've got a pacemaker cluster making sure that the shared IP address, haproxy, and other homegrown utility applications related to Solr are only running on one machine. Thanks, Shawn
Re: SolrCloud Query Balancing
I did not actually realize this, I apologize for my previous reply! Haproxy would definitely be the right choice then for the posters setup for redundancy. Den 16/10/2013 kl. 15.53 skrev Shawn Heisey s...@elyograg.org: On 10/16/2013 3:52 AM, michael.boom wrote: I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3 machines along with 3 Zookeeper instances. My web application makes queries to Solr specifying the hostname of one of the machines. So that machine will always get the request and the other ones will just serve as an aid. So I would like to setup a load balancer that would fix that, balancing the queries to all machines. Maybe doing the same while indexing. SolrCloud actually handles load balancing for you. You'll find that when you send requests to one server, they are actually being re-directed across the entire cloud, unless you include a distrib=false parameter on the request, but that would also limit the search to one shard, which is probably not what you want. The only thing that you don't get with a non-Java client is redundancy. If you can't build in failover capability yourself, which is a very advanced programming technique, then you need a load balancer. For my large non-Cloud Solr install, I use haproxy as a load balancer. Most of the time, it doesn't actually balance the load, just makes sure that Solr is always reachable even if part of it goes down. The haproxy program is simple and easy to use, but performs extremely well. I've got a pacemaker cluster making sure that the shared IP address, haproxy, and other homegrown utility applications related to Solr are only running on one machine. Thanks, Shawn
howto increase indexing speed?
I have a small solr setup, not even on a physical machine but a vmware virtual machine with a single cpu that reads data using DIH from a database. The machine has no phisical disks attached but stores data on a netapp nas. Currently this machine indexes 320 documents/sec, not bad but we plan to double the index and we would like to keep nearly the same. Doing some basic checks during the indexing I have found with iostat that the usage of the disks is nearly 8% and the source database is running fine, instead the virtual cpu is 95% running on solr. Now I can quite easily add another virtual cpu to the solr box, but as far as I know this won't help because DIH doesn't work in parallel. Am I wrong? What would you do? Rewrite the feeding process quitting dih and using solrj to feed data in parallel? Would you instead keep DIH and switch to a sharded configuration? Thank you for any hints Giovanni
Re: howto increase indexing speed?
I think DIH uses only one core per instance. IMHO 300 doc/sec is quite good. If you would like to use more cores you need to use solrj. Or maybe more than one DIH and more cores of course. Primoz From: Giovanni Bricconi giovanni.bricc...@banzai.it To: solr-user solr-user@lucene.apache.org Date: 16.10.2013 16:25 Subject:howto increase indexing speed? I have a small solr setup, not even on a physical machine but a vmware virtual machine with a single cpu that reads data using DIH from a database. The machine has no phisical disks attached but stores data on a netapp nas. Currently this machine indexes 320 documents/sec, not bad but we plan to double the index and we would like to keep nearly the same. Doing some basic checks during the indexing I have found with iostat that the usage of the disks is nearly 8% and the source database is running fine, instead the virtual cpu is 95% running on solr. Now I can quite easily add another virtual cpu to the solr box, but as far as I know this won't help because DIH doesn't work in parallel. Am I wrong? What would you do? Rewrite the feeding process quitting dih and using solrj to feed data in parallel? Would you instead keep DIH and switch to a sharded configuration? Thank you for any hints Giovanni
Re: howto increase indexing speed?
You might consider local disks. I once ran Solr with the indexes on an NFS-mounted volume and the slowdown was severe. wunder On Oct 16, 2013, at 7:40 AM, primoz.sk...@policija.si wrote: I think DIH uses only one core per instance. IMHO 300 doc/sec is quite good. If you would like to use more cores you need to use solrj. Or maybe more than one DIH and more cores of course. Primoz From: Giovanni Bricconi giovanni.bricc...@banzai.it To: solr-user solr-user@lucene.apache.org Date: 16.10.2013 16:25 Subject:howto increase indexing speed? I have a small solr setup, not even on a physical machine but a vmware virtual machine with a single cpu that reads data using DIH from a database. The machine has no phisical disks attached but stores data on a netapp nas. Currently this machine indexes 320 documents/sec, not bad but we plan to double the index and we would like to keep nearly the same. Doing some basic checks during the indexing I have found with iostat that the usage of the disks is nearly 8% and the source database is running fine, instead the virtual cpu is 95% running on solr. Now I can quite easily add another virtual cpu to the solr box, but as far as I know this won't help because DIH doesn't work in parallel. Am I wrong? What would you do? Rewrite the feeding process quitting dih and using solrj to feed data in parallel? Would you instead keep DIH and switch to a sharded configuration? Thank you for any hints Giovanni -- Walter Underwood wun...@wunderwood.org
Re: prepareCommit vs Commit
Thanks Shalin. Will post it there too. - Phani Chaitanya -- View this message in context: http://lucene.472066.n3.nabble.com/prepareCommit-vs-Commit-tp4095545p4095916.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID
The only delete I see in the master logs is: INFO - 2013-10-11 14:06:54.793; org.apache.solr.update.processor.LogUpdateProcessor; [annotation] webapp=/solr path=/update params={} {delete=[change.me(-1448623278425899008)]} 0 60 When you commit, we have the following: INFO - 2013-10-11 14:07:03.809; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} INFO - 2013-10-11 14:07:03.813; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit. That suggests that the id you are trying to delete never existed in the first place and hence there was nothing to commit. Hence replication was not triggered. Am I missing something? On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi Otis, Did you get a chance to look into the logs. Please let me know if you need more information. Thank you. Regards, Bharat Akkinepalli -Original Message- From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com] Sent: Friday, October 11, 2013 2:16 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Hi Otis, Thanks for the response. The log files can be found here. MasterLog : http://pastebin.com/DPLKMPcF Slave Log: http://pastebin.com/DX9sV6Jx One more point worth mentioning here is that when we issue the commit with expungeDeletes=true, then the delete by id replication is successful. i.e. http://localhost:8983/solr/annotation/update?commit=trueexpungeDeletes=true Regards, Bharat Akkinepalli -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Wednesday, October 09, 2013 6:35 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Bharat, Can you look at the logs on the Master when you issue the delete and the subsequent commits and share that? Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi, We have recently migrated from Solr 3.6 to Solr 4.4. We are using the Master/Slave configuration in Solr 4.4 (not Solr Cloud). We have noticed the following behavior/defect. Configuration: === 1. The Hard Commit and Soft Commit are disabled in the configuration (we control the commits from the application) 2. We have 1 Master and 2 Slaves configured and the pollInterval is configured to 10 Minutes. 3. The Master is configured to have the replicateAfter as commit startup Steps to reproduce the problem: == 1. Delete a document in Solr (using delete by id). URL - http://localhost:8983/solr/annotation/update with body as deleteid change.me/id/delete 2. Issue a commit in Master ( http://localhost:8983/solr/annotation/update?commit=true). 3. The replication of the DELETE WILL NOT happen. The master and slave has the same Index version. 4. If we try to issue another commit in Master, we see that it replicates fine. Request you to please confirm if this is a known issue. Thank you. Regards, Bharat Akkinepalli -- Regards, Shalin Shekhar Mangar.
Re: AW: Boosting a field with defType:dismax -- No results at all
We have just one more Problem: When we search explicit, like *:* or partNumber:A32783627 we still don’t get any results. What we are doing here wrong? -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095918.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Switching indexes
Garth, I think I get what you're saying, but I want to make sure. I have 3 servers (index1, index2, index3), with Solr living on port 8080. Each of those has 3 cores loaded with data: core1 (old version) core1new (new version) core2 (unrelated to core1) If I wanted to make it so that queries to core1 are really going to core1new, I'd run: http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1 Correct? -- Chris On Wed, Oct 16, 2013 at 9:02 AM, Garth Grimm garthgr...@averyranchconsulting.com wrote: The alias applies to the entire cloud, not a single core. So you'd have your indexing application point to a collection alias named 'index'. And that alias would point to core1. You'd have your query applications point to a collection alias named 'query', and that would point to core1, as well. Then use the Collection API to create core1new across the entire cloud. Then update the 'index' alias to point to core1new. Feed documents in, run warm-up scripts, run smoke tests, etc., etc. When you're ready, point the 'query' alias to core1new. You're now running completely on core1new, and can use the Collection API to delete core1 from the cloud. Or keep it around as a backup to which you can restore simply by changing 'query' alias. -Original Message- From: Christopher Gross [mailto:cogr...@gmail.com] Sent: Wednesday, October 16, 2013 7:05 AM To: solr-user Subject: Re: Switching indexes Shawn, It all makes sense, I'm just dealing with production servers here so I'm trying to be very careful (shutting down one node at a time is OK, just don't want to do something catastrophic.) OK, so I should use that aliasing feature. On index1 I have: core1 core1new core2 On index2 and index3 I have: core1 core2 If I do the alias command on index1 and have core1 alias core1new: 1) Will that then get rid of the existing core1 and have core1new data be used for queries? 2) Will that change make core1 instances on index2 and index3 update to have core1new data? Thanks again! -- Chris On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey s...@elyograg.org wrote: On 10/15/2013 2:17 PM, Christopher Gross wrote: I have 3 Solr nodes (and 5 ZK nodes). For #1, would I have to do that on all of them? For #2, I'm not getting the auto-replication between node 1 and nodes 2 3 for my new index. I have 2 indexes -- just call them index and indexbk (bk being the backup containing the full data set) up and running on one node. If I were to do a swap (via the Core Admin page), would that push the changes for indexbk over to the other two nodes? Would I need to do that switch on the leader, or could that be done on one of the other nodes? For #1, I don't know how you want to handle your sharding and/or replication. I would assume that you probably have numShards=1 and replicationFactor=3, but I could be wrong. At any rate, where the collection lives is an implementation detail that's up to you. SolrCloud keeps track of all your collections, whether they are on one server or all servers. Typically you can send requests (queries, API calls, etc) that deal with entire collections to any node in your cluster and they will be handled correctly. If you need to deal with a specific core, that call needs to go to the correct node. For #2, when you create a core and want it to be a replica of something that already exists, you need to give it a name that's not in use on your cluster, such as index2_shard1_replica3. You also tell it what collection it's part of, which for my example, would probably be index2. Then you tell it what shard it will contain. That will be shard1, shard2, etc. Here's an example of a CREATE call: http://server:port/solr/admin/**cores?action=CREATEname=** index2_shard1_replica3**collection=index2shard=shard1 For the rest of your message: Core swapping and SolrCloud do NOT get along. If you are using SolrCloud, CoreAdmin features like that need to disappear from your toolset. Attempting a core swap will make bad things (tm) happen. Collection aliasing is the way in SolrCloud that you can now do what used to be done with swapping. You have collections named index1, index2, index3, etc ... and you keep an alias called just index that points to one of those other collections, so that you don't have to change your application - you just repoint the alias and all the application queries going to index will go to the correct place. I hope I haven't made things more confusing for you! Thanks, Shawn
RE: Switching indexes
I'd suggest using the Collections API: http://localhost:8983/solr/admin/collections?action=CREATEALIASname=aliascollections=collection1,collection2... See the Collections Aliases section of http://wiki.apache.org/solr/SolrCloud. BTW, once you make the aliases, Zookeeper will have entries in /aliases.json that will tell you what aliases are defined and what they point to. -Original Message- From: Christopher Gross [mailto:cogr...@gmail.com] Sent: Wednesday, October 16, 2013 10:44 AM To: solr-user Subject: Re: Switching indexes Garth, I think I get what you're saying, but I want to make sure. I have 3 servers (index1, index2, index3), with Solr living on port 8080. Each of those has 3 cores loaded with data: core1 (old version) core1new (new version) core2 (unrelated to core1) If I wanted to make it so that queries to core1 are really going to core1new, I'd run: http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1 Correct? -- Chris On Wed, Oct 16, 2013 at 9:02 AM, Garth Grimm garthgr...@averyranchconsulting.com wrote: The alias applies to the entire cloud, not a single core. So you'd have your indexing application point to a collection alias named 'index'. And that alias would point to core1. You'd have your query applications point to a collection alias named 'query', and that would point to core1, as well. Then use the Collection API to create core1new across the entire cloud. Then update the 'index' alias to point to core1new. Feed documents in, run warm-up scripts, run smoke tests, etc., etc. When you're ready, point the 'query' alias to core1new. You're now running completely on core1new, and can use the Collection API to delete core1 from the cloud. Or keep it around as a backup to which you can restore simply by changing 'query' alias. -Original Message- From: Christopher Gross [mailto:cogr...@gmail.com] Sent: Wednesday, October 16, 2013 7:05 AM To: solr-user Subject: Re: Switching indexes Shawn, It all makes sense, I'm just dealing with production servers here so I'm trying to be very careful (shutting down one node at a time is OK, just don't want to do something catastrophic.) OK, so I should use that aliasing feature. On index1 I have: core1 core1new core2 On index2 and index3 I have: core1 core2 If I do the alias command on index1 and have core1 alias core1new: 1) Will that then get rid of the existing core1 and have core1new data be used for queries? 2) Will that change make core1 instances on index2 and index3 update to have core1new data? Thanks again! -- Chris On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey s...@elyograg.org wrote: On 10/15/2013 2:17 PM, Christopher Gross wrote: I have 3 Solr nodes (and 5 ZK nodes). For #1, would I have to do that on all of them? For #2, I'm not getting the auto-replication between node 1 and nodes 2 3 for my new index. I have 2 indexes -- just call them index and indexbk (bk being the backup containing the full data set) up and running on one node. If I were to do a swap (via the Core Admin page), would that push the changes for indexbk over to the other two nodes? Would I need to do that switch on the leader, or could that be done on one of the other nodes? For #1, I don't know how you want to handle your sharding and/or replication. I would assume that you probably have numShards=1 and replicationFactor=3, but I could be wrong. At any rate, where the collection lives is an implementation detail that's up to you. SolrCloud keeps track of all your collections, whether they are on one server or all servers. Typically you can send requests (queries, API calls, etc) that deal with entire collections to any node in your cluster and they will be handled correctly. If you need to deal with a specific core, that call needs to go to the correct node. For #2, when you create a core and want it to be a replica of something that already exists, you need to give it a name that's not in use on your cluster, such as index2_shard1_replica3. You also tell it what collection it's part of, which for my example, would probably be index2. Then you tell it what shard it will contain. That will be shard1, shard2, etc. Here's an example of a CREATE call: http://server:port/solr/admin/**cores?action=CREATEname=** index2_shard1_replica3**collection=index2shard=shard1 For the rest of your message: Core swapping and SolrCloud do NOT get along. If you are using SolrCloud, CoreAdmin features like that need to disappear from your toolset. Attempting a core swap will make bad things (tm) happen. Collection aliasing is the way in SolrCloud that you can now do what used to be done with swapping. You have collections named index1, index2, index3, etc ... and you keep an alias called just index that points to one of
RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID
Hi Shalin, I am not sure why the log specifies No uncommitted changes appear. The data is available in Solr at the time I perform a delete. please find the below steps I have performed: Inserted a document in master (with id= change.me.1) issued a commit on master Triggered replication on slave Ensured that the document is replicated successfully. Issued a delete by ID. Issued a commit on master Replication did NOT happen. The logs are as follows: Master - http://pastebin.com/265CtCEp Slave - http://pastebin.com/Qx0xLwmK Regards, Bharat Akkinepalli. -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, October 16, 2013 11:28 AM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID The only delete I see in the master logs is: INFO - 2013-10-11 14:06:54.793; org.apache.solr.update.processor.LogUpdateProcessor; [annotation] webapp=/solr path=/update params={} {delete=[change.me(-1448623278425899008)]} 0 60 When you commit, we have the following: INFO - 2013-10-11 14:07:03.809; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} INFO - 2013-10-11 14:07:03.813; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit. That suggests that the id you are trying to delete never existed in the first place and hence there was nothing to commit. Hence replication was not triggered. Am I missing something? On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi Otis, Did you get a chance to look into the logs. Please let me know if you need more information. Thank you. Regards, Bharat Akkinepalli -Original Message- From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com] Sent: Friday, October 11, 2013 2:16 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Hi Otis, Thanks for the response. The log files can be found here. MasterLog : http://pastebin.com/DPLKMPcF Slave Log: http://pastebin.com/DX9sV6Jx One more point worth mentioning here is that when we issue the commit with expungeDeletes=true, then the delete by id replication is successful. i.e. http://localhost:8983/solr/annotation/update?commit=trueexpungeDelete s=true Regards, Bharat Akkinepalli -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Wednesday, October 09, 2013 6:35 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Bharat, Can you look at the logs on the Master when you issue the delete and the subsequent commits and share that? Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi, We have recently migrated from Solr 3.6 to Solr 4.4. We are using the Master/Slave configuration in Solr 4.4 (not Solr Cloud). We have noticed the following behavior/defect. Configuration: === 1. The Hard Commit and Soft Commit are disabled in the configuration (we control the commits from the application) 2. We have 1 Master and 2 Slaves configured and the pollInterval is configured to 10 Minutes. 3. The Master is configured to have the replicateAfter as commit startup Steps to reproduce the problem: == 1. Delete a document in Solr (using delete by id). URL - http://localhost:8983/solr/annotation/update with body as deleteid change.me/id/delete 2. Issue a commit in Master ( http://localhost:8983/solr/annotation/update?commit=true). 3. The replication of the DELETE WILL NOT happen. The master and slave has the same Index version. 4. If we try to issue another commit in Master, we see that it replicates fine. Request you to please confirm if this is a known issue. Thank you. Regards, Bharat Akkinepalli -- Regards, Shalin Shekhar Mangar.
Re: Regarding Solr Cloud issue...
On 10/16/2013 4:51 AM, Chris wrote: Also, is there any easy way upgrading to 4.5 without having to change most of my plugins configuration files? Upgrading is something that should be done carefully. If you can, it's always recommended that you try it out on dev hardware with your real index data beforehand, so you can deal with any problems that arise without causing problems for your production cluster. Upgrading SolrCloud is particularly tricky, because for a while you will be running different versions on different machines in your cluster. If you're using your own custom software to go with Solr, or you're using third-party plugins that aren't included in the Solr download, upgrading might take more effort than usual. Also, if you are doing anything in your config/schema that changes the format of the Lucene index, you may find that it can't be upgraded without completely rebuilding the index. Examples of this are changing the postings format or docValues format. This is a very nasty complication with SolrCloud, because those configurations affect the entire cluster. In that case, the whole index may need to be rebuilt without custom formats before upgrading is attempted. If you don't have any of the complications mentioned in the preceding paragraph, upgrading is usually a very simple process: *) Shut down Solr. *) Delete the extracted WAR file directory. *) Replace solr.war with the new war from dist/ in the download. **) Usually it must actually be named solr.war, which means renaming it. *) Delete and replace other jars copied from the download. *) Change luceneMatchVersion in all solrconfig.xml files. ** *) Start Solr back up. ** With SolrCloud, you can't actually change the luceneMatchVersion until all of your servers have been upgraded. A full reindex is strongly recommended. With SolrCloud, it normally needs to wait until all servers are upgraded. In situations where it won't work at all without a reindex, upgrading SolrCloud can be very challenging. It's strongly recommended that you look over CHANGES.txt and compare the new example config/schema with the example from the old version, to see if there are any changes that you might want to incorporate into your own config. As with luceneMatchVersion, if you're running SolrCloud, those changes might need to wait until you're fully upgraded. Side note: When upgrading to a new minor version, config changes aren't normally required. They will usually be required when upgrading major versions, such as 3.x to 4.x. If you *do* have custom plugins that aren't included in the Solr download, you may have to recompile them for the new version, or wait for the vendor to create a new version before you upgrade. This is only the tip of the iceberg, but a lot of the rest of it depends greatly on your configurations. Thanks, Shawn
AW: Boosting a field with defType:dismax -- No results at all
We have just one more Problem: When we search explicit, like *:* or partNumber:A32783627 we still don't get any results. What we are doing here wrong? -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095927.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud Query Balancing
On 10/16/2013 8:01 AM, Henrik Ossipoff Hansen wrote: I did not actually realize this, I apologize for my previous reply! Haproxy would definitely be the right choice then for the posters setup for redundancy. Any load balancer software, or even an appliance load balancer like those made by F5, would probably work. I don't think there's anything wrong with nginx. I've never used it, but I've heard it mentioned often in a load balancer context, so it's probably great software. The original poster should use whatever they are comfortable with, and if they have no experience with any particular solution, they can ask advice from people who have used one or more of the possibilities. Never be afraid to offer advice. I've been wrong plenty of times in what I've posted on this list, and I've learned a TON because of it. Thanks, Shawn
Re: Switching indexes
On 10/16/2013 9:44 AM, Christopher Gross wrote: Garth, I think I get what you're saying, but I want to make sure. I have 3 servers (index1, index2, index3), with Solr living on port 8080. Each of those has 3 cores loaded with data: core1 (old version) core1new (new version) core2 (unrelated to core1) If I wanted to make it so that queries to core1 are really going to core1new, I'd run: http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1 Alias is a *Collections* API concept, not a CoreAdmin API concept. One question is this: Do you have a *collection* named core1, or just a *core* named core1? I'm pretty sure that it's possible on a SolrCloud system to have cores that are not participating in the cloud infrastructure. Collections are made up of shards. Shards have replicas. Each replica is a core. I'd like to see whether you have configurations loaded into zookeeper. In the admin UI, click on Cloud, then Tree. Click the arrow to the left of /configs to open it. If you see folders underneath /configs, then you do have at least one configurations in zookeeper, and you will have the name(s) they are using. You can also click the arrow next to /collections and see whether you have any collections. The Cloud-Graph page shows you a visual representation of your cloud. Let us know what you find. If you have anything there, I can give you some API URL calls that will hopefully fully illustrate what I'm saying. Thanks, Shawn
Re: AW: Boosting a field with defType:dismax -- No results at all
dismax doesn't support wildcard, fuzzy, or fielded terms. edismax does. My e-book details differences between the query parsers. -- Jack Krupansky -Original Message- From: uwe72 Sent: Wednesday, October 16, 2013 12:26 PM To: solr-user@lucene.apache.org Subject: AW: Boosting a field with defType:dismax -- No results at all We have just one more Problem: When we search explicit, like *:* or partNumber:A32783627 we still don't get any results. What we are doing here wrong? -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095927.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to retrieve the query for a boolean keyword?
I believe it is not possible. But you can easily split this in two query statements. First one: text:(“vacuna” AND “esteve news”) and the second: (text:(“vacuna”) OR text:(“esteve news”)) AND -text:(“vacuna” AND “esteve news”) The minus - excludes all entries of the first statemant. This is important to ensure that you don't get entries twice. So the first will contain all entries with both words, and the second query all left entries that contain exactly ONE of those words. I hope this helps. Am 16.10.2013 15:49, schrieb Silvia Suárez: Dear all, I am using solrj as client for indexing and searching documents on the solr server My question: How to retrieve the query for a boolean keyword? For example: I have this query: text:(“vacuna” AND “esteve news”) OR text:(“vacuna”) OR text:(“esteve news”) And searching in: text-- Esteve news: Obtener una vacuna para frenar el... Solr returns: emEsteve news/em: obtener una emvacuna/em para frenar el ... It is ok. My question is: Can I know with solr that results: emEsteve news/em emvacuna/em are provided by the query with the AND operator? is it posible to retrieve with solrj? Thanks a lot in advance, Sil, * * *Tecnologías y SaaS para el análisis de marcas comerciales.* Nota: Usted ha recibido este mensaje al estar en la libreta de direcciones del remitente, en los archivos de la empresa o mediante el sistema de “responder” al ser usted la persona que contactó por este medio con el remitente. En caso de no querer recibir ningún email mas del remitente o de cualquier miembro de la organización a la que pertenece, por favor, responda a este email solicitando la baja de su dirección en nuestros archivos. Advertencia legal: Este mensaje y, en su caso, los ficheros anexos son confidenciales, especialmente en lo que respecta a los datos personales, y se dirigen exclusivamente al destinatario referenciado. Si usted no lo es y lo ha recibido por error o tiene conocimiento del mismo por cualquier motivo, le rogamos que nos lo comunique por este medio y proceda a destruirlo o borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar, archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo ello bajo pena de incurrir en responsabilidades legales.
Re: field title_ngram was indexed without position data; cannot run PhraseQuery
Hello, Thank you all for your help. There was indeed a property which was not set right in schema.xml: omitTermFreqAndPositions=true After changing it to false phrase lookup started working OK. Thanks, M On 10/15/13 12:01 PM, Jack Krupansky wrote: Show us the field and field type from your schema. Likely you are omitting position info for the field, and the field type has autoGeneratePhraseQueries=true - the ngram analyzer generates a sequence of terms for a single source term and then the query parser generates a PhraseQuery for that sequence, but that requires position info in the index but you have omitted them. That's one theory. So, if that theory is correct, either retain position info by getting rid of the omit, or remove the autoGeneratePhraseQueries. -- Jack Krupansky -Original Message- From: Jason Hellman Sent: Tuesday, October 15, 2013 11:19 AM To: solr-user@lucene.apache.org Subject: Re: field title_ngram was indexed without position data; cannot run PhraseQuery If you consider what n-grams do this should make sense to you. Consider the following piece of data: White iPod If the field is fed through a bigram filter (n-gram with size of 2) the resulting token stream would appear as such: wh hi it te ip po od The usual use of n-grams is to match those partial tokens, essentially giving you a great deal of power in creating non-wildcard partial matches. How you use this is up to your imagination, but one easy use is in partial matches in autosuggest features. I can't speak for the intent behind the way it's coded, but it makes a great deal of sense to me that positional data would be seen as unnecessary since the intent of n-grams typically doesn't collide with phrase searches. If you need both behaviors it's far better to use copyField and have one field dedicated to standard tokenization and token filters, and another field for n-grams. I hope that's useful to you. On Oct 15, 2013, at 6:14 AM, MC videm...@gmail.com wrote: Hello, Could someone explain (or perhaps provide a documentation link) what does the following error mean: field title_ngram was indexed without position data; cannot run PhraseQuery I'll do some more searching online, I was just wondering if anyone has encountered this error before, and what the possible solution might be. I've recently upgraded my version of solr from 3.6.0 to 4.5.0, I'm not sure if this has any bearing or not. Thanks, M
Re: Switching indexes
Ok, so I think I was confusing the terminology (still in a 3.X mindset I guess.) From the Cloud-Tree, I do see that I have collections for what I was calling core1, core2, etc. So, to redo the above, Servers: index1, index2, index3 Collections: (on each) coll1, coll2 Collection (core?) on index1: coll1new Each Collection has 1 shard (too small to make sharding worthwhile). So should I run something like this: http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new Or will I need coll1new to be on each of the index1, index2 and index3 instances of Solr? -- Chris On Wed, Oct 16, 2013 at 12:40 PM, Shawn Heisey s...@elyograg.org wrote: On 10/16/2013 9:44 AM, Christopher Gross wrote: Garth, I think I get what you're saying, but I want to make sure. I have 3 servers (index1, index2, index3), with Solr living on port 8080. Each of those has 3 cores loaded with data: core1 (old version) core1new (new version) core2 (unrelated to core1) If I wanted to make it so that queries to core1 are really going to core1new, I'd run: http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1 Alias is a *Collections* API concept, not a CoreAdmin API concept. One question is this: Do you have a *collection* named core1, or just a *core* named core1? I'm pretty sure that it's possible on a SolrCloud system to have cores that are not participating in the cloud infrastructure. Collections are made up of shards. Shards have replicas. Each replica is a core. I'd like to see whether you have configurations loaded into zookeeper. In the admin UI, click on Cloud, then Tree. Click the arrow to the left of /configs to open it. If you see folders underneath /configs, then you do have at least one configurations in zookeeper, and you will have the name(s) they are using. You can also click the arrow next to /collections and see whether you have any collections. The Cloud-Graph page shows you a visual representation of your cloud. Let us know what you find. If you have anything there, I can give you some API URL calls that will hopefully fully illustrate what I'm saying. Thanks, Shawn
Re: AW: Boosting a field with defType:dismax -- No results at all
Works like this? str name=defTypeedismax/str str name=qfSignalImpl.baureihe^1011 text^0.1/str Another option: How about just but to the desired fields a high boosting factor while adding the field to the document, using solr?! Can this work? -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095938.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Switching indexes
On 10/16/2013 11:51 AM, Christopher Gross wrote: Ok, so I think I was confusing the terminology (still in a 3.X mindset I guess.) From the Cloud-Tree, I do see that I have collections for what I was calling core1, core2, etc. So, to redo the above, Servers: index1, index2, index3 Collections: (on each) coll1, coll2 Collection (core?) on index1: coll1new Each Collection has 1 shard (too small to make sharding worthwhile). So should I run something like this: http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new Or will I need coll1new to be on each of the index1, index2 and index3 instances of Solr? I don't think you can create an alias if a collection already exists with that name - so having a collection named core1 means you wouldn't want an alias named core1. I could be wrong, but just to keep things clean, I wouldn't recommend it, even if it's possible. That CREATEALIAS command will only work if coll1new shows up in /collections and shows green on the cloud graph. If it does, and you're using an alias name that doesn't already exist as a collection, then you're good. Whether coll1new is living on one server, two servers, or all three servers doesn't matter for CREATEALIAS, or for most other collection-related topics. Any query or update can be sent to any server in the cloud and it will be routed to the correct place according to the clusterstate. Where things live and how many replicas there are *does* matter for a discussion about redundancy. Generally speaking, you're going to want your shards to have at least two replicas, so that if a Solr instance goes down, or is taken down for maintenance, your cloud remains fully operational. In your situation, you probably want three replicas - so each collection lives on all three servers. So my general advice: Decide what name you want your application to use, make sure none of your existing collections are using that name, and set up an alias with that name pointing to whichever collection is current. Then change your application configurations or code to point at the alias instead of directly at the collection. When you want to do your reindex, first create a new collection using the collections API. Index to that new collection. When it's ready to go, use CREATEALIAS to update the alias, and your application will start using the new index. Thanks, Shawn
SolrCloud Performance Issue
Hi, I'm in the process of transitioning to SolrCloud from a conventional Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1 replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance (mounted on a separate volume). 6 gb memory is allocated to each solr instance. I've around 10 million documents in index. With the previous standalone model, the queries avg around 100 ms. The SolrCloud query response have been abysmal so far. The query response time is over 1000ms, reaching 2000ms often. I expected some surge due to additional servers, network latency, etc. but this difference is really baffling. The hardware is similar in both cases, except for the fact that couple of SolrCloud node is sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a bottleneck as well. The other difference from old setup is that I'm using the new CloudSolrServer class which is having the 3 zookeeper reference for load balancing. But I don't think it has any major impact as the queries executed from Solr admin query panel confirms the slowness. Here are some of my configuration setup: autoCommit maxTime3/maxTime openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit maxBooleanClauses1024/maxBooleanClauses filterCache class=solr.FastLRUCache size=16384 initialSize=4096 autowarmCount=4096/ queryResultCache class=solr.LRUCache size=16384 initialSize=8192 autowarmCount=4096/ documentCache class=solr.LRUCache size=32768 initialSize=16384 autowarmCount=0/ fieldValueCache class=solr.FastLRUCache size=16384 autowarmCount=8192 showItems=4096 / enableLazyFieldLoadingtrue/enableLazyFieldLoading queryResultWindowSize200/queryResultWindowSize queryResultMaxDocsCached400/queryResultMaxDocsCached listener event=newSearcher class=solr.QuerySenderListener arr name=queries lststr name=qline/str/lst lststr name=qxref/str/lst lststr name=qdraw/str/lst /arr /listener listener event=firstSearcher class=solr.QuerySenderListener arr name=queries lststr name=qline/str/lst lststr name=qdraw/str/lst lststr name=qline/strstr name=fqlanguage:english/str/lst lststr name=qline/strstr name=fqSource2:documentation/str/lst lststr name=qline/strstr name=fqSource2:CloudHelp/str/lst lststr name=qdraw/strstr name=fqlanguage:english/str/lst lststr name=qdraw/strstr name=fqSource2:documentation/str/lst lststr name=qdraw/strstr name=fqSource2:CloudHelp/str/lst /arr /listener maxWarmingSearchers2/maxWarmingSearchers The custom request handler : requestHandler name=/adskcloudhelp class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=wtvelocity/str str name=v.templatebrowse/str str name=v.contentTypetext/html;charset=UTF-8/str str name=v.layoutlayout/str str name=v.channelcloudhelp/str str name=defTypeedismax/str str name=q.alt*:*/str str name=rows15/str str name=flid,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score/str str name=qftext^1.5 title^2 IndexTerm^.9 keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1/str str name=bqSource2:CloudHelp^3 Source2:youtube^0.85/str str name=bfrecip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0/str str name=dftext/str str name=faceton/str str name=facet.mincount1/str str name=facet.limit100/str str name=facet.fieldlanguage/str str name=facet.fieldSource2/str str name=facet.fieldDocumentationBook/str str name=facet.fieldADSKProductDisplay/str str name=facet.fieldaudience/str str name=hltrue/str str name=hl.fltext title/str str name=f.text.hl.fragsize250/str str name=f.text.hl.alternateFieldShortDesc/str str
Re: Switching indexes
Thanks Shawn, the explanations help bring me forward to the SolrCloud mentality. So it sounds like going forward that I should have a more complicated name (ex: coll1-20131015) aliased to coll1, to make it easier to switch in the future. Now, if I already have an index (copied from one location to another), it sounds like I should just remove my existing (bad/old data) coll1, create the replicated one (calling it coll1-date), then alias coll1 to that one. This type of information would have been awesome to know before I got started, but I can make do with what I've got going now. Thanks again! -- Chris On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org wrote: On 10/16/2013 11:51 AM, Christopher Gross wrote: Ok, so I think I was confusing the terminology (still in a 3.X mindset I guess.) From the Cloud-Tree, I do see that I have collections for what I was calling core1, core2, etc. So, to redo the above, Servers: index1, index2, index3 Collections: (on each) coll1, coll2 Collection (core?) on index1: coll1new Each Collection has 1 shard (too small to make sharding worthwhile). So should I run something like this: http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new Or will I need coll1new to be on each of the index1, index2 and index3 instances of Solr? I don't think you can create an alias if a collection already exists with that name - so having a collection named core1 means you wouldn't want an alias named core1. I could be wrong, but just to keep things clean, I wouldn't recommend it, even if it's possible. That CREATEALIAS command will only work if coll1new shows up in /collections and shows green on the cloud graph. If it does, and you're using an alias name that doesn't already exist as a collection, then you're good. Whether coll1new is living on one server, two servers, or all three servers doesn't matter for CREATEALIAS, or for most other collection-related topics. Any query or update can be sent to any server in the cloud and it will be routed to the correct place according to the clusterstate. Where things live and how many replicas there are *does* matter for a discussion about redundancy. Generally speaking, you're going to want your shards to have at least two replicas, so that if a Solr instance goes down, or is taken down for maintenance, your cloud remains fully operational. In your situation, you probably want three replicas - so each collection lives on all three servers. So my general advice: Decide what name you want your application to use, make sure none of your existing collections are using that name, and set up an alias with that name pointing to whichever collection is current. Then change your application configurations or code to point at the alias instead of directly at the collection. When you want to do your reindex, first create a new collection using the collections API. Index to that new collection. When it's ready to go, use CREATEALIAS to update the alias, and your application will start using the new index. Thanks, Shawn
Re: Local Solr and Webserver-Solr act differently (and treated like or)
On 10/16/2013 4:46 AM, Stavros Delisavas wrote: My local solr gives me: http://pastebin.com/Q6d9dFmZ and my webserver this: http://pastebin.com/q87WEjVA I copied only the first few hundret lines (of more than 8000) because the webserver output was to big even for pastebin. On 16.10.2013 12:27, Erik Hatcher wrote: What does the debug output say from debugQuery=true say between the two? What's really needed here is the first part of the debug section, which has rawquerystring, querystring, parsedquery, and parsedquery_toString. The info from your local solr has this part, but what you pasted from the webserver one didn't include those parts, because it's further down than the first few hundred lines. Thanks, Shawn
Re: Local Solr and Webserver-Solr act differently (and treated like or)
Okay I understand, here's the rawquerystring. It was at about line 3000: lst name=debug str name=rawquerystringtitle:(into AND the AND wild*)/str str name=querystringtitle:(into AND the AND wild*)/str str name=parsedquery+title:wild*/str str name=parsedquery_toString+title:wild*/str At this place the debug output DOES differ from the one on my local system. But I don't understand why... This is the local debug output: lst name=debug str name=rawquerystringtitle:(into AND the AND wild*)/str str name=querystringtitle:(into AND the AND wild*)/str str name=parsedquery+title:into +title:the +title:wild*/str str name=parsedquery_toString+title:into +title:the +title:wild*/str Why is that? Any ideas? Am 16.10.2013 21:03, schrieb Shawn Heisey: On 10/16/2013 4:46 AM, Stavros Delisavas wrote: My local solr gives me: http://pastebin.com/Q6d9dFmZ and my webserver this: http://pastebin.com/q87WEjVA I copied only the first few hundret lines (of more than 8000) because the webserver output was to big even for pastebin. On 16.10.2013 12:27, Erik Hatcher wrote: What does the debug output say from debugQuery=true say between the two? What's really needed here is the first part of the debug section, which has rawquerystring, querystring, parsedquery, and parsedquery_toString. The info from your local solr has this part, but what you pasted from the webserver one didn't include those parts, because it's further down than the first few hundred lines. Thanks, Shawn
Re: Local Solr and Webserver-Solr act differently (and treated like or)
So, the stopwords.txt file is different between the two systems - the first has stop words but the second does not. Did you expect stop words to be removed, or not? -- Jack Krupansky -Original Message- From: Stavros Delsiavas Sent: Wednesday, October 16, 2013 5:02 PM To: solr-user@lucene.apache.org Subject: Re: Local Solr and Webserver-Solr act differently (and treated like or) Okay I understand, here's the rawquerystring. It was at about line 3000: lst name=debug str name=rawquerystringtitle:(into AND the AND wild*)/str str name=querystringtitle:(into AND the AND wild*)/str str name=parsedquery+title:wild*/str str name=parsedquery_toString+title:wild*/str At this place the debug output DOES differ from the one on my local system. But I don't understand why... This is the local debug output: lst name=debug str name=rawquerystringtitle:(into AND the AND wild*)/str str name=querystringtitle:(into AND the AND wild*)/str str name=parsedquery+title:into +title:the +title:wild*/str str name=parsedquery_toString+title:into +title:the +title:wild*/str Why is that? Any ideas? Am 16.10.2013 21:03, schrieb Shawn Heisey: On 10/16/2013 4:46 AM, Stavros Delisavas wrote: My local solr gives me: http://pastebin.com/Q6d9dFmZ and my webserver this: http://pastebin.com/q87WEjVA I copied only the first few hundret lines (of more than 8000) because the webserver output was to big even for pastebin. On 16.10.2013 12:27, Erik Hatcher wrote: What does the debug output say from debugQuery=true say between the two? What's really needed here is the first part of the debug section, which has rawquerystring, querystring, parsedquery, and parsedquery_toString. The info from your local solr has this part, but what you pasted from the webserver one didn't include those parts, because it's further down than the first few hundred lines. Thanks, Shawn
SolrCloud Performance Issue
Hi, I'm in the process of transitioning to SolrCloud from a conventional Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1 replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance (mounted on a separate volume). 6 gb memory is allocated to each solr instance. I've around 10 million documents in index. With the previous standalone model, the queries avg around 100 ms. The SolrCloud query response have been abysmal so far. The query response time is over 1000ms, reaching 2000ms often. I expected some surge due to additional servers, network latency, etc. but this difference is really baffling. The hardware is similar in both cases, except for the fact that couple of SolrCloud node is sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a bottleneck as well. The other difference from old setup is that I'm using the new CloudSolrServer class which is having the 3 zookeeper reference for load balancing. But I don't think it has any major impact as the queries executed from Solr admin query panel confirms the slowness. Here are some of my configuration setup: autoCommit maxTime3/maxTime openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit maxBooleanClauses1024/maxBooleanClauses filterCache class=solr.FastLRUCache size=16384 initialSize=4096 autowarmCount=4096/ queryResultCache class=solr.LRUCache size=16384 initialSize=8192 autowarmCount=4096/ documentCache class=solr.LRUCache size=32768 initialSize=16384 autowarmCount=0/ fieldValueCache class=solr.FastLRUCache size=16384 autowarmCount=8192 showItems=4096 / enableLazyFieldLoadingtrue/enableLazyFieldLoading queryResultWindowSize200/queryResultWindowSize queryResultMaxDocsCached400/queryResultMaxDocsCached listener event=newSearcher class=solr.QuerySenderListener arr name=queries lststr name=qline/str/lst lststr name=qxref/str/lst lststr name=qdraw/str/lst /arr /listener listener event=firstSearcher class=solr.QuerySenderListener arr name=queries lststr name=qline/str/lst lststr name=qdraw/str/lst lststr name=qline/strstr name=fqlanguage:english/str/lst lststr name=qline/strstr name=fqSource2:documentation/str/lst lststr name=qline/strstr name=fqSource2:CloudHelp/str/lst lststr name=qdraw/strstr name=fqlanguage:english/str/lst lststr name=qdraw/strstr name=fqSource2:documentation/str/lst lststr name=qdraw/strstr name=fqSource2:CloudHelp/str/lst /arr /listener maxWarmingSearchers2/maxWarmingSearchers The custom request handler : requestHandler name=/adskcloudhelp class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=wtvelocity/str str name=v.templatebrowse/str str name=v.contentTypetext/html;charset=UTF-8/str str name=v.layoutlayout/str str name=v.channelcloudhelp/str str name=defTypeedismax/str str name=q.alt*:*/str str name=rows15/str str name=flid,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score/str str name=qftext^1.5 title^2 IndexTerm^.9 keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1/str str name=bqSource2:CloudHelp^3 Source2:youtube^0.85/str str name=bfrecip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0/str str name=dftext/str str name=faceton/str str name=facet.mincount1/str str name=facet.limit100/str str name=facet.fieldlanguage/str str name=facet.fieldSource2/str str name=facet.fieldDocumentationBook/str str name=facet.fieldADSKProductDisplay/str str name=facet.fieldaudience/str str name=hltrue/str str name=hl.fltext title/str str name=f.text.hl.fragsize250/str str name=f.text.hl.alternateFieldShortDesc/str str name=spellchecktrue/str str name=spellcheck.dictionarydefault/str
Solr - Read sort data from external source
Hello, I am trying to write some code to read rank data from external db, I saw some example done using database - http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.html, where they fetch whole database during index searcher creation and cache it. But is there any way to pass parameter or choose different database during FieldComparator based on query. Lets say I want to pass versions, the sort order in version 1 will be different then sort order in v2. Or if I use ExternalFileField is there way to load different file base on query parameter? Regards -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Read-sort-data-from-external-source-tp4095970.html Sent from the Solr - User mailing list archive at Nabble.com.
Skipping caches on a /select
Hey guys, I am debugging some /select queries on my Solr tier and would like to see if there is a way to tell Solr to skip the caches on a given /select query if it happens to ALREADY be in the cache. Live queries are being inserted and read from the caches, but I want my debug queries to bypass the cache entirely. I do know about the cache=false param (that causes the results of a select to not be INSERTED in to the cache), but what I am looking for instead is a way to tell Solr to not read the cache at all, even if there actually is a cached result for my query. Is there a way to do this (without disabling my caches in solrconfig.xml), or is this feature request? Thanks! Tim Vaillancourt
Re: SolrCloud on SSL
Not important, but I'm also curious why you would want SSL on Solr (adds overhead, complexity, harder-to-troubleshoot, etc)? To avoid the overhead, could you put Solr on a separate VLAN (with ACLs to client servers)? Cheers, Tim On 12 October 2013 17:30, Shawn Heisey s...@elyograg.org wrote: On 10/11/2013 9:38 AM, Christopher Gross wrote: On Fri, Oct 11, 2013 at 11:08 AM, Shawn Heisey s...@elyograg.org wrote: On 10/11/2013 8:17 AM, Christopher Gross wrote: Is there a spot in a Solr configuration that I can set this up to use HTTPS? From what I can tell, not yet. https://issues.apache.org/jira/browse/SOLR-3854 https://issues.apache.org/jira/browse/SOLR-4407 https://issues.apache.org/jira/browse/SOLR-4470 Dang. Christopher, I was just looking through Solr source code for a completely different issue, and it seems that there *IS* a way to do this in your configuration. If you were to use https://hostname; or https://ipaddress; as the host parameter in your solr.xml file on each machine, it should do what you want. The parameter is described here, but not the behavior that I have discovered: http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params Boring details: In the org.apache.solr.cloud package, there is a ZkController class. The getHostAddress method is where I discovered that you can do this. If you could try this out and confirm that it works, I will get the wiki page updated and look into the Solr reference guide as well. Thanks, Shawn
Re: Skipping caches on a /select
On Wed, Oct 16, 2013 at 6:18 PM, Tim Vaillancourt t...@elementspace.com wrote: I am debugging some /select queries on my Solr tier and would like to see if there is a way to tell Solr to skip the caches on a given /select query if it happens to ALREADY be in the cache. Live queries are being inserted and read from the caches, but I want my debug queries to bypass the cache entirely. I do know about the cache=false param (that causes the results of a select to not be INSERTED in to the cache), but what I am looking for instead is a way to tell Solr to not read the cache at all, even if there actually is a cached result for my query. Yeah, cache=false for q or fq should already not use the cache at all (read or write). -Yonik
Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID
Thanks Bharat. This is a bug. I've opened LUCENE-5289. https://issues.apache.org/jira/browse/LUCENE-5289 On Wed, Oct 16, 2013 at 9:35 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi Shalin, I am not sure why the log specifies No uncommitted changes appear. The data is available in Solr at the time I perform a delete. please find the below steps I have performed: Inserted a document in master (with id= change.me.1) issued a commit on master Triggered replication on slave Ensured that the document is replicated successfully. Issued a delete by ID. Issued a commit on master Replication did NOT happen. The logs are as follows: Master - http://pastebin.com/265CtCEp Slave - http://pastebin.com/Qx0xLwmK Regards, Bharat Akkinepalli. -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, October 16, 2013 11:28 AM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID The only delete I see in the master logs is: INFO - 2013-10-11 14:06:54.793; org.apache.solr.update.processor.LogUpdateProcessor; [annotation] webapp=/solr path=/update params={} {delete=[change.me(-1448623278425899008)]} 0 60 When you commit, we have the following: INFO - 2013-10-11 14:07:03.809; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} INFO - 2013-10-11 14:07:03.813; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit. That suggests that the id you are trying to delete never existed in the first place and hence there was nothing to commit. Hence replication was not triggered. Am I missing something? On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi Otis, Did you get a chance to look into the logs. Please let me know if you need more information. Thank you. Regards, Bharat Akkinepalli -Original Message- From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com] Sent: Friday, October 11, 2013 2:16 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Hi Otis, Thanks for the response. The log files can be found here. MasterLog : http://pastebin.com/DPLKMPcF Slave Log: http://pastebin.com/DX9sV6Jx One more point worth mentioning here is that when we issue the commit with expungeDeletes=true, then the delete by id replication is successful. i.e. http://localhost:8983/solr/annotation/update?commit=trueexpungeDelete s=true Regards, Bharat Akkinepalli -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Wednesday, October 09, 2013 6:35 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID Bharat, Can you look at the logs on the Master when you issue the delete and the subsequent commits and share that? Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) b.akkinepa...@elsevier.com wrote: Hi, We have recently migrated from Solr 3.6 to Solr 4.4. We are using the Master/Slave configuration in Solr 4.4 (not Solr Cloud). We have noticed the following behavior/defect. Configuration: === 1. The Hard Commit and Soft Commit are disabled in the configuration (we control the commits from the application) 2. We have 1 Master and 2 Slaves configured and the pollInterval is configured to 10 Minutes. 3. The Master is configured to have the replicateAfter as commit startup Steps to reproduce the problem: == 1. Delete a document in Solr (using delete by id). URL - http://localhost:8983/solr/annotation/update with body as deleteid change.me/id/delete 2. Issue a commit in Master ( http://localhost:8983/solr/annotation/update?commit=true). 3. The replication of the DELETE WILL NOT happen. The master and slave has the same Index version. 4. If we try to issue another commit in Master, we see that it replicates fine. Request you to please confirm if this is a known issue. Thank you. Regards, Bharat Akkinepalli -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.