Hi Shawn, I make the changes in my schema.xml & uploaded the configuration from one of my server, which is now visible in all other servers (I confirmed it by checking from admin interface).
*My Solr Cloud arch is :* I have two collections, mcat & intent in my external zookeeper ensemble of 3. *On Machine1* I created the mcat_shard1 of mcat in core 1 & intent_shard1 of intent in core 2. *On Machine2* I created the mcat_shard2 of mcat in core 1 & intent_shard2 of intent in core 2. *On Machine 3* I created the replica of mcat_shard1 in core 1 I created the replica of mcat_shard2 in core 2 I created the replica of intent_shard1 in core 3 I created the replica of intent_shard2 in core 4 So to apply the changes i reloaded my collection using the url: http://localhost:4567/solr/admin/collections?action=RELOAD&name=intent Due to which the two shard present on one of my machine (Machine 2) goes in recovery mode and i am not able to understand what's wrong here. *Please see the solr logs of that machine:* INFO - 2015-03-11 12:12:41.630; > org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null > path=/admin/cores > params={coreNodeName=core_node2&onlyIfLeaderActive=true&state=recovering&nodeName=192.168.5.81:4567_solr&action=PREPRECOVERY&checkLive=true&core=mcat_shard2_replica_core&wt=javabin&onlyIfLeader=true&version=2} > status=400 QTime=1 > ERROR - 2015-03-11 12:12:41.643; org.apache.solr.common.SolrException; > Error while trying to recover. > core=mcat_core:java.util.concurrent.ExecutionException: > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at http://127.0.1.1:4567/solr: I was asked to wait on state > recovering for null in null on 192.168.5.81:4567_solr but I still do not > see the requested state. I see state: null live:false leader from ZK: Not > available due to: java.lang.NullPointerException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > at > org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:597) > at > org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:369) > at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) > Caused by: > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at http://127.0.1.1:4567/solr: I was asked to wait on state > recovering for null in null on 192.168.5.81:4567_solr but I still do not > see the requested state. I see state: null live:false leader from ZK: Not > available due to: java.lang.NullPointerException > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:558) > at > org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:249) > at > org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:245) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > ERROR - 2015-03-11 12:12:41.644; org.apache.solr.cloud.RecoveryStrategy; > Recovery failed - trying again... (6) core=mcat_core > INFO - 2015-03-11 12:12:41.644; org.apache.solr.cloud.RecoveryStrategy; > Wait 60.0 seconds before trying to recover again (7) > INFO - 2015-03-11 12:12:42.665; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 3) > INFO - 2015-03-11 12:12:43.355; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 3) I restart my complete cluster but the problem still present. Please help. *Here is the screenshot url:* *http://i.imgur.com/QFdg89S.png <http://i.imgur.com/QFdg89S.png>* http://i.imgur.com/tS0yTNh.png With Regards Aman Tandon On Wed, Mar 11, 2015 at 2:02 PM, Aman Tandon <amantandon...@gmail.com> wrote: > Thanks Shawn. >> >> >> except that you should reload the collection, which >> will reload all cores for that collection > > > So i could reload a collection via Collection API's > > > http://localhost:8983/solr/admin/collections?action=RELOAD&name=newCollection > > right? > > With Regards > Aman Tandon > > On Wed, Mar 11, 2015 at 1:48 PM, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 3/11/2015 12:43 AM, Aman Tandon wrote: >> > Thanks Nitin for replying, isn't it will be costly operation to restart >> all >> > nodes. >> > >> > What i am doing in this is uploading the configurations again to >> zookeeper >> > and then reloading my core. And it is working well. So am i missing >> > something? >> >> Yes, that is enough, except that you should reload the collection, which >> will reload all cores for that collection. Fully restarting all Solr >> instances is not required for most changes to a collection config. >> >> There is an exception - when you are adding config that uses new jars >> added to ${solr.solr.home}/lib after Solr startup. In that situation, a >> restart would be required so that the new jars get loaded. It is >> strongly recommended that you use an external zookeeper and that you do >> a rolling restart, where you restart one node, wait for the cloud graph >> in the admin UI to show 100% green, then restart the next node. >> >> Nitin's suggestion shows the zkcli.sh script starting with "sudo" which >> runs the command as root. This is not necessary. As long as the >> contents of the example directory (or server directory in 5.0) is >> accessible to a normal user and the script is marked executable for that >> user, no special permissions are required. >> >> Thanks, >> Shawn >> >> >