Missed the list in my last reply:

This used to work properly - I'm guess that the zk layout refactoring right 
before 4.0 broke it. We likely need a JIRA issue, a fix, and a test. 

Mark

On Nov 14, 2012, at 6:43 AM, Gilles Comeau <gilles.com...@polecat.co> wrote:

> Hi all,
> 
> I just wanted to make the simplest repro of this issue, which now I am 
> thinking might be related to the decision made in: 
> https://issues.apache.org/jira/browse/SOLR-3080  ?  And this is the expected 
> behaviour?
> 
> 1.    Download SOLR 4 production and extract.
> 2.    Replace solr.xml in apache-solr-4.0.0/example/solr/solr.xml with:
> 
> <?xml version="1.0" encoding="UTF-8" ?>
> <solr persistent="true">
>  <cores adminPath="/admin/cores" defaultCoreName="collection1" 
> host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" 
> zkClientTimeout="${zkClientTimeout:15000}">
>    <core shard="shard1" instanceDir="collection1/" name="collection1" 
> collection="polecat"/>
>    <core shard="shard1" instanceDir="collection2/" name="collection2" 
> collection="polecat"/>
>    <core schema="schema.xml" shard="core3" instanceDir="core3/" name="core3" 
> config="solrconfig.xml" collection="polecat" dataDir="data"/>
>  </cores>
> </solr>
> 
> 3.    Start solr with: java -Dbootstrap_confdir=./solr/collection1/conf 
> -Dcollection.configName=myconf -DzkRun -Dsolrcloud.skip.autorecovery=true  
> -jar start.jar
>       (skip.autorecovery is used because the shards don't exist previously)
> 
> Then run this:
>       Sanity query:  
> http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
>       Remove the core: 
> http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core3&deleteIndex=true
>       Error query: 
> http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true
> 
> And the sanity query, we will receive 0 records, the error query "no servers 
> hosting shard:".   And in the clusterstate.json:  "core3":{"replicas":{}}}}
> 
> Regards,
> 
> Gilles
> 
> -----Original Message-----
> From: Gilles Comeau [mailto:gilles.com...@polecat.co] 
> Sent: 13 November 2012 16:39
> To: solr-user@lucene.apache.org; markrmil...@gmail.com
> Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
> 
> Sorry forgot.. pictures are no good.. From cluster.json, the same 
> information, the core I unloaded shard sticks around:  
> “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}”
> 
> Do I need a special command to delete the shard or something?  I’ve never 
> seen a command that does that?
> 
> Regards, Gilles
> 
>  "experiment":{
>    
> "solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{
>          "shard":"solrexperiment:8080_solr_experiment_master",
>          "roles":null, 
> "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
>    
> "solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{
>          
> "shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment",
>          
> "collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}},
>    "solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}
> 
> 
> From: Gilles Comeau [mailto:gilles.com...@polecat.co]
> Sent: 13 November 2012 16:29
> To: solr-user@lucene.apache.org; markrmil...@gmail.com
> Subject: RE: Removing Shards from Zookeeper - no servers hosting shard
> 
> 
> When I do the unload through the UI, I see the below messages in the solr 
> log.   Nothing in the zookeeper log.
> 
> 
> 
> Then right after I try:  
> http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true
>   and get  <str name="msg">no servers hosting shard:</str>.   Also, I still 
> see the shard being referenced in the cloud tab in the UI.
> 
> 
> 
> [cid:image001.png@01CDC1BB.FD2BE590]
> 
> 
> 
> Does this work for anyone else using SOLR 4.0 production with external 
> zookeeper and distributed queries and if so, can you let me know exactly what 
> versions and steps you take to not get this error? ☺   Anyone else have any 
> problems getting this to work?
> 
> 
> 
> 
> My setup is pretty basic:  Local external zookeeper  3.3.6, solr 4.0 with 
> three cores seen above.
> 
> 
> 
> Regards, Gilles
> 
> 
> 
> INFO: [02_10_2012_experiment]  CLOSING SolrCore 
> org.apache.solr.core.SolrCore@11e3c2c6<mailto:org.apache.solr.core.SolrCore@11e3c2c6>
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher
> 
> INFO: [02_10_2012_experiment] Closing main searcher on request.
> 
> 13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close
> 
> FINE: Closing Searcher@7cd47880 main
> 
>        
> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> 
>        
> filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> 
>        
> queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0}
> 
>        
> documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close
> 
> FINE: Closing: 
> CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index
>  
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>>
> 
> 13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close
> 
> INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft 
> autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
> 
> 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
> 
> INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
> 
> 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref
> 
> INFO: Closing SolrCoreState - canceling any ongoing recovery
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile
> 
> INFO: Persisting cores config to /solr2/solr.xml
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@adminPath=/admin/cores
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
> 
> FINE: null missing optional solr/cores/@shareSchema
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@hostPort=9090
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@zkClientTimeout=10000
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal
> 
> FINE: null solr/cores/@hostContext=solr
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode
> 
> FINE: null missing optional solr/cores/@leaderVoteWait
> 
> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile
> 
> INFO: Persisting cores config to /solr2/solr.xml
> 
> 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader 
> updateClusterState
> 
> INFO: Updating cloud state from ZooKeeper...
> 
> 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process
> 
> INFO: A cluster state change has occurred - updating...
> 
> 
> 
> -----Original Message-----
> From: Mark Miller [mailto:markrmil...@gmail.com]
> Sent: 13 November 2012 14:13
> To: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> Subject: Re: Removing Shards from Zookeeper - no servers hosting shard
> 
> 
> 
> Odd...the unload command should be enough...
> 
> 
> 
> On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau 
> <gilles.com...@polecat.co<mailto:gilles.com...@polecat.co>> wrote:
> 
>> Hi all,
> 
>> 
> 
>> We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 
>> development version circa November 2011.  We keep 6 months of data online in 
>> our primary cluster, and archive off old stuff to a slower disk archive 
>> cluster.   We used to remove SOLR cores with the following code, but 
>> everything has changed in Zookeeper now.
> 
>> 
> 
>> Old code to remove cores from Zookeeper:
> 
>> 
> 
>> 
> 
>> curl 
>> http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>>
> 
>> 
> 
>>        echo "Removing indexes from all Zookeeper hosts"
> 
>>        for (( i=0; i<${#ZK_HOSTS[*]}; i++ ))
> 
>>        do
> 
>>                $JAVA -cp 
>> .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar
>>  org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete 
>> /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD
> 
>>                $JAVA -cp 
>> .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar
>>  org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete 
>> /collections/polecat/shards/solrenglish:8080_solr_$SHARD
> 
>>        Done
> 
>> 
> 
>> curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master
> 
>> 
> 
>> Now that we have migrated, I have tried removing cores from Zookeeper by 
>> removing the stuff for the unloaded core in "leaders" and "leader_elect", 
>> but for some reason SOLR keeps sending the requests to the shard, and I end 
>> up with the "no servers hosting shard" error.
> 
>> 
> 
>> Does anyone know how to remove a SOLR core from a SOLR server and have 
>> Zookeeper updated, and have distributed queries still work?   The only thing 
>> I know how to do now is stop tomcat, stop zookeeper, clear out the data 
>> directory and then restart both.   This isn't really ideal for a process I'd 
>> like to have running each night, and surely it is something others have it.  
>> I've tried google searching, and what I find is references to the bug where 
>> solr notifies zookeeper on core unloads which is marked as fixed, and people 
>> talking about how it doesn't work but if your run reloads on each core, it 
>> will work.  (also doesn't work when I do it)
> 
>> 
> 
>> Regards,
> 
>> 
> 
>> Gilles Comeau
> 
> 
> 
> 
> 
> 
> 
> --
> 
> - Mark

Reply via email to