Re: Strategy for removing an active shard from zookeeper

Jeff Wartes Thu, 03 Jul 2014 12:43:25 -0700

To expand on that, the Collections API DELETEREPLICA command is availible
in Solr >= 4.6, but will not have the ability wipe the disk until Solr
4.10. 
Note that whether or not it deletes anything from disk, DELETEREPLICA will
remove that replica from your cluster state in ZK, so even in 4.10,
rebooting the node will NOT cause it to copy the data from the remaining
replica. You¹d need to explicitly ADDREPLICA (Solr >= 4.8) to get it
participating again. On the plus side, you could do this without
restarting any servers.


The CoreAdmin UNLOAD command (which I think DELETEREPLICA uses under the
hood) has been available and able to wipe the disk since Solr 4.0. It
looks like specifying ³deleteIndex=true² might essentially do what you¹re
currently doing. I¹m not sure if you¹d still need a restart.


It¹s odd to me that one replica would use more disk space than the other
though, that implies a replication issue. Which, in turn, means you
probably don¹t have any assurances that deleting the node with a bigger
index isn¹t losing unique documents.



On 6/30/14, 7:10 PM, "Anshum Gupta" <ans...@anshumgupta.net> wrote:

>You should use the DELETEREPLICA Collections API:
>https://cwiki.apache.org/confluence/display/solr/Collections+API#Collectio
>nsAPI-api9
>
>As of the last release, I don't think it deletes the index directory
>but I remember there was a JIRA for the same.
>For now you could perhaps use this API and follow it up with manually
>deleting the directory after that. This should help you maintain the
>sanity of the SolrCloud state.
>
>
>On Mon, Jun 30, 2014 at 8:45 PM, tomasv <dadk...@gmail.com> wrote:
>> Hello All,
>> (I'm a newbie, so if my terminology is incorrect or my concepts are
>>wrong,
>> please point me in the right direction)(This is the first of several
>> questions to come)
>>
>> I've inherited a SOLR 4 cloud installation and we're having some issues
>>with
>> disk space on one of our shards.
>>
>> We currently have 64 servers serving a collection. The collection is
>>managed
>> by a zookeeper instance. There are two servers for each shard (32
>>replicated
>> shards).
>>
>> We have a service that is constantly running and inserting new records
>>into
>> our collection as we get new data to be indexed.
>>
>> One of our shards is growing (on disk)  disproportionately  quickly.
>>When
>> the disk gets full, we start getting 500-series errors from the SOLR
>>system
>> and our websites start to fail.
>>
>> Currently, when we start seeing these errors, and IT sees that the disk
>>is
>> full on this particular server, the folks in IT delete the /data
>>directory
>> and restart the server (linux based). This has the effect of causing the
>> shard to reboot and re-load itself from its paired partner.
>>
>> But I would expect that there is a more elegant way to recover from this
>> event.
>>
>> Can anyone point me to a strategy that may be used in an instance such
>>as
>> this? Should we be taking steps to save the indexed information prior to
>> restarting the server (more on this in a separate question). Should we
>>be
>> backing up something (anything) prior to the restart?
>>
>> (I'm still going through the SOLR wiki; so if the answer is there a
>>link is
>> appreciated).
>>
>> Thanks!
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>>http://lucene.472066.n3.nabble.com/Strategy-for-removing-an-active-shard-
>>from-zookeeper-tp4144892.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>-- 
>
>Anshum Gupta
>http://www.anshumgupta.net

Re: Strategy for removing an active shard from zookeeper

Reply via email to