Re: How to build SolrCloud collection from instance dirs after ZK is lost?

Erick Erickson Sat, 20 Jun 2020 03:30:52 -0700

There’s a special nodeset EMPTY that will add no replicas. And by replicas 
I’m including leaders here. You can then ADDREPLICA to place each one exactly
where you please.


Using a defined nodeset as you outline will work too. If it somehow messes
up you can always MOVEREPLICA…

The thing to remember is that Zookeeper keeps all the metadata and it
needs to be coordinated with what’s in the core.properties file on disk.
Reconstructing that data can be done manually, but that’s horribly error
prone best let Solr do that.

The indexes, however, have no notion of anything outside the replica. 
Zookeeper is unaware of their existence and vice versa. So what you’re
really doing is recreating the metadata (including core.properties) then
overlaying the index.

As far as reconstructing this by including more data in core.properties,
people have been talking about "ZK as the source of truth”, which is
something of the opposite. If we keep metadata in both places, then 
keeping them synchronized is a pain. And, frankly, I’m lukewarm to
the whole notion of reconstructing the collection automatically.
Backing up critical data (in this case Zookeeper) is part of any 
operations playbook… but that’s a debate for another day ;)

Good luck!
Erick

> On Jun 20, 2020, at 3:37 AM, Mikhail Khludnev <m...@apache.org> wrote:
> 
> Hello, Colleagues.
> Thank you for sharing your experience and ideas. 
> I'm aiming for the simplest scenario with default routing (is it 
> 'composite'?) so, I assume that if I set the same number of shards, the hash 
> ranges will match. I plan to create new collection with replicationFactor=1 
> shuffle=false and specify nodeset in the same order as remaining shard dirs 
> distributed across nodes that let me collocate old and new cores and just 
> exec $mv. Also, replicas can be placed to the certain nodes explicitly.
> 
> On Sat, Jun 20, 2020 at 1:59 AM Shawn Heisey <apa...@elyograg.org> wrote:
> On 6/18/2020 1:35 AM, Mikhail Khludnev wrote:
> > I'm challenged with cluster recovery. Think about total failure: ZK 
> > state is lost, however instanceDirs survived since they are mounted via 
> > EBS. Let's say collection is read/only and/or it doesn't have 
> > replicas, just leaders.
> > Is there a way to create a new empty collection and say, hey here's 
> > shard1 instance, shard2 instance is there etc?
> > 
> > Customer says that the old version of solr does it automatically: when 
> > empty zk is connected, collection's shards just appear there. Right now 
> > due to https://issues.apache.org/jira/browse/SOLR-12066Cleanup deleted 
> > core when node start - if instances with data dirs connect to empty ZK 
> > it just wipes dirs away.
> 
> I think that SOLR-12066 was a mistake.  See SOLR-13396, which is linked 
> to SOLR-12066.  There are some interesting ideas outlined in SOLR-13396.
> 
> There is info in the clusterstate that is currently not recorded 
> anywhere but zookeeper, making it impossible to fully reconstruct a 
> collection from existing cores when ZK data is lost.
> 
> A quick look at the cloud example on version 8.5.1 tells me that for 
> such reconstruction to be possible, in addition to what it currently 
> contains, core.properties would need to record the shard hash range, the 
> router, maxShardsPerNode, and autoAddReplicas.  And there may be other 
> things related to features that the cloud example does not use.
> 
> If both properties and clusterstate in ZK are available, any mismatches 
> between the two should generate a WARN log, and ZK info should probably 
> be preferred over properties.  A Collections API action should probably 
> be created to force mismatches back into agreement.
> 
> Alternately, the new info could be recorded in a new file, with 
> cloud.properties being one possibility for the filename.  I can think of 
> reasons to prefer this approach, but I worry about the stability of 
> adding a whole new file to the config mechanisms.
> 
> If the capability does not already exist, I think there should be some 
> combination of Collections API actions that will allow somebody to 
> manually reconstruct the collection clusterstate in ZK.
> 
> Side note:  While playing with examples on 8.5.1 so I could be accurate 
> on this message, I discovered that the "Files" tab in the admin UI has 
> issues, in both cloud and standalone mode.  The following screenshot has 
> some red lines added to problems I found.  Subdirectories do not work 
> correctly, the column for filenames is not wide enough for the example 
> configs, and the filenames do not have mouseover expansion which would 
> be an alternate way to deal with really long filenames.
> 
> https://www.dropbox.com/s/4lm3uad2uv53630/SolrAdminFilesTabProblems.png?dl=0
> 
> That's probably worthy of an issue, but I don't want to open one without 
> discussion.
> 
> Thanks,
> Shawn
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> -- 
> Sincerely yours
> Mikhail Khludnev


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: How to build SolrCloud collection from instance dirs after ZK is lost?

Reply via email to