Hi David, The parent metadata persists only until the sub-shards become active. Actually the logic to make the sub-shards active depends on knowing when all 'sibling' sub-shards' replicas have recovered successfully. We store the parent to make that easier to look up. Once all replicas of all sub-shards have recovered, the shard states are updated. The 'updateshardstate' command also removes the 'parent' key from the sub-shards while switching them to 'active'.
If you're seeing the 'parent' key on a 'active' sub-shard then it may be a bug. Please paste your clusterstate and I'll look into why it was left over. On Mon, Feb 3, 2014 at 10:19 AM, David Smiley (@MITRE.org) <dsmi...@mitre.org> wrote: > I think I figured this out; I hope people find this useful.. > > It may not be possible to declare what the hash ranges are when you create > the collection, but you *can* do so when you split via the 'ranges' > parameter, which is a comma-delimited list. So this means you can create a > new collection with one shard and then immediately split it to the desired > ranges to line up with that of your backup. I also observed that if you > create a collection and then split every shard (in 2), it will result in an > equivalent collection to one that was created with twice as many shards to > begin with. I hoped that was so and verified the ranges end up being the > same both ways. > > The only thing that seems like it may be benign but not 100% certain is that > if you split a shard, the new shards have a 'parent' reference to the name > of the shard it was split from. And even if you delete that parent shard > (since it's not needed anymore; it becomes inactive). I'm not sure why this > metadata is recorded because, at least after the split, I can't see why it's > pertinent to anything. > > ~ David > > > David Smiley (@MITRE.org) wrote >> Hi, >> >> I'm attempting to come up with a SolrCloud restore / clone process for >> either recover to a known good state or to clone the environment for >> experimentation. At the moment my process involves either creating a new >> zookeeper environment or at least deleting the existing Collection so that >> I can create a new one. This works; I use the Core API; the first command >> defines the collection parameters, and I invoke it once for each replica. >> I don't use the Collection API because I want SolrCloud to go off trying >> to create all the replicas -- I know where each one is pre-positioned. >> >> What I'm concerned about is what happens once I start wanting to use Shard >> splitting, *especially* if I don't want to split all shards because shards >> are uneven due to custom routing (e.g. id:"customer!myid"). In this case >> I don't know how to create the collection with the hash ranges post-shard >> split. Solr doesn't have an API for me to explicitly say what the hash >> ranges should be on each shard (to match up with a backup). And I'm >> concerned about undocumented pitfalls that may exist in manually >> constructing a clusterstate.json, as another approach. >> >> Any ideas? >> >> ~ David > > > > > > ----- > Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Clone-or-Restore-Solrcloud-tp4114773p4114983.html > Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.