Re: Inconsistent numFound in SC when querying core directly

Erick Erickson Tue, 14 Mar 2017 17:30:37 -0700

bq: If I changed the routing strategy back to composite (which it should be). is
it ok?


I sincerely doubt it. The docs have already been routed to the wrong
place (actually, I'm not sure how it worked at all). You can't get
them redistributed simply by changing the definition in ZooKeeper,
they're _already_ in the wrong place.

I'd tear down the corrupted data center and rebuild the collection.
Here "tear down" is delete all the affected collections and start over
again.

On the plus side, if you can get a window during which you are _not_
indexing you can copy the indexes from one of your good data centers
to the new one. Do it like this:


- Stop indexing.

- Set up the new collection in the corrupted data center. It's
important that it have _exactly_ the same number of shards ad the DC
you're going to transfer _from_. Also, make it leader only, i.e.
exactly 1 replica per shard.

- copy the indexes over from the good data center to the corresponding
shards. Here "corresponding" means that the source and destination
have the same hash range, which you can see from the state.json (or
clusterstate.json if you're on an earlier format). NOTE: there are two
ways to do this:
-- Just do file copies, scp, hand carry CDs, whatever. Solr should be
offline in the target data center.
-- use the replication API to issue a "fetchindex" command. This works
even in cloud mode, all the target Solr instance needs is access to a
URL it can pull from. Solr of course needs to be running in this case.

- Bring up Solr on the target data center and verify it's working.

- Use the Collections API to ADDREPLICA on the target system until you
build out the collection with the numbers of replicas you want.

- Start indexing to the target data center.

The bits about shutting off indexing is a safety measure, it
guarantees that the indexes are consistent. If you can't shut indexing
down during the transfer, you'll need to index docs to the
newly-rebuilt cluster in some manner that guarantees the two DCs will
have the same docs eventually.

Best,
Erick



On Tue, Mar 14, 2017 at 3:26 PM, vbindal <vineetbin...@gmail.com> wrote:
> I think I dint explain properly.
>
> I have 3 data centers each with its own SOLR cloud.
>
> My original strategy was composite routing but when one data center went
> down and we brought it back, somehow the routing strategy on this changed to
> implicit (Other 2 DC still have composit and they are working absolutely
> fine).
>
> This might be the reason for the data corruption on that DS because the
> routing strategy got changed.
>
> If I changed the routing strategy back to composite (which it should be). is
> it ok? Do I need to do anything more than simply changing the strategy in
> the clusterState.json?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Inconsistent-numFound-in-SC-when-querying-core-directly-tp4105009p4325001.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Inconsistent numFound in SC when querying core directly

Reply via email to