Also, you have to track two sets of batches, failures, and retries.  --wunder


On Jun 25, 2013, at 11:30 AM, Kevin Osborn wrote:

> Jason,
> 
> My initial reluctance to indexing directly to both data centers is that we
> are doing a lot of bulk loading through CSV handler. We never get just 1
> document at a time. It comes in large batch updates. And now we would have
> to send the batch updates twice.
> 
> That is not to say that we won't go this way. But I am exploring other
> solutions as well.
> 
> 
> On Tue, Jun 25, 2013 at 11:21 AM, Jason Hellman <
> jhell...@innoventsolutions.com> wrote:
> 
>> Kevin,
>> 
>> I can imagine this working if you consider your second data center a pure
>> slave relationship to your SolrCloud cluster.  I haven't tried it, but I
>> don't see why the solrconfig.xml can't identify as a master allowing you to
>> call any of your cores in the cluster to replicate out.  That being said,
>> this idea doesn't facilitate a SolrCloud cluster in the second data
>> center…just a slave that could be a repeater.
>> 
>> You say that sending the data in both directions is not idea, but it works
>> and is conceptually very simple.  What is the reasoning behind wanting to
>> get away from that approach?
>> 
>> Jason
>> 
>> On Jun 25, 2013, at 10:07 AM, Kevin Osborn <kevin.osb...@cbsi.com> wrote:
>> 
>>> We are going to have two datacenters, each with their own SolrCloud and
>>> ZooKeeper quorums. The end result will be that they should be replicas of
>>> each other.
>>> 
>>> One method that has been mentioned is that we should add documents to
>> each
>>> cluster separately. For various reasons, this may not be ideal for us.
>>> Instead, we are playing around with the idea of always indexing to one
>>> datacenter. And then having that replicate to the other datacenter. And
>>> this is where I am having some trouble on how to proceed.
>>> 
>>> The nice thing about SolrCloud is that there is no masters and slaves.
>> Each
>>> node is equals, has the same configs, etc. But in this case, I want to
>> have
>>> a node in one datacenter poll for changes in another data center. Before
>>> SolrCloud, I would have used slave/master replication. But in the
>> SolrCloud
>>> world, I am not sure how to configure this setup?
>>> 
>>> Or is there any better ideas on how to use replication to push or pull
>> data
>>> from one datacenter to another?
>>> 
>>> In my case, NRT is not a requirement. And I will also be dealing with
>> about
>>> 3 collections and 5 or 6 shards.
>>> 
>>> Thanks.
>>> 
>>> --
>>> *KEVIN OSBORN*
>>> LEAD SOFTWARE ENGINEER
>>> CNET Content Solutions
>>> OFFICE 949.399.8714
>>> CELL 949.310.4677      SKYPE osbornk
>>> 5 Park Plaza, Suite 600, Irvine, CA 92614
>>> [image: CNET Content Solutions]
>> 
>> 
> 
> 
> -- 
> *KEVIN OSBORN*
> LEAD SOFTWARE ENGINEER
> CNET Content Solutions
> OFFICE 949.399.8714
> CELL 949.310.4677      SKYPE osbornk
> 5 Park Plaza, Suite 600, Irvine, CA 92614
> [image: CNET Content Solutions]

--
Walter Underwood
wun...@wunderwood.org



Reply via email to