Re: Data Replication

vaibhav thapliyal Sun, 16 Oct 2016 10:59:09 -0700

I think neither of these would contribute much to load balancing. HDFS
replication is mostly a safeguard against Single Points of failure in a
Hadoop cluster. However, Data center replication would ensure the
availability of an Accumulo instance.


On 16 October 2016 at 21:02, Yamini Joshi <yamini.1...@gmail.com> wrote:

> In other words, what helps in load balancing? HDFS replication or Data
> center replication?
>
> Best regards,
> Yamini Joshi
>
> On Sat, Oct 15, 2016 at 10:44 PM, Yamini Joshi <yamini.1...@gmail.com>
> wrote:
>
>> So HDFS is for durability while replication is for availability? I'm
>> assuming that the client is unaware of the replicated instance and queries
>> the DB with no knowledge of which instance/table will return the result.
>>
>> Best regards,
>> Yamini Joshi
>>
>> On Thu, Oct 13, 2016 at 11:46 AM, Josh Elser <josh.el...@gmail.com>
>> wrote:
>>
>>> I'm not familiar with MongoDB. Perhaps someone else can confirm this for
>>> you.
>>>
>>> Yamini Joshi wrote:
>>>
>>>> So, can I say that if I have a table split across nodes (i.e. num
>>>> tablets > 1) and HDFS replication in my system, it is sort of equivalent
>>>> to a sharded and replicated mongo architecture?
>>>>
>>>> Best regards,
>>>> Yamini Joshi
>>>>
>>>> On Thu, Oct 13, 2016 at 11:06 AM, Josh Elser <josh.el...@gmail.com
>>>> <mailto:josh.el...@gmail.com>> wrote:
>>>>
>>>>     The Accumulo (Data Center) Replication feature is for having
>>>>     multiple active Accumulo clusters all containing the same data.
>>>>
>>>>     HDFS provides replication as a means for durability of the data it
>>>>     is storing. The files that Accumulo creates on one HDFS instance are
>>>>     replicated by HDFS. This does not help if your entire cluster become
>>>>     unavailable. That is what the data center replication Accumulo
>>>>     feature solves.
>>>>
>>>>     While both can be called "replication", they serve very different
>>>>     purposes.
>>>>
>>>>
>>>>     Yamini Joshi wrote:
>>>>
>>>>         Hello
>>>>
>>>>         I was going through some Accumulo docs and found out about
>>>>         replication.
>>>>         To enable replication,one needs to make some config settings as
>>>>         described in
>>>>         https://github.com/apache/accumulo/blob/master/docs/src/main
>>>> /asciidoc/chapters/replication.txt
>>>>         <https://github.com/apache/accumulo/blob/master/docs/src/mai
>>>> n/asciidoc/chapters/replication.txt>.
>>>>         I cannot seem to grasp the difference between this replication
>>>>         conf and
>>>>         the replication on HDFS level. What exactly is the use case for
>>>>         replication? Are the replicated instances visible to the
>>>> clients?
>>>>
>>>>         Best regards,
>>>>         Yamini Joshi
>>>>
>>>>
>>>>
>>
>

Re: Data Replication

Reply via email to