I think neither of these would contribute much to load balancing. HDFS replication is mostly a safeguard against Single Points of failure in a Hadoop cluster. However, Data center replication would ensure the availability of an Accumulo instance.
On 16 October 2016 at 21:02, Yamini Joshi <yamini.1...@gmail.com> wrote: > In other words, what helps in load balancing? HDFS replication or Data > center replication? > > Best regards, > Yamini Joshi > > On Sat, Oct 15, 2016 at 10:44 PM, Yamini Joshi <yamini.1...@gmail.com> > wrote: > >> So HDFS is for durability while replication is for availability? I'm >> assuming that the client is unaware of the replicated instance and queries >> the DB with no knowledge of which instance/table will return the result. >> >> Best regards, >> Yamini Joshi >> >> On Thu, Oct 13, 2016 at 11:46 AM, Josh Elser <josh.el...@gmail.com> >> wrote: >> >>> I'm not familiar with MongoDB. Perhaps someone else can confirm this for >>> you. >>> >>> Yamini Joshi wrote: >>> >>>> So, can I say that if I have a table split across nodes (i.e. num >>>> tablets > 1) and HDFS replication in my system, it is sort of equivalent >>>> to a sharded and replicated mongo architecture? >>>> >>>> Best regards, >>>> Yamini Joshi >>>> >>>> On Thu, Oct 13, 2016 at 11:06 AM, Josh Elser <josh.el...@gmail.com >>>> <mailto:josh.el...@gmail.com>> wrote: >>>> >>>> The Accumulo (Data Center) Replication feature is for having >>>> multiple active Accumulo clusters all containing the same data. >>>> >>>> HDFS provides replication as a means for durability of the data it >>>> is storing. The files that Accumulo creates on one HDFS instance are >>>> replicated by HDFS. This does not help if your entire cluster become >>>> unavailable. That is what the data center replication Accumulo >>>> feature solves. >>>> >>>> While both can be called "replication", they serve very different >>>> purposes. >>>> >>>> >>>> Yamini Joshi wrote: >>>> >>>> Hello >>>> >>>> I was going through some Accumulo docs and found out about >>>> replication. >>>> To enable replication,one needs to make some config settings as >>>> described in >>>> https://github.com/apache/accumulo/blob/master/docs/src/main >>>> /asciidoc/chapters/replication.txt >>>> <https://github.com/apache/accumulo/blob/master/docs/src/mai >>>> n/asciidoc/chapters/replication.txt>. >>>> I cannot seem to grasp the difference between this replication >>>> conf and >>>> the replication on HDFS level. What exactly is the use case for >>>> replication? Are the replicated instances visible to the >>>> clients? >>>> >>>> Best regards, >>>> Yamini Joshi >>>> >>>> >>>> >> >