So HDFS is for durability while replication is for availability? I'm assuming that the client is unaware of the replicated instance and queries the DB with no knowledge of which instance/table will return the result.
Best regards, Yamini Joshi On Thu, Oct 13, 2016 at 11:46 AM, Josh Elser <josh.el...@gmail.com> wrote: > I'm not familiar with MongoDB. Perhaps someone else can confirm this for > you. > > Yamini Joshi wrote: > >> So, can I say that if I have a table split across nodes (i.e. num >> tablets > 1) and HDFS replication in my system, it is sort of equivalent >> to a sharded and replicated mongo architecture? >> >> Best regards, >> Yamini Joshi >> >> On Thu, Oct 13, 2016 at 11:06 AM, Josh Elser <josh.el...@gmail.com >> <mailto:josh.el...@gmail.com>> wrote: >> >> The Accumulo (Data Center) Replication feature is for having >> multiple active Accumulo clusters all containing the same data. >> >> HDFS provides replication as a means for durability of the data it >> is storing. The files that Accumulo creates on one HDFS instance are >> replicated by HDFS. This does not help if your entire cluster become >> unavailable. That is what the data center replication Accumulo >> feature solves. >> >> While both can be called "replication", they serve very different >> purposes. >> >> >> Yamini Joshi wrote: >> >> Hello >> >> I was going through some Accumulo docs and found out about >> replication. >> To enable replication,one needs to make some config settings as >> described in >> https://github.com/apache/accumulo/blob/master/docs/src/main >> /asciidoc/chapters/replication.txt >> <https://github.com/apache/accumulo/blob/master/docs/src/mai >> n/asciidoc/chapters/replication.txt>. >> I cannot seem to grasp the difference between this replication >> conf and >> the replication on HDFS level. What exactly is the use case for >> replication? Are the replicated instances visible to the clients? >> >> Best regards, >> Yamini Joshi >> >> >>