BTW, Srivas, I could find a single countless example of horror story of 'hadoop fs -rmr' in a form of hypothetical question (and not on this list ;) http://is.gd/55KD1E
Just for the sake of full disclosure, of course. Enjoy, Cos On Tue, May 22, 2012 at 09:45PM, M. C. Srivas wrote: > On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus > <martinus...@gmail.com>wrote: > > > Hi Todd, > > > > Thanks for your answer. Is that will have the same capability as the > > commercial M5 of MapR : http://www.mapr.com/products/why-mapr ? > > > > Thanks. > > > Hi Martinus, some major differences in HA between MapR's M5 and Apache > Hadoop > > 1. with M5, any node become master at any time. It is a fully active-active > system. You can get create a fully bomb-proof cluster, such that in a > 20-node cluster, you can configure to survive even if 19 of the 20 nodes > are lost. With Apache, it is a 1-1 active-passive system. > > 2. M5 does not require a NFS filer in the backend. Apache Hadoop requires a > Netapp or similar NFS filer to assist in saving the NN data, even in its HA > configuration. Note that for true HA, the Netapp or similar also will need > to be HA. > > 3. M5 has full HA for the Job-Tracker as well. > > Of course, HA is only a small part of the total business continuity story. > Full recovery in the face of any kind of failures is critical: > > With M5: > > - If there is a complete cluster crash and reboot (eg, a full > power-failure of the entire cluster), M5 will recover in 5-10 minutes, and > submitted jobs will resume from where they were. > > - with snapshots, if you upgrade your software and it corrupts data, M5 > provides snapshots to help you recover. The number of times I've seen > someone running "hadoop fs -rmr /" accidentally and asking for help on > this mailing list is beyond counting. With M5, it is completely recoverable > > - full disaster-recovery across clusters by mirroring. > > Hope that clarifies some of the differences. > > > > > > > > On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <t...@cloudera.com> wrote: > > > >> Hi Martinus, > >> > >> Hadoop HA is available in Hadoop 2.0.0. This release is currently > >> being voted on in the community. > >> > >> You can read more here: > >> > >> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/ > >> > >> -Todd > >> > >> On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus > >> <martinus...@gmail.com> wrote: > >> > Hi, > >> > > >> > Is there any hadoop HA distribution out there? > >> > > >> > Thanks. > >> > >> > >> > >> -- > >> Todd Lipcon > >> Software Engineer, Cloudera > >> > > > >
signature.asc
Description: Digital signature