Thanks, my answers are inline too. On Mon, May 17, 2010 at 9:50 PM, Jonathan Gray <jg...@facebook.com> wrote: > Answers inline. > >> <snip /> >> * We will go live from January 2011, in that time frame should we >> develop using 0.21-SNAPSHOT or should we stick to 0.20.x? Ideally I >> would not want to go ahead with a snapshot in production and also >> would not want to make an upgrade within few months (because of some >> problems noticed in the mailing list regarding upgrade and I am a bit >> skeptical about it in general). > > If you need data durability (no data loss under node failure) then you have > no choice but to go with 0.21 once it is released. This is not supported on > the 0.20 line. > > There are a number of organizations who will be going live into production on > 0.21 in Q3 2010. You can be sure that there will be a very well tested and > stable 0.21 release by January 2011. >
Extremely great! Then we are on the right track working with 0.21 >> * Transaction was a contrib module of HBase but it seems recently >> removed from the 0.21-SNAPSHOT. In light of it what would be the way >> to achieve transaction? > > It is still available but is being moved to GitHub. You can still use it, it > has just been moved out of the core code. > >> * NN was (if I am not mistaken) a SPoF, I also learnt that its >> supposed to be fixed in 0.21, is that in trunk already? > > I'm not sure where you heard this was fixed in 0.21 as my understanding is > that it is not fixed in 0.21. > > There is work being done at Facebook (and I believe parallel work being done > elsewhere) to add a true backup NameNode. Once stabilized this will be > released and available to the public though it may not be put into an > official Hadoop release in an 0.21 timeframe. > I see, in that case I misunderstood the statement from Slide #10 of "ApacheCon2009: Practical HBase" - 'Removed SPoF, multi-master w/ automatic failover (ZK)'. What was this statement referring to as master? Great to learn that its being worked on.. >> * What kind of data loss should we design to? > > On 0.21, you should not have data loss. We are doing a lot of testing on > this to ensure stability and durability. > This is extremely good news, it seems both my primary concerns are getting addressed! >> * Is there any professional service provider who could help us train >> for deployment, help optimize and in case we need emergency provide >> service? (P.S. I contacted Cloudera via email 11 days back and still >> waiting for a reply, may be they are not interested any alternate >> would do great!) > > Cloudera should be able to help. They're active on this list but perhaps > don't want to use this forum to sell services. I would ping them again off > the list. > Hmm, ok will try again with them. Thanks a lot! Imran >> >> I eagerly hope for some help and guideline on these queries. >> >> -- >> Imran M Yousuf >> Entrepreneur & Software Engineer >> Smart IT Engineering >> Dhaka, Bangladesh >> Email: im...@smartitengineering.com >> Blog: http://imyousuf-tech.blogs.smartitengineering.com/ >> Mobile: +880-1711402557 > -- Imran M Yousuf Entrepreneur & Software Engineer Smart IT Engineering Dhaka, Bangladesh Email: im...@smartitengineering.com Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557