Answers inline.

> -----Original Message-----
> From: Imran M Yousuf [mailto:imyou...@gmail.com]
> Sent: Monday, May 17, 2010 8:14 AM
> To: hbase-user@hadoop.apache.org
> Subject: Availability Transaction and data integrity
> 
> Hi,
> 
> Currently we are designing an architecture for a Accounting SaaS and
> e-commerce website. As both of them will store financial data -
> transaction, redundancy, HA and data integrity is very important. As I
> am not a master of HBase architecture and implementation I am eagerly
> waiting for your comments on as follows:
> 
> * We will go live from January 2011, in that time frame should we
> develop using 0.21-SNAPSHOT or should we stick to 0.20.x? Ideally I
> would not want to go ahead with a snapshot in production and also
> would not want to make an upgrade within few months (because of some
> problems noticed in the mailing list regarding upgrade and I am a bit
> skeptical about it in general).

If you need data durability (no data loss under node failure) then you have no 
choice but to go with 0.21 once it is released.  This is not supported on the 
0.20 line.

There are a number of organizations who will be going live into production on 
0.21 in Q3 2010.  You can be sure that there will be a very well tested and 
stable 0.21 release by January 2011.

> * Transaction was a contrib module of HBase but it seems recently
> removed from the 0.21-SNAPSHOT. In light of it what would be the way
> to achieve transaction?

It is still available but is being moved to GitHub.  You can still use it, it 
has just been moved out of the core code.

> * NN was (if I am not mistaken) a SPoF, I also learnt that its
> supposed to be fixed in 0.21, is that in trunk already?

I'm not sure where you heard this was fixed in 0.21 as my understanding is that 
it is not fixed in 0.21.

There is work being done at Facebook (and I believe parallel work being done 
elsewhere) to add a true backup NameNode.  Once stabilized this will be 
released and available to the public though it may not be put into an official 
Hadoop release in an 0.21 timeframe.

> * What kind of data loss should we design to?

On 0.21, you should not have data loss.  We are doing a lot of testing on this 
to ensure stability and durability.

> * Is there any professional service provider who could help us train
> for deployment, help optimize and in case we need emergency provide
> service? (P.S. I contacted Cloudera via email 11 days back and still
> waiting for a reply, may be they are not interested any alternate
> would do great!)

Cloudera should be able to help.  They're active on this list but perhaps don't 
want to use this forum to sell services.  I would ping them again off the list.

> 
> I eagerly hope for some help and guideline on these queries.
> 
> --
> Imran M Yousuf
> Entrepreneur & Software Engineer
> Smart IT Engineering
> Dhaka, Bangladesh
> Email: im...@smartitengineering.com
> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> Mobile: +880-1711402557

Reply via email to