The answer here is actually not so simple. Durable sync was only added to HDFS with HDFS-744, and I have not gotten to make matching changes to HBase, yet.
Before HDFS-744 there was only hflush, which guarantees that the data reached all replica datanodes (3 by default), but not that the data is actually sync'ed to disk. If you now hard shut down (by pulling the powercord for example) three or more machines of your cluster at the same time you can possibly loose data. I blogged about this here: http://hadoop-hbase.blogspot.com/2012/05/hbase-hdfs-and-durable-sync.html -- Lars ________________________________ From: Panshul Gupta <[email protected]> To: [email protected] Sent: Thursday, January 10, 2013 7:18 AM Subject: persistence in Hbase Hello, I was wondering if it is possible that I have data stored in Hbase tables on my 10 node cluster. I switch off (power down) my cluster. When I power up my cluster again, and run the HDFS and hadoop daemons, will the Hbase have my old data persisted in the form I left it?? or will I have to re import all the data?? Thankyou for the help. -- Regards, Panshul. http://about.me/panshulgupta
