At the Hbase vs Accumulo deathmatch the other night Todd elucidated that 
Hbase's write-ahead log is in HDFS and benefits somewhat thereby. He neglected 
to mention that for years until HDFS append() was available Hbase just LOST 
data while Accumulo didn't .. but he was talking about the current state of 
affairs so, whatever.

The question now is, does it make any sense to look at HDFS as a place to store 
Accumulo's write-ahead log? I remember that BigTable used two write streams 
(each of which is transparently replicated by HDFS) and switched between them 
to avoid performance hiccups, so it does sound like a critical part of the 
overall performance. Such a big change would belong probably in 1.6 or later 
... But there may be reasons to never use HDFS and to always use a separately 
maintained subsystem.

Any one care to lay out the arguments for staying with a separate subsystem? I 
think we know the arguments for using HDFS.

Aaron

Reply via email to