I call that wild idea and raise you. I had the random idea once to log an audit trail to the WAL in addition to edits (all of the query side stuff, plus exceptional conditions and important metrics) and then hand off the rolled WALs to some periodic MapReduce process for reduction into long term storage, perhaps with correlation. Like Chukwa, sort of. Half an audit trail -- the write side, the mutations -- is already there in the WAL in chronological order, and this may not be an unreasonable way to handle audit trails from 100+ or dare I say it 1000+ region servers while trying to stick within the the Hadoop stack, not pick up the complexities of some other external component such as Scribe or some syslog collector, etc). Just stack up the WALs in HDFS and process them at the end of the day or something like that.
Anyway the bloom soon comes off the rose... I mean idea... Trouble of course is doubling or tripling (or more) the size of the WAL with follow on negative write path performance impacts: more frequent rolling, more data to sync, need to append data to files in HDFS if only serving queries, etc. However if Avro is both fast and has good support for nested structures with optional fields, and we could come up with some scheme where some marker indicates a field should get the last previous value seen (as opposed to just being null), then it might not be so crazy. - Andy ________________________________ From: stack <st...@duboce.net> To: hbase-dev@hadoop.apache.org Sent: Tue, December 15, 2009 8:32:56 PM Subject: Re: Avro for WAL serialization format? What do you see as advantage Jeff? I suppose it'd be more compact that current Writable-based serialization. Current HBase WAL is a SequenceFile. We'd have to move away from that? Thanks, St.Ack On Tue, Dec 15, 2009 at 7:46 PM, Jeff Hammerbacher <ham...@cloudera.com>wrote: > Hey, > > Inspired by Drizzle's use of Protobufs for their transaction log format > (e.g. > > http://jpipes.com/index.php?/archives/299-Drizzle-Replication-The-Transaction-Log.html > ): > how crazy would it be to try out Avro's binary format for the HBase WAL? > > Thanks, > Jeff >