[ https://issues.apache.org/jira/browse/HBASE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794366#action_12794366 ]
Andrew Purtell commented on HBASE-2055: --------------------------------------- Sorry, above I meant SYNC_INTERVAL, not SYNC_SIZE. Also it looks like the DataFileWriter as implemented for AVRO-160 will hold up to SYNC_INTERVAL bytes in a buffer before writing out the block. We want to hsync after a group of related commits in the WAL whether SYNC_INTERVAL is reached or not, but also have the stream marked with a sync marker at each SYNC_INTERVAL. This is basically what my v3 or v4 patch does. It also writes a copy of the schema just after the sync marker so we have an opportunity to resynchronize a reader on each block regardless of how many previous blocks are corrupt (perhaps all). > Serialize WAL as Avro records > ----------------------------- > > Key: HBASE-2055 > URL: https://issues.apache.org/jira/browse/HBASE-2055 > Project: Hadoop HBase > Issue Type: Improvement > Reporter: Andrew Purtell > Priority: Minor > Attachments: HBASE-2055-v2.patch, HBASE-2055-v3.patch, > HBASE-2055-v4.patch, HBASE-2055.patch, jackson-core-asl-1.0.1.jar, > jackson-mapper-asl-1.0.1.jar, paranamer-1.5.jar, > TEST-org.apache.hadoop.hbase.regionserver.wal.TestHLog.txt.gz, > TEST-org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.txt.gz, > TEST-org.apache.hadoop.hbase.TestFullLogReconstruction.txt.gz, test-site.patch > > > There was some advocacy of using Avro for serialization of HBase WAL records > up on hbase-...@. Idea is Hadoop core is getting away from Writables and Avro > is the blessed replacement. > I think we have this criteria for its use: > 1) Performance of writing Avro records is no worse than that for writing > Writables into a SequenceFile. > 2) Space consumed by Avro serialization is no worse than that of Writables > 3) File format is amenable to appends (cannot require valid trailers, etc.) > I'll put up a patch so we can try it out. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.