[ 
https://issues.apache.org/jira/browse/HBASE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-2055:
----------------------------------

    Attachment: HBASE-2055-v3.patch

v3 writes a sync marker every 64K which includes a copy of the schema. 

The file is initialized with a sync marker. The reader scans from the start of 
the file until it finds a valid sync marker and then reads in the schema. 

This is a fair amount of overhead -- 1 byte per record, 1K per 64K of data -- 
but does mean edits from corrupt logs can be partially recovered.

> Serialize WAL as Avro records
> -----------------------------
>
>                 Key: HBASE-2055
>                 URL: https://issues.apache.org/jira/browse/HBASE-2055
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>         Attachments: HBASE-2055-v2.patch, HBASE-2055-v3.patch, 
> HBASE-2055.patch, jackson-core-asl-1.0.1.jar, jackson-mapper-asl-1.0.1.jar, 
> paranamer-1.5.jar, 
> TEST-org.apache.hadoop.hbase.regionserver.wal.TestHLog.txt.gz, 
> TEST-org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.txt.gz, 
> TEST-org.apache.hadoop.hbase.TestFullLogReconstruction.txt.gz, test-site.patch
>
>
> There was some advocacy of using Avro for serialization of HBase WAL records 
> up on hbase-...@. Idea is Hadoop core is getting away from Writables and Avro 
> is the blessed replacement. 
> I think we have this criteria for its use:
> 1) Performance of writing Avro records is no worse than that for writing 
> Writables into a SequenceFile.
> 2) Space consumed by Avro serialization is no worse than that of Writables
> 3) File format is amenable to appends (cannot require valid trailers, etc.)
> I'll put up a patch so we can try it out. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to