[ 
https://issues.apache.org/jira/browse/HDFS-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3023:
------------------------------

    Attachment: hdfs-3023-HDFS-1623.txt

Attached patch implements optimizations 1-3 above. I didn't do optimization #4 
since it's a bit more complicated and will only really help with files that are 
several blocks long. I'd like to leave it for future work.

The patch is against the HA branch, since the branch is soon to be merged and 
the code around OP_ADD, etc, differs a bit. Rather than do it twice, I figured 
I'd just work on HA branch.

I'll do a round of benchmarks on this patch tomorrow.
                
> Optimize entries in edits log for persistBlocks calls
> -----------------------------------------------------
>
>                 Key: HDFS-3023
>                 URL: https://issues.apache.org/jira/browse/HDFS-3023
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node, performance
>    Affects Versions: HA branch (HDFS-1623), 0.23.2
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3023-HDFS-1623.txt
>
>
> One of the performance issues noticed in the HA branch is due to the much 
> larger edit logs, now that we are writing OP_ADD transactions to the edit log 
> on every block allocation. We can condense these calls down in two ways:
> 1) use variable-length integers for the block list length, size, and genstamp 
> (most of these end up fitting in far less than 8 bytes)
> 2) use delta-coding for the genstamp and block size for any blocks after the 
> first block (most blocks will be the same size and only slightly higher 
> genstamps)
> 3) introduce a new OP_UPDATE_BLOCKS transaction that doesn't re-serialize 
> metadata information like lease owner, permissions, etc
> 4) allow OP_UPDATE_BLOCKS to only re-serialize the blocks that have changed 
> for a given transaction

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to