[
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228976#comment-13228976
]
stack commented on HBASE-4608:
------------------------------
bq. But we don't know if the current dictionary compression API is general
enough to cover the new compression type.
Agree that we don't know what the future will bring. Not going to try.
bq. But the last paragraph above hinges on the scenario of keeping the same WAL
version when new compression type is added.
Yes, thats one possible scenario. There are others where we need to change the
version. Can deal when we get there.
bq. Suppose we find a way to improve dictionary compression after the
integration of this JIRA. Would WAL version increase or stay at 1 ?
If API doesn't change, no need to up the global file version. Could add new
improved dictionary compression type.
If we need to change the api, then we'll need to change the global version. At
the same time we might add some other facility that has nought to do w/
compression -- say, we might decide to intersperse markers for when we flush or
compact. We'd likely bump the version one point only though. This new version
would say indicate wal was now able to do extended compression api AND includes
flush and compaction markers. We could bump the version once per feature added
but that buys us nothing; its the version we ship that counts, the accumulation
of features since last time we shipped.
> HLog Compression
> ----------------
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
> Issue Type: New Feature
> Reporter: Li Pi
> Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt,
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt,
> 4608v18.txt, 4608v23.txt, 4608v24.txt, 4608v25.txt, 4608v5.txt, 4608v6.txt,
> 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends
> across different datanodes. We can speed up this process by compressing the
> HLog. Current plan involves using a dictionary to compress table name, region
> id, cf name, and possibly other bits of repeated data. Also, HLog format may
> be changed in other ways to produce a smaller HLog.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira