[
https://issues.apache.org/jira/browse/HBASE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242115#comment-13242115
]
He Yongqiang commented on HBASE-5674:
-------------------------------------
okay. Now i need to make it public on my lack sense of humor. :)
Here is the real problem:
In our use case, the space the data occupies *really* matter. We need to find
all kind of things that we can do to bring down the size as much as possible.
Apparently we do not want to bring in LZMA compression or bzip2 compression as
they are really slow. In my simple test, a 41MB data can be reduced to 32MB
after i rewrite the hbase Long timestamp to zero. The 8-bytes Long timestamp is
heavy is because it is binary system timestamp which makes it very hard to
compress (MemstoreTS is also a Long timestamp but there is no problem with it
as it will be zero eventually). And if you look at how we are using that data,
pretty much that data is not used by most applications if the data is system
generated (not specified by applications). A good reason to make it
configurable is some application may do specify it. In that case, pretty much
you as hbase can not modify that data. But for a lot of other applications
which do not care this data should not suffer this problem if data size really
matter to them.
I think this could benefit other community members as they may see this problem
when they want to decrease the data size.
> add support in HBase to overwrite hbase timestamp to a version number during
> major compaction
> ---------------------------------------------------------------------------------------------
>
> Key: HBASE-5674
> URL: https://issues.apache.org/jira/browse/HBASE-5674
> Project: HBase
> Issue Type: Improvement
> Reporter: He Yongqiang
> Assignee: He Yongqiang
>
> Right now, a millisecond-level timestamp is attached to every record.
> In our case, we only need a version number (mostly it will be just zero etc).
> A millisecond timestamp is too heavy to carry. We should add support to
> overwrite it to zero during major compaction.
> KVs before major compaction will remain using system timestamp. And this
> should be configurable, so that we should not mess up if the hbase timestamp
> is specified by application.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira