[ 
https://issues.apache.org/jira/browse/HBASE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242115#comment-13242115
 ] 

He Yongqiang commented on HBASE-5674:
-------------------------------------

okay. Now i need to make it public on my lack sense of humor. :)

Here is the real problem:
In our use case, the space the data occupies *really* matter. We need to find 
all kind of things that we can do to bring down the size as much as possible. 
Apparently we do not want to bring in LZMA compression or bzip2 compression as 
they are really slow. In my simple test, a 41MB data can be reduced to 32MB 
after i rewrite the hbase Long timestamp to zero. The 8-bytes Long timestamp is 
heavy is because it is binary system timestamp which makes it very hard to 
compress (MemstoreTS is also a Long timestamp but there is no problem with it 
as it will be zero eventually). And if you look at how we are using that data, 
pretty much that data is not used by most applications if the data is system 
generated (not specified by applications). A good reason to make it 
configurable is some application may do specify it. In that case, pretty much 
you as hbase can not modify that data. But for a lot of other applications 
which do not care this data should not suffer this problem if data size really 
matter to them. 
I think this could benefit other community members as they may see this problem 
when they want to decrease the data size. 


                
> add support in HBase to overwrite hbase timestamp to a version number during 
> major compaction
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5674
>                 URL: https://issues.apache.org/jira/browse/HBASE-5674
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>
> Right now, a millisecond-level timestamp is attached to every record. 
> In our case, we only need a version number (mostly it will be just zero etc). 
> A millisecond timestamp is too heavy to carry. We should add support to 
> overwrite it to zero during major compaction. 
> KVs before major compaction will remain using system timestamp. And this 
> should be configurable, so that we should not mess up if the hbase timestamp 
> is specified by application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to