[ 
https://issues.apache.org/jira/browse/HBASE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900895#comment-15900895
 ] 

Allan Yang edited comment on HBASE-17757 at 3/8/17 8:25 AM:
------------------------------------------------------------

{quote}
In case where both DBE and compression in use, the size we track will be after 
compression also. And what we keep in cache is uncompressed blocks (By default. 
There is config to keep compressed also).. So the math will go wrong there?
{quote}
Sorry, I don't quit catch your question. As far as I know, compression happens 
after finish writing one block. So it is hard to unify blocksize after 
compression(Don't know when to finish one block, since we don't know the size 
after compression). On the other hand, unify blocksize only works if the 
encoding happens 'on the fly', so unify blocksize for encoding algorithms like 
prefix-tree is not possible too.

{quote}
How abt thinking the block size limit to be a hard limit than a soft one?
{quote}
It is hard to do so, as far as I know, a single row must remain in one 
block(correct me if I'm wrong), if a single row's size is bigger than 
blocksize, then this single block's size will beyond our limit.


was (Author: allan163):
{quote}
In case where both DBE and compression in use, the size we track will be after 
compression also. And what we keep in cache is uncompressed blocks (By default. 
There is config to keep compressed also).. So the math will go wrong there?
{quote}
Sorry, I don't quit catch your question. As far as I know, compression happens 
after finish writing one block. So it is hard to unify blocksize after 
compression(Don't know when to finish one block, since we don't know the size 
after compression). On the other hand, unify blocksize only works if the 
encoding happens 'on the fly', so unify blocksize for encoding algorithms like 
prefix-tree is not possible too.

{quote}
How abt thinking the block size limit to be a hard limit than a soft one?
{quote}
It is hard, as far as I know, a single row must remain in one block(correct me 
if I'm wrong), if a single row's size is bigger than blocksize, then this 
single block's size will beyond our limit.

> Unify blocksize after encoding to decrease memory fragment 
> -----------------------------------------------------------
>
>                 Key: HBASE-17757
>                 URL: https://issues.apache.org/jira/browse/HBASE-17757
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>         Attachments: HBASE-17757.patch
>
>
> Usually, we store encoded block(uncompressed) in blockcache/bucketCache. 
> Though we have set the blocksize, after encoding, blocksize is varied. Varied 
> blocksize will cause memory fragment problem, which will result in more FGC 
> finally.In order to relief the memory fragment, This issue adjusts the 
> encoded block to a unified size.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to