[ https://issues.apache.org/jira/browse/HBASE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900895#comment-15900895 ]
Allan Yang edited comment on HBASE-17757 at 3/8/17 8:25 AM: ------------------------------------------------------------ {quote} In case where both DBE and compression in use, the size we track will be after compression also. And what we keep in cache is uncompressed blocks (By default. There is config to keep compressed also).. So the math will go wrong there? {quote} Sorry, I don't quit catch your question. As far as I know, compression happens after finish writing one block. So it is hard to unify blocksize after compression(Don't know when to finish one block, since we don't know the size after compression). On the other hand, unify blocksize only works if the encoding happens 'on the fly', so unify blocksize for encoding algorithms like prefix-tree is not possible too. {quote} How abt thinking the block size limit to be a hard limit than a soft one? {quote} It is hard to do so, as far as I know, a single row must remain in one block(correct me if I'm wrong), if a single row's size is bigger than blocksize, then this single block's size will beyond our limit. was (Author: allan163): {quote} In case where both DBE and compression in use, the size we track will be after compression also. And what we keep in cache is uncompressed blocks (By default. There is config to keep compressed also).. So the math will go wrong there? {quote} Sorry, I don't quit catch your question. As far as I know, compression happens after finish writing one block. So it is hard to unify blocksize after compression(Don't know when to finish one block, since we don't know the size after compression). On the other hand, unify blocksize only works if the encoding happens 'on the fly', so unify blocksize for encoding algorithms like prefix-tree is not possible too. {quote} How abt thinking the block size limit to be a hard limit than a soft one? {quote} It is hard, as far as I know, a single row must remain in one block(correct me if I'm wrong), if a single row's size is bigger than blocksize, then this single block's size will beyond our limit. > Unify blocksize after encoding to decrease memory fragment > ----------------------------------------------------------- > > Key: HBASE-17757 > URL: https://issues.apache.org/jira/browse/HBASE-17757 > Project: HBase > Issue Type: New Feature > Reporter: Allan Yang > Assignee: Allan Yang > Attachments: HBASE-17757.patch > > > Usually, we store encoded block(uncompressed) in blockcache/bucketCache. > Though we have set the blocksize, after encoding, blocksize is varied. Varied > blocksize will cause memory fragment problem, which will result in more FGC > finally.In order to relief the memory fragment, This issue adjusts the > encoded block to a unified size. -- This message was sent by Atlassian JIRA (v6.3.15#6346)