[
https://issues.apache.org/jira/browse/HBASE-16594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476744#comment-15476744
]
binlijin edited comment on HBASE-16594 at 9/12/16 2:10 AM:
-----------------------------------------------------------
The performance on a single regionserver is :
{code}
BlockSize=8K DATA_BLOCK_ENCODING => 'NONE' (CPU 4/42) 37k
BlockSize=16K DATA_BLOCK_ENCODING => 'NONE' (CPU 3/41) 41k
BlockSize=32K DATA_BLOCK_ENCODING => 'NONE' (CPU 3/45) 43k
BlockSize=64K DATA_BLOCK_ENCODING => 'NONE' (CPU 3/46) 36k
BlockSize=32k DATA_BLOCK_ENCODING => 'Row_Index_V1' (CPU 4/45) 45k
BlockSize=32k DATA_BLOCK_ENCODING => 'Row_Index_V2' (CPU 4/48) 64k
(CPU 4/42) which mean System CPU 4%,User CPU 42%.
{code}
was (Author: aoxiang):
The performance on a single regionserver is :
BlockSize=8K DATA_BLOCK_ENCODING => 'NONE' (CPU 4/42) 37k
BlockSize=16K DATA_BLOCK_ENCODING => 'NONE' (CPU 3/41) 41k
BlockSize=32K DATA_BLOCK_ENCODING => 'NONE' (CPU 3/45) 43k
BlockSize=64K DATA_BLOCK_ENCODING => 'NONE' (CPU 3/46) 36k
BlockSize=32k DATA_BLOCK_ENCODING => 'Row_Index_V1' (CPU 4/45) 45k
BlockSize=32k DATA_BLOCK_ENCODING => 'Row_Index_V2' (CPU 4/48) 64k
(CPU 4/42) which mean System CPU 4%,User CPU 42%.
> ROW_INDEX_V2 DBE
> ----------------
>
> Key: HBASE-16594
> URL: https://issues.apache.org/jira/browse/HBASE-16594
> Project: HBase
> Issue Type: Sub-task
> Components: Performance
> Reporter: binlijin
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16594-master_v1.patch, HBASE-16594-master_v2.patch
>
>
> See HBASE-16213, ROW_INDEX_V1 DataBlockEncoding.
> ROW_INDEX_V1 is the first version which have no storage optimization,
> ROW_INDEX_V2 do storage optimization: store every row only once, store column
> family only once in a HFileBlock.
> ROW_INDEX_V1 is :
> /**
> * Store cells following every row's start offset, so we can binary search to
> a row's cells.
> *
> * Format:
> * flat cells
> * integer: number of rows
> * integer: row0's offset
> * integer: row1's offset
> * ....
> * integer: dataSize
> *
> */
> ROW_INDEX_V2 is :
> * row1 qualifier timestamp type value tag
> * qualifier timestamp type value tag
> * qualifier timestamp type value tag
> * row2 qualifier timestamp type value tag
> * row3 qualifier timestamp type value tag
> * qualifier timestamp type value tag
> * ....
> * integer: number of rows
> * integer: row0's offset
> * integer: row1's offset
> * ....
> * column family
> * integer: dataSize
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)