[jira] [Commented] (HBASE-19506) Support variable sized chunks from ChunkCreator

Anastasia Braginsky (JIRA) Wed, 03 Jan 2018 02:11:27 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309415#comment-16309415
 ]


Anastasia Braginsky commented on HBASE-19506:
---------------------------------------------

Now we come back to this idea. Looking deeper into details, the size of 
cell-representation is 20Bytes, the chunk size is 2MB (2097152Bytes), therefore 
one chunk can hold representations of 104857.6 cells. 

How much cells are inserted before in-memory flush, very depends on the 
workload. However, seeking for some average, let's say cell size is 1KB and we 
flush in-memory every 12.8MB (10% out of 128MB), thus 12.8MB/1KB=12.8KB ~= 
12800 cells are written (in this case).

After that each 5 immutable segments in pipeline are compacted, so 5 
under-utilized index chunks are released, and one index chunk with about 52800 
cell-representations is allocated (which is about half-capacity). So looks like 
indeed there is some under utilization of index chunks, however the index 
chunks are at most 5 per memstore, so this impact can be not so significant.

As for solution, we suggest to create another pool for "small" chunks in 
ChunkCreator. Let's say chunks of 256KB size. It means we will need to define 
also new type of chunks. But it is very important to avoid on-demand 
allocation. This "small-chunks" pool can be pre-allocated and its chunks can be 
reused.

  

> Support variable sized chunks from ChunkCreator
> -----------------------------------------------
>
>                 Key: HBASE-19506
>                 URL: https://issues.apache.org/jira/browse/HBASE-19506
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>
> When CellChunkMap is created it allocates a special index chunk (or chunks) 
> where array of cell-representations is stored. When the number of 
> cell-representations is small, it is preferable to allocate a chunk smaller 
> than a default value which is 2MB.
> On the other hand, those "non-standard size" chunks can not be used in pool. 
> On-demand allocations in off-heap are costly. So this JIRA is about to 
> investigate the trade of between memory usage and the final performance. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-19506) Support variable sized chunks from ChunkCreator

Reply via email to