[ 
https://issues.apache.org/jira/browse/HBASE-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209746#comment-15209746
 ] 

Anoop Sam John commented on HBASE-15493:
----------------------------------------

[~ram_krish]  When Mutation comes to server, we know the total #cells in that. 
Not per family count. We dont PB that info. And may be we should not also?
Ya it will be best if we can have a new kind of Mutation only for server. At 
server we have 2 advs.  We can know the #cells in a mutation. (At least total)  
and all the cells are in order.  So we dont need a TreeMap at least.  If u see 
a Mutation , the over head heap size is really big.   But we expose Mutation in 
our CPs and so FamilyMap which is Map..   We can discuss more on how to reduce 
these..  For a major version we may be able to break even some things in CP 
side.
[~vrodionov]
When I say addImmutable(byte[] fam, List<Cell>)  am also thinking user should 
call it only once. 
So if all cells are passed, we may have to split them into family map and again 
create new Lists.  My idea was to reuse the List created by user.  And ya we 
say it should be used only when user is read to give the List as immutable.
So they may be to do like
{code}
List<Cell> f1Cells = new ArrayList<>(3);
f1Cells.add( CellUtil.createCell(...));
f1Cells.add( CellUtil.createCell(...));
f1Cells.add( CellUtil.createCell(...));//  3 cells in f1
put.addImmutable (f1,  f1Cells);

List<Cell> f2Cells = new ArrayList<>(2);
f2Cells.add( CellUtil.createCell(...));
f2Cells.add( CellUtil.createCell(...)); // 2 cells in f2
put.addImmutable (f2,  f2Cells);

{code}
If it is one API with taking all family cells,they can reduce 2 lines.   Within 
mutation we need to create List for each of the cf and we need to use size= 5?

> Default ArrayList size may not be optimal for Mutation
> ------------------------------------------------------
>
>                 Key: HBASE-15493
>                 URL: https://issues.apache.org/jira/browse/HBASE-15493
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, regionserver
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15493-v1.patch, HBASE-15493-v2.patch
>
>
> {code}
>   List<Cell> getCellList(byte[] family) {
>     List<Cell> list = this.familyMap.get(family);
>     if (list == null) {
>       list = new ArrayList<Cell>();
>     }
>     return list;
>   }
> {code}
> Creates list of size 10, this is up to 80 bytes per column family in mutation 
> object. 
> Suggested:
> {code}
>   List<Cell> getCellList(byte[] family) {
>     List<Cell> list = this.familyMap.get(family);
>     if (list == null) {
>       list = new ArrayList<Cell>(CELL_LIST_INITIAL_CAPACITY);
>     }
>     return list;
>   }
> {code}
> CELL_LIST_INITIAL_CAPACITY = 2 in the patch, this is debatable. For mutation 
> where every CF has 1 cell, this gives decent reduction in memory allocation 
> rate in both client and server during write workload. ~2%, not a big number, 
> but as I said, already, memory optimization will include many small steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to