Xiang Li commented on HBASE-18555:

Hi Mike Drob, 
bq. This seems like the perfect use case for putIfAbsent
Yes, it is, but could be better. Map#putIfAbsent() is like
 V v = map.get(key);
 if (v == null)
     v = map.put(key, value);

 return v;
When the key is associated with a value, it performs get(). When the key is not 
associated with a value, it performs get() and then put(). familyMap is 
TreeMap, which is Red-Black tree. The time complexity of get() and put() is 
quite low, but it is still O(log(n)).
So there could be a better way than putIfAbsent(), like the patch does: only 
put (key=family, value=list) right after the list is allocated.

> Remove redundant familyMap.put() from Put#addColumn() and addImmutable()
> ------------------------------------------------------------------------
>                 Key: HBASE-18555
>                 URL: https://issues.apache.org/jira/browse/HBASE-18555
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Minor
>         Attachments: HBASE-18555.master.000.patch
> In Put#addColumn() and addImmutable(), after getting the cell list of the 
> given family and add the cell into the list, the code puts (key=family, 
> value=list) into familyMap.
> In addColumn(), it is like
> {code}
>     List<Cell> list = getCellList(family);
>     KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
>     list.add(kv);
>     familyMap.put(CellUtil.cloneFamily(kv), list); // <-- here
>     return this;
> {code}
> In addImmutable(), it is like
> {code}
>     List<Cell> list = getCellList(family);
>     KeyValue kv = createPutKeyValue(family, qualifier, ts, value, tag);
>     list.add(kv);
>     familyMap.put(family, list); // <-- here
>     return this;
> {code}
> I think those put() for Map only take effect when getCellList(family) returns 
> a new allocated ArrayList. When the list for a family already exist, put() 
> for Map will update the value to the reference of the list, but actually, the 
> reference of the list is not changed.
> Those put() do not do any harm in terms of the correctness when they are 
> here. But it could be removed to improve the performance. familyMap searches 
> for key and set the new value and return the old value. Those operation take 
> some time but actually are not needed. 
> The put() could be moved into Mutation#getCellList(family)

This message was sent by Atlassian JIRA

Reply via email to