[
https://issues.apache.org/jira/browse/HBASE-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121889#comment-16121889
]
Xiang Li edited comment on HBASE-18555 at 8/10/17 4:41 PM:
---
Hi Mike Drob,
bq. This seems like the perfect use case for putIfAbsent
Yes, it is, but could be better. Map#putIfAbsent() is like
{code}
V v = map.get(key);
if (v == null)
v = map.put(key, value);
return v;
{code}
When the key is associated with a value, it performs get(). When the key is not
associated with a value, it performs get() and then put(). familyMap is
TreeMap, which is Red-Black tree. The time complexity of get() and put() is
quite low, but it is still O(logn).
So there could be a better way than putIfAbsent(), like the patch does: only
put (key=family, value=list) right after the list is allocated.
was (Author: water):
Hi Mike Drob,
bq. This seems like the perfect use case for putIfAbsent
Yes, it is, but could be better. Map#putIfAbsent() is like
{code}
V v = map.get(key);
if (v == null)
v = map.put(key, value);
return v;
{code}
When the key is associated with a value, it performs get(). When the key is not
associated with a value, it performs get() and then put(). familyMap is
TreeMap, which is Red-Black tree. The time complexity of get() and put() is
quite low, but it is still O(log(n)).
So there could be a better way than putIfAbsent(), like the patch does: only
put (key=family, value=list) right after the list is allocated.
> Remove redundant familyMap.put() from Put#addColumn() and addImmutable()
>
>
> Key: HBASE-18555
> URL: https://issues.apache.org/jira/browse/HBASE-18555
> Project: HBase
> Issue Type: Improvement
> Components: Client
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-18555.master.000.patch
>
>
> In Put#addColumn() and addImmutable(), after getting the cell list of the
> given family and add the cell into the list, the code puts (key=family,
> value=list) into familyMap.
> In addColumn(), it is like
> {code}
> List list = getCellList(family);
> KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
> list.add(kv);
> familyMap.put(CellUtil.cloneFamily(kv), list); // <-- here
> return this;
> {code}
> In addImmutable(), it is like
> {code}
> List list = getCellList(family);
> KeyValue kv = createPutKeyValue(family, qualifier, ts, value, tag);
> list.add(kv);
> familyMap.put(family, list); // <-- here
> return this;
> {code}
> I think those put() for Map only take effect when getCellList(family) returns
> a new allocated ArrayList. When the list for a family already exist, put()
> for Map will update the value to the reference of the list, but actually, the
> reference of the list is not changed.
> Those put() do not do any harm in terms of the correctness when they are
> here. But it could be removed to improve the performance. familyMap searches
> for key and set the new value and return the old value. Those operation take
> some time but actually are not needed.
> The put() could be moved into Mutation#getCellList(family)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)