[jira] [Comment Edited] (HBASE-18555) Remove redundant familyMap.put() from Put#addColumn() and addImmutable()

2017-08-10 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121889#comment-16121889
 ] 

Xiang Li edited comment on HBASE-18555 at 8/10/17 4:41 PM:
---

Hi Mike Drob, 
bq. This seems like the perfect use case for putIfAbsent
Yes, it is, but could be better. Map#putIfAbsent() is like
{code}
 V v = map.get(key);
 if (v == null)
 v = map.put(key, value);

 return v;
{code}
When the key is associated with a value, it performs get(). When the key is not 
associated with a value, it performs get() and then put(). familyMap is 
TreeMap, which is Red-Black tree. The time complexity of get() and put() is 
quite low, but it is still O(logn).
So there could be a better way than putIfAbsent(), like the patch does: only 
put (key=family, value=list) right after the list is allocated.


was (Author: water):
Hi Mike Drob, 
bq. This seems like the perfect use case for putIfAbsent
Yes, it is, but could be better. Map#putIfAbsent() is like
{code}
 V v = map.get(key);
 if (v == null)
 v = map.put(key, value);

 return v;
{code}
When the key is associated with a value, it performs get(). When the key is not 
associated with a value, it performs get() and then put(). familyMap is 
TreeMap, which is Red-Black tree. The time complexity of get() and put() is 
quite low, but it is still O(log(n)).
So there could be a better way than putIfAbsent(), like the patch does: only 
put (key=family, value=list) right after the list is allocated.

> Remove redundant familyMap.put() from Put#addColumn() and addImmutable()
> 
>
> Key: HBASE-18555
> URL: https://issues.apache.org/jira/browse/HBASE-18555
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-18555.master.000.patch
>
>
> In Put#addColumn() and addImmutable(), after getting the cell list of the 
> given family and add the cell into the list, the code puts (key=family, 
> value=list) into familyMap.
> In addColumn(), it is like
> {code}
> List list = getCellList(family);
> KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
> list.add(kv);
> familyMap.put(CellUtil.cloneFamily(kv), list); // <-- here
> return this;
> {code}
> In addImmutable(), it is like
> {code}
> List list = getCellList(family);
> KeyValue kv = createPutKeyValue(family, qualifier, ts, value, tag);
> list.add(kv);
> familyMap.put(family, list); // <-- here
> return this;
> {code}
> I think those put() for Map only take effect when getCellList(family) returns 
> a new allocated ArrayList. When the list for a family already exist, put() 
> for Map will update the value to the reference of the list, but actually, the 
> reference of the list is not changed.
> Those put() do not do any harm in terms of the correctness when they are 
> here. But it could be removed to improve the performance. familyMap searches 
> for key and set the new value and return the old value. Those operation take 
> some time but actually are not needed. 
> The put() could be moved into Mutation#getCellList(family)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18555) Remove redundant familyMap.put() from Put#addColumn() and addImmutable()

2017-08-10 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121836#comment-16121836
 ] 

Xiang Li edited comment on HBASE-18555 at 8/10/17 4:13 PM:
---

Upload a very first patch v000 to 
(1) Remove familyMap.put() from Put#addColumn() and addImmutable()
(2) Move familyMap.put() into Mutation#getCellList(), perform put() after the 
cell list is allocated


was (Author: water):
Upload a very first patch to 
(1) Remove familyMap.put() from Put#addColumn() and addImmutable()
(2) Move familyMap.put() into Mutation#getCellList(), perform put() after the 
cell list is allocated

> Remove redundant familyMap.put() from Put#addColumn() and addImmutable()
> 
>
> Key: HBASE-18555
> URL: https://issues.apache.org/jira/browse/HBASE-18555
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-18555.master.000.patch
>
>
> In Put#addColumn() and addImmutable(), after getting the cell list of the 
> given family and add the cell into the list, the code puts (key=family, 
> value=list) into familyMap.
> In addColumn(), it is like
> {code}
> List list = getCellList(family);
> KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
> list.add(kv);
> familyMap.put(CellUtil.cloneFamily(kv), list); // <-- here
> return this;
> {code}
> In addImmutable(), it is like
> {code}
> List list = getCellList(family);
> KeyValue kv = createPutKeyValue(family, qualifier, ts, value, tag);
> list.add(kv);
> familyMap.put(family, list); // <-- here
> return this;
> {code}
> I think those put() for Map only take effect when getCellList(family) returns 
> a new allocated ArrayList. When the list for a family already exist, put() 
> for Map will update the value to the reference of the list, but actually, the 
> reference of the list is not changed.
> Those put() do not do any harm in terms of the correctness when they are 
> here. But it could be removed to improve the performance. familyMap searches 
> for key and set the new value and return the old value. Those operation take 
> some time but actually are not needed. 
> The put() could be moved into Mutation#getCellList(family)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)