[ 
https://issues.apache.org/jira/browse/TEZ-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281403#comment-17281403
 ] 

David Mollitor commented on TEZ-4275:
-------------------------------------

Also, I see here that Google recommends using this over the JDK's own String 
intern implementation.  Tez currently uses {{hadoop-common}} for String 
interning, which under the covers, uses JDK facilities.  I also propose here 
making an implementation for Tez that uses Google Guava.

https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/Interner.java#L28-L30

> Use Google Guava Intern Facility
> --------------------------------
>
>                 Key: TEZ-4275
>                 URL: https://issues.apache.org/jira/browse/TEZ-4275
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Google Guava has a pretty cool Interner facility.
>  
>  * More memory efficient than current offering.  Map contains a weak key and 
> a static dummy value (current implementation uses a weak value)
>  * Current implementation has a single lock around the entire data structure. 
>  Google segments their data structure into (default: 4) segments for better 
> concurrency
>  * All the other thoughtful stuff Google has added into this feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to