[
https://issues.apache.org/jira/browse/TEZ-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281403#comment-17281403
]
David Mollitor commented on TEZ-4275:
-------------------------------------
Also, I see here that Google recommends using this over the JDK's own String
intern implementation. Tez currently uses {{hadoop-common}} for String
interning, which under the covers, uses JDK facilities. I also propose here
making an implementation for Tez that uses Google Guava.
https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/Interner.java#L28-L30
> Use Google Guava Intern Facility
> --------------------------------
>
> Key: TEZ-4275
> URL: https://issues.apache.org/jira/browse/TEZ-4275
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Google Guava has a pretty cool Interner facility.
>
> * More memory efficient than current offering. Map contains a weak key and
> a static dummy value (current implementation uses a weak value)
> * Current implementation has a single lock around the entire data structure.
> Google segments their data structure into (default: 4) segments for better
> concurrency
> * All the other thoughtful stuff Google has added into this feature
--
This message was sent by Atlassian Jira
(v8.3.4#803005)