[
https://issues.apache.org/jira/browse/LUCENE-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Miller closed LUCENE-1308.
-------------------------------
Resolution: Duplicate
> Remove String.intern() from Field.java to increase performance and lower
> contention
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1308
> URL: https://issues.apache.org/jira/browse/LUCENE-1308
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 2.3.2
> Reporter: Rene Schwietzke
> Attachments: yad.zip
>
>
> Right now, *document.Field is interning all field names. While this makes
> sense because it lowers the overall memory consumption, the method intern()
> of String is know to be difficult to handle.
> 1) it is a native call and therefore slower than anything on the Java level
> 2) the String pool is part of the perm space and not of the general heap, so
> it's size is more restricted and needs extra VM params to be managed
> 3) Some VMs show GC problems with strings in the string pool
> Suggested solution is a WeakHashMap instead, that takes care of unifying the
> String instances and at the same time keeping the pool in the heap space and
> releasing the String when it is not longer needed. For extra performance in a
> concurrent environment, a ConcurrentHashMap-like implementation of a weak
> hashmap is recommended, because we mostly read from the pool.
> We saw a 10% improvement in throughout and response time of our application
> and the application is not only doing searches (we read a lot of documents
> from the result). So a single measurement test case could show even more
> improvement in single and concurrent usage.
> The Cache:
> /** Cache to replace the expensive String.intern() call with the java version
> */
> private final static Map<String, WeakReference<String>> unifiedStringsCache =
> Collections.synchronizedMap(new WeakHashMap<String,
> WeakReference<String>>(109));
> The access to it, instead of this.name = name.intern;
> // unify the strings, but do not use the expensive String.intern() version
> // which is not "weak enough", uses the perm space and is a native call
> String unifiedName = null;
> WeakReference<String> ref = unifiedStringsCache.get(name);
> if (ref != null)
> {
> unifiedName = ref.get();
> }
> if (unifiedName == null)
> {
> unifiedStringsCache.put(name, new WeakReference(name));
> unifiedName = name;
> }
> this.name = unifiedName;
> I guess it is sufficient to have mostly all fields names interned, so I
> skipped the additional synchronization around the access and take the risk
> that only 99.99% :) of all field names are interned.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]