[ https://issues.apache.org/jira/browse/LUCENE-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Miller closed LUCENE-1308. ------------------------------- Resolution: Duplicate > Remove String.intern() from Field.java to increase performance and lower > contention > ----------------------------------------------------------------------------------- > > Key: LUCENE-1308 > URL: https://issues.apache.org/jira/browse/LUCENE-1308 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: 2.3.2 > Reporter: Rene Schwietzke > Attachments: yad.zip > > > Right now, *document.Field is interning all field names. While this makes > sense because it lowers the overall memory consumption, the method intern() > of String is know to be difficult to handle. > 1) it is a native call and therefore slower than anything on the Java level > 2) the String pool is part of the perm space and not of the general heap, so > it's size is more restricted and needs extra VM params to be managed > 3) Some VMs show GC problems with strings in the string pool > Suggested solution is a WeakHashMap instead, that takes care of unifying the > String instances and at the same time keeping the pool in the heap space and > releasing the String when it is not longer needed. For extra performance in a > concurrent environment, a ConcurrentHashMap-like implementation of a weak > hashmap is recommended, because we mostly read from the pool. > We saw a 10% improvement in throughout and response time of our application > and the application is not only doing searches (we read a lot of documents > from the result). So a single measurement test case could show even more > improvement in single and concurrent usage. > The Cache: > /** Cache to replace the expensive String.intern() call with the java version > */ > private final static Map<String, WeakReference<String>> unifiedStringsCache = > Collections.synchronizedMap(new WeakHashMap<String, > WeakReference<String>>(109)); > The access to it, instead of this.name = name.intern; > // unify the strings, but do not use the expensive String.intern() version > // which is not "weak enough", uses the perm space and is a native call > String unifiedName = null; > WeakReference<String> ref = unifiedStringsCache.get(name); > if (ref != null) > { > unifiedName = ref.get(); > } > if (unifiedName == null) > { > unifiedStringsCache.put(name, new WeakReference(name)); > unifiedName = name; > } > this.name = unifiedName; > I guess it is sufficient to have mostly all fields names interned, so I > skipped the additional synchronization around the access and take the risk > that only 99.99% :) of all field names are interned. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org