https://issues.apache.org/bugzilla/show_bug.cgi?id=51400
Bug #: 51400 Summary: Use of "new String(byte[] b, String enc)" hits Sun JVM bottleneck Product: Tomcat 6 Version: 6.0.32 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: Catalina AssignedTo: dev@tomcat.apache.org ReportedBy: dengb...@evernote.com Classification: Unclassified Created attachment 27186 --> https://issues.apache.org/bugzilla/attachment.cgi?id=27186 Patch with optimizations We're using Tomcat 6 for a high-volume, high-concurrency service (Evernote). At times, we've seen a performance slowdown within the service, which we've traced to a concurrency flaw within the JVM code that translates named encodings (e.g. "utf-8") into Charsets. This translates into a number of stuck threads trying to convert a byte array to a String or vice versa, ala: java.lang.Thread.State: BLOCKED (on object monitor) at sun.nio.cs.FastCharsetProvider.charsetForName(Unknown Source) - waiting to lock <0x00007ff3b4cc85b0> (a sun.nio.cs.StandardCharsets) at java.nio.charset.Charset.lookup2(Unknown Source) at java.nio.charset.Charset.lookup(Unknown Source) at java.nio.charset.Charset.isSupported(Unknown Source) at java.lang.StringCoding.lookupCharset(Unknown Source) at java.lang.StringCoding.decode(Unknown Source) at java.lang.String.<init>(Unknown Source) at org.apache.tomcat.util.buf.ByteChunk.toStringInternal(ByteChunk.java:499) at org.apache.tomcat.util.buf.StringCache.toString(StringCache.java:315) at org.apache.tomcat.util.buf.ByteChunk.toString(ByteChunk.java:492) at org.apache.tomcat.util.buf.MessageBytes.toString(MessageBytes.java:213) at org.apache.tomcat.util.http.MimeHeaders.getHeader(MimeHeaders.java:319) at org.apache.coyote.Request.getHeader(Request.java:330) at org.apache.catalina.connector.Request.getHeader(Request.java:1854) at org.apache.catalina.connector.RequestFacade.getHeader(RequestFacade.java:643) This isn't a true deadlock, since each thread will eventually finish, but it can significantly affect concurrency if there are a number of threads making heavy use of: new String(byte[] b, String encoding) String.getBytes() String.getBytes(String encoding) This is, unfortunately, a known bottleneck within the JVM: http://blog.inuus.com/vox/2008/05/the-mysteries-of-java-character-set-performance.html http://halfbottle.blogspot.com/2009/07/charset-continued-i-wrote-about.html http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6790402 To avoid this bottleneck in the JVM, we've patched our server to use the explicit Charset object for String encoding rather than the name of the charset, and then added a ConcurrentHashMap<String, Charset> to lookup charsets by encodings. I've attached a patch with our fixes on 6.0.32 Just as a random FYI - the same issue hits MySQL's Java connector, so we'd occasionally see Tomcat and MySQL fighting over this same JVM chokepoint: http://bugs.mysql.com/bug.php?id=61105 -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org