https://issues.apache.org/bugzilla/show_bug.cgi?id=51400

             Bug #: 51400
           Summary: Use of "new String(byte[] b, String enc)" hits Sun JVM
                    bottleneck
           Product: Tomcat 6
           Version: 6.0.32
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Catalina
        AssignedTo: dev@tomcat.apache.org
        ReportedBy: dengb...@evernote.com
    Classification: Unclassified


Created attachment 27186
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=27186
Patch with optimizations

We're using Tomcat 6 for a high-volume, high-concurrency service (Evernote). 
At times, we've seen a performance slowdown within the service, which we've
traced to a concurrency flaw within the JVM code that translates named
encodings (e.g. "utf-8") into Charsets.  This translates into a number of stuck
threads trying to convert a byte array to a String or vice versa, ala:

  java.lang.Thread.State: BLOCKED (on object monitor)
       at sun.nio.cs.FastCharsetProvider.charsetForName(Unknown Source)
       - waiting to lock <0x00007ff3b4cc85b0> (a sun.nio.cs.StandardCharsets)
       at java.nio.charset.Charset.lookup2(Unknown Source)
       at java.nio.charset.Charset.lookup(Unknown Source)
       at java.nio.charset.Charset.isSupported(Unknown Source)
       at java.lang.StringCoding.lookupCharset(Unknown Source)
       at java.lang.StringCoding.decode(Unknown Source)
       at java.lang.String.<init>(Unknown Source)
       at
org.apache.tomcat.util.buf.ByteChunk.toStringInternal(ByteChunk.java:499)
       at org.apache.tomcat.util.buf.StringCache.toString(StringCache.java:315)
       at org.apache.tomcat.util.buf.ByteChunk.toString(ByteChunk.java:492)
       at
org.apache.tomcat.util.buf.MessageBytes.toString(MessageBytes.java:213)
       at
org.apache.tomcat.util.http.MimeHeaders.getHeader(MimeHeaders.java:319)
       at org.apache.coyote.Request.getHeader(Request.java:330)
       at org.apache.catalina.connector.Request.getHeader(Request.java:1854)
       at
org.apache.catalina.connector.RequestFacade.getHeader(RequestFacade.java:643)

This isn't a true deadlock, since each thread will eventually finish, but it
can
significantly affect concurrency if there are a number of threads making heavy
use of:
   new String(byte[] b, String encoding)
   String.getBytes()
   String.getBytes(String encoding)

This is, unfortunately, a known bottleneck within the JVM:
http://blog.inuus.com/vox/2008/05/the-mysteries-of-java-character-set-performance.html
http://halfbottle.blogspot.com/2009/07/charset-continued-i-wrote-about.html
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6790402


To avoid this bottleneck in the JVM, we've patched our server to use the
explicit Charset object for String encoding rather than the name of the
charset, and then added a ConcurrentHashMap<String, Charset> to lookup charsets
by encodings.

I've attached a patch with our fixes on 6.0.32

Just as a random FYI - the same issue hits MySQL's Java connector, so we'd
occasionally see Tomcat and MySQL fighting over this same JVM chokepoint: 
http://bugs.mysql.com/bug.php?id=61105

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to