DO NOT REPLY [Bug 51400] Use of "new String(byte[] b, String enc)" hits Sun JVM bottleneck

bugzilla Thu, 23 Jun 2011 07:09:17 -0700

https://issues.apache.org/bugzilla/show_bug.cgi?id=51400


--- Comment #9 from Konstantin Preißer <prei...@web.de> 2011-06-23 14:08:49 UTC 
---
Hi,

would caching charset misses be a good idea, if the Encoding strings can also
be received from external sources?

For example, if a client makes a POST request to a Servlet and sends this
header: 

Content-Type: application/x-www-form-urlencoded;
charset=this-is-a-non-existing-charset

and a Servlet makes a call to HttpServletRequest.getParameter(...), then
o.a.tomcat.util.buf.B2CConverter.getCharset(String) will be called with a value
of "this-is-a-non-existing-charset". If a client would make tons of requests
with random, invalid charset strings and these misses would be added to a List,
couldn't it lead to a memory leak? (if they would never be deleted)

However, there is static method Charset.availableCharsets() which returns a
SortedMap<String, Charset> of all charsets available by the current JVM. Maybe
this list could be used to build a Map of all available charsets (the aliases
returned by Charset.aliases() would also have to be added)? Then missing
charsets could also be found fast.

However, I think, in B2CConverter.getCharset() the encoding string should be
converted to lower-case/upper-case before a lookup in the Map, to avoid
multiple entries ("uTF-8", "UtF-8" etc.).

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

DO NOT REPLY [Bug 51400] Use of "new String(byte[] b, String enc)" hits Sun JVM bottleneck

Reply via email to