https://issues.apache.org/bugzilla/show_bug.cgi?id=51400
--- Comment #12 from Christopher Schultz <ch...@christopherschultz.net> 2011-06-23 20:02:11 UTC --- > > I suppose it's a fairly small set of encodings, but with little benefit, > > there's no reason IMO to pre-populate. > > You're right; however if I read the reports correctly, this is true if > charsets > with valid names only are used for the lookup. But everytime when there is a > loopkup for a non-existing Charset, the JVM-synchronized Charset.lookup() is > called. Probably to speed this up, Konstantin Kolinko suggested to cache > charset missings. Duh. I hadn't thought of spurious lookups causing their own synchronization disasters. Perhaps the invalid-charset cache could be limited in some way: MRU caches are easy to build with the standard Java library. > If a list with all avaliable charsets would be pre-populated, including their > aliases, missing charsets could also be determined fast. True: if the encoding is not supported by the JVM, then it's invalid no matter what. In that case, case normalization is probably a good thing to do: if it's not in the case (after normalization), then it's not valid... no reason to ever call Charset.lookup() after startup. > Well, on my Windows machine the longest alias (not canonical name) of a > charset > is "Extended_UNIX_Code_Packed_Format_for_Japanese" which consists of 39 > mutable > characters. Wow. > The current (trunk) implementation in > o.a.tomcat.util.buf.B2CConverter.getCharset() does not normalize the name, so > a > Client could send requests with 2^39 permutations in a Content-Type header > (which would make 49 TiB of Charset strings) ;-) My math might be wrong, too, but I believe that's only 512GiB if names are 1-byte-per-char, but I think Java does 2-bytes-per-char, so it's 1TiB. You're right, though: that's pretty huge. +1 to case normalization. +1 to LUT pre-population. -1 to LUT miss caching: it's totally unnecessary given the above. -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org