One of the aims of the proposed cookie changes [1] was to deal with the HTML 5 changes that mean UTF-8 can appear in cookie headers.
This has some potentially large implications for Tomcat. Currently, Tomcat handles cookies as MessageBytes, processing everything in bytes and only converting to String when necessary. This is largely possible because of the assumption that everything is ASCII. Introduce UTF-8 and processing everything in bytes gets a whole lot harder. You essentially have to decode to UTF-8 to ensure that you have valid data - at a which point why not just use Strings anyway? I am currently leaning towards removing a lot of the current cookie header caching recycling and doing something along the following lines: - Lazy parsing as currently (but unless cookie based session tracking is disabled this is going to run on every request) - Convert headers to UTF-8 strings - Parse them with a new parser along the lines of o.a.t.u.http.parser - Have that parser return an array of javax.servlet.http.Cookie objects - Pass those to the app if/when requested In terms of handling RFC6265 and RFC2109 my plan is to have two parsers, share as much code as possible and switch between them based on the cookie header with the expectation that 99.9% of cookies will be parsed by the RFC6265 parser. We could add some options to this switching to enable other parsers (e.g. a Netscape parser) to be used. I'd also like to keep the current cookie parsing implementation for now. Until we are happy with the new parsing, the current implementation will be the default. Once we are happy with the new parsing we can change the default. We can add an option to switch between the current and the new parsing. Thoughts? Mark [1] https://wiki.apache.org/tomcat/Cookies --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org