How exactly should we handle I18N in freenet? The HTTP spec says that text/html defaults to the charset "ISO-8859-1". In order to try to prevent ambiguity in the filter, we need to explicitly set the charset in the Content-Type that we send back to the browser. The first question is whether this will force the browser to use ISO-8859-1, or whether, IE-style, it will autodetect anyway and use whatever it thinks the code looks like. Ideally we would always send an explicit charset, and allow any charset to be specified as long as java.io.InputStreamReader knows about it and therefore we can filter it. The problem with this is that the browser may try to read it as a different charset... so either we assume that the browser will accept an explicit setting of the charset in the Content-Type field, or we have to put in autodetection code for any conceivable charset - starting with the UTF16 patch I recently hacked up. So what do we do? -- Matthew Toseland toad at amphibian.dyndns.org amphibian at users.sourceforge.net Freenet/Coldstore open source hacker. Employed full time by Freenet Project Inc. from 11/9/02 to 11/1/03 http://freenetproject.org/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20021127/49d39d53/attachment.pgp>
