On 12/07/2012 04:26 PM, Vincent Massol wrote: > Hi, > > On Dec 7, 2012, at 9:59 PM, Sergiu Dumitriu <[email protected]> wrote: > >> Hi devs, >> >> We've moved more and more toward an UTF-8-only application, and XWiki >> has only been tested with this configuration for several years. >> >> I propose that we require UTF-8 for a valid, supported installation. >> This means: >> - JVM encoding (-Dfile.encoding=UTF8) >> - Container default URL encoding (Tomcat has ISO-8859-1 by default) >> - Database encoding (MySql is still configured with latin1 on some distros) >> >> There's one big site to update on our side: xwiki.org. >> >> Here's my +1. This is a move toward a future web, since more and more >> standards require (or at least assume as a default) UTF-8. >> >> >> >> After thinking a bit more, it would make sense to require a valid >> Unicode encoding, including UTF-16, which is preferable in countries >> that don't use a latin alphabet. However, XWiki doesn't currently work >> under 16-bit encodings at all. > > For XWiki 4.x I'm -1 since it's a big change and we don't want to break our > users that currently use 4.x with ISO8859-1 for example > > For XWiki 5.x I'm not sure. > > To be able to answer I need to understand more. For example what currently > doesn't work with any encoding the user wants to use? Shouldn't we just be > transparent and use whatever encoding is specified and not hardcode anything?
Non-ASCII-compatible encodings don't work at least because of the way we read components.txt. For other encodings, the problem is that you never know when something will break. Things may appear to work, but enter a non-ASCII character and: - the data might be discarded - the database might throw an exception -- if a non-transactional database engine is used, then all future access to the document or history or object properties will fail - the document might become inaccessible since the URL is decoded incorrectly - The PDF export might break - Escaped XML entities might appear in the browser instead of characters (just from the top of my head) This isn't a proposal to change our rules, it's a proposal to make explicit what we've been doing anyway. There have been many issues and emails where our answer starts with "make sure your <component> encoding is set to UTF-8". On xwiki.org users often try to use non-ASCII characters, and that doesn't work, and so we might be losing potential users if they assume that XWiki simply doesn't support their language. -- Sergiu Dumitriu http://purl.org/net/sergiu _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

