On 23.11.2010 08:55, Maho NAKATA wrote:
Hi Maho!Hi Bernd,
When you are in an environment where you have UTF-8 encoding set (eg. a Unix shell with the LANG=en-US.UTF-8 environment set) than list your files and there see any special chars that do not belong to your language eg. just a square or you see something like \234 being displayed instead of a char or similar than you know someone has commited those files using a different encoding, eg. BIG-5.How we know where we use invalid non UTF-8 characters? It's not immediately clear to me.
Another simple possible test is trying to import the whole CVS archive into an SVN Repository while having LANG=en-US.UTF-8 or similar set. The "svn import" command would complain on any non UTF-8 chars in that case.
thanks, Nakata Maho
From: Bernd Eilers <bernd.eil...@oracle.com> Subject: [native-lang] native language webcontent developers please review your filenames for invalid non UTF-8 characters Date: Mon, 22 Nov 2010 17:49:40 +0100Hi native language communities and esp. the WebContent developers among you! I recently stumbled over a few filenames in OpenOffice.org's webcontent which have invalid non-UTF-8 characters in their filenames. The character encoding to be used for webconent checked into OpenOffice.org´s webcontent CVS repository is UTF-8. Please make sure to use an UTF-8 locale when checking in files with non-us-AscII chars. For example if you are in france and are using some Unix OS set LANG=fr.UTF-8 and not LANG=fr.ISO8859-15. GUI CVS Clients used on Windows often allow to specify the encoding to be used explicitly. Filenames with other encodings will not work and what is even worse they do create a big problem when moving OpenOffice.org to the new kenai based infrastructure. While CVS does not care much about invalid chars in filenames subversion which will be used on the new infrastructure does treat those filenames as errornous and as a result will not import the whole project at all. Could native language projects webcontent developers please review their webcontent and change anything that is currently not UTF-8 compliant. And that means not only to copy the broken files to new valid ones but also deleting the broken filenames from the CVS repository! For example there are 2 broken directory names in fr/www/Documentation/Gallery starting with the letters "fl" and than some non UTF-8 encoded char. Kind regards, Bernd Eilers -- <http://www.oracle.com/> Bernd Eilers | Software Engineer Phone: +49 40 23 646 967 ORACLE Deutschland B.V. & Co. KG | Nagelsweg 55 | 20097 Hamburg ORACLE Deutschland B.V. & Co. KG Hauptverwaltung: Riesstr. 25, D-80992 München Registergericht: Amtsgericht München, HRA 95603 Komplementärin: ORACLE Deutschland Verwaltung B.V. Rijnzathe 6, 3454PV De Meern, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Geschäftsführer: Jürgen Kunz, Marcel van de Molen, Alexander van der Ven <http://www.oracle.com/commitment> Oracle is committed to developing practices and products that help protect the environment--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@native-lang.openoffice.org For additional commands, e-mail: dev-h...@native-lang.openoffice.org
ORACLE Deutschland B.V. & Co. KG | Nagelsweg 55 | 20097 Hamburg
Deutschland B.V. & Co. KG
Re: [native-lang] native language webcontent developers please review your filenames for invalid non UTF-8 characters
- [native-lang] native language webcontent developers... Bernd Eilers
- Re: [native-lang] native language webcontent d... Maho NAKATA
- Re: [native-lang] native language webconte... Bernd Eilers
- Re: [native-lang] native language webcontent d... Stefan Taxhet (sonews)
- [native-lang] Re: [l10n-dev] Re: [native-l... Eike Rathke