Bug#389065: [debiandoc-sgml-pkgs] Bug#389065: debiandoc-sgml: [INTL:ru] Russian UTF-8 locale
tags 389065 +pending thanks Hi Yuri, On Sat, Sep 23, 2006 at 09:52:58PM +0400, Yuri Kozlov wrote: The attached diff is contains the 'iconv -f koi8-r -t utf-8' conversion result on the Locale/ru_RU.KOI8-R/* files with a small translation improvement. I committed your patch. Nevertheless I wonder whether you want to integrate the small translation improvement into KOI8-R encoded files as well? This includes: diff -ru ru_RU.KOI8-R/HTML ru_RU.UTF-8/HTML --- ru_RU.KOI8-R/HTML 2006-11-12 18:18:56.0 +0100 +++ ru_RU.UTF-8/HTML2006-11-12 18:17:44.0 +0100 'abstract' = 'Аннотация', - 'copyright notice' = 'Заметка об авторских правах', + 'copyright notice' = 'Сведения об авторских правах', 'contents' = 'Содержание', 'chapter' = sub { return Глава $_[0] }, 'appendix' = sub { return Приложение $_[0] }, @@ -17,7 +17,7 @@ 'paragraph' = sub { return раздел $_[0] }, 'subparagraph' = sub { return раздел $_[0] }, 'footnotes' = 'Сноски', - 'comments' = 'Comments', + 'comments' = 'Комментарии', 'next' = 'вперед', 'previous' = 'назад', ); diff -ru ru_RU.KOI8-R/LaTeX ru_RU.UTF-8/LaTeX --- ru_RU.KOI8-R/LaTeX 2006-11-12 18:18:56.0 +0100 +++ ru_RU.UTF-8/LaTeX 2006-11-12 18:17:44.0 +0100 'abstract' = 'Аннотация', - 'copyright notice' = 'Замечания об авторских правах', + 'copyright notice' = 'Сведения об авторских правах', + 'after begin document' = '\\renewcommand{\\vpageref}[1]{на стр. \\pageref{#1}}', + 'pdfhyperref' = 'unicode' ); diff -ru ru_RU.KOI8-R/Text ru_RU.UTF-8/Text --- ru_RU.KOI8-R/Text 2006-11-12 18:18:57.0 +0100 +++ ru_RU.UTF-8/Text2006-11-12 18:17:44.0 +0100 'abstract' = 'Аннотация', - 'copyright notice' = 'Замечания об авторских правах', + 'copyright notice' = 'Сведения об авторских правах', 'contents' = 'Содержание', 'chapter' = sub { return Глава $_[0] }, 'appendix' = sub { return Приложение $_[0] }, diff -ru ru_RU.KOI8-R/TextOV ru_RU.UTF-8/TextOV --- ru_RU.KOI8-R/TextOV 2006-11-12 18:18:57.0 +0100 +++ ru_RU.UTF-8/TextOV 2006-11-12 18:17:44.0 +0100 'abstract' = 'Аннотация', - 'copyright notice' = 'Заметка об авторских правах', + 'copyright notice' = 'Сведения об авторских правах', 'contents' = 'Содержание', 'chapter' = sub { return Глава $_[0] }, 'appendix' = sub { return Приложение $_[0] }, As in the 'etch' by default used a utf-8 encoding for Russian, it would be nice to have a ru_RU.UTF-8 locale support. This diff is not changed the 'system' alias 'ru'. Should I make ru_RU.UTF-8 the default locale for ru once I verified that it supports all currently available Russian documents? I think this would be a good idea. Jens
Bug#389065: [debiandoc-sgml-pkgs] Bug#389065: debiandoc-sgml: [INTL:ru] Russian UTF-8 locale
2006/11/12, Jens Seidel [EMAIL PROTECTED]: I committed your patch. Nevertheless I wonder whether you want to integrate the small translation improvement into KOI8-R encoded files as well? This includes: Yes of course. As in the 'etch' by default used a utf-8 encoding for Russian, it would be nice to have a ru_RU.UTF-8 locale support. This diff is not changed the 'system' alias 'ru'. Should I make ru_RU.UTF-8 the default locale for ru once I verified that it supports all currently available Russian documents? I think this would be a good idea. If it not will breaks any. If I can to help something mail me. -- Regards, Yuri Kozlov -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#389065: [debiandoc-sgml-pkgs] Bug#389065: debiandoc-sgml: [INTL:ru] Russian UTF-8 locale
On Sun, Sep 24, 2006 at 07:49:04PM +0900, Osamu Aoki wrote: On Sat, Sep 23, 2006 at 09:06:20PM +0200, Jens Seidel wrote: yep, I agree that UTF-8 should be supported for a wider range of In theory, it is a good direction. Changing encoding to UTF-8 should be simple recoding for HTML and plain text but... PS and PDF are real work since tool chain (LaTeX) seems to be using good old local encoding. If you can address both format, please propose fix. If particular encoding does not mind breaking PS/PDF build script, they can change to UTF-8 now. Russian, Ukrainian, Vietnamese and maybe other languages work well in UTF-8 even for PS and PDF. (I noticed that the dash in PDF files looks a little bit strange (thick but short) but that's not very important.) I tested Russian for APT HOWTO and Debian Reference without problems. Since Etch will be more UTF-8 centric, an UTF-8 default would be useful, right? There are a few problems related to this: All packages containing Russian documents would FTBFS. A simple recoding of the document (and for Debian Reference also of the *.ent file and bin/getdocdate) needs to be done, but only in the package, not in DDP CVS since the build host still runs Sarge. On the other side the DDP build can be deactivated until Etch. It's also possible to use ru_RU.KOI8-R as locale in the build script. But this would create filenames document.ru_RU.KOI8-R.html instead of document.ru.html (if option -c of debiandoc2html is used). But, ... I think Japanese, Chinese, ... possibly Russian may need good work. I really do not have time to do it. Practical solution is to make behavior work with both style. Any well thought action are welcomed I suggest we still wait a few days to test Asian together with UTF-8 as proposed in the RC bug report. I will test these too. But we should first decide whether we all agree that UTF-8 would be a good idea. After this I could also post to debian-doc and provide help. (Documentation packages could break the freeze to update translations, so *now* is a good change to switch to UTF-8. After the Etch release it would be harder.) Jens -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#389065: [debiandoc-sgml-pkgs] Bug#389065: debiandoc-sgml: [INTL:ru] Russian UTF-8 locale
On Sat, Sep 23, 2006 at 09:06:20PM +0200, Jens Seidel wrote: Hi Yuri, On Sat, Sep 23, 2006 at 09:52:58PM +0400, Yuri Kozlov wrote: Hello. The attached diff is contains the 'iconv -f koi8-r -t utf-8' conversion result on the Locale/ru_RU.KOI8-R/* files with a small translation improvement. As in the 'etch' by default used a utf-8 encoding for Russian, it would be nice to have a ru_RU.UTF-8 locale support. This diff is not changed the 'system' alias 'ru'. I have builded russian html and pdf files from release-notes sources and it is works. yep, I agree that UTF-8 should be supported for a wider range of languages. I think a general solution which starts iconv to transform the encoding of existing strings to UTF-8 (or other encodings) would be more useful and reduce duplicated information. If I do not provide such a patch in the next time we should really apply your patch for now :-) In theory, it is a good direction. Many thanks to Eugeniy for his a UTF-8 patch (#366992). Indeed, without his great help UTF-8 support would be still missing. Changing encoding to UTF-8 should be simple recoding for HTML and plain text but... PS and PDF are real work since tool chain (LaTeX) seems to be using good old local encoding. If you can address both format, please propose fix. If particular encoding does not mind breaking PS/PDF build script, they can change to UTF-8 now. But, ... I think Japanese, Chinese, ... possibly Russian may need good work. I really do not have time to do it. Practical solution is to make behavior work with both style. Any well thought action are welcomed Osamu -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#389065: [debiandoc-sgml-pkgs] Bug#389065: debiandoc-sgml: [INTL:ru] Russian UTF-8 locale
Hi Yuri, On Sat, Sep 23, 2006 at 09:52:58PM +0400, Yuri Kozlov wrote: Hello. The attached diff is contains the 'iconv -f koi8-r -t utf-8' conversion result on the Locale/ru_RU.KOI8-R/* files with a small translation improvement. As in the 'etch' by default used a utf-8 encoding for Russian, it would be nice to have a ru_RU.UTF-8 locale support. This diff is not changed the 'system' alias 'ru'. I have builded russian html and pdf files from release-notes sources and it is works. yep, I agree that UTF-8 should be supported for a wider range of languages. I think a general solution which starts iconv to transform the encoding of existing strings to UTF-8 (or other encodings) would be more useful and reduce duplicated information. If I do not provide such a patch in the next time we should really apply your patch for now :-) Many thanks to Eugeniy for his a UTF-8 patch (#366992). Indeed, without his great help UTF-8 support would be still missing. Jens -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]