Re: [O] [PATCH] Table continuation strings
Hello, Yasushi SHOJI ya...@atmark-techno.com writes: Right. It is doable, but for Japanese I don't think anyone wants to do it, or at least not a ordinal usage, IMO. OK. Ok, I've checked what I can. It seems working at least for me. Let's patch up the `org-export-dictionary' to see it breaks for others. Sure. Here is a patch to convert all Japanese entries from :utf-8 to :default. I applied it. Thank you. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] Table continuation strings
Hello, Yasushi SHOJI yasushi.sh...@atmark-techno.com writes: And here is a patch for the rest of Japanese translation strings. Applied. Thank you. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] Table continuation strings
Hi, At Mon, 23 Dec 2013 10:09:44 +0100, Nicolas Goaziou wrote: There's a limitation: if you use Latin1 characters (e.g. when you write in French), you cannot export to text/ascii anymore. So, if, for some reason, you really need to export to ascii only, but still need to write in french, you have to be careful not to use any of these Latin1 characters, in particular in translated strings. Similarly, Japanese :ascii entries could be written using romanji. I don't know to what extent it is useful, though. Right. It is doable, but for Japanese I don't think anyone wants to do it, or at least not a ordinal usage, IMO. I'm checking exporters I use, including plain text and html, but it doesn't seems to go wrong. But I really needs some help for other back-ends. I'll post a patch for testing if anyone's interested in. Good idea. You can also set entries to :default and provide a different :latex value, if required. Ok, I've checked what I can. It seems working at least for me. Let's patch up the `org-export-dictionary' to see it breaks for others. Here is a patch to convert all Japanese entries from :utf-8 to :default. You can apply it with `git am --scissors'. - 8 - cut here - 8 - Subject: [PATCH 1/2] ox: Convert Japanese translation from utf-8 to default * lisp/ox.el (org-export-dictionary): Convert all Japanese translation from utf-8 to default. There shouldn't be much need for exporters and users to worry about the coding system of the final output. If one wants to export a Japanese document, he should already have the document with Japanese capable coding system. In that case, Emacs should be able to handle the coding system conversion form the translation table to the designated file coding system. There are two cases which I can think don't work: - all words in the document are written in romaji, and one wants romaji translations - the documents are writ en in a language which does not support Japanese character set, ie English or French, and one wants to use Japanese for non-content strings, ie TOC. These cases are too rare that we can ignore for now. --- lisp/ox.el | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/lisp/ox.el b/lisp/ox.el index 2160826..592cc79 100644 --- a/lisp/ox.el +++ b/lisp/ox.el @@ -5331,7 +5331,7 @@ them. (hu :default Szerzotilde;) (is :html Houml;fundur) (it :default Autore) - (ja :html #33879;#32773; :utf-8 著者) + (ja :default 著者 :html #33879;#32773;) (nl :default Auteur) (no :default Forfatter) (nb :default Forfatter) @@ -5347,7 +5347,7 @@ them. (es :default Continúa de la página anterior) (fr :default Suite de la page précédente) (it :default Continua da pagina precedente) - (ja :utf-8 前ページからの続き) + (ja :default 前ページからの続き) (nl :default Vervolg van vorige pagina) (pt :default Continuação da página anterior) (ru :html (#1055;#1088;#1086;#1076;#1086;#1083;#1078;#1077;#1085;#1080;#1077;) @@ -5357,7 +5357,7 @@ them. (es :default Continúa en la siguiente página) (fr :default Suite page suivante) (it :default Continua alla pagina successiva) - (ja :utf-8 次ページに続く) + (ja :default 次ページに続く) (nl :default Vervolg op volgende pagina) (pt :default Continua na página seguinte) (ru :html (#1055;#1088;#1086;#1076;#1086;#1083;#1078;#1077;#1085;#1080;#1077; #1089;#1083;#1077;#1076;#1091;#1077;#1090;) @@ -5374,7 +5374,7 @@ them. (hu :html Daacute;tum) (is :default Dagsetning) (it :default Data) - (ja :html #26085;#20184; :utf-8 日付) + (ja :default 日付 :html #26085;#20184;) (nl :default Datum) (no :default Dato) (nb :default Dato) @@ -5403,7 +5403,7 @@ them. (de :default Abbildung) (es :default Figura) (et :default Joonis) - (ja :html #22259; :utf-8 図) + (ja :default 図 :html #22259;) (no :default Illustrasjon) (nb :default Illustrasjon) (nn :default Illustrasjon) @@ -5416,7 +5416,7 @@ them. (es :default Figura %d:) (et :default Joonis %d:) (fr :default Figure %d : :html Figurenbsp;%dnbsp;:) - (ja :html #22259;%d: :utf-8 図%d: ) + (ja :default 図%d: :html #22259;%d: ) (no :default Illustrasjon %d) (nb :default Illustrasjon %d) (nn :default Illustrasjon %d) @@ -5436,7 +5436,7 @@ them. (hu :html Laacute;bjegyzet) (is :html Aftanmaacute;lsgreinar) (it :html Note a piegrave; di pagina) - (ja :html #33050;#27880; :utf-8 脚注) + (ja :default 脚注 :html #33050;#27880;) (nl :default Voetnoten) (no :default Fotnoter) (nb :default Fotnoter) @@ -5497,7 +5497,7 @@ them. (es :default Tabla) (et :default Tabel) (fr :default Tableau) - (ja :html #34920; :utf-8 表) + (ja :default 表 :html #34920;) (ru :html #1058;#1072;#1073;#1083;#1080;#1094;#1072; :utf-8 Таблица)
Re: [O] [PATCH] Table continuation strings
Hi, At Thu, 02 Jan 2014 17:15:17 +0900, Yasushi SHOJI wrote: At Mon, 23 Dec 2013 10:09:44 +0100, Nicolas Goaziou wrote: There's a limitation: if you use Latin1 characters (e.g. when you write in French), you cannot export to text/ascii anymore. So, if, for some reason, you really need to export to ascii only, but still need to write in french, you have to be careful not to use any of these Latin1 characters, in particular in translated strings. Similarly, Japanese :ascii entries could be written using romanji. I don't know to what extent it is useful, though. Right. It is doable, but for Japanese I don't think anyone wants to do it, or at least not a ordinal usage, IMO. I'm checking exporters I use, including plain text and html, but it doesn't seems to go wrong. But I really needs some help for other back-ends. I'll post a patch for testing if anyone's interested in. Good idea. You can also set entries to :default and provide a different :latex value, if required. Ok, I've checked what I can. It seems working at least for me. Let's patch up the `org-export-dictionary' to see it breaks for others. Here is a patch to convert all Japanese entries from :utf-8 to :default. You can apply it with `git am --scissors'. And here is a patch for the rest of Japanese translation strings. - 8 - cut here - 8 - Subject: [PATCH 2/2] ox: Add new Japanese translation strings * lisp/ox.el (org-export-dictionary): Add new Japanese translation strings. A few strings in `org-export-dictionary' didn't have Japanese translations. So, I just added. --- lisp/ox.el | 6 ++ 1 file changed, 6 insertions(+) diff --git a/lisp/ox.el b/lisp/ox.el index 592cc79..ed7afe5 100644 --- a/lisp/ox.el +++ b/lisp/ox.el @@ -5391,6 +5391,7 @@ them. (es :html Ecuacioacute;n :default Ecuación) (et :html V#245;rrand :utf-8 Võrrand) (fr :ascii Equation :default Équation) + (ja :default 方程式) (no :default Ligning) (nb :default Ligning) (nn :default Likning) @@ -5454,6 +5455,7 @@ them. (es :default Indice de Listados de programas) (et :default Loendite nimekiri) (fr :default Liste des programmes) + (ja :default ソースコード目次) (no :default Dataprogrammer) (nb :default Dataprogrammer) (ru :html #1057;#1087;#1080;#1089;#1086;#1082; #1088;#1072;#1089;#1087;#1077;#1095;#1072;#1090;#1086;#1082; @@ -5465,6 +5467,7 @@ them. (es :default Indice de tablas) (et :default Tabelite nimekiri) (fr :default Liste des tableaux) + (ja :default 表目次) (no :default Tabeller) (nb :default Tabeller) (nn :default Tabeller) @@ -5478,6 +5481,7 @@ them. (es :default Listado de programa %d) (et :default Loend %d) (fr :default Programme %d : :html Programmenbsp;%dnbsp;:) + (ja :default ソースコード%d:) (no :default Dataprogram %d) (nb :default Dataprogram %d) (ru :html #1056;#1072;#1089;#1087;#1077;#1095;#1072;#1090;#1082;#1072; %d.: @@ -5489,6 +5493,7 @@ them. (es :default vea seccion %s) (et :html Vaata peat#252;kki %s :utf-8 Vaata peatükki %s) (fr :default cf. section %s) + (ja :default セクション %s を参照) (ru :html #1057;#1084;. #1088;#1072;#1079;#1076;#1077;#1083; %s :utf-8 См. раздел %s) (zh-CN :html #21442;#35265;#31532;%s#33410; :utf-8 参见第%s节)) @@ -5545,6 +5550,7 @@ them. (es :default referencia desconocida) (et :default Tundmatu viide) (fr :ascii Destination inconnue :default Référence inconnue) + (ja :default 不明な参照先) (ru :html #1053;#1077;#1080;#1079;#1074;#1077;#1089;#1090;#1085;#1072;#1103; #1089;#1089;#1099;#1083;#1082;#1072; :utf-8 Неизвестная ссылка) (zh-CN :html #26410;#30693;#24341;#29992; :utf-8 未知引用))) -- 1.8.5.2 -- yashi
Re: [O] [PATCH] Table continuation strings
Hello, Yasushi SHOJI ya...@atmark-techno.com writes: That means that whenever your-choice-of-coding-system can handle the characters for the translation string, meaning that the coding system has code points for all of the characters of the translation string and Emacs can convert between them, it is free to use any character for the output, right? If one wants to use French, she sets the current buffer coding system to any coding system which can handle French and set the language option as fr. In that case, her/his org buffer should already have French characters in it, there is no need for translation string to be strictly ASCII only when you export with plain / ascii, no? There's a limitation: if you use Latin1 characters (e.g. when you write in French), you cannot export to text/ascii anymore. So, if, for some reason, you really need to export to ascii only, but still need to write in french, you have to be careful not to use any of these Latin1 characters, in particular in translated strings. Similarly, Japanese :ascii entries could be written using romanji. I don't know to what extent it is useful, though. I'm checking exporters I use, including plain text and html, but it doesn't seems to go wrong. But I really needs some help for other back-ends. I'll post a patch for testing if anyone's interested in. Good idea. You can also set entries to :default and provide a different :latex value, if required. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] Table continuation strings
Hello, Yasushi SHOJI ya...@atmark-techno.com writes: Ah, OK. Those coding keys are for the back-ends to select proper strings, not for the string encoding. This is also related to string encoding. You will get garbage if you insert a string containing characters outside the encoding you use to save the file, won't you? Then, is there any restriction with HTML back-ends? Why does it need numeric character reference instead of just plain characters, if the coding system is not a concern? See above. You may want to save your html file in a different encoding than UTF-8. IIUC, numeric character reference are more generic. Correct me if I'm wrong. My understainding is as follows: All translation strings is in `emacs-internal' coding system, since it is defined in .el. A org file ready to be exported has a coding system specific to the buffer, ie. utf-8, iso-latin-1, euc-jp, etc. Correct. Org export back-ends get a strings for the back-ends from the translation table when appropriate. At that time Emacs converts the strings encoding system to match the buffer encoding system (or does Emacs convert all encoding when it writes to file?). The latter. The output in concatenated into a single string, which is then inserted in the target buffer (and saved to a file, if needed). Back-ends uses `org-export-coding-system' if set, otherwise use the current buffer coding system. Some back-ends also use their own variable (e.g. `org-html-coding-system'). If my understanding is ok, all entries of Japanese translation should have :default instead of :utf-8. :default instead of :utf-8 means Org will use these translations also for LaTeX, HTML and ASCII export. If you think that is correct, then we can switch to :default, indeed. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] Table continuation strings
Hi Nicolas, At Sun, 22 Dec 2013 09:20:57 +0100, Nicolas Goaziou wrote: Yasushi SHOJI ya...@atmark-techno.com writes: Ah, OK. Those coding keys are for the back-ends to select proper strings, not for the string encoding. This is also related to string encoding. You will get garbage if you insert a string containing characters outside the encoding you use to save the file, won't you? Right. However, as you described below, the output file's encoding is not determined by the language option, but by the current buffer coding system, org-export-coding-system, or back-end specific variable, ie org-html-coding-system. That means that whenever your-choice-of-coding-system can handle the characters for the translation string, meaning that the coding system has code points for all of the characters of the translation string and Emacs can convert between them, it is free to use any character for the output, right? If one wants to use French, she sets the current buffer coding system to any coding system which can handle French and set the language option as fr. In that case, her/his org buffer should already have French characters in it, there is no need for translation string to be strictly ASCII only when you export with plain / ascii, no? I just don't see any use case. I must have missed something here. Please enlighten me. BTW, Here is a part of quick test I've done. source lang exporter o-e-c-s o-h-c-s target buffer target file --- euc-jp japlain/ascii nil -euc-jp euc-jp euc-jp japlain/utf-8 nil -euc-jp euc-jp euc-jp japlain/ascii utf-8-euc-jp utf-8 euc-jp japlain/utf-8 utf-8-euc-jp utf-8 euc-jp jahtml nil utf-8euc-jp w/ charset=utf-8 utf-8 euc-jp jahtml nil euc-jp euc-jp w/ charset=euc-jp euc-jp w/ charset=euc-jp --- euc-jp frplain/ascii nil -euc-jp w/ fr trans euc-jp w/ fr translation euc-jp frplain/utf-8 nil -euc-jp w/ fr trans utf-8 decoration euc-jp w/ fr trans utf-8 decoration All major encoding for Japanese, euc-jp, iso2022, shift-jis, and utf-8 can handle the current translation string without problem. So I'm assuming that encoding for other language must have some problem. Then, is there any restriction with HTML back-ends? Why does it need numeric character reference instead of just plain characters, if the coding system is not a concern? See above. You may want to save your html file in a different encoding than UTF-8. IIUC, numeric character reference are more generic. I agree that numeric reference is more generic. As I've just checked, HTML even allows us to put characters outside of the current content charset with numeric reference! # italian text exported as html with ja language option. even if # html has iso-8859-1 as charset, web browser shows japanese chars. If my understanding is ok, all entries of Japanese translation should have :default instead of :utf-8. :default instead of :utf-8 means Org will use these translations also for LaTeX, HTML and ASCII export. If you think that is correct, then we can switch to :default, indeed. Since I don't use LaTeX, I have no idea about it. I hope some LaTeX user help me here. I'm checking exporters I use, including plain text and html, but it doesn't seems to go wrong. But I really needs some help for other back-ends. I'll post a patch for testing if anyone's interested in. Thanks, -- yashi
Re: [O] [PATCH] Table continuation strings
Hello, Yasushi SHOJI ya...@atmark-techno.com writes: The thing I don't understand is the reason all Japanese entries have `:utf-8'. Would you kindly enlighten me the relationship among the followings: - transtion coding key (ie :utf-8, :default, :html) - your current buffer coding system - `buffer-file-coding-system' and friends Coding keys are related to export back-ends. Therefore :latex entry will be used for `latex' export, :html for `html' export, `:utf-8' for both text (utf-8) and odt export, and so on. As its name suggests, :default key is used as a fallback value when no appropriate property is found. It makes up for a handy shortcut when some strings are identical. Coding system is a different thing. When `org-export-coding-system' is non-nil, it will be used as the coding system for output (note that some export back-ends override this behaviour). Otherwise, output will have the same encoding as the source buffer. BTW, 前ページから続く should be 前ページからの続き I applied your suggestion. Thank you. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] Table continuation strings
HI, At Sat, 21 Dec 2013 10:05:35 +0100, Nicolas Goaziou wrote: Yasushi SHOJI ya...@atmark-techno.com writes: The thing I don't understand is the reason all Japanese entries have `:utf-8'. Would you kindly enlighten me the relationship among the followings: - transtion coding key (ie :utf-8, :default, :html) - your current buffer coding system - `buffer-file-coding-system' and friends Coding keys are related to export back-ends. Therefore :latex entry will be used for `latex' export, :html for `html' export, `:utf-8' for both text (utf-8) and odt export, and so on. As its name suggests, :default key is used as a fallback value when no appropriate property is found. It makes up for a handy shortcut when some strings are identical. Ah, OK. Those coding keys are for the back-ends to select proper strings, not for the string encoding. Then, is there any restriction with HTML back-ends? Why does it need numeric character reference instead of just plain characters, if the coding system is not a concern? Coding system is a different thing. When `org-export-coding-system' is non-nil, it will be used as the coding system for output (note that some export back-ends override this behaviour). Otherwise, output will have the same encoding as the source buffer. Correct me if I'm wrong. My understainding is as follows: All translation strings is in `emacs-internal' coding system, since it is defined in .el. A org file ready to be exported has a coding system specific to the buffer, ie. utf-8, iso-latin-1, euc-jp, etc. Org export back-ends get a strings for the back-ends from the translation table when appropriate. At that time Emacs converts the strings encoding system to match the buffer encoding system (or does Emacs convert all encoding when it writes to file?). Back-ends uses `org-export-coding-system' if set, otherwise use the current buffer coding system. If my understanding is ok, all entries of Japanese translation should have :default instead of :utf-8. Thanks, -- yashi
Re: [O] [PATCH] Table continuation strings
Hi, At Wed, 30 Oct 2013 11:15:36 +0100, Nicolas Goaziou wrote: t...@tsdye.com (Thomas S. Dye) writes: Patch includes table continuation strings for several languages. Translations all from the internet. Caveat emptor. Applied. Thank you. + (ja :utf-8 前ページから続く) [...] + (ja :utf-8 次ページに続く) These will not be very helpful, though, as `latex' back-end (the only one to use this string so far) relies on :latex or :default properties, never on :utf-8. I'm not a latex user but a Japanese speaker who'd like to use those translation tables with other backends. The thing I don't understand is the reason all Japanese entries have `:utf-8'. Would you kindly enlighten me the relationship among the followings: - transtion coding key (ie :utf-8, :default, :html) - your current buffer coding system - `buffer-file-coding-system' and friends BTW, 前ページから続く should be 前ページからの続き Thanks, -- yashi
Re: [O] [PATCH] Table continuation strings
Hi Tom, t...@tsdye.com writes: Patch includes table continuation strings for several languages. Translations all from the internet. Caveat emptor. The German strings are fine. Best regards -- Michael Strey http://www.strey.biz
Re: [O] [PATCH] Table continuation strings
Hello, t...@tsdye.com (Thomas S. Dye) writes: Patch includes table continuation strings for several languages. Translations all from the internet. Caveat emptor. Applied. Thank you. + (ja :utf-8 前ページから続く) [...] + (ja :utf-8 次ページに続く) These will not be very helpful, though, as `latex' back-end (the only one to use this string so far) relies on :latex or :default properties, never on :utf-8. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] Table continuation strings
Nicolas Goaziou n.goaz...@gmail.com writes: Hello, t...@tsdye.com (Thomas S. Dye) writes: Patch includes table continuation strings for several languages. Translations all from the internet. Caveat emptor. Applied. Thank you. + (ja :utf-8 前ページから続く) [...] + (ja :utf-8 次ページに続く) These will not be very helpful, though, as `latex' back-end (the only one to use this string so far) relies on :latex or :default properties, never on :utf-8. We'll need a Japanese-speaking LaTeX user to chime in here. I've never typeset Japanese in LaTeX and don't speak or read the language. All the best, Tom -- T.S. Dye Colleagues, Archaeologists 735 Bishop St, Suite 315, Honolulu, HI 96813 Tel: 808-529-0866, Fax: 808-529-0884 http://www.tsdye.com
[O] [PATCH] Table continuation strings
Aloha all, Patch includes table continuation strings for several languages. Translations all from the internet. Caveat emptor. All the best, Tom From 0c551e51f5eff759957a415d7d29a830b43631d2 Mon Sep 17 00:00:00 2001 From: Thomas Dye t...@tsdye.com Date: Tue, 29 Oct 2013 14:39:48 -1000 Subject: [PATCH] Table continuation strings for some languages --- lisp/ox.el | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/lisp/ox.el b/lisp/ox.el index 141abc4..11a7510 100644 --- a/lisp/ox.el +++ b/lisp/ox.el @@ -5277,9 +5277,21 @@ them. (zh-CN :html #20316;#32773; :utf-8 作者) (zh-TW :html #20316;#32773; :utf-8 作者)) (Continued from previous page - (fr :default Suite de la page précédente)) + (de :default Fortsetzung von vorheriger Seite) + (es :default Continúa de la página anterior) + (fr :default Suite de la page précédente) + (it :default Continua da pagina precedente) + (ja :utf-8 前ページから続く) + (nl :default Vervolg van vorige pagina) + (pt :default Continuação da página anterior)) (Continued on next page - (fr :default Suite page suivante)) + (de :default Fortsetzung nächste Seite) + (es :default Continúa en la siguiente página) + (fr :default Suite page suivante) + (it :default Continua alla pagina successiva) + (ja :utf-8 次ページに続く) + (nl :default Vervolg op volgende pagina) + (pt :default Continua na página seguinte)) (Date (ca :default Data) (cs :default Datum) -- 1.8.3.3 -- T.S. Dye Colleagues, Archaeologists 735 Bishop St, Suite 315, Honolulu, HI 96813 Tel: 808-529-0866, Fax: 808-529-0884 http://www.tsdye.com