On Thu, 4 Jul 2013 22:22:40 +0800 Marguerite Su wrote: Hi Marguerite,
> >> It is possible that some kind guys writing some perl scripts to > >> extract the translator credits in every po, generate a new xml, and > >> integrate it into our existing documentation? Attached is a shell script that, sort of, does what you want--based on the existing data. The biggest problem with your .po files is the inconsistency of the Translators data. Some translator credits are added as comments at the beginning of the file, whereas others are mentioned in the "translator-credits" section. Combination of both exists, too. The "translator credit" section contains the data in a number of varieties that makes them difficult to parse (there are even two different commas!). Single authors also entered their credits differently from file to file (there are, for example, nine different credits from yourself ;-)). Interestingly, there is almost no occurrence of multi-line data in the "translator-credits" section. (a fact I made use of). The fact that the data is inconsistent also makes the msggrep output useless for your purpose. It typically looks like this: > msggrep -K -e 'translator-credits' po/zypper.xml.zh_CN.po # # Translators: # <[email protected]>, 2013. # Guo Yunhe <[email protected]>, 2013. msgid "" msgstr "" "Project-Id-Version: opensuse-manuals\n" "POT-Creation-Date: 2013-03-17 02:02+0800\n" "PO-Revision-Date: 2013-03-02 03:36+0000\n" "Last-Translator: guoyunhebrave <[email protected]>\n" "Language-Team: Chinese (China) (http://www.transifex.com/projects/p/opensuse-" "manuals/language/zh_CN/)\n" "Language: zh_CN\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" #. Put one translator per line, in the form of NAME <EMAIL>, YEAR1, YEAR2 #: zypper.xml:0(None) msgid "translator-credits" msgstr "Sign Guo Yunhe <[email protected]>" Parsing that output is not much different from parsing the complete .po file, therefore I did not make use of it. My script focuses on the "translator-credits" (ignoring the comments) and tries unify the existing variants - just a quick hack that allows you to quickly make use of the data without having to do too much manual work. This version assumes that multiple entries occur on one line and are separated by ";" - that is the case in all but three files. It converts all usable data to <member>2012, 玛丽苏 <ulink url="mailto:[email protected]"/></member> which allows you to just Cut & Paste it into a <simplelist>. The output can easily be changed by adjusting the script. Errors and missing data is reported at STDERR--that will hopefully help fixing the data. To run it, go to the po/ and run the script like this: extract-translators.sh 2>translator_errors.txt | sort -u That will put the data on STDOUT (sorted, with duplicates removed) and the errors to translator_errors.txt. For better results, the translator data needs to be cleaned up (at least the comments from the top of the files need to be moved to the "translator-credits" section). And as long as there is no way or tool that lets you properly extract the "translator-credits" section from the .po file, I suggest to put the data in one line separated by a RECORD_DELIMITER (see line 4 in the script). Hope this helps. -- Regards Frank Frank Sundermeyer, Technical Writer, Documentation SUSE Linux Products GmbH, Maxfeldstr. 5, D-90409 Nuernberg Tel: +49-911-74053-0, Fax: +49-911-7417755; http://www.opensuse.org/ SUSE Linux Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) "Reality is always controlled by the people who are most insane" Dogbert
extract-translators.sh
Description: application/shellscript
