Follow-up Comment #3, bug #11294 (project wesnoth):
= "Spec" on the node info extraction =
The goal is to present to the translator, aside from the text to be
translated, also some useful info on the text. A prime example, but not
limited to, would be the speaker of a piece of text in a dialog.
For the lack of better (or any, honestly) knowledge of WML, I'll call
[foo]...[/foo], bar=... a variable, and all variables and child nodes of a
given node siblings. Then, by their merit for translation, I divide variables
into extractable, informational, and uninteresting. With this setup, we want
for each extractable (which becomes the msgid field of a PO entry) to collect
all name=value pairs of sibling informational variables -- the node info
(formatted as an auto-comment field of a PO entry).
An example WML file:
1| # File kazoom.cfg
2| [node1]
3| unint_foo=...
4| inform_bar=value_bar
5| extract_baz=_"Translate this"
6| [subnode1]
7| unint_pong=...
8| extract_ding=_"This to translate as well"
9| inform_bong=value_bong
10| extract_gong=_"Also translate this"
11| [/subnode1]
12| inform_qwyx=value_qwyx
13| unint_quak=...
14| extract_qorq=_"Also translate this"
15| [/node1]
should result in PO file entries:
#. node1: inform_bar=value_bar, inform_qwyx=value_qwyx
#: kazoom.cfg:5
msgid "Translate this"
msgstr ""
#. subnode1: inform_bong=value_bong
#: kazoom.cfg:8
msgid "This to translate as well"
msgstr ""
#. subnode1: inform_bong=value_bong
#. node1: inform_bar=value_bar, inform_qwyx=value_qwyx
#: kazoom.cfg:10 kazoom.cfg:14
msgid "Also translate this"
msgstr ""
Comments:
* For the extractable extract_baz, the informational siblings within node1
are both inform_bar and inform_qwyx, and so are both extracted for the node
info #. node1: ... of "Translate this" PO entry.
* Extractable extract_ding belongs to subnode1 and has only inform_bong as an
informational sibling. Hence that is the only node info for "This to translate
as well" PO entry.
* Extractables extract_gong and extract_qorq belong to different nodes, but
have the same text value, "Also translate this". All PO entries must be
unique, therefore the #: comment (source reference) of the last PO entry
above mentions two source lines, and the two respective node infos are
stacked above each other.
Extractable variables are known by their value, being an underscore preceding
a quoted literal. To differentiate between informational and uninteresting,
the informational variables are explicitly stated (at present:
speaker|role|description|condition|type|race)
Source references as well as node info comments should in general be stated
within the PO entry in order of appearance of messages in WML file. This is
not a hard requirement, but it should come naturally from the
implementation.
The implementation challenge I encountered was parsing the WML. Since a
sibling informational variable for the given extractable may occur both
before and after, including subnodes in between, the node stack must be
maintained. This wouldn't be a problem for the WML as simple as in the
example, but the stuff "in the wild" is pretty hairy to my untrained eyes :)
A full-fledged Python WML parser would be peachy for this purpose.
As a robustness measure, since node info is nice to have but not absolutely
critical, the extractor should not exit on seemingly invalid WML. Instead, it
should just stop collecting node info downstream of the error point, warn
about it, and continue collecting only translatable text. Also, since
typically many WML files are extracted in one run into single PO, the node
info collection should recover when switching to next WML file after the
erroneous.
_______________________________________________________
Reply to this item at:
<http://gna.org/bugs/?11294>
_______________________________________________
Message sent via/by Gna!
http://gna.org/
_______________________________________________
Wesnoth-bugs mailing list
[email protected]
https://mail.gna.org/listinfo/wesnoth-bugs