[Summary: There's serious semantic problems with all "=for :lang en"-like notations, and I don't want to let it in perlpodspec until I'm sure I can identify the least-bad of the many ways to do it.]
At 05:40 PM 2001-10-01 +0300, Jarkko Hietaniemi wrote: >[...] Moreover, I tried searching the new perlpod and perlpodspec, but >couldn't find much mention of the i18n/l10n issues (what if somebody >wants to write pods in Italian?) (Maybe I just didn't find the >discussion.) [...] Well, you can just write in Italian. "Ecco la fonzione foo()!" As long as you save in Unicode or Latin-1 you should be fine. However, if you want to write a part in Italian that is a content-alternate with equivalent text in English, such that English-speakers seeing only Italian part, and Italian-speakers can see only the Italian part, then THAT opens up a whole can of worms. I've been thinking about those worms lately. They are slimy and crawly! The message http:[EMAIL PROTECTED]/msg00383.html is what got me started thinking about it in the first place. To paraphrase it, you could write something like this: [newlines elided for concision] =over =begin :lang en =item function($filename) This is a function. =end :lang en =for :lang fr =item function($nom_de_fichier) C'est une fonction. =end :lang fr ... This introduces a whole level of complexity into Pod parsing and processing, because: * No longer can the first (non-=pod non-=cut) command after =over be used to say which of the four kinds of over-regions this is. (Yes, that's just a problem with one command, but when you have only a half-dozen commands in the language, every one of them counts.) * It's not backwards compatible. Pulling English things off into "=begin :lang en"..."=end :lang en" regions (or "=begin :lang-en"..."=end :lang-en" or however it's expressed -- maybe I prefer :lang-en) looks to all existing processors as if all the content just up and disappeared, and that's REALLY bad. It's not "degrading gracefully", to reuse a term from HTML standards. (I can think of a way that's backwards compatible AND degrades gracefully, but it makes some the other problems actually worse.) I think this means that I don't want this in the current perlpod/perlpodspec drafts, because the current drafts are basically a best-current-practices documents, and the few new(-seeming) things that it introduces, are backwards-compatable. Really, I don't like breaking things! * If/when you assemble a doctree, you can no longer say that the parent of every 'item' node is an 'over' node (or whatever you call an =over...=back region). In general, this could make for pretty crazy trees, whereas before the doctrees pretty much made good sense. For example, consider: [newlines elided for concision] =head1 Foo Bar =over =item * Stuff =item * Thing =back That assembles to probably something like this: * head1 * "Foo" * p * "Bar" * over-bullet * item-bullet * "Stuff" * item-bullet * "Thing" Nice and straightforward. And if someone wants to convert this to a format where headings can't stand on their own, but have to bracket the text that they're a heading for, then it only takes a /little/ doing. (Like: look for all the right siblings of each "head1" node, stopping just before the next "head1" node.) However, consider: [newlines elided for concision] =begin :lang fr =head1 Les Fonctions Publiques Ces fonctions sont la pour vous! =end :lang fr =begin :lang en =head1 Public Functions These functions are for you! =end :lang en Bar =over =begin :lang fr =item * Des trucs =end :lang fr =begin :lang en =item * Stuff =end :lang en =begin :lang fr =item * Les choses =end :lang fr =begin :lang en =item * Things =end :lang en =back The tree for that looks like this: * for (lang fr) * head1 * "Les Fonctions Publiques" * p * "Ces fonctions sont la pour vous!" * for (lang en) * head1 * "Public Functions" * p * "These functions are for you!" * over-bullet * for (lang fr) * item-bullet * "Des trucs" * for (lang en) * item-bullet * "Stuff" * for (lang fr) * item-bullet * "Les choses" * for (lang en) * item-bullet * "Things" I think that makes for a rather harder-to-traverse doctree -- and our example operation of "find the nodes governed by this =head1 node" has now become really much more difficult. Moreover, as you process these things (whether in a doctree or in an event model), you can no longer get away with treating for/begin...end regions by saying "is their target tag 'rtf', because I'm an rtf formatter!", and if so, then process the nodes under it, otherwise prune. But once you introduce a "=for lang langtag", instead you have to see whether it's "lang", and if so, see whether that's a tag of any of the languages you've been told to process for (which maybe MUST be listed in the VERSION section? Or even before that?). (BTW, I /think/ that instead a ":lang-fr" syntax solves some of these problems.) Now, it could be that the relationship of these things to doctree models can be ironed out by saying that to get a doctree, you /have/ to say what targets (like "rtf, lang-en, private") you're formatting for. I strongly resist that idea, partly because carries the stink of preprocessors (which are bad news for just about any language they touch, /especially/ markup languages) and also because it really destroys one of my long-term goals (and Larry's too, unless he changed his mind since 1999), which is that Pod should be a notational variant of (and losslessly expressable as) a subset of XML. And currently it IS! Pod::PXML, while still experimental (and probably requiring minor revision in light of perlpodspec), makes it so. Except possibly for <!-- comments --> in XML, you can convert POD to XML and back without losing any information. And that is true /only/ because you can take a Pod document and get THE doctree, not just A doctree for my target-set. (And you can take a PXML document's doctree, and make /the/ Pod doctree from it.) An alternate approach is that there's two kinds of trees: one is THE doctree, and has all these things like piles of "for" elements and whatnot, as with the bigger tree shown above; and then you say "now, make this a happy simple doctree by throwing out anything that's not in my target-set, which is qw~lang-en rtf~". And then the simplifier goes thru and promotes some nodes, and destroys others, so that what you're left with is one of the N-possible handy simplified doctrees, like this (for lang-en): * head1 * "Public Functions" * p * "These functions are for you!" * over-bullet * item-bullet * "Stuff" * item-bullet * "Things" However, this dichotomy between The Doctree for a document and a usable doctree for your target-set, is still upsetting. But maybe there's no way around it (or something very much like it). I think it'll take a lot more thought (and me sitting in diners and scribbling lots of code on the back of napkins), and I don't want to prematurely add it to current perlpodspec until I'm sure of how it needs to happen. For the moment, I want the current perlpodspec (once I tidy up a few loose ends people have commented on) to basically be backwards-compatible and a best-current-practices document. -- Sean M. Burke [EMAIL PROTECTED] http://www.spinn.net/~sburke/
