[Summary: There's serious semantic problems with all "=for :lang en"-like
notations, and I don't want to let it in perlpodspec until I'm sure I can
identify the least-bad of the many ways to do it.]

At 05:40 PM 2001-10-01 +0300, Jarkko Hietaniemi wrote:
>[...] Moreover, I tried searching the new perlpod and perlpodspec, but
>couldn't find much mention of the i18n/l10n issues (what if somebody
>wants to write pods in Italian?)  (Maybe I just didn't find the
>discussion.) [...]

Well, you can just write in Italian.  "Ecco la fonzione foo()!"  As long as
you save in Unicode or Latin-1 you should be fine.

However, if you want to write a part in Italian that is a content-alternate
with equivalent text in English, such that English-speakers seeing only
Italian part, and Italian-speakers can see only the Italian part, then THAT
opens up a whole can of worms.  I've been thinking about those worms
lately.  They are slimy and crawly!

The message
  http:[EMAIL PROTECTED]/msg00383.html
is what got me started thinking about it in the first place.

To paraphrase it, you could write something like this:

[newlines elided for concision]
=over
=begin :lang en
=item function($filename)
This is a function.
=end :lang en
=for :lang fr
=item function($nom_de_fichier)
C'est une fonction.
=end :lang fr
...

This introduces a whole level of complexity into Pod parsing and
processing, because:

* No longer can the first (non-=pod non-=cut) command after =over be used
to say which of the four kinds of over-regions this is.
(Yes, that's just a problem with one command, but when you have only a
half-dozen commands in the language, every one of them counts.)

* It's not backwards compatible.  Pulling English things off into "=begin
:lang en"..."=end :lang en" regions (or "=begin :lang-en"..."=end :lang-en"
or however it's expressed -- maybe I prefer :lang-en) looks to all existing
processors as if all the content just up and disappeared, and that's REALLY
bad.  It's not "degrading gracefully", to reuse a term from HTML standards.
(I can think of a way that's backwards compatible AND degrades gracefully,
but it makes some the other problems actually worse.)

I think this means that I don't want this in the current
perlpod/perlpodspec drafts, because the current drafts are basically a
best-current-practices documents, and the few new(-seeming) things that it
introduces, are backwards-compatable.  Really, I don't like breaking things!

* If/when you assemble a doctree, you can no longer say that the parent of
every 'item' node is an 'over' node (or whatever you call an =over...=back
region).  In general, this could make for pretty crazy trees, whereas
before the doctrees pretty much made good sense.  For example, consider:

[newlines elided for concision]
=head1 Foo
Bar
=over 
=item *
Stuff
=item *
Thing
=back

That assembles to probably something like this:

  * head1
     * "Foo"
  * p
     * "Bar"
  * over-bullet
     * item-bullet
        * "Stuff"
     * item-bullet
        * "Thing"

Nice and straightforward.  And if someone wants to convert this to a format
where headings can't stand on their own, but have to bracket the text that
they're a heading for, then it only takes a /little/ doing.  (Like: look
for all the right siblings of each "head1" node, stopping just before the
next "head1" node.)

However, consider:

[newlines elided for concision]
=begin :lang fr
=head1 Les Fonctions Publiques
Ces fonctions sont la pour vous!
=end :lang fr
=begin :lang en
=head1 Public Functions
These functions are for you!
=end :lang en
Bar
=over

=begin :lang fr
=item *
Des trucs
=end :lang fr
=begin :lang en
=item *
Stuff
=end :lang en

=begin :lang fr
=item *
Les choses
=end :lang fr
=begin :lang en
=item *
Things
=end :lang en

=back

The tree for that looks like this:

  * for (lang fr)
     * head1
        * "Les Fonctions Publiques"
     * p
        * "Ces fonctions sont la pour vous!"
  * for (lang en)
     * head1
        * "Public Functions"
     * p
        * "These functions are for you!"
  * over-bullet
     * for (lang fr)
        * item-bullet
           * "Des trucs"
     * for (lang en)
        * item-bullet
           * "Stuff"
     * for (lang fr)
        * item-bullet
           * "Les choses"
     * for (lang en)
        * item-bullet
           * "Things"

I think that makes for a rather harder-to-traverse doctree -- and our
example operation of "find the nodes governed by this =head1 node" has now
become really much more difficult.

Moreover, as you process these things (whether in a doctree or in an event
model), you can no longer get away with treating for/begin...end regions by
saying "is their target tag 'rtf', because I'm an rtf formatter!", and if
so, then process the nodes under it, otherwise prune.  But once you
introduce a "=for lang langtag", instead you have to see whether it's
"lang", and if so, see whether that's a tag of any of the languages you've
been told to process for (which maybe MUST be listed in the VERSION
section?  Or even before that?).
(BTW, I /think/ that instead a ":lang-fr" syntax solves some of these
problems.)


Now, it could be that the relationship of these things to doctree models
can be ironed out by saying that to get a doctree, you /have/ to say what
targets (like "rtf, lang-en, private") you're formatting for.  I strongly
resist that idea, partly because carries the stink of preprocessors (which
are bad news for just about any language they touch, /especially/ markup
languages) and also because it really destroys one of my long-term goals
(and Larry's too, unless he changed his mind since 1999), which is that Pod
should be a notational variant of (and losslessly expressable as) a subset
of XML.  And currently it IS!  Pod::PXML, while still experimental (and
probably requiring minor revision in light of perlpodspec), makes it so.
Except possibly for <!-- comments --> in XML, you can convert POD to XML
and back without losing any information.  And that is true /only/ because
you can take a Pod document and get THE doctree, not just A doctree for my
target-set.  (And you can take a PXML document's doctree, and make /the/
Pod doctree from it.)

An alternate approach is that there's two kinds of trees:  one is THE
doctree, and has all these things like piles of "for" elements and whatnot,
as with the bigger tree shown above; and then you say "now, make this a
happy simple doctree by throwing out anything that's not in my target-set,
which is qw~lang-en rtf~".  And then the simplifier goes thru and promotes
some nodes, and destroys others, so that what you're left with is one of
the N-possible handy simplified doctrees, like this (for lang-en):

  * head1
     * "Public Functions"
  * p
     * "These functions are for you!"
  * over-bullet
     * item-bullet
        * "Stuff"
     * item-bullet
        * "Things"

However, this dichotomy between The Doctree for a document and a usable
doctree for your target-set, is still upsetting.
But maybe there's no way around it (or something very much like it).

I think it'll take a lot more thought (and me sitting in diners and
scribbling lots of code on the back of napkins), and I don't want to
prematurely add it to current perlpodspec until I'm sure of how it needs to
happen.  For the moment, I want the current perlpodspec (once I tidy up a
few loose ends people have commented on) to basically be
backwards-compatible and a best-current-practices document.


--
Sean M. Burke  [EMAIL PROTECTED]  http://www.spinn.net/~sburke/

Reply via email to