[EMAIL PROTECTED] wrote:
> I suggest we write a special language hyphenation file for URLs -- it is
> not a natural language, but it is one nevertheless, with its own lexical
> rules.

interesstingly, I just had the same thought.

> (Can someone provide me with a pointer to the pertinent spec?)
  http://www.rfc-editor.org/rfc/rfc2396.txt

> Stylesheets like DocBook's can take advantage of this by specifying the new
> language code, something like x-url. This approach can also be used with
> programming languages or other similar stuff, and it has already been
> proven to work with languages that can produce very long words (Herr
> Pietschmann und die xml:lang='de' Leute soll mit mir einstimmig sein ;-).
> However, the hyphen would not be a good choice as the character to use in
> the breaking point: a better choice would be to use ellipses (...) in the
> preceeding AND in the following line. Can this be achieved?

Certain problems here:
- There are quite a few places where the length of the laguage
   name is hardwired to 2 (or 5 if using a location). This doesn't
   mend "x-url" won't work, but I'd rather check before making
   promises.
- The hyphenation character is, well, a *character*, and it is
   appended at the first part of the hyphenated word. This is
   hardwired. I don't recommend hacking around there, the code is
   already very brittle and will be rewritten in HEAD anyway.
   Fuerthermore, "..." should be used with care because dots can
   occur in URLs and play a noticable role especially in the host
   part. I'd settle for a zero width space or, perhaps, a backslash.
   The hyphenation character can be explicitely specified with the
   hyphenation-character property, and the spec mandates the hyphen
   char U+2010 as default (I think FOP uses a dash, but so what).
   There is a field in the hyphenation XML file for a language specific
   hyphenation charactar, but I think it's ignored.

> I can write such an hyphenation file if you people agree this is a sensible
> solution.
That would be interesting.

For everybody else interested: FOP uses the hyphenation
algorithm from TEX, which is described in "The TeXBook",
appendix H.
The TEX-source of this book can be downloaded from a variety
of places, just type "textbook.tex" into Google.

J.Pietschmann



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to