Karen Lease wrote:
> That's a good question. As far as I've seen there's nothing official in 
> the specification, so it's up to each implementation to handle this. I 
> believe it falls in the category of things which could be defined by the 
>   "user agent" idea in FOP and then used by the line-breaking algorithm. 
> In the redesign branch of FOP, the line-breaking code assumes it has a 
> list of legal line-break characters; clearly this needs to depend on the 
> script, which isn't yet the case.

There seems to be a general consensus that a FO processor
has to be an Unicode compliant application, even though the
spec appears to carefully avoid to make this explicit. This
would mean implementing the Unicode TR14 line breaking
algorithm which is basically script independent (it
inherently handles *all* scripts), apart from cases which
are explicitely left to "a higher application layer" (which
I'd recommend to read as: "let the user insert soft hyphens
and zero width spaces if he wants to have breaks").

TR14 needs a table for certain character properties be present,
which can be derived from the official database. We probably
need such a table for other purposes too, for example writing
direction and word separation. I fear an implementation which
is not too heavyweight could be tricky, however, there is a
report from the Unicode consortium which covers exactly this
topic, I just hadn't had time to read it yet.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to