Just a quick heads up that I finally took the plunge to add UAX#14 line 
breaking to FOP. This is based on code donated by Joerg quite some time 
ago on which I did some work in October 2005. This had been documented 
on list at the time.

One of the major stumbling blocks in progressing this was the conflict 
between the recursive / nested getNextKnuthElement calls and the need 
to do the UAX#14 line breaking processing across inline boundaries.

In the end I decided, in the interest of making at least some progress 
in this area, to not attempt the 'all singing all dancing solution', 
but to simply apply this to the TextLayoutManager only. Yes, that gives 
us only limited new functionality, but hopefully its still an 
improvement. Also, the code is based on the Unicode 4.1 standard and 
not 5.0 but that can be fixed later.

Its looking OK so far and most of the layout engine tests pass. The 
change consists of a new package org.apache.fop.text.linebreak 
containing two classes and changes to the TextLayoutManager. Nothing 
else has been touched so far.

Its not ready for a commit yet, but hopefully in a few days.

The question that arises is if this should go into the planned release 
or if that is too risky and I should wait with the commit until the 
release is out or do it in a branch?

Another issue is that one of the two new files is actually generated by 
a little Java program (also from Joerg) from Unicode data files. While 
it would be a 'nice to have' for this generation to be integrated into 
the FOP build I would initially commit the generated file into the 
repository. To integrate the generation into the build we would either 
need have the Unicode data files in the Apache repository (not sure 
about licensing issues here) or the build would need to fetch those 
files causing an external dependency which usually is a hassle for 
people behind corporate firewalls etc.. Thats why I propose to apply 
the KISS principle initially.


Reply via email to