On Tue, 1 Nov 2005 01:33 am, [EMAIL PROTECTED] wrote:
> Hi all,
>
>         Just an FYI, Batik also currently has an implementation of
> the Unicode TR14 word breaking alg.
> (org.apache.batik.gvt.flow.TextLineBreak).
>
>         As far as performance is concerned it should be fairly fast
> as it is mostly just table based.
>
Thomas, thanks for the pointer (Note to myself - need to become more 
aware of what's in the Batik code base. Feeble excuse - Joerg didn't 
seem to know either).
Had a look at the Batik code: Same algorithm as Joerg wrote (not 
surprising as UAX#14 actually contains real C code) very similar data 
structures internally. Data structures are hard coded and not generated 
from the Unicode text files. The API is different, especially it relies 
on Batik specific types being passed across not just plain Strings (but 
this could probably be handled by a wrapper).

This probably strengthens the argument of making all of this part of 
XMLGraphics Common....grumble...grumble...

My main reason for hesitation with the XMLGraphics Common approach is 
simple man power. We need to setup the infrastructure (subversion, 
mailing lists, web site, etc.). We need to maintain this. We would 
basically would publish APIs currently internal to Batik and FOP with 
all the resultant support headaches. For example, I would not like to 
see my time diluted in the moment by having to discuss API needs 
outside of FOP/Batik. Actually I am reluctant to even dive into the 
Batik code base in the moment. FOP is complicated enough to digest.

Hmmm... not sure where to go from here.

Manuel

Reply via email to