To be more clear, I propose we replace FOP's implementation of UAX14 with use of ICU's line break iterator, and that ICU becomes a standard dependency for FOP.
However, before taking a decision on this, allow me to create a branch (on github) that actually makes this change so that folks can evaluate it. Is that a reasonable approach? On Tue, Jun 18, 2013 at 6:04 PM, Glenn Adams <[email protected]> wrote: > My position is that it is costing us in interoperability (I mean lack > thereof) by failing to use ICU. I don't see any issue about size. > > > On Tue, Jun 18, 2013 at 6:00 PM, Vincent Hennebert > <[email protected]>wrote: > >> On 18/06/13 06:46, Glenn Adams wrote: >> > Is there a reason FOP doesn't use ICU for determining line break >> > boundaries? The FOP implementation of UAX14 >> (org.apache.fop.text.linebreak) >> > seems to be out of date and basically unmaintained. According to [1], a >> > number of Apache projects are using it, including PDFBox, Xalan, and >> Xerces. >> >> I think the main reason in the past has been the size of the ICU4J jar >> compared to FOP’s own jar: >> http://markmail.org/thread/krkqlircefpuxlse >> >> I guess the topic could be revisited today. We could consider adding it >> as an optional dependency, or acknowledge that full Unicode support is >> taken for granted nowadays and use it by default. >> >> > >> > [1] http://site.icu-project.org/#TOC-Apache-Projects >> > >> >> Vincent >> > >
