On Mon, May 27, 2019 at 01:56:29PM +0200, Tino Didriksen wrote: > The PR https://github.com/apertium/apertium/pull/47 wants to add a direct > dependency on ICU. I am in favour of this, but figured it should be brought > up on the list. > > Reasoning: > - HFST and CG-3 both require ICU, and ICU has been the official Unicode > library for 3 years now. > - lttoolbox requires libxml2, and libxml2 requires ICU - so Apertium > already has a transitive dependency on ICU. > - Language development requires libxml2-utils to get xmllint, which again > transitively requires ICU.
I think at least HFST and libxml2 have configurable ICU support that can be turned off with acceptable functionality loss. > So we might as well embrace ICU entirely - also in other parts of lttoolbox > and the wider Apertium tools. I would agree. In past one could've argued that new dependencies make things harder to install and ICU was not the easiest to work with, but with current packagings it's not such a big concern. I think ICU probably still is quite big and slow but we could also immediately make use of it in few places like OOV tokenisations we've seen in issues recently that outweighs it. -- Doktor Tommi A Pirinen, Computational Linguist, <https://flammie.github.io/purplemonkeydishwasher/>, Universität Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. CLARIN-D Entwickler. President of ACL SIGUR SIG for Uralic languages <http://gtweb.uit.no/sigur/>. I tend to follow inline-posting style in desktop e-mail messages.
signature.asc
Description: PGP signature
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff