created https://issues.apache.org/jira/browse/JOSHUA-326 for this
Il giorno mer 21 dic 2016 alle ore 19:38 Matt Post <p...@cs.jhu.edu> ha scritto: > 7 → master is indeed the plan, as soon as we ship 6.1. > > matt > > > > On Dec 21, 2016, at 1:25 PM, Tommaso Teofili <tommaso.teof...@gmail.com> > wrote: > > > > Il giorno mer 21 dic 2016 alle ore 16:00 Matt Post <p...@cs.jhu.edu> ha > > scritto: > > > >> Sure, that'd be nice to do. I'd love to get rid of the Perl scripts. Are > >> you just throwing out an idea or are you interested in doing this? > > > > > > I'd be happy to do it. If Joern can help out that'd be of course very > > appreciated. > > > > > >> I think the way to go would be to set this up on a branch (off 7), and > >> then I could test it on some languages. > >> > > > > sure, and hopefully branch 7 becomes our new master soon after the 6.1 > > release. > > > > Regards, > > Tommaso > > > > > >> > >> > >>> On Dec 21, 2016, at 5:33 AM, Tommaso Teofili < > tommaso.teof...@gmail.com> > >> wrote: > >>> > >>> Hi all, > >>> > >>> I was talking to Joern (Apache OpenNLP committer) recently and it came > up > >>> the idea that we could use OpenNLP for the data preprocessing phase in > >>> Joshua as to allow tokenization, sentence detection, etc. > >>> As I was reading through our doc [1] this is currently done with > >> dedicated > >>> scripts; we could make that part pluggable (with a default simple Java > >>> implementation) and allow more fine grained control over it using > >> libraries > >>> like OpenNLP: > >>> > >>> What would people think? > >>> > >>> Regards, > >>> Tommaso > >>> > >>> [1] : https://cwiki.apache.org/confluence/display/JOSHUA/Project+Ideas > >> > >> > >