2011/9/12 Stefane Fermigier <[email protected]>: > > On Sep 12, 2011, at 10:48 AM, Rupert Westenthaler wrote: > >> On Mon, Sep 12, 2011 at 10:34 AM, Reto Bachmann-Gmür <[email protected]> wrote: >>> Add them to svn? The files aren't that big. A similar issue but maybe >>> a bit harder to solve is the downloaded dbpedia data, I don't think >>> the released version should depend on third party servers for >>> compiling. >>> >> The reason why the OpenNLP models are not yet hosted @apache.org is >> because of licenses issues. > > Which are ? What's the license on the NLP models ? If they are on > SourceForge, they should me open source.
Those models are statistically derived from copyrighted material that is available for NLP researchers under a restrictive license "for research purpose only". Hence the license of such derived work is somewhat "gray". Better have models trained on explicitly annotated corpus freely redistributable for a any purpose. That's why I started the pignlproc project to build models from Wikipedia and contacted the OpenNLP developers to collaborate on this. They started an effort in that direction but nobody has enough time to finish building & testing models with good enough quality so far. > If they are under a license incompatible with apache.org, OK, but nothing > prevents the IKS project from hosting open source stuff, right ? +1 for mirroring the models on a IKS server. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
