Re: Getting our first release out

Jörn Kottmann Tue, 01 Feb 2011 08:08:17 -0800

On 2/1/11 4:57 PM, Grant Ingersoll wrote:

Your timing is great, as I was just about to suggest the same thing.


On Feb 1, 2011, at 6:51 AM, Jörn Kottmann wrote:

Hi all,

I would like to go ahead and get our first release out. The release is
backward compatible with the models we had over at SourceForge.
Which means we do not need to release new models right now.

The logic to train most of the models is already included in OpenNLP
and enables our users just to train the models them self or even
mix with their own data.

To release the models at Apache we have to go trough a series of legal
issues which I believe should not postpone our first release for
weeks or months.

Can you summarize here the issues?  The last thread is mountainous.  To some 
extent, there is no time like the present to address the legal issues.  The ASF 
has legal counsel, if you can summarize what we do to make the models and what 
the concerns are, we can take it over to legal-discuss@ and start working on 
it.  It may not be as big a deal as one might think.

The concerns are, that our models are trained on various closed or freecorpora which almost all have different licenses.We would have to discuss if the trained model from each corpora isallowed to be distributed under AL 2.0.

I believe in most cases we do not validate any copyright, becausestatistics about text is not protectedby its copyright. We would for example generate bigram or trigramfeatures over the whole corpus.

In my opinion we at least need to provide a list of corpora and licensesto start a discussion over atthe legal list, which alone will take some time to dig out, at least forthe english training data.

We also have training data where we are unsure about the license.

On the other side we have no advantage of doing it now as part of therelease.

In my opinion we should try to get the process started, maybe puttogether a wiki page and as soonas we have all the information we need for the legal people we starttalking to them.


Jörn

Re: Getting our first release out

Reply via email to