Nicolás Reynolds wrote:
Hi,
this package, nltk-data, got caught on our filters due to a license that says:
Parts of NTLK-Data are distributed under various licenses,
as documented in their respective README files.
See: /usr/share/nltk/data/corpora/ -- [0]
so we went checking those licenses, to find out that the package is composed by
a bunch of data sets with very different licensing, very few of them free.
the licenses include copyright plus redistribution allowed, cc-nc, bsd-like,
only one gpl, and several doesn't even have a license, just the author name.
though, these are collections of words... i don't know if copyright applies to
them.
I picked a random package [1]. The license attribute says "May be used
for non-commercial purposes.". The README in the zip says
"redistribution permitted". The non-commercial clause means that it's
non-free. It looks like structured data to me, so copyrightable.
also, nltk (the software) has a method for downloading this data directly from
the website... this qualifies as recommending unfree stuff, right?
From the description on the website [2] I'd say yes, non-free.
[1]
http://nltk.googlecode.com/svn/trunk/nltk_data/packages/corpora/brown.xml
[2] http://www.nltk.org/data