On the one hand, there are tight limits on what you can put in the source tree.
On the other hand, in many legal cases, there is no such thing as GPL data. It is quite possible that the objectionable properties of the GPL will be deemed irrelevant. You would want to open a ticket on LEGAL on JIRA and specify the copyright information, the license, and exactly what was going to happen to the data. On Fri, Dec 3, 2010 at 7:39 AM, Jörn Kottmann <[email protected]> wrote: > On 12/3/10 1:29 PM, Thilo Goetz wrote: >> >> On 12/3/2010 13:14, Jörn Kottmann wrote: >> [...] >>> >>> Having room for additional things like a corpus project could be nice, >>> because people who are interested in the source code might not >>> want to check out the corpus (if we ever get it created). >> >> So you can either make it convenient to check out >> everything, or to check out individual stuff. When >> I check out an open source project I tend to get >> everything and then later figure out if there is >> stuff in there that I don't need. Maybe most people >> are different, no idea. > > Do you know if it would be possible to check in GPL training? > We cannot distribute it, but we could distribute models built > from it. > > There is a private training data repository, but it takes ages > to check it out, so that might be something we should not > stuff into the trunk which contains the source code. > > Jörn >
