Thanks for the info Ted. I also had the same interpretation but decided to contact the folks at Wikipedia just to make sure- Below was their response.
I am not an attorney, but If I am reading it correctly, I *think* we should be able include it in the project and add the attribution to the NOTICE/LICENSE. Do you know if this is something that we would be required to get an okay from Apache Legal? Re: [Ticket#2013010310007005] Creative Commons License and Apache (was: checking in wiki) Dear Chen Pei, Thank you for your email. Our response follows your message. 01/03/2013 16:35 - Chen Pei wrote: > Dear Licensing@Wikipedia, > We have an incubator project on the Apache Software Foundation and had > a question about reusing content from Wikipedia. > We built a Lucene index with 5000 wikipedia articles relating to > medicine. Each article is modified by reducing it to list of words and their counts in that article. Would this term count transformation be okay from the Wikipedia license to be including inside an ASL 2.0 project? Is it considered new work or do we need a specific license for this purpose? > > Email discussion thread on Apache: > http://markmail.org/search/+list:org.apache.legal.discuss#query:%20list%3Aorg.apache.legal.discuss+page:1+mid:ocui6ty64xesc4b4+state:results > > Thanks, > Pei > In principle, all text in Wikipedia is subject to the Creative Commons Attribution-ShareAlike License (CC-BY-SA) and may be used free of charge for any purpose. Reading more about the license should help explain it in simpler terms: <https://creativecommons.org/licenses/by-sa/3.0/> Images and other media files may be subject to other licenses, which can be seen upon clicking on the desired image or file. A specific permission for reusing the content is not necessary, as long as the re-user observes the license conditions. CC-BY-SA allows commercial use. The only thing that needs to be done is attribution ('BY'), which can simply be a link to the history page of an article <https://en.wikipedia.org/wiki/Help:Page_history>, and re-releasing the content under similar licenses <https://en.wikipedia.org/wiki/Share-alike> ('SA'). For more information please see: <https://en.wikipedia.org/wiki/Wikipedia:Copyrights> or <https://commons.wikimedia.org/wiki/Commons:Reuse>. Please note: Neither the Wikimedia Foundation, nor the authors of articles on Wikimedia sites, nor the volunteers answering mail to this address provide legal advice. It is your responsibility, if you intend to reuse content from Wikimedia sites, to determine how the licenses of the content that we host apply to your intended uses. Yours sincerely, -- Wikipedia - https://en.wikipedia.org --- Disclaimer: all mail to this address is answered by volunteers, and responses are not to be considered an official statement of the Wikimedia Foundation. For official correspondence, please contact the Wikimedia Foundation by certified mail at the address listed on https://www.wikimediafoundation.org From: Ted Dunning [mailto:[email protected]] Sent: Wednesday, January 02, 2013 7:38 PM To: [email protected] Cc: [email protected] Subject: Re: Creative Commons License (was: checking in wiki) On non-legal-binding precedent is the RCV1 corpus where Reuters agreed that "Summaries, analyses and interpretations of the linguistic properties of the information may be derived and published provided it is not possible to reconstruct the Data from the summary." This was part of an agreement, so it has no legal binding, but it does indicate that at least one fairly strict copyright interpreter was OK with the term count transformation. On Wed, Jan 2, 2013 at 4:33 PM, Benson Margulies <[email protected]<mailto:[email protected]>> wrote: On Wed, Jan 2, 2013 at 2:30 PM, Tim Miller <[email protected]<mailto:[email protected]>> wrote: > The license is share alike 3.0, the reasons we need advice is because we are > using modified/derived version (the clause in the legal FAQ starts > "Unmodified media..."). Specifically, we built a lucene index with 5000 > wikipedia articles relating to medicine. Each article is modified by > reducing it to list of words and their counts in that article. Is there some > advice on whether this sort of modification is allowable or whether it > disqualifies? A language model derived from a corpus is not necessarily a derived work of the corpus. Opinions vary. Some would tell you that it's a new work entirely, and you own it. Others would tell you that you need a specific license from the original content owners. > Tim > > On 01/02/2013 11:28 AM, Jörn Kottmann wrote: >> >> Hello, >> >> it depends on which CA license the material is licensed under. >> >> The legal FAQ clarifies it for some of them: >> http://www.apache.org/legal/resolved.html >> >> For Creative Commons Share Alike 2.5/3.0 it says: >> "Unmodified media under the Creative Commons Attribution-Share Alike 2.5 >> and >> Creative Commons Attribution-Share Alike 3.0 licenses may be included in >> Apache products, subject to the licenses attribution clauses which may >> require LICENSE/NOTICE/README changes. ...." >> >> Is that the license wikipedia is licensed under? >> >> Jörn >> >> On 01/02/2013 05:10 PM, Chen, Pei wrote: >>> >>> Hi, >>> We would like to check in some derived features/models from Wikipedia >>> into the src code base and would like to double check - are Creative Commons >>> Licenses compatible with ASL 2.0? >>> http://creativecommons.org/licenses/by-sa/3.0/ >>> We couldn't find it in the approved 3rd party list: >>> http://www.apache.org/legal/3party.html >>> >>> Thanks, >>> Pei >>> >>> >>> -----Original Message----- >>> From: Tim Miller >>> [mailto:[email protected]<mailto:[email protected]>] >>> Sent: Monday, December 31, 2012 3:22 PM >>> To: [email protected]<mailto:[email protected]> >>> Subject: checking in wiki >>> >>> Hi team, >>> I'm just about ready to check in the wikipedia small index and the new >>> coref features and models that take advantage of them, and I want to verify >>> what changes we need to make to the license/notice to allow this in the next >>> release. The NOTICE section has the dependent software included -- is it >>> sufficient to add something like this: >>> >>> This product includes contents adapted from the English-language >>> Wikipedia (en.wikipedia.org<http://en.wikipedia.org>) developed under >>> the Creative Commons >>> Attribution-ShareAlike 3.0 License >>> (http://creativecommons.org/licenses/by-sa/3.0/). >>> >>> Thanks >>> Tim >> >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [email protected]<mailto:[email protected]> > For additional commands, e-mail: > [email protected]<mailto:[email protected]> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected]<mailto:[email protected]> For additional commands, e-mail: [email protected]<mailto:[email protected]>
