Hi Luis, There are many useful ways to cut the data. Even raw statistics on number of lines of code under each license; number of independent foundations/projects that have adopted each license; types of software under each license; etc. can be interesting. I'd like to know which licenses are used by government agencies; for-profit software companies; non-profits. Most useful would be a way of listing "large" or "important" projects and the licenses they use, as long as the list of such projects is broad and comprehensive.
I have no idea how Black Duck or others calculate their statistics nor what is included in their samples, so the lack of methodological openness is more of a problem than the availability of "statistics". I hope that OSI can address these questions as scientists would, rather than as religious zealots for one sect or another. Regarding the classification of licenses, I think it is most important to categorize licenses in the same business-related terminology that relates to business models. So you need to identify which licenses ignore or have antiquated provisions regarding patents, and why that might matter; which licenses require reciprocity; whether that reciprocity includes use by third parties over a network or whether it is a "strong" or "weak" reciprocity; which licensees contain defensive suspension provisions (patent only or copyright also) that require due diligence before reliance on that software; which licenses are definitely incompatible with each other for derivative work purposes; which licenses are approved for use by the US or other governments; which contain attribution requirements beyond a subset of basic requirements; which contain jurisdiction or governing law provisions; etc. Of course, OSI should identify licenses that have been superseded or withdrawn by the author. Good luck doing this with scientific precision. /Larry Lawrence Rosen Rosenlaw & Einschlag, a technology law firm (www.rosenlaw.com) 3001 King Ranch Rd., Ukiah, CA 95482 Office: 707-485-1242 -----Original Message----- From: Luis Villa [mailto:l...@tieguy.org] Sent: Sunday, December 09, 2012 10:47 AM To: Karl Fogel; License Discuss Subject: Re: [License-discuss] objective criteria for license evaluation I'm a little surprised at how quiet this thread has been, especially since I know some members of this list have been calling for objective criteria for a while. So let me restate the question to broaden it a bit. If you had a *blue-sky dream* what subjective information would you look at? For example, if you had the resources to scan huge numbers of code repositories, what numbers would you look for? * ranking by LoC under each license * ranking by "projects" under each license * ... ? Similarly, if you could declare objective criteria for textual license analysis and had the time/resources to read all of them, what would those criteria be? e.g., * has/has not been retired by the author * has/has not been obsoleted by a new license published by the same author * has/doesn't have an explicit patent grant * ... ? These examples assume quantitative measures of adoption, the text, and the explicit actions of the author are the only things about a license that can actually be measured, but I am probably thinking small- other examples welcome. [As a reminder, this is not a purely theoretical exercise- I agree with many on this list that a license process based on more objective criteria would be a good thing, and this thread is an effort to explore that issue and start thinking about what such a list might look like.] Luis On Thu, Dec 6, 2012 at 3:35 PM, Karl Fogel <kfo...@red-bean.com> wrote: > Matthew Flaschen <matthew.flasc...@gatech.edu> writes: >>On 12/05/2012 10:23 AM, Karl Fogel wrote: >>> Luis Villa <l...@tieguy.org> writes: >>>> Anyone else have other suggestions for objective criteria we could >>>> use? I know some folks here have been thinking about this issue for >>>> some time. >>> >>> Number of "forks" of software under a given license on GitHub, >>> adjusted for license popularity across GitHub? (And the equivalent >>> calculation for other sites, where possible.) >> >>That could be misleading, depending on what we want to measure. There >>are a lot of forks doing real work (either true forks, or those that >>do ongoing pull requests to keep synced). >> >>However, there are also people that fork and make one or two changes, >>or none at all. There's nothing wrong with that, it just might not be >>a meaningful metric for this purpose. > > Of course. I meant that as a direction to look in, not as a literal > suggestion of methodology. By number of forks at GitHub, I meant > "look at the forks, using some kind of intelligent criteria, > statistical methods, etc". > > This is non-trivial work, of course. Which is why it is so hard to > get good stats on license popularity and why the notion is rife with > fundamental definitional questions. > _______________________________________________ > License-discuss mailing list > License-discuss@opensource.org > http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discus > s _______________________________________________ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss _______________________________________________ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss