Re: [License-discuss] objective criteria for license evaluation
Hello license-discuss, As a software developer, interested to raise awareness on open licensing, build a community of an Open Source project, and educate myself and all involved people to understand and choose their open licenses, I very much welcome this discussion. I admit I have been looking for slightly more guidance on OSI pages, over the last couple of years, than it is available currently. Please don't take that as criticism, or not otherwise intended than simply a need for pointers. If I may share a few thoughts from this user-side experience. I think that OSI pages could greatly help if they contain hints or assistance in particular for: On 12/10/12, Lawrence Rosen lro...@rosenlaw.com wrote: Regarding the classification of licenses, I think it is most important to categorize licenses in the same business-related terminology that relates to business models. So you need to identify which licenses ignore or have antiquated provisions regarding patents, and why that might matter; which licenses require reciprocity; whether that reciprocity includes use by third parties over a network or whether it is a strong or weak reciprocity; I quote this for the reciprocity criterion first and foremost. I think it's essential, including but not limited to, for developers looking for a license, and for developers and community to understand open licenses, their effects, their goals. For an educational purpose. My own (poor) attempt at it has been the simplified and easy to understand approach (IMHO): permissive licenses (require no reciprocity; BSD, MIT, Apache) - weak copyleft (MPL; with LGPL more towards the next 'step') - strong copyleft (GPL) - strong copyleft extended (AGPL). Additionally, I think OSL is worth a place; again for informative or educational purpose IMHO. which licenses are definitely incompatible with each other for derivative work purposes; This is another important question, one needs to know or inform themselves easily on definite incompatibilities. As expected, personally I have addressed it by researching the licenses, license stewards statements, and projects statements where needed. IMHO a matrix or listings of at least some license incompatibilities would be very useful. Other criteria discussed in this thread could also be useful, for sure. However, at least these above (including patents position, with a simple explanation if it's possible, for the many unaware of potential issues), are in my experience very much needed. They shape the landscape of Open Source licenses and categorization by them would greatly help to understand at least the basics of this landscape. ___ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss
Re: [License-discuss] objective criteria for license evaluation
On Mon, Dec 10, 2012 at 10:57:10AM +, Gervase Markham wrote: On 09/12/12 18:46, Luis Villa wrote: So let me restate the question to broaden it a bit. If you had a *blue-sky dream* what subjective information would you look at? For example, if you had the resources to scan huge numbers of code repositories, what numbers would you look for? * ranking by LoC under each license * ranking by projects under each license * ... ? If we are blue-sky dreaming, then I would like to rank by _useful_, unique lines of code under each license. Useful in the sense that some half-finished barely-compiling my first Windows CD player on Sourceforge counts for nothing, whereas jQuery counts for a lot. Unique, in the sense that I shouldn't be able to game the stats by going to github and forking every project with my preferred license. I can also imagine other metrics of license popularity. Download statistics are problematic but it is the usual metric for distro popularity. One might be able to measure the size of contributor and user communities (numbers of committers, numbers of unique patch authors for a given release, subscriptions to mailing lists...?). [...] I think there is also a place for lawyers generally think it's vague and has sub-optimal word choice, which might apply to e.g. Artistic v1. I think that's highly problematic. I really don't think one can successfully attempt to measure consensus among lawyers regarding specific open source licenses. You could probably find enough lawyers to criticize features of any number of OSI-approved licenses, and there is also the problem (to which the GPL family is especially vulnerable for historical reasons) of 'popular' licenses being scrutinized for flaws more severely than less widely-used licenses. As for 'suboptimal word choice' that seems unavoidably subjective, and probably can be legitimately applied to every single OSI-approved license, including all of the ones assumed to be the most popular, and probably every software license that's ever been drafted. - RF ___ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss
Re: [License-discuss] objective criteria for license evaluation
On Mon, Dec 10, 2012 at 2:57 AM, Gervase Markham g...@mozilla.org wrote: On 09/12/12 18:46, Luis Villa wrote: So let me restate the question to broaden it a bit. If you had a *blue-sky dream* what subjective information would you look at? By the way, I think this was probably obvious from the rest of the email, but I meant *objective* here. For example, if you had the resources to scan huge numbers of code repositories, what numbers would you look for? * ranking by LoC under each license * ranking by projects under each license * ... ? If we are blue-sky dreaming, then I would like to rank by _useful_, unique lines of code under each license. Useful in the sense that some half-finished barely-compiling my first Windows CD player on Sourceforge counts for nothing, whereas jQuery counts for a lot. Unique, in the sense that I shouldn't be able to game the stats by going to github and forking every project with my preferred license. How to define useful objectively? Size is the obvious, plausibly-obtainable proxy here for useful- projects over X LOC or something like that. I suppose if you had a custom crawler that had knowledge of git/svn/cvs/etc., you could do projects over 5 committers or projects with over 100 commits or something along those lines. Richard suggests community size, which would be great but is probably not computable, no matter how many people/how much money you throw at it. It may be that in practice, objective information has to be stored in the same revision control system the relevant license information is stored in. Otherwise you're not talking about something that can be crawled/computed- you're talking about something that requires human intervention, which even if it is objective still limits your sample size. Similarly, if you could declare objective criteria for textual license analysis and had the time/resources to read all of them, what would those criteria be? e.g., * has/has not been retired by the author This is important; however some licenses such as the HPND have no identified author, but yet are deprecated. Deprecated by *who*? :) (Note that we don't even have a deprecated category right now; we've only gotten as far as redundant with more popular licenses.) * has/has not been obsoleted by a new license published by the same author - one can imagine a license which has been obsoleted by its author but is still in wide use, and even specifically chosen over newer versions (e.g. GPL 2) * has/doesn't have an explicit patent grant - I am of the view that even if the OSI finds it impossible politically to recommend specific licenses, it should try and get to a place where it can recommend license features - with an explicit patent grant being in pole position. Any others? * ... ? I think there is also a place for lawyers generally think it's vague and has sub-optimal word choice, which might apply to e.g. Artistic v1. As Richard points out, it is very hard to imagine how to make this objective, but I'd encourage folks to think creatively about it. * Plays well with other popular licenses. We now have a can use in progression which goes: MIT/BSD - Apache 2 - MPL 2 - LGPL 3 - GPL 3 (- AGPL 3) (Those GPL numbers could be 2 rather than 3 if there was a warning about the Apache2/GPL2 incompatibility which the FSF asserts.) If your code doesn't slot somewhere into that ecosystem, you are (IMO) significantly reducing the likelihood of it gaining widespread use, all other things being equal. I like the intuition here, but I'd like to push us to think about more objective criteria: what does it mean to play nicely? Presumably compatible, but who determines compatibility? What does it mean? Can that be determined objectively? Plays nicely with what other popular licenses? EPL is popular, for example. Luis ___ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss
Re: [License-discuss] objective criteria for license evaluation
On 10/12/12 17:23, Luis Villa wrote: How to define useful objectively? Size is the obvious, plausibly-obtainable proxy here for useful- projects over X LOC or something like that. I suppose if you had a custom crawler that had knowledge of git/svn/cvs/etc., you could do projects over 5 committers or projects with over 100 commits or something along those lines. Richard suggests community size, which would be great but is probably not computable, no matter how many people/how much money you throw at it. Perhaps we could have multiple criteria - either size, or being used in N other projects. If there were some way of detecting that. Some modern SCMs now allow you to explicitly pull in other repos; perhaps that could be detected. This is important; however some licenses such as the HPND have no identified author, but yet are deprecated. Deprecated by *who*? :) (Note that we don't even have a deprecated category right now; we've only gotten as far as redundant with more popular licenses.) Well, http://opensource.org/licenses/HPND says: This License has been voluntarily deprecated by its author. :-P * has/doesn't have an explicit patent grant - I am of the view that even if the OSI finds it impossible politically to recommend specific licenses, it should try and get to a place where it can recommend license features - with an explicit patent grant being in pole position. Any others? Nothing so concrete. One would want the license to have been drafted with international concerns in mind, especially if it did not have choice-of-law. But that's much harder to spot. As Richard points out, it is very hard to imagine how to make this objective, but I'd encourage folks to think creatively about it. Richard's point is a fair one :-) I like the intuition here, but I'd like to push us to think about more objective criteria: what does it mean to play nicely? Presumably compatible, but who determines compatibility? What does it mean? Can that be determined objectively? A good question. What is compatibility? It is a non-transitive relation, such that X is compatible with Y if code from license X can be used in a project with license Y. (If we want to pick a better term than compatible, I wouldn't object.) Who determines compatibility? Aside from the well-known disagreement about Apache 2 and GPL 2, I'm not sure (perhaps I'm naive!) that there is much disagreement about compatibility as defined above, for popular X and Y. Gerv ___ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss
Re: [License-discuss] objective criteria for license evaluation
I'm a little surprised at how quiet this thread has been, especially since I know some members of this list have been calling for objective criteria for a while. So let me restate the question to broaden it a bit. If you had a *blue-sky dream* what subjective information would you look at? For example, if you had the resources to scan huge numbers of code repositories, what numbers would you look for? * ranking by LoC under each license * ranking by projects under each license * ... ? Similarly, if you could declare objective criteria for textual license analysis and had the time/resources to read all of them, what would those criteria be? e.g., * has/has not been retired by the author * has/has not been obsoleted by a new license published by the same author * has/doesn't have an explicit patent grant * ... ? These examples assume quantitative measures of adoption, the text, and the explicit actions of the author are the only things about a license that can actually be measured, but I am probably thinking small- other examples welcome. [As a reminder, this is not a purely theoretical exercise- I agree with many on this list that a license process based on more objective criteria would be a good thing, and this thread is an effort to explore that issue and start thinking about what such a list might look like.] Luis On Thu, Dec 6, 2012 at 3:35 PM, Karl Fogel kfo...@red-bean.com wrote: Matthew Flaschen matthew.flasc...@gatech.edu writes: On 12/05/2012 10:23 AM, Karl Fogel wrote: Luis Villa l...@tieguy.org writes: Anyone else have other suggestions for objective criteria we could use? I know some folks here have been thinking about this issue for some time. Number of forks of software under a given license on GitHub, adjusted for license popularity across GitHub? (And the equivalent calculation for other sites, where possible.) That could be misleading, depending on what we want to measure. There are a lot of forks doing real work (either true forks, or those that do ongoing pull requests to keep synced). However, there are also people that fork and make one or two changes, or none at all. There's nothing wrong with that, it just might not be a meaningful metric for this purpose. Of course. I meant that as a direction to look in, not as a literal suggestion of methodology. By number of forks at GitHub, I meant look at the forks, using some kind of intelligent criteria, statistical methods, etc. This is non-trivial work, of course. Which is why it is so hard to get good stats on license popularity and why the notion is rife with fundamental definitional questions. ___ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss ___ License-discuss mailing list License-discuss@opensource.org http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss
Re: [License-discuss] objective criteria for license evaluation
Hi Luis, There are many useful ways to cut the data. Even raw statistics on number of lines of code under each license; number of independent foundations/projects that have adopted each license; types of software under each license; etc. can be interesting. I'd like to know which licenses are used by government agencies; for-profit software companies; non-profits. Most useful would be a way of listing large or important projects and the licenses they use, as long as the list of such projects is broad and comprehensive. I have no idea how Black Duck or others calculate their statistics nor what is included in their samples, so the lack of methodological openness is more of a problem than the availability of statistics. I hope that OSI can address these questions as scientists would, rather than as religious zealots for one sect or another. Regarding the classification of licenses, I think it is most important to categorize licenses in the same business-related terminology that relates to business models. So you need to identify which licenses ignore or have antiquated provisions regarding patents, and why that might matter; which licenses require reciprocity; whether that reciprocity includes use by third parties over a network or whether it is a strong or weak reciprocity; which licensees contain defensive suspension provisions (patent only or copyright also) that require due diligence before reliance on that software; which licenses are definitely incompatible with each other for derivative work purposes; which licenses are approved for use by the US or other governments; which contain attribution requirements beyond a subset of basic requirements; which contain jurisdiction or governing law provisions; etc. Of course, OSI should identify licenses that have been superseded or withdrawn by the author. Good luck doing this with scientific precision. /Larry Lawrence Rosen Rosenlaw Einschlag, a technology law firm (www.rosenlaw.com) 3001 King Ranch Rd., Ukiah, CA 95482 Office: 707-485-1242 -Original Message- From: Luis Villa [mailto:l...@tieguy.org] Sent: Sunday, December 09, 2012 10:47 AM To: Karl Fogel; License Discuss Subject: Re: [License-discuss] objective criteria for license evaluation I'm a little surprised at how quiet this thread has been, especially since I know some members of this list have been calling for objective criteria for a while. So let me restate the question to broaden it a bit. If you had a *blue-sky dream* what subjective information would you look at? For example, if you had the resources to scan huge numbers of code repositories, what numbers would you look for? * ranking by LoC under each license * ranking by projects under each license * ... ? Similarly, if you could declare objective criteria for textual license analysis and had the time/resources to read all of them, what would those criteria be? e.g., * has/has not been retired by the author * has/has not been obsoleted by a new license published by the same author * has/doesn't have an explicit patent grant * ... ? These examples assume quantitative measures of adoption, the text, and the explicit actions of the author are the only things about a license that can actually be measured, but I am probably thinking small- other examples welcome. [As a reminder, this is not a purely theoretical exercise- I agree with many on this list that a license process based on more objective criteria would be a good thing, and this thread is an effort to explore that issue and start thinking about what such a list might look like.] Luis On Thu, Dec 6, 2012 at 3:35 PM, Karl Fogel kfo...@red-bean.com wrote: Matthew Flaschen matthew.flasc...@gatech.edu writes: On 12/05/2012 10:23 AM, Karl Fogel wrote: Luis Villa l...@tieguy.org writes: Anyone else have other suggestions for objective criteria we could use? I know some folks here have been thinking about this issue for some time. Number of forks of software under a given license on GitHub, adjusted for license popularity across GitHub? (And the equivalent calculation for other sites, where possible.) That could be misleading, depending on what we want to measure. There are a lot of forks doing real work (either true forks, or those that do ongoing pull requests to keep synced). However, there are also people that fork and make one or two changes, or none at all. There's nothing wrong with that, it just might not be a meaningful metric for this purpose. Of course. I meant that as a direction to look in, not as a literal suggestion of methodology. By number of forks at GitHub, I meant look at the forks, using some kind of intelligent criteria, statistical methods, etc. This is non-trivial work, of course. Which is why it is so hard to get good stats on license popularity and why the notion is rife with fundamental definitional questions. ___ License-discuss mailing list License