Re: [License-discuss] objective criteria for license evaluation

2012-12-30 Thread Engel Nyst
Hello license-discuss,

As a software developer, interested to raise awareness on open
licensing, build a community of an Open Source project, and educate
myself and all involved people to understand and choose their open
licenses, I very much welcome this discussion. I admit I have been
looking for slightly more guidance on OSI pages, over the last couple
of years, than it is available currently. Please don't take that as
criticism, or not otherwise intended than simply a need for pointers.

If I may share a few thoughts from this user-side experience. I think
that OSI pages could greatly help if they contain hints or assistance
in particular for:

On 12/10/12, Lawrence Rosen lro...@rosenlaw.com wrote:
 Regarding the classification of licenses, I think it is most important to
 categorize licenses in the same business-related terminology that relates
 to
 business models. So you need to identify which licenses ignore or have
 antiquated provisions regarding patents, and why that might matter; which
 licenses require reciprocity; whether that reciprocity includes use by
 third
 parties over a network or whether it is a strong or weak reciprocity;

I quote this for the reciprocity criterion first and foremost. I think
it's essential, including but not limited to, for developers looking
for a license, and for developers and community to understand open
licenses, their effects, their goals. For an educational purpose.

My own (poor) attempt at it has been the simplified and easy to
understand approach (IMHO):
permissive licenses (require no reciprocity; BSD, MIT, Apache) - weak
copyleft (MPL; with LGPL more towards the next 'step') - strong
copyleft (GPL) - strong copyleft extended (AGPL).

Additionally, I think OSL is worth a place; again for informative or
educational purpose IMHO.

 which licenses are definitely incompatible with each other for derivative
 work purposes;

This is another important question, one needs to know or inform
themselves easily on definite incompatibilities. As expected,
personally I have addressed it by researching the licenses, license
stewards statements, and projects statements where needed. IMHO a
matrix or listings of at least some license incompatibilities would be
very useful.


Other criteria discussed in this thread could also be useful, for
sure. However, at least these above (including patents position, with
a simple explanation if it's possible, for the many unaware of
potential issues), are in my experience very much needed. They shape
the landscape of Open Source licenses and categorization by them would
greatly help to understand at least the basics of this landscape.
___
License-discuss mailing list
License-discuss@opensource.org
http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss


Re: [License-discuss] objective criteria for license evaluation

2012-12-10 Thread Richard Fontana
On Mon, Dec 10, 2012 at 10:57:10AM +, Gervase Markham wrote:
 On 09/12/12 18:46, Luis Villa wrote:
 So let me restate the question to broaden it a bit. If you had a
 *blue-sky dream* what subjective information would you look at?
 
 For example, if you had the resources to scan huge numbers of code
 repositories, what numbers would you look for?
 
 * ranking by LoC under each license
 * ranking by projects under each license
 * ... ?
 
 If we are blue-sky dreaming, then I would like to rank by _useful_,
 unique lines of code under each license. Useful in the sense that
 some half-finished barely-compiling my first Windows CD player on
 Sourceforge counts for nothing, whereas jQuery counts for a lot.
 Unique, in the sense that I shouldn't be able to game the stats by
 going to github and forking every project with my preferred license.

I can also imagine other metrics of license popularity. Download
statistics are problematic but it is the usual metric for distro
popularity. One might be able to measure the size of contributor and
user communities (numbers of committers, numbers of unique patch
authors for a given release, subscriptions to mailing lists...?).
 
[...]
 I think there is also a place for lawyers generally think it's
 vague and has sub-optimal word choice, which might apply to e.g.
 Artistic v1.

I think that's highly problematic. I really don't think one can
successfully attempt to measure consensus among lawyers regarding
specific open source licenses. You could probably find enough lawyers
to criticize features of any number of OSI-approved licenses, and
there is also the problem (to which the GPL family is especially
vulnerable for historical reasons) of 'popular' licenses being
scrutinized for flaws more severely than less widely-used licenses. 

As for 'suboptimal word choice' that seems unavoidably subjective, and
probably can be legitimately applied to every single OSI-approved
license, including all of the ones assumed to be the most popular, and
probably every software license that's ever been drafted. 

- RF

___
License-discuss mailing list
License-discuss@opensource.org
http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss


Re: [License-discuss] objective criteria for license evaluation

2012-12-10 Thread Luis Villa
On Mon, Dec 10, 2012 at 2:57 AM, Gervase Markham g...@mozilla.org wrote:
 On 09/12/12 18:46, Luis Villa wrote:

 So let me restate the question to broaden it a bit. If you had a
 *blue-sky dream* what subjective information would you look at?

By the way, I think this was probably obvious from the rest of the
email, but I meant *objective* here.

 For example, if you had the resources to scan huge numbers of code
 repositories, what numbers would you look for?

 * ranking by LoC under each license
 * ranking by projects under each license
 * ... ?


 If we are blue-sky dreaming, then I would like to rank by _useful_, unique
 lines of code under each license. Useful in the sense that some
 half-finished barely-compiling my first Windows CD player on Sourceforge
 counts for nothing, whereas jQuery counts for a lot. Unique, in the sense
 that I shouldn't be able to game the stats by going to github and forking
 every project with my preferred license.

How to define useful objectively? Size is the obvious,
plausibly-obtainable proxy here for useful- projects over X LOC or
something like that. I suppose if you had a custom crawler that had
knowledge of git/svn/cvs/etc., you could do projects over 5
committers or projects with over 100 commits or something along
those lines. Richard suggests community size, which would be great but
is probably not computable, no matter how many people/how much money
you throw at it.

It may be that in practice, objective information has to be stored in
the same revision control system the relevant license information is
stored in. Otherwise you're not talking about something that can be
crawled/computed- you're talking about something that requires human
intervention, which even if it is objective still limits your sample
size.

 Similarly, if you could declare objective criteria for textual license
 analysis and had the time/resources to read all of them, what would
 those criteria be? e.g.,

 * has/has not been retired by the author

 This is important; however some licenses such as the HPND have no identified
 author, but yet are deprecated.

Deprecated by *who*? :) (Note that we don't even have a deprecated
category right now; we've only gotten as far as redundant with more
popular licenses.)

 * has/has not been obsoleted by a new license published by the same author

 - one can imagine a license which has been obsoleted by its author but is
 still in wide use, and even specifically chosen over newer versions (e.g.
 GPL 2)

 * has/doesn't have an explicit patent grant

 - I am of the view that even if the OSI finds it impossible politically to
 recommend specific licenses, it should try and get to a place where it can
 recommend license features - with an explicit patent grant being in pole
 position.

Any others?

 * ... ?

 I think there is also a place for lawyers generally think it's vague and
 has sub-optimal word choice, which might apply to e.g. Artistic v1.

As Richard points out, it is very hard to imagine how to make this
objective, but I'd encourage folks to think creatively about it.

 * Plays well with other popular licenses. We now have a can use in
 progression which goes:

 MIT/BSD - Apache 2 - MPL 2 - LGPL 3 - GPL 3 (- AGPL 3)

 (Those GPL numbers could be 2 rather than 3 if there was a warning about the
 Apache2/GPL2 incompatibility which the FSF asserts.)

 If your code doesn't slot somewhere into that ecosystem, you are (IMO)
 significantly reducing the likelihood of it gaining widespread use, all
 other things being equal.

I like the intuition here, but I'd like to push us to think about more
objective criteria: what does it mean to play nicely? Presumably
compatible, but who determines compatibility? What does it mean? Can
that be determined objectively?

Plays nicely with what other popular licenses? EPL is popular, for example.

Luis
___
License-discuss mailing list
License-discuss@opensource.org
http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss


Re: [License-discuss] objective criteria for license evaluation

2012-12-10 Thread Gervase Markham

On 10/12/12 17:23, Luis Villa wrote:

How to define useful objectively? Size is the obvious,
plausibly-obtainable proxy here for useful- projects over X LOC or
something like that. I suppose if you had a custom crawler that had
knowledge of git/svn/cvs/etc., you could do projects over 5
committers or projects with over 100 commits or something along
those lines. Richard suggests community size, which would be great but
is probably not computable, no matter how many people/how much money
you throw at it.


Perhaps we could have multiple criteria - either size, or being used in 
 N other projects. If there were some way of detecting that. Some 
modern SCMs now allow you to explicitly pull in other repos; perhaps 
that could be detected.



This is important; however some licenses such as the HPND have no identified
author, but yet are deprecated.


Deprecated by *who*? :) (Note that we don't even have a deprecated
category right now; we've only gotten as far as redundant with more
popular licenses.)


Well, http://opensource.org/licenses/HPND says:

This License has been voluntarily deprecated by its author.

:-P


* has/doesn't have an explicit patent grant


- I am of the view that even if the OSI finds it impossible politically to
recommend specific licenses, it should try and get to a place where it can
recommend license features - with an explicit patent grant being in pole
position.


Any others?


Nothing so concrete. One would want the license to have been drafted 
with international concerns in mind, especially if it did not have 
choice-of-law. But that's much harder to spot.



As Richard points out, it is very hard to imagine how to make this
objective, but I'd encourage folks to think creatively about it.


Richard's point is a fair one :-)


I like the intuition here, but I'd like to push us to think about more
objective criteria: what does it mean to play nicely? Presumably
compatible, but who determines compatibility? What does it mean? Can
that be determined objectively?


A good question. What is compatibility? It is a non-transitive relation, 
such that X is compatible with Y if code from license X can be used in a 
project with license Y. (If we want to pick a better term than 
compatible, I wouldn't object.)


Who determines compatibility? Aside from the well-known disagreement 
about Apache 2 and GPL 2, I'm not sure (perhaps I'm naive!) that there 
is much disagreement about compatibility as defined above, for popular X 
and Y.


Gerv
___
License-discuss mailing list
License-discuss@opensource.org
http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss


Re: [License-discuss] objective criteria for license evaluation

2012-12-09 Thread Luis Villa
I'm a little surprised at how quiet this thread has been, especially
since I know some members of this list have been calling for objective
criteria for a while.

So let me restate the question to broaden it a bit. If you had a
*blue-sky dream* what subjective information would you look at?

For example, if you had the resources to scan huge numbers of code
repositories, what numbers would you look for?

* ranking by LoC under each license
* ranking by projects under each license
* ... ?

Similarly, if you could declare objective criteria for textual license
analysis and had the time/resources to read all of them, what would
those criteria be? e.g.,

* has/has not been retired by the author
* has/has not been obsoleted by a new license published by the same author
* has/doesn't have an explicit patent grant
* ... ?

These examples assume quantitative measures of adoption, the text, and
the explicit actions of the author are the only things about a license
that can actually be measured, but I am probably thinking small- other
examples welcome.

[As a reminder, this is not a purely theoretical exercise- I agree
with many on this list that a license process based on more objective
criteria would be a good thing, and this thread is an effort to
explore that issue and start thinking about what such a list might
look like.]

Luis

On Thu, Dec 6, 2012 at 3:35 PM, Karl Fogel kfo...@red-bean.com wrote:
 Matthew Flaschen matthew.flasc...@gatech.edu writes:
On 12/05/2012 10:23 AM, Karl Fogel wrote:
 Luis Villa l...@tieguy.org writes:
 Anyone else have other suggestions for objective criteria we could
 use? I know some folks here have been thinking about this issue for
 some time.

 Number of forks of software under a given license on GitHub, adjusted
 for license popularity across GitHub?  (And the equivalent calculation
 for other sites, where possible.)

That could be misleading, depending on what we want to measure.  There
are a lot of forks doing real work (either true forks, or those that do
ongoing pull requests to keep synced).

However, there are also people that fork and make one or two changes, or
none at all.  There's nothing wrong with that, it just might not be a
meaningful metric for this purpose.

 Of course.  I meant that as a direction to look in, not as a literal
 suggestion of methodology.  By number of forks at GitHub, I meant look
 at the forks, using some kind of intelligent criteria, statistical
 methods, etc.

 This is non-trivial work, of course.  Which is why it is so hard to get
 good stats on license popularity and why the notion is rife with
 fundamental definitional questions.
 ___
 License-discuss mailing list
 License-discuss@opensource.org
 http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss
___
License-discuss mailing list
License-discuss@opensource.org
http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss


Re: [License-discuss] objective criteria for license evaluation

2012-12-09 Thread Lawrence Rosen
Hi Luis,

There are many useful ways to cut the data. Even raw statistics on number of
lines of code under each license; number of independent foundations/projects
that have adopted each license; types of software under each license; etc.
can be interesting. I'd like to know which licenses are used by government
agencies; for-profit software companies; non-profits. Most useful would be a
way of listing large or important projects and the licenses they use, as
long as the list of such projects is broad and comprehensive. 

I have no idea how Black Duck or others calculate their statistics nor what
is included in their samples, so the lack of methodological openness is more
of a problem than the availability of statistics. I hope that OSI can
address these questions as scientists would, rather than as religious
zealots for one sect or another.

Regarding the classification of licenses, I think it is most important to
categorize licenses in the same business-related terminology that relates to
business models. So you need to identify which licenses ignore or have
antiquated provisions regarding patents, and why that might matter; which
licenses require reciprocity; whether that reciprocity includes use by third
parties over a network or whether it is a strong or weak reciprocity;
which licensees contain defensive suspension provisions (patent only or
copyright also) that require due diligence before reliance on that software;
which licenses are definitely incompatible with each other for derivative
work purposes; which licenses are approved for use by the US or other
governments; which contain attribution requirements beyond a subset of basic
requirements; which contain jurisdiction or governing law provisions; etc.
Of course, OSI should identify licenses that have been superseded or
withdrawn by the author.

Good luck doing this with scientific precision.

/Larry 

Lawrence Rosen
Rosenlaw  Einschlag, a technology law firm (www.rosenlaw.com)
3001 King Ranch Rd., Ukiah, CA 95482
Office: 707-485-1242


-Original Message-
From: Luis Villa [mailto:l...@tieguy.org] 
Sent: Sunday, December 09, 2012 10:47 AM
To: Karl Fogel; License Discuss
Subject: Re: [License-discuss] objective criteria for license evaluation

I'm a little surprised at how quiet this thread has been, especially since I
know some members of this list have been calling for objective criteria for
a while.

So let me restate the question to broaden it a bit. If you had a *blue-sky
dream* what subjective information would you look at?

For example, if you had the resources to scan huge numbers of code
repositories, what numbers would you look for?

* ranking by LoC under each license
* ranking by projects under each license
* ... ?

Similarly, if you could declare objective criteria for textual license
analysis and had the time/resources to read all of them, what would those
criteria be? e.g.,

* has/has not been retired by the author
* has/has not been obsoleted by a new license published by the same author
* has/doesn't have an explicit patent grant
* ... ?

These examples assume quantitative measures of adoption, the text, and the
explicit actions of the author are the only things about a license that can
actually be measured, but I am probably thinking small- other examples
welcome.

[As a reminder, this is not a purely theoretical exercise- I agree with many
on this list that a license process based on more objective criteria would
be a good thing, and this thread is an effort to explore that issue and
start thinking about what such a list might look like.]

Luis

On Thu, Dec 6, 2012 at 3:35 PM, Karl Fogel kfo...@red-bean.com wrote:
 Matthew Flaschen matthew.flasc...@gatech.edu writes:
On 12/05/2012 10:23 AM, Karl Fogel wrote:
 Luis Villa l...@tieguy.org writes:
 Anyone else have other suggestions for objective criteria we could 
 use? I know some folks here have been thinking about this issue for 
 some time.

 Number of forks of software under a given license on GitHub, 
 adjusted for license popularity across GitHub?  (And the equivalent 
 calculation for other sites, where possible.)

That could be misleading, depending on what we want to measure.  There 
are a lot of forks doing real work (either true forks, or those that 
do ongoing pull requests to keep synced).

However, there are also people that fork and make one or two changes, 
or none at all.  There's nothing wrong with that, it just might not be 
a meaningful metric for this purpose.

 Of course.  I meant that as a direction to look in, not as a literal 
 suggestion of methodology.  By number of forks at GitHub, I meant 
 look at the forks, using some kind of intelligent criteria, 
 statistical methods, etc.

 This is non-trivial work, of course.  Which is why it is so hard to 
 get good stats on license popularity and why the notion is rife with 
 fundamental definitional questions.
 ___
 License-discuss mailing list
 License