On Dec 10, 2009, at 6:58 PM, Peter Murray-Rust wrote:
But, for the record:
>  
> CML HAS a licence (Artistic) specified in the pom.xml file associated with 
> the project.

 == The existence of "LICENSE.txt"

The file LICENSE.txt is in the same directory as pom.xml and says:

   CML Schema is distributed under a Creative Commons license,
   allowing redistribution but NOT derivative works. This is
   to ensure that the schema does not mutate.

There are two main CC licenses which do not allow derivative works. These are:

  http://creativecommons.org/licenses/by-nd/2.0/
    "Attribution-No Derivative Works 2.0 Generic"

  http://creativecommons.org/licenses/by-nc-nd/3.0/
    "Attribution-Noncommercial-No Derivative Works 3.0"

It is not possible to tell which of those is meant. Under the worst case 
scenario, it means the latter, and would clearly not be open. This must be 
clarified for it to be useable.

The Creative Commons site recommends that its licenses should not be used for 
source code, in preference to the public domain, BSD, GPL, or LGPL. As the XSD 
file is a form of source code (it is input to a validator), the two listed CC 
licenses are likely a poor choice, or the use of an XSD file as the normative 
document is a poor choice. For one, it prevents translation to other schema 
formats, or to validating XML parsers.

  == Does LICENSE.txt take precedence over pom.xml?

Given the name "LICENSE.txt" I think it's fair to say that it has precedence 
over a file named "pom.xml".

The pom.xml file does not mention the extra qualifications from LICENSE.txt 
("Any distribution must acknowledge the origins and also include copies of the 
JUMBO source", "You may not claim that a modified version is a compliant CML 
system and may not assert that it reads or writes CML.").  Someone searching 
for a license should be expected to consider that the pom.xml is an 
approximation to the license for the entire package, perhaps limited by some 
schema definition which does not allow the full details.

I searched for and found that pom.xml is part of Maven. The license section is 
at
  http://maven.apache.org/pom.html#Licenses
and it's clear to see that the schema indeed is not powerful enough to handle 
license statements which apply only to a part of the software.


  == Suppose JUMBO is licensed under an unmodified "Artistic License"
        (and not the "Artistic License 2.0" version)

Let's suppose though that I take CML to also be distributed under the "Artistic 
License". The link in the pom.xml file is to:

  http://www.opensource.org/licenses/artistic-license.php

which says it's to the outdated license, and people should upgrade to the 2.0 
license.

Let's suppose the license is really the outdated "Artistic License" (1.0) and 
not the 2.0 license. The FSF quite clearly says:

   http://www.fsf.org/licensing/licenses/index_html#ArtisticLicense
   We cannot say that this is a free software license because it
   is too vague; some passages are too clever for their own good,
   and their meaning is not clear. We urge you to avoid using it,
   except as part of the disjunctive license of Perl.

and puts it under the category "The following licenses do not qualify as free 
software licenses." I see that Debian does allow the Artistic License under its 
Free Software Guidelines:
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=381729
but acceptance does not imply compatibility with the LGPL or GPL.

Note also that every package which distributes Java binaries (CDK, JChemPaint, 
and others) is in violation of this 1.0 license because they do not distribute 
the source to JUMBO. Few of them even list using JUMBO. This is in the process 
of changing, due to my making a nuisance of myself.

I doubt at this point that any of the maintainers of those projects are willing 
to make a decision contrary to the FSF, although asking the licensing wizards 
of Debian may help clarify things.

  == Suppose JUMBO is distributed under "Artistic License 2.0"
                  Is the 2.0 license compatible with the LGPL?

Let's suppose the license really should be 2.0 (and in private email you have 
said that is the case). That means CDK, etc. must either adhere to this strange 
clause:

    (8) You are permitted to link Modified and Standard Versions
    with other works, to embed the Package in a larger work of
    your own, or to build stand-alone binary or bytecode versions
    of applications that include the Package, and Distribute the
    result without restriction, provided the result does not
    expose a direct interface to the Package.

which Bioclipse I'm told, does not do, or use clause 4(c)(ii) and relicense the 
source using any license which

    permits the licensee to freely copy, modify and redistribute
    the Modified Version using the same licensing terms that apply
    to the copy that the licensee received, and requires that the
    Source form of the Modified Version, and of any works derived
    from it, be made freely available in that license fees are
    prohibited but Distributor Fees are allowed. Distribution of
    Compiled Forms of the Standard Version or Modified Versions
    without the Source

Larry Wall meant for this to be compatible with the GPL and the FSF agrees that 
it is compatible, but it's not obvious to me that a possible relicense includes 
the LGPL. The binary distribution (which CDK etc. do) is a work derived from 
JUMBO, which means it must include a copyright which allows free access to 
copy, modify, and redistribute the source code for that work, in its entirety. 
This would seem to allow the GPL but not the LGPL, which has a more limited 
intent.

When combined with clause 8, where the original goal was to prohibit embedding 
JUMBO in another package, I think it's fair to say the intent is to discourage 
the use of the LGPL over the GPL.

I am not a lawyer, but as someone reviewing the license before deciding to 
include in my package I would be wary. Then again, I'm a BSD license fan so I'm 
already wary.

  == Remember, LICENSE.txt prohibits modified versions from claiming CML support

The LICENSE.txt file is clearly meant to override the terms of the Artistic 
License, saying:

  You may not claim that a modified version is a compliant CML
  system and may not assert that it reads or writes CML.

Bioclipse, BTW, distributes "jar/jumbo-with-fix-by-jonalv.jar" which is a 
patched form of JUMBO so clearly Bioclipse is not allowed to claim CML support.

I believe this text also means that any derived license must also contain this 
override, so a relicense to the LGPL, if allowed, must also enforce this claim. 
I believe that doing so is allowed by the LGPL because the LGPL says

    Activities other than copying, distribution and modification are
    not covered by this License; they are outside its scope.

I would not guarantee that this clause and the LGPL are compatible. I strongly 
suggest it is not allowed by Debian, as this would correspond roughly to misuse 
of the invariant sections in the GNU FDL.


I have bought this issue up with several of the projects which use JUMBO, and 
pointed out that they are in technical violation of the Artistic License (or of 
the 2.0 form). One such is Egon, and I'm sure he's wondering how this will be 
resolved.

  == Possible resolutions to the licensing conflict

The simplest would be to relicense everything under the LGPL, and without the 
prohibition against saying that modified code reads or write CML. (Or relicense 
under the BSD, or put into the public domain.) This would be clearly compatible 
with the existing LGPL projects which use JUMBO. This is acceptable to me, and 
would have no downstream consequences.

Next simplest would be to remove the clause and say that LGPL is a valid 
relicensing of clause 4(c)(ii). This would be the disjunctive form of (Artistic 
License 2.0 | LGPL). This is acceptable to me and would have no downstream 
consequences - everyone would relicense it under the LGPL.

Next next simplest would be to leave the clause in and clarify that LGPL is 
allowed. However, it would still mean that Bioclipse, which uses a patched 
library, could not say that it reads/writes CML files. There are obvious 
downstream consequences, especially if packaged with something like Debian.


Do bear in mind that if the software can be used under the LGPL then anyone is 
free to take the CML xsd schema and alter it under the terms of the LGPL. That 
is something which you've several times said was not your intent, which means 
the XSD, when used as code, is not free or open source software. I personally 
don't consider it worrisome.


  == The JUMBO license must be clarified, or the package must
             be removed from CDK, JChemPaint, Bioclipse, etc.

>  That is a valid thing to do. It's not very convenient for some people. When 
> I find time I'll do soimething.

First, you are correct. You do not need to do anything at all.

I will make a very legal point. As free and open source software developers 
(and commercial as well), we must be firm believers of strong copyright. The 
free software movement could not exist without it.

Currently, *every* LGPL-based project which includes JUMBO and CML is in 
violation of the stated "Artistic License", which the FSF clearly categorizes 
as "not free." They also do not distribute the JUMBO source, which is required. 
That's easily fixed, and some of them are reconciling this specific lack.

Many of them are in violation of the "Artistic License 2.0", either because 
they expose part of the JUMBO or CML API (violating section (8)), or haven't 
relicensed the software to some form which allows that. I'm not certain that 
the 2.0 license is compatible with the LGPL, and neither am I certain that the 
restrictions you want for the XSD file can be passed over to an LGPL file.

At least one package (Bioclipse) includes a patched version of JUMBO, which 
means it may no longer claim that it supports CML.

So long as you and Rzepa make no changes to the license, then most of the 
downstream packages which use JUMBO should feel obligated to remove JUMBO from 
their binary distributions, because leaving it in would violate the explicit 
license requirements.

You can say the license is otherwise, but please be specific as to what the 
license you are actually releasing this under (version number, and even better 
would be the actual license text rather than an ambiguous URL and incomplete 
license name), and help resolve these conflicts I've highlighted; and hopefully 
also change the license files so it is not ambiguous to others in the future. 
Adding an explicit copyright statement to the distribution would also help (see 
below).

If the downstream packages do not remove JUMBO support, then it's either 
because the maintainers don't feel there is a license violation (if you, dear 
reader, are one such maintainer, please tell me why that's the case) or they 
understand that you or Rzepa, as copyright holders, would not sue them or 
otherwise take action. That latter case should be extremely distasteful to 
anyone who believes in a strong copyright and clear licenses, and places a 
minor but unexpected risk on further downstream users which must at least be 
acknowledge prominently in the license section of the help and documentation.


  == Is the existing license clear?

> If someone wishes to help I'd be delighted. CML copyright is Henry Rzepa and 
> Peter Murray-Rust. They wrote it and they've published it. You may not like 
> the licence and you may not like the authorship but it's clear.

It is clear?

It's clear that there are two different documents in the distribution which 
assert a license, and that these two documents say different things. One is a 
subset of the other.

It's clear that the stated "Artistic License" does not count as "free software" 
according to the FSF, which also says that it is not compatible with the GPL 
and by inference, neither with the LGPL. (All evidence says the license is 
"Artistic License" and NOT "Artistic License 2.0".)

It's clear that the clause in LICENSE.txt about using a Creative Commons 
license without derivations can refer to two different licenses, one of which 
prohibits commercial use.


It's clear that I'm a persistent nit-picker.

I would like to help. Relicense the entirety of JUMBO under the LGPL, without 
additional constraints on how modifications prohibit using the name "CML", and 
it would be fine for everyone.


  == JUMBO lacks an explicit copyright notice

Now that you bring up copyright ownership, I see that nothing inside the JUMBO 
distribution states who owns the copyright. There only relevant mention of 
'copyright' or 'copy right' or '(c)' or '&copy' I could find (from searching in 
jumbo-5.5-b1) was:

Morgan.java: * @author pm286 Copyright P.Murray-Rust, 29-May-2005 Artistic 
license

"Rzepa" exists in pom.xml as a "developer" along with many others, and in the 
"LICENSE.txt" file along with your name. The latter just needs the magic text 
"copyright by" before those two names to make me happy. A year would be nice, 
but is not required.

The Artistic License does require an explicit copyright notice, saying:

     "Copyright Holder" means the individual(s) or organization(s)
      named in the copyright notice for the entire Package.


  == What don't we need a license for SMILES, MDL's CT formats, PDB, ... ?

> In contrast there is no license for Daylight SMILES specification that I know 
> of. There is no licence for ctfile.pdf (the online documentation of MDL 
> files). There's no licence for most biological data specs - PDB, etc. files 
> are covered by Community Norms.

That's because they are not needed. I suspect this is a first-thought reaction 
rather than a considered opinion, as otherwise it would imply you don't know 
the purpose of a license. I will explain, for clarity's sake and for those 
reading along.

Licenses are only needed to give rights that are not otherwise allowed under 
copyright. (Or to restrict those rights, but that's a different point I'll get 
to in a bit.)

  == How to get the original SMILES paper

The SMILES specification was first made in JCICS (1988) and is also available 
in a more recent book chapter also by Weininger. Additional documentation is 
available online for your review, and if you email Daylight and ask for a 
written copy of the tutorial books, which includes the online documenttion, 
there's a good chance they'll send you a copy.

The copyright to the JCICS paper is owned by the ACS, which makes the paper 
available for purchase on a non-discrimintory basis. They are partners with 
Copyright Clearance Center, which has standard rules to "secure permission for 
reuse of material from ACS Journals, whether that reuse is in a book/textbook, 
journal/magazine, newspaper/newsletter, classroom, thesis/dissertation, to make 
photocopies, or to order reprints." However, RightsLink does not appear to work 
for that article, which is 

  DOI: 10.1021/ci00057a005
and available at
  
http://pubs.acs.org/doi/abs/10.1021/ci00057a005?prevSearch=%255Bauthor%253A%2BWeininger%255D%2BAND%2B%255Btitle%253A%2BSMILES%255D&searchHistoryKey=

The paper may be downloaded from the ACS through a variety of ways. One way is 
through temporary access (US$30 for 48 hours), which would allow getting the 
PDF and printing it out.

Additional copies of the article exist at a number of sites, including one 
which is about 30 minutes walking distance from where I am right now. I would 
have to wait until opening hours, but I would be able to view it at no cost, 
take notes, and bring my laptop with me to implement code while in the library, 
if I wanted to.

You being at Cambridge have no doubt much easier access to it than I, and know 
how to get a copy.

  == Rights under copyright law

Once you have a copy, you are free to implement something which can parse and 
create SMILES strings. Nothing in the ACS copyright prevents you from doing 
that. There is no code or text which you need to incorporate, and the small 
data table of default valences for the organic subset and the list of which 
elements are aromatic clearly fall into fair use.

There is nothing at all in the spec which requires a license, which is why one 
is not needed.

To think otherwise would mean you want all journals which do a copyright 
transfer for a format specification to also do a license transfer.

The same holds for the MDL CT file spec. The PDF file is covered under 
copyright, but there is nothing in an implementation based on the spec which 
would require access beyond which copyright law already gives.

The same holds for the PDB spec. It's a bit more complicated because it 
requires that some text content be present, but this almost certainly falls 
into the category of not containing "a modicum of originality."

  == Remember, I'm saying that SMILES is an open spec, not that CML is closed!  

Let's compare that to the CML. As far as I am aware, the CML has never been 
published without the stated license, so I don't know if my full rights under 
copyright exists. That is, has the license document restricted me from certain 
rights I would otherwise have without the license?

I could ask a lawyer friend or two of mine. I honestly don't know.

But really, my point is NOT to say that CML is not an open specification. It's 
to say that SMILES and even the MDL CT file formats are no less open than CML.

(To be picky, a third point is that the specification is given as an XSD, which 
happens to also be usable as input to a computer program. If I were to create 
my own XSD from the file, it would almost certainly look similar, and this 
conversion would likely not fall under fair use. There are ways around this so 
it isn't relevant.)

  == JUMBO and CML use in LGPL'ed projects requires a license extension to 
copyright

As a consequence of doing this research I found that JUMBO and CML licenses are 
not well stated, and the LGPL projects which use that software are almost 
certainly in violation of the license agreement. In this case it isn't a 
question of fair use. Copyright law does not by default give these other people 
the right to distribute the JUMBO and CML software and schema and so they must 
have a license in order to do so.

Licenses are only needed to give rights that are not otherwise allowed under 
copyright.

  == Misunderstanding my goal?

> The BO is not a secret society that excludes people - it's a shared vision 
> that will try to improve its approach in response to what happens in the 
> chemical community and the wider world. The normal way to become part is to 
> offer help of some sort.

I don't understand this paragraph. I for one never claimed it was a secret 
society, and I think I have been quite generous of my time in detailing the 
flaws and limitations of admittedly rather obscure but fundamental licensing 
issues. I have also outlined several different ways in which it can be 
resolved, and some of the consequences of doing so.

The Blue Obelisk project does as a matter of course exclude some people. It 
excludes those who disagree with the fundamental goals of Open Source, Open 
Data, Open Standards. There is absolutely no problem with this exclusion.

It also excludes those like me who consider some of its assertions as being too 
much at odds with reality, to the point of making statements which are not at 
all justifiable. The idea that SMILES is a proprietary specification while CML 
is not is by far the biggest. I think I have gone into extreme detail on why 
SMILES is an open specification, and I think I have effectively countered every 
objection raised to my assertion. In doing so I think I also mentioned many 
things which would help refine what an open standard or specification means. 

If I believe in the goals of the Blue Obelisk and want it to succeed then I can 
help by showing that these statements are unfounded, which would make the 
project more approachable and acceptable to others like me. Perhaps were it 
changed enough then I would feel comfortable saying that I am a member, rather 
than an observer and a participant in some of the BO projects.

Best regards,

                                Andrew
                                [email protected]



------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Blueobelisk-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to