On Dec 10, 2009, at 6:58 PM, Peter Murray-Rust wrote: But, for the record: > > CML HAS a licence (Artistic) specified in the pom.xml file associated with > the project.
== The existence of "LICENSE.txt" The file LICENSE.txt is in the same directory as pom.xml and says: CML Schema is distributed under a Creative Commons license, allowing redistribution but NOT derivative works. This is to ensure that the schema does not mutate. There are two main CC licenses which do not allow derivative works. These are: http://creativecommons.org/licenses/by-nd/2.0/ "Attribution-No Derivative Works 2.0 Generic" http://creativecommons.org/licenses/by-nc-nd/3.0/ "Attribution-Noncommercial-No Derivative Works 3.0" It is not possible to tell which of those is meant. Under the worst case scenario, it means the latter, and would clearly not be open. This must be clarified for it to be useable. The Creative Commons site recommends that its licenses should not be used for source code, in preference to the public domain, BSD, GPL, or LGPL. As the XSD file is a form of source code (it is input to a validator), the two listed CC licenses are likely a poor choice, or the use of an XSD file as the normative document is a poor choice. For one, it prevents translation to other schema formats, or to validating XML parsers. == Does LICENSE.txt take precedence over pom.xml? Given the name "LICENSE.txt" I think it's fair to say that it has precedence over a file named "pom.xml". The pom.xml file does not mention the extra qualifications from LICENSE.txt ("Any distribution must acknowledge the origins and also include copies of the JUMBO source", "You may not claim that a modified version is a compliant CML system and may not assert that it reads or writes CML."). Someone searching for a license should be expected to consider that the pom.xml is an approximation to the license for the entire package, perhaps limited by some schema definition which does not allow the full details. I searched for and found that pom.xml is part of Maven. The license section is at http://maven.apache.org/pom.html#Licenses and it's clear to see that the schema indeed is not powerful enough to handle license statements which apply only to a part of the software. == Suppose JUMBO is licensed under an unmodified "Artistic License" (and not the "Artistic License 2.0" version) Let's suppose though that I take CML to also be distributed under the "Artistic License". The link in the pom.xml file is to: http://www.opensource.org/licenses/artistic-license.php which says it's to the outdated license, and people should upgrade to the 2.0 license. Let's suppose the license is really the outdated "Artistic License" (1.0) and not the 2.0 license. The FSF quite clearly says: http://www.fsf.org/licensing/licenses/index_html#ArtisticLicense We cannot say that this is a free software license because it is too vague; some passages are too clever for their own good, and their meaning is not clear. We urge you to avoid using it, except as part of the disjunctive license of Perl. and puts it under the category "The following licenses do not qualify as free software licenses." I see that Debian does allow the Artistic License under its Free Software Guidelines: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=381729 but acceptance does not imply compatibility with the LGPL or GPL. Note also that every package which distributes Java binaries (CDK, JChemPaint, and others) is in violation of this 1.0 license because they do not distribute the source to JUMBO. Few of them even list using JUMBO. This is in the process of changing, due to my making a nuisance of myself. I doubt at this point that any of the maintainers of those projects are willing to make a decision contrary to the FSF, although asking the licensing wizards of Debian may help clarify things. == Suppose JUMBO is distributed under "Artistic License 2.0" Is the 2.0 license compatible with the LGPL? Let's suppose the license really should be 2.0 (and in private email you have said that is the case). That means CDK, etc. must either adhere to this strange clause: (8) You are permitted to link Modified and Standard Versions with other works, to embed the Package in a larger work of your own, or to build stand-alone binary or bytecode versions of applications that include the Package, and Distribute the result without restriction, provided the result does not expose a direct interface to the Package. which Bioclipse I'm told, does not do, or use clause 4(c)(ii) and relicense the source using any license which permits the licensee to freely copy, modify and redistribute the Modified Version using the same licensing terms that apply to the copy that the licensee received, and requires that the Source form of the Modified Version, and of any works derived from it, be made freely available in that license fees are prohibited but Distributor Fees are allowed. Distribution of Compiled Forms of the Standard Version or Modified Versions without the Source Larry Wall meant for this to be compatible with the GPL and the FSF agrees that it is compatible, but it's not obvious to me that a possible relicense includes the LGPL. The binary distribution (which CDK etc. do) is a work derived from JUMBO, which means it must include a copyright which allows free access to copy, modify, and redistribute the source code for that work, in its entirety. This would seem to allow the GPL but not the LGPL, which has a more limited intent. When combined with clause 8, where the original goal was to prohibit embedding JUMBO in another package, I think it's fair to say the intent is to discourage the use of the LGPL over the GPL. I am not a lawyer, but as someone reviewing the license before deciding to include in my package I would be wary. Then again, I'm a BSD license fan so I'm already wary. == Remember, LICENSE.txt prohibits modified versions from claiming CML support The LICENSE.txt file is clearly meant to override the terms of the Artistic License, saying: You may not claim that a modified version is a compliant CML system and may not assert that it reads or writes CML. Bioclipse, BTW, distributes "jar/jumbo-with-fix-by-jonalv.jar" which is a patched form of JUMBO so clearly Bioclipse is not allowed to claim CML support. I believe this text also means that any derived license must also contain this override, so a relicense to the LGPL, if allowed, must also enforce this claim. I believe that doing so is allowed by the LGPL because the LGPL says Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. I would not guarantee that this clause and the LGPL are compatible. I strongly suggest it is not allowed by Debian, as this would correspond roughly to misuse of the invariant sections in the GNU FDL. I have bought this issue up with several of the projects which use JUMBO, and pointed out that they are in technical violation of the Artistic License (or of the 2.0 form). One such is Egon, and I'm sure he's wondering how this will be resolved. == Possible resolutions to the licensing conflict The simplest would be to relicense everything under the LGPL, and without the prohibition against saying that modified code reads or write CML. (Or relicense under the BSD, or put into the public domain.) This would be clearly compatible with the existing LGPL projects which use JUMBO. This is acceptable to me, and would have no downstream consequences. Next simplest would be to remove the clause and say that LGPL is a valid relicensing of clause 4(c)(ii). This would be the disjunctive form of (Artistic License 2.0 | LGPL). This is acceptable to me and would have no downstream consequences - everyone would relicense it under the LGPL. Next next simplest would be to leave the clause in and clarify that LGPL is allowed. However, it would still mean that Bioclipse, which uses a patched library, could not say that it reads/writes CML files. There are obvious downstream consequences, especially if packaged with something like Debian. Do bear in mind that if the software can be used under the LGPL then anyone is free to take the CML xsd schema and alter it under the terms of the LGPL. That is something which you've several times said was not your intent, which means the XSD, when used as code, is not free or open source software. I personally don't consider it worrisome. == The JUMBO license must be clarified, or the package must be removed from CDK, JChemPaint, Bioclipse, etc. > That is a valid thing to do. It's not very convenient for some people. When > I find time I'll do soimething. First, you are correct. You do not need to do anything at all. I will make a very legal point. As free and open source software developers (and commercial as well), we must be firm believers of strong copyright. The free software movement could not exist without it. Currently, *every* LGPL-based project which includes JUMBO and CML is in violation of the stated "Artistic License", which the FSF clearly categorizes as "not free." They also do not distribute the JUMBO source, which is required. That's easily fixed, and some of them are reconciling this specific lack. Many of them are in violation of the "Artistic License 2.0", either because they expose part of the JUMBO or CML API (violating section (8)), or haven't relicensed the software to some form which allows that. I'm not certain that the 2.0 license is compatible with the LGPL, and neither am I certain that the restrictions you want for the XSD file can be passed over to an LGPL file. At least one package (Bioclipse) includes a patched version of JUMBO, which means it may no longer claim that it supports CML. So long as you and Rzepa make no changes to the license, then most of the downstream packages which use JUMBO should feel obligated to remove JUMBO from their binary distributions, because leaving it in would violate the explicit license requirements. You can say the license is otherwise, but please be specific as to what the license you are actually releasing this under (version number, and even better would be the actual license text rather than an ambiguous URL and incomplete license name), and help resolve these conflicts I've highlighted; and hopefully also change the license files so it is not ambiguous to others in the future. Adding an explicit copyright statement to the distribution would also help (see below). If the downstream packages do not remove JUMBO support, then it's either because the maintainers don't feel there is a license violation (if you, dear reader, are one such maintainer, please tell me why that's the case) or they understand that you or Rzepa, as copyright holders, would not sue them or otherwise take action. That latter case should be extremely distasteful to anyone who believes in a strong copyright and clear licenses, and places a minor but unexpected risk on further downstream users which must at least be acknowledge prominently in the license section of the help and documentation. == Is the existing license clear? > If someone wishes to help I'd be delighted. CML copyright is Henry Rzepa and > Peter Murray-Rust. They wrote it and they've published it. You may not like > the licence and you may not like the authorship but it's clear. It is clear? It's clear that there are two different documents in the distribution which assert a license, and that these two documents say different things. One is a subset of the other. It's clear that the stated "Artistic License" does not count as "free software" according to the FSF, which also says that it is not compatible with the GPL and by inference, neither with the LGPL. (All evidence says the license is "Artistic License" and NOT "Artistic License 2.0".) It's clear that the clause in LICENSE.txt about using a Creative Commons license without derivations can refer to two different licenses, one of which prohibits commercial use. It's clear that I'm a persistent nit-picker. I would like to help. Relicense the entirety of JUMBO under the LGPL, without additional constraints on how modifications prohibit using the name "CML", and it would be fine for everyone. == JUMBO lacks an explicit copyright notice Now that you bring up copyright ownership, I see that nothing inside the JUMBO distribution states who owns the copyright. There only relevant mention of 'copyright' or 'copy right' or '(c)' or '©' I could find (from searching in jumbo-5.5-b1) was: Morgan.java: * @author pm286 Copyright P.Murray-Rust, 29-May-2005 Artistic license "Rzepa" exists in pom.xml as a "developer" along with many others, and in the "LICENSE.txt" file along with your name. The latter just needs the magic text "copyright by" before those two names to make me happy. A year would be nice, but is not required. The Artistic License does require an explicit copyright notice, saying: "Copyright Holder" means the individual(s) or organization(s) named in the copyright notice for the entire Package. == What don't we need a license for SMILES, MDL's CT formats, PDB, ... ? > In contrast there is no license for Daylight SMILES specification that I know > of. There is no licence for ctfile.pdf (the online documentation of MDL > files). There's no licence for most biological data specs - PDB, etc. files > are covered by Community Norms. That's because they are not needed. I suspect this is a first-thought reaction rather than a considered opinion, as otherwise it would imply you don't know the purpose of a license. I will explain, for clarity's sake and for those reading along. Licenses are only needed to give rights that are not otherwise allowed under copyright. (Or to restrict those rights, but that's a different point I'll get to in a bit.) == How to get the original SMILES paper The SMILES specification was first made in JCICS (1988) and is also available in a more recent book chapter also by Weininger. Additional documentation is available online for your review, and if you email Daylight and ask for a written copy of the tutorial books, which includes the online documenttion, there's a good chance they'll send you a copy. The copyright to the JCICS paper is owned by the ACS, which makes the paper available for purchase on a non-discrimintory basis. They are partners with Copyright Clearance Center, which has standard rules to "secure permission for reuse of material from ACS Journals, whether that reuse is in a book/textbook, journal/magazine, newspaper/newsletter, classroom, thesis/dissertation, to make photocopies, or to order reprints." However, RightsLink does not appear to work for that article, which is DOI: 10.1021/ci00057a005 and available at http://pubs.acs.org/doi/abs/10.1021/ci00057a005?prevSearch=%255Bauthor%253A%2BWeininger%255D%2BAND%2B%255Btitle%253A%2BSMILES%255D&searchHistoryKey= The paper may be downloaded from the ACS through a variety of ways. One way is through temporary access (US$30 for 48 hours), which would allow getting the PDF and printing it out. Additional copies of the article exist at a number of sites, including one which is about 30 minutes walking distance from where I am right now. I would have to wait until opening hours, but I would be able to view it at no cost, take notes, and bring my laptop with me to implement code while in the library, if I wanted to. You being at Cambridge have no doubt much easier access to it than I, and know how to get a copy. == Rights under copyright law Once you have a copy, you are free to implement something which can parse and create SMILES strings. Nothing in the ACS copyright prevents you from doing that. There is no code or text which you need to incorporate, and the small data table of default valences for the organic subset and the list of which elements are aromatic clearly fall into fair use. There is nothing at all in the spec which requires a license, which is why one is not needed. To think otherwise would mean you want all journals which do a copyright transfer for a format specification to also do a license transfer. The same holds for the MDL CT file spec. The PDF file is covered under copyright, but there is nothing in an implementation based on the spec which would require access beyond which copyright law already gives. The same holds for the PDB spec. It's a bit more complicated because it requires that some text content be present, but this almost certainly falls into the category of not containing "a modicum of originality." == Remember, I'm saying that SMILES is an open spec, not that CML is closed! Let's compare that to the CML. As far as I am aware, the CML has never been published without the stated license, so I don't know if my full rights under copyright exists. That is, has the license document restricted me from certain rights I would otherwise have without the license? I could ask a lawyer friend or two of mine. I honestly don't know. But really, my point is NOT to say that CML is not an open specification. It's to say that SMILES and even the MDL CT file formats are no less open than CML. (To be picky, a third point is that the specification is given as an XSD, which happens to also be usable as input to a computer program. If I were to create my own XSD from the file, it would almost certainly look similar, and this conversion would likely not fall under fair use. There are ways around this so it isn't relevant.) == JUMBO and CML use in LGPL'ed projects requires a license extension to copyright As a consequence of doing this research I found that JUMBO and CML licenses are not well stated, and the LGPL projects which use that software are almost certainly in violation of the license agreement. In this case it isn't a question of fair use. Copyright law does not by default give these other people the right to distribute the JUMBO and CML software and schema and so they must have a license in order to do so. Licenses are only needed to give rights that are not otherwise allowed under copyright. == Misunderstanding my goal? > The BO is not a secret society that excludes people - it's a shared vision > that will try to improve its approach in response to what happens in the > chemical community and the wider world. The normal way to become part is to > offer help of some sort. I don't understand this paragraph. I for one never claimed it was a secret society, and I think I have been quite generous of my time in detailing the flaws and limitations of admittedly rather obscure but fundamental licensing issues. I have also outlined several different ways in which it can be resolved, and some of the consequences of doing so. The Blue Obelisk project does as a matter of course exclude some people. It excludes those who disagree with the fundamental goals of Open Source, Open Data, Open Standards. There is absolutely no problem with this exclusion. It also excludes those like me who consider some of its assertions as being too much at odds with reality, to the point of making statements which are not at all justifiable. The idea that SMILES is a proprietary specification while CML is not is by far the biggest. I think I have gone into extreme detail on why SMILES is an open specification, and I think I have effectively countered every objection raised to my assertion. In doing so I think I also mentioned many things which would help refine what an open standard or specification means. If I believe in the goals of the Blue Obelisk and want it to succeed then I can help by showing that these statements are unfounded, which would make the project more approachable and acceptable to others like me. Perhaps were it changed enough then I would feel comfortable saying that I am a member, rather than an observer and a participant in some of the BO projects. Best regards, Andrew [email protected] ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Blueobelisk-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
