I will just deal with the specific aspects of SMILES and CML.

SMILES is effectively an open standard. CanonicalSMILES (which I think is
the problem) is not.

The SMILES spec has been published though there are some gray areas - like
how aromaticity and tautomers are calculated. The answer here is "whatever
the current commercial program emits". That makes the definition of
aromaticity and tautomerism not open - it can only be determined by reading
the code and that is not universally available.

CanonicalSMILES is a major problem (if it wasn't then I suspect InChI would
not have been developed). The paper describing the canonicalisation is
seriously different from the implementation in the program. Many of us
(including me several times) have asked Daylight to publish the
canonicalisation algorithm. They have consistently refused. CanonicalSMILES
is therefore only usable by the small subet of Daylight subscribers.

I do not know what Andrew's problem is with CML. It's published. The schema
is on Sourceforge. The JUMBO code which is meant to be a reference is Open.
The Chem4Word code which is released today is destined to be Open. Many
people have contributed to CML and JUMBO. Anyone can raise a problem on the
mailing list. I will do my best endeaviour to reply. The only problem seems
to be that Henry Rzepa and I have the final say word on what does and what
does not go in. That's the same with W3C (TimBL), Python (Guido), etc. I
don't believe a formal specification can be developed by an indefinite
democracy. The process is carried out in the Open. In practice relatively
little changes in CML so there isn't much traffic. If Andrew wants something
to change I'm happy for all the discussion to be in public.

P

On Thu, Dec 10, 2009 at 4:25 PM, Andrew Dalke <[email protected]>wrote:

> I can say on my part that I wanted clarification of why the Blue Obelisk
> wiki page on Open Standards said that SMILES was not an open standard while
> CML was an open standard.
>
> The continued description of SMILES as a closed standard has been an
> irritation to me, because SMILES seems to be about as far from a closed,
> proprietary standard as anything else which exists in this field.
>
> I've also raised the issue that CML does not include a copyright statement
> or have a definite distribution license attached to it, which makes it hard
> for me to consider it "open".
>
> Because of my talking about that over the last week, I think that CML
> license question will be resolved soon, especially as the lack of license
> may affect downstream projects like the CDK and JChemPaint which include the
> CML schema as part of the distribution.
>
>
>

Many of these problems are due to oversight or overwork. That's where
voluntary help is valuable.


> > The word "open" is more political than descriptive.
>
> I agree. I want to know the political views of Blue Obelisk which gets them
> to say that something is open or not. I have my conjectures, but I would
> rather hear something more concrete from them first.
>
The Blue Obelisk is a set of people who share fairly common views. It's a
meritocracy, in that those who contribute are regarded highly. That includes
code, docs, specs, tutorials, advocacy, anything.

In general if people ask "why doesn't the BO do X" the reply is likely to be
"good ideas, would you like to help?"



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Blueobelisk-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to