On Fri, Feb 19, 2010 at 1:26 PM, Egon Willighagen <
[email protected]> wrote:

> On Fri, Feb 19, 2010 at 2:08 PM, Peter Murray-Rust <[email protected]>
> wrote:
> > GeoffH has made a useful suggestion (on the BO Stack Overflow,
> >
> http://blueobelisk.stackexchange.com/questions/231/what-formats-fall-into-open-specification
> )
> > which I'd be happy to take as a starting point.
>
> Likewise...
>
> > -----
> > If [a spec/standard] is freely modifiable, then it's likely to fragment.
> > Look at HTML for an example.
> >
> > OpenSMILES is, perhaps, an interesting example, since there's an open
> group
> > working on the specification. But the same is true of HTML.
>
> Interestingly... you could argue that OpenSMILES is a fork of SMILES.
> That is, SMILES is sufficiently open to even allow OpenSMILES.
>
> > The phrases should probably be something like:
> >
> > openly shared and distributed
> > open to multiple implementations
> > open to improvement
>
> The latter is to me a rather important part: allowing modification. We
> formalize this by means of licenses in Open Source, and by means of
> waivers and licenses in Open Data. *But* for both we reached the
> consensus that the right for modification should be formalized.
>
> This does not mean that there is a universal absolute right to absolute
modification, but that the process of modification should be formalized.
Within that some things will be permitted or encourgaed and others will be
forbidden or discouraged.


> > In short, there's a mechanism to clarify implementation details (and/or a
> > set of validation tests), and some sort of community process.
> >
> > I'm still not sure if you can avoid HTML (or PDB or SDF) fragmentation.
> But
> > I think that's in line with what you're suggesting.
>
> I am not worried about fragmentation. Human habits is not to favor
> small project and small forks; instead, the best 'fork' wins.
>
>
I take a completely different view here. There were several early
implementations of CML done without my knowledge or the simple courtesy of
contacting me/Henry. There are programs supplied by commercial companies
which "save as CML" and the CML is not conformant. One company simply
wrapped a load of completely random XML in <cml>... </cml> wrappers. Then
people try to read this and it fails and they blame CML. So right to modify
a spec without consultation and in any way I believe is harmful.



> > Calling it "freely modifiable" seems like the wrong tone.
> > ---
>
> How so?
>
> > Instead it is more useful to the authors and the
> > community to express what they feel is important and to iterate toward a
> > rough consensus. In my own case of CML I completely sign up to all
> Geoff's
> > suggestions. CML is sharable and distributable, there are several
> > coordinated multiple implementations and there is a community process
> (open
> > mailing list, contributors from outside the authorship). The balance
> between
> > fragmentation and innovation is a difficult one and can be best be
> managed
> > through open discussion and creation of software artifacts.
>
> This is how it works for Open Source too, and soon Open Data likely
> too. I see no reason why allowing 'modification' would immediately
> cause heavy fragmentation, and loss of momentum; I do not understand
> the source of this fear... did I miss an important example where
> allowing this caused the specification project to die?
>

CML was developed specifically for the process of creating a
machine-readable specification that could enforce validation (as far as I am
aware it's the only one in chemistry). There are several variations in CML
allowing controlled flexibility. They include CMLSpect, CMLComp and CMLLite
to be released at ACS. The former two were developed by consultation between
several groups. They are not a dicatorial imposition. However they do
provide conformance processes. A typical example was JSpecview where RobertL
wished to restrict the type of CML data that the program could accept and
reject. He made the decision, not me. IMO that's a good example of community
process.

"Community process" does not mean an anarchic free-for-all where anyone can
do anything. It means a process where interested members of the community
can have their voice heard and their contributions accepted if appropriate.
It does not necessarily rely on a majority voting system. It certainly
involves work on governance which is hard and time-consuming. It cannot be
controlled by specifying the process in a licence.

It's worth thinking about Openness in MDL-molfile has evolved. It certainly
was not even public let alon Open when it was first deployed (I ran a
meeting in 1983 where this was a significant issue). It's gradually evolved
to a stage where it is widely distributed and widely used - and there are
validators - we wrote one for CML with MDL's psonsorship.  In that sense
it's a community process. However as far as I know there are no Open methods
for it to evolve. It's copyright Symyx (and probably trademarked) and Symyx
have the legal right to forbid any mofication of the spec. As far as I know
neither MDL or Symyx has made any offical statement about the public
modification of MDL-mol. AFAIK they are disinteredly benevolent but it's
still their spec, not ours. So I regard MDL as de facto openly accessible
and implementable but not modifiable.

P.


-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Blueobelisk-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to