On Mon, Nov 21, 2016 at 10:47 AM, Patrick Fuller
<patrickful...@gmail.com> wrote:
> For what it's worth, I thought I'd try to add my outside opinion to the
> discussion.
>
> Where I'm coming from - I don't think that performance has ever been the
> motivation behind the JSON format. I view JSON as developer friendly and
> easy to implement, but also as the first thing to abandon if performance
> becomes limiting. Even optimized BSON structures (I think) require at least
> one hash lookup, so it'll never be as efficient as the O(1) you get with a
> flat array filled with reference indices.
>
> In that spirit, my opinion would be that CJSON should focus on developer
> friendliness at the cost of performance. CJSON should make it easier for a
> first-year grad student to write useful code, even if it's obscenely slow
> and/or requires some 2TB HDDs.
>
Thank you for adding your thoughts. If you want to push in that
direction shouldn't we develop a format that treats atoms and bonds as
objects, with labels, and offers some redundancy in favor of ease of
use. This is largely where CML went with its representation, I can see
the utility but it isn't the style of format I wanted to work with and
feels overly verbose if very flexible.

Something like (made up and not validated):

{
  "molecule": "myMolecule",
  "inchi": "correctInChI_here",
  "name": "friendlyName",
  "atoms": [ { "atomicNumber': 4, "atomicSymbol": "Be", "x3": 1.1,
"y3": 1.1, "z3": 0.0, "label": "a1", "customLabel": "Bob" },
                   { "atomicNumber': 6, "atomicSymbol": "C", "x3":
0.0, "y3": 0.0, "z3": 0.0, "label": "a2" } ],
  "bonds": [ { "label": "b1", "order": 1, "connections": [ "a1", "a2" ] } ],
  "properties": { "propertyKey": "valuePair" }
}

You could condense the x, y, z into a vector of length 3. You can then
just query each atom/bond object, you still need some API to do the
lookup of bond connections to atoms. At which point I wonder if you
might be better off just developing a little utility code to sit above
the format.

I can see the temptation to use JSON as an interface due to the
Python/JavaScript language support. Having gone down the path of
everything is an object before I wanted to pursue a different path,
but see some utility in the first year grad student model. I think for
me I would focus on two distinct use cases rather than try to put
everything into one approach - it should be relatively simple to move
between them.

I am trying to finish writing up a paper that among other things looks
at these two different approaches, but I guess I err on the side of
minimalism. This type of thing is more amenable to the use the format
as API style approach, and if I spent more time in Python/JavaScript
may be something I would be more inclined to use. It wouldn't be hard
to support both, and part of me would like to benchmark and look at
storage implications out of curiosity.

------------------------------------------------------------------------------
_______________________________________________
Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

Reply via email to