Hi Marcus,

I completely agree. I think "overly verbose but flexible" is the definition
of JSON, and its verbosity can be frustrating at times. It's also messy in
statically typed languages - particularly typecasting as you unravel highly
nested structures.

Supporting a maximally readable JSON would help out the Python/JS scripting
folks, but will inevitably add some overhead and annoyance to the C++
workflow. Maybe the right answer is that JSON/mongo will never be the right
answer for high performance. If a project gets to a point where you're
profiling file I/O, then maybe JSON should be replaced by e.g. a simple
serialized vector.

Pat

On Mon, Nov 21, 2016 at 10:25 AM, Marcus D. Hanwell <
marcus.hanw...@kitware.com> wrote:

> On Mon, Nov 21, 2016 at 10:47 AM, Patrick Fuller
> <patrickful...@gmail.com> wrote:
> > For what it's worth, I thought I'd try to add my outside opinion to the
> > discussion.
> >
> > Where I'm coming from - I don't think that performance has ever been the
> > motivation behind the JSON format. I view JSON as developer friendly and
> > easy to implement, but also as the first thing to abandon if performance
> > becomes limiting. Even optimized BSON structures (I think) require at
> least
> > one hash lookup, so it'll never be as efficient as the O(1) you get with
> a
> > flat array filled with reference indices.
> >
> > In that spirit, my opinion would be that CJSON should focus on developer
> > friendliness at the cost of performance. CJSON should make it easier for
> a
> > first-year grad student to write useful code, even if it's obscenely slow
> > and/or requires some 2TB HDDs.
> >
> Thank you for adding your thoughts. If you want to push in that
> direction shouldn't we develop a format that treats atoms and bonds as
> objects, with labels, and offers some redundancy in favor of ease of
> use. This is largely where CML went with its representation, I can see
> the utility but it isn't the style of format I wanted to work with and
> feels overly verbose if very flexible.
>
> Something like (made up and not validated):
>
> {
>   "molecule": "myMolecule",
>   "inchi": "correctInChI_here",
>   "name": "friendlyName",
>   "atoms": [ { "atomicNumber': 4, "atomicSymbol": "Be", "x3": 1.1,
> "y3": 1.1, "z3": 0.0, "label": "a1", "customLabel": "Bob" },
>                    { "atomicNumber': 6, "atomicSymbol": "C", "x3":
> 0.0, "y3": 0.0, "z3": 0.0, "label": "a2" } ],
>   "bonds": [ { "label": "b1", "order": 1, "connections": [ "a1", "a2" ] }
> ],
>   "properties": { "propertyKey": "valuePair" }
> }
>
> You could condense the x, y, z into a vector of length 3. You can then
> just query each atom/bond object, you still need some API to do the
> lookup of bond connections to atoms. At which point I wonder if you
> might be better off just developing a little utility code to sit above
> the format.
>
> I can see the temptation to use JSON as an interface due to the
> Python/JavaScript language support. Having gone down the path of
> everything is an object before I wanted to pursue a different path,
> but see some utility in the first year grad student model. I think for
> me I would focus on two distinct use cases rather than try to put
> everything into one approach - it should be relatively simple to move
> between them.
>
> I am trying to finish writing up a paper that among other things looks
> at these two different approaches, but I guess I err on the side of
> minimalism. This type of thing is more amenable to the use the format
> as API style approach, and if I spent more time in Python/JavaScript
> may be something I would be more inclined to use. It wouldn't be hard
> to support both, and part of me would like to benchmark and look at
> storage implications out of curiosity.
>
------------------------------------------------------------------------------
_______________________________________________
Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

Reply via email to