On Mon, Nov 21, 2016 at 10:47 AM, Patrick Fuller <patrickful...@gmail.com> wrote: > For what it's worth, I thought I'd try to add my outside opinion to the > discussion. > > Where I'm coming from - I don't think that performance has ever been the > motivation behind the JSON format. I view JSON as developer friendly and > easy to implement, but also as the first thing to abandon if performance > becomes limiting. Even optimized BSON structures (I think) require at least > one hash lookup, so it'll never be as efficient as the O(1) you get with a > flat array filled with reference indices. > > In that spirit, my opinion would be that CJSON should focus on developer > friendliness at the cost of performance. CJSON should make it easier for a > first-year grad student to write useful code, even if it's obscenely slow > and/or requires some 2TB HDDs. > Thank you for adding your thoughts. If you want to push in that direction shouldn't we develop a format that treats atoms and bonds as objects, with labels, and offers some redundancy in favor of ease of use. This is largely where CML went with its representation, I can see the utility but it isn't the style of format I wanted to work with and feels overly verbose if very flexible.
Something like (made up and not validated): { "molecule": "myMolecule", "inchi": "correctInChI_here", "name": "friendlyName", "atoms": [ { "atomicNumber': 4, "atomicSymbol": "Be", "x3": 1.1, "y3": 1.1, "z3": 0.0, "label": "a1", "customLabel": "Bob" }, { "atomicNumber': 6, "atomicSymbol": "C", "x3": 0.0, "y3": 0.0, "z3": 0.0, "label": "a2" } ], "bonds": [ { "label": "b1", "order": 1, "connections": [ "a1", "a2" ] } ], "properties": { "propertyKey": "valuePair" } } You could condense the x, y, z into a vector of length 3. You can then just query each atom/bond object, you still need some API to do the lookup of bond connections to atoms. At which point I wonder if you might be better off just developing a little utility code to sit above the format. I can see the temptation to use JSON as an interface due to the Python/JavaScript language support. Having gone down the path of everything is an object before I wanted to pursue a different path, but see some utility in the first year grad student model. I think for me I would focus on two distinct use cases rather than try to put everything into one approach - it should be relatively simple to move between them. I am trying to finish writing up a paper that among other things looks at these two different approaches, but I guess I err on the side of minimalism. This type of thing is more amenable to the use the format as API style approach, and if I spent more time in Python/JavaScript may be something I would be more inclined to use. It wouldn't be hard to support both, and part of me would like to benchmark and look at storage implications out of curiosity. ------------------------------------------------------------------------------ _______________________________________________ Avogadro-devel mailing list Avogadro-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/avogadro-devel