On Fri, Nov 18, 2016 at 11:11 AM, Boone, Paul <paulbo...@pitt.edu> wrote:
>
> I’m working with Geoff on the python plugin architecture. There are a couple
> small change to the cjson format I’d like to propose. Currently, the cjson
> has structure that is implied but not explicit in the file itself, and this
> forces an adopter of the file to extrapolate the format instead of just
> reading it. Making the structure explicit will make it much easier to use
> the cjson format, as well as making it more intuitive when looking at or
> editing the file directly.
>
> The changes would be:
>
> (1) group atom coordinates by atom:
>
> i.e.:
>  “3d”: [
> [1,2,3],
> [1,2,5],
> etc
> ]
>
> instead of:
> “3d”: [
> 1,2,3,1,2,5,
> etc
> ]
>
> (2) group bonds by bond:
>
> i.e.:
>
>   "index" :
>       [
>         [0,1],
>         [0,34],
>         [0,35],
> ...
>      ]
>
> instead of:
>
>   "index" :
>       [
>         0,
>         1,
>         0,
>         34,
>         0,
>         35,
>         ...
>     ]
>
> When the file semantically reflects the actual structure, we can just use
> the cjson as-is without doing anything. Currently, where it doesn’t reflect
> the actual structure, I have to do list comprehensions that are not terribly
> intuitive to marshall the structure back and forth.
>
One of the concerns when developing this was storage in BSON, where
each array needs a type, a length, and then the values. Having many
short arrays is not very efficient, it also makes the mapping from
Avogadro's internal storage very simple as it is much more efficient
to store connectivity, coordinates etc in a contiguous block which is
just written to a JSON array.

I think developing small amounts of C++, Python and JavaScript API to
interface with the raw arrays is reasonable. If people feel very
strongly about making tuples the extra space in a JSON text block
isn't too much of a concern, but there is existing code that I would
like to continue supporting for at least the short term.

If we change to tuples we should bump the format number, and keep code
to read/write version 0 in the reader/writer to support older files.
There are a number of other changes we are considering too.
Alternatively we could add a new key for tuples and output that for
the Python code, retaining the flat arrays for compatibility with
existing code. It seems like a reasonable convenience function.

------------------------------------------------------------------------------
_______________________________________________
Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

Reply via email to