> I specifically wouldn’t worry about space considerations of the sub-arrays, 
> but I don’t ever worry about space for JSON since it just wasn’t intended for 
> that. Right now I think the biggest CJSON file we’re testing with for the 
> python interface is about 117k, which I don’t think of as large. 
...
> - are there really two formats here: (1) a format designed for internal use, 
> ease of importing / exporting straight to/from avogadro internal structures, 
> and optimized for minimal size, likely using BSON and (2) a public format 
> designed for readablity and semantic explicitness?


I guess the side question about size is also whether compressed CJSON is 
sufficient for many purposes. (I understand BSON is useful for MongoDB, but I'm 
assuming "array of array" is still possible in BSON.)

The 117Kb file that Paul mentions is a 978 atom nanotube. It gzips to 20K.

I checked a few PDB files:
1fha (1361 atoms) = 172K CJSON = 46KB gzipped
5jh9 (~15000 atoms) = 2.6MB PDB file = 2.1MB CJSON = 472K gzipped CJSON
1rcx (~39000 atoms) = 3.2MB PDB file = 5.1MB CJSON (real space) = 1.2M gzipped

I think the idea of an optional tuple version is a good compromise, combined 
with a format version bump. The C++ code can easily support both variations for 
backwards compatibility.

-Geoff
------------------------------------------------------------------------------
_______________________________________________
Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

Reply via email to