On Wed, Jan 20, 2016 at 7:42 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu>
wrote:

> On 01/20/2016 04:57 PM, Peter S. Shenkin wrote:
> > On Wed, Jan 20, 2016 at 5:33 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu>
> > wrote:
>
> >> JSON encodes a single string. That is a problem for sending larger files
> >> over the net, say, an NMR structure of a larger molecule with 100 models
> >> in the file.
> >>
> >
> > That's not a problem, conceptually, because you can have an array of
> > structures.
>
> No, my point was that streaming isn't a part of JSON specification and
> common implementations do not offer it.
>
> https://en.wikipedia.org/wiki/JSON_Streaming

You can cut one model out of a PDB file (or one structure out of and
> SDF) and the result is a valid file.
>

If each array element was complete, the same would be true here. A
pdb-aware JSON API could wrap a streaming unpacker around a batch
implementation of choice.


> In ASN.1 the length of the value is at the front.


I believe that depends on the encoding, and in any case, streaming asn.1
decoders are available. But none are freeware, as far as I know.

 have a file full of "disjoint" single structures, possibly with
> some kind of metadata header. (I haven't touched ASN.1 since school, so
> don't quote me on this.)
>

Yes, I think that's right, though I've not used ASN.1 for a long time
either.

Oh wait, that sounds exactly like PDB with its REMARKs and MODELs.
>

No, it doesn't, because the problem that I thought we were trying to
address is rather the lack of extensibility, the lack of lower-case, the
fact that different users (even for deposited structures, IIRC) and
different software products overload the available fields differently (like
putting partial charge in the Temperature Factor field) and have violated
the standard by doing necessary but formally disallowed things such as
using multiple CONECT fields to indicate multiple bonds.

Having said all this, it would suffices to write APIs that allow
specification of a dialect (CHARMM, PDB_STD, etc.) and have a convention
for returning all the contents in arrays, dictionaries, what have you,
where the keys reflect the semantics of the dialect (like "partial_charge"
or "T_factor"), and where the unused keys would return NULL.

So then, a separate question is whether there also needs to be a serialized
format for the resulting object that associated APIs can also read and
write.

'Nuff said. (By me, at least, since I'm not volunteering to do it. :-) )

-P.
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to