Re: [ccp4bb] mmCIF as working format?

Tim Gruene Mon, 05 Aug 2013 06:49:58 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Ian,


yes, also as Andrew pointed at, I meant to refer to the ration of xml
vs. cif rather than cif vs. PDB.

My worry was that future versions of programs like refmac, phenix, or
buster-TNT would only write large files (in xml-format) and no more
PDB files, so that I would have to work with xml or cif on my desktop
computer. I am not worried about what format I download from the PDB
itself.

But if that's not the case I am pleased, but even if that is the case
and if that future change is majority driven, I guess I will have to
live with it even though I have objections to most of the replies to
the thread I opened (except for the file size).

Best,
Tim

P.S.: I would hold that bet against you, but I don't think this is the
right place to discuss this.


On 08/05/2013 03:35 PM, Ian Clifton wrote:
> On 05/08/13 09:03, Tim Gruene wrote:
> 
>> having read Gerard Kleywegt's latest announcement on the wwPDB
>> Workshop (1st August) made me wonder whether it is planned to
>> introduce mmCIF as working format to users in addition to using
>> it at e.g. the PDB, because I think that would make life
>> unnecessarily complicated.
> 
> There’s nothing to stop you using your /own/ working format—it’s
> easy to extract a simpler file from the full archive file—but the
> archive file obviously has to contain the full set of metadata, and
> to be useful, that metadata has to be easily parsable.
> 
> 
>> The example mmCIF file for GroEL is about 7.5 times bigger than
>> its PDB file. I know that disk space is 'cheap' nowadays, but
>> that does not make it fast.
>> 
>> And personally I find mmCIF very awkward to work with, since it
>> is not line-oriented. 'grep', 'awk', 'perl' etc. do not work well
>> on XML-like files. Instead of using mmCIF, one could, e.g.
>> introduce a free format PDB format, with space holders for
>> non-assigned entities, and maybe a line continuation character.
> 
> Are you sure you’re talking about the CIF‐based mmCIF format here,
> not the XML‐based PDBx format? mmCIF shouldn’t be much bigger than
> PDB.
> 
>> If mmCIF is not going to be the working format for MX
>> (refinement) programs I would be happy for a reassurance, and
>> otherwise I would appreciate some comments about the benefits of
>> an XML file format over a line-oriented free format for the
>> scientists that work with structural data. I my opinion, using
>> XML (or mmCIF) for structural information is an attempt of
>> programmers to make themselves more indespensable to scientists,
>> rather than scientifically needed.
> 
> Even when searching the “simple” PDB format, you’re likely to
> encounter problems with line endings. Imagine trying to find all
> files containing PEG, your script must reliably recognise something
> like:
> 
> REMARK 280 CRYSTALLIZATION CONDITIONS: 1.0M LITHIUM SULPHATE, 100MM
> POLY REMARK 280   ETHYLENE GLYCOL
> 
> —in fact this sort of thing is much /easier/ to do, given the
> proper tools, in a format like XML.
> 
> With file formats, the devil is always in the details. If you set
> out to create a “line‐oriented, free format” PDB replacement, and
> you carefully ironed out all the potential ambiguities and awkward
> corner cases, I bet you’d come up with something close to mmCIF.

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFR/61NUxlJ7aRr7hoRAvpyAJ4oq9fWcHA657hZNCix7xoK4ktxgQCgrlx2
C+7EqGgVGKo1J3+6tZHMSqk=
=mdO9
-----END PGP SIGNATURE-----

Re: [ccp4bb] mmCIF as working format?

Reply via email to