-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Ian,
yes, also as Andrew pointed at, I meant to refer to the ration of xml vs. cif rather than cif vs. PDB. My worry was that future versions of programs like refmac, phenix, or buster-TNT would only write large files (in xml-format) and no more PDB files, so that I would have to work with xml or cif on my desktop computer. I am not worried about what format I download from the PDB itself. But if that's not the case I am pleased, but even if that is the case and if that future change is majority driven, I guess I will have to live with it even though I have objections to most of the replies to the thread I opened (except for the file size). Best, Tim P.S.: I would hold that bet against you, but I don't think this is the right place to discuss this. On 08/05/2013 03:35 PM, Ian Clifton wrote: > On 05/08/13 09:03, Tim Gruene wrote: > >> having read Gerard Kleywegt's latest announcement on the wwPDB >> Workshop (1st August) made me wonder whether it is planned to >> introduce mmCIF as working format to users in addition to using >> it at e.g. the PDB, because I think that would make life >> unnecessarily complicated. > > There’s nothing to stop you using your /own/ working format—it’s > easy to extract a simpler file from the full archive file—but the > archive file obviously has to contain the full set of metadata, and > to be useful, that metadata has to be easily parsable. > > >> The example mmCIF file for GroEL is about 7.5 times bigger than >> its PDB file. I know that disk space is 'cheap' nowadays, but >> that does not make it fast. >> >> And personally I find mmCIF very awkward to work with, since it >> is not line-oriented. 'grep', 'awk', 'perl' etc. do not work well >> on XML-like files. Instead of using mmCIF, one could, e.g. >> introduce a free format PDB format, with space holders for >> non-assigned entities, and maybe a line continuation character. > > Are you sure you’re talking about the CIF‐based mmCIF format here, > not the XML‐based PDBx format? mmCIF shouldn’t be much bigger than > PDB. > >> If mmCIF is not going to be the working format for MX >> (refinement) programs I would be happy for a reassurance, and >> otherwise I would appreciate some comments about the benefits of >> an XML file format over a line-oriented free format for the >> scientists that work with structural data. I my opinion, using >> XML (or mmCIF) for structural information is an attempt of >> programmers to make themselves more indespensable to scientists, >> rather than scientifically needed. > > Even when searching the “simple” PDB format, you’re likely to > encounter problems with line endings. Imagine trying to find all > files containing PEG, your script must reliably recognise something > like: > > REMARK 280 CRYSTALLIZATION CONDITIONS: 1.0M LITHIUM SULPHATE, 100MM > POLY REMARK 280 ETHYLENE GLYCOL > > —in fact this sort of thing is much /easier/ to do, given the > proper tools, in a format like XML. > > With file formats, the devil is always in the details. If you set > out to create a “line‐oriented, free format” PDB replacement, and > you carefully ironed out all the potential ambiguities and awkward > corner cases, I bet you’d come up with something close to mmCIF. - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFR/61NUxlJ7aRr7hoRAvpyAJ4oq9fWcHA657hZNCix7xoK4ktxgQCgrlx2 C+7EqGgVGKo1J3+6tZHMSqk= =mdO9 -----END PGP SIGNATURE-----
