Re: [ccp4bb] mmCIF as working format?

Jeffrey, Philip D. Wed, 07 Aug 2013 16:55:38 -0700

 Nat Echols wrote:
> Personally, if I need to change a chain ID, I can use Coot or pdbset or many 
> other tools.  Writing code for
> this should only be necessary if you're processing large numbers of models, 
> or have a spectacularly
> misformatted PDB file.


Problem.  Coot is bad at the chain label aspect.
Create a pdb file containing residues A1-A20 and X101-X120 - non-overlapping 
numbering.
Try to change the chain label of X to A.
I get "WARNING:: CONFLICT: chain id already exists in this molecule"

This is (IMHO) a bizarre feature because this is exactly the sort of thing you 
do when building structures.

Therefore I do one of two things:
1.  Open it in (x)emacs, replace " X " with " A " and Bob's your uncle.
2.  Start Peek2 - that's my interactive program for doing simple and stupid 
things like this.  I type "read test.pdb" and "chain" and Peek2 prompts me at 
perceived chain breaks (change in chain label, CA-CA breaks, ATOM/HETATM 
transitions &c) and then "write test.pdb".   Takes less than 10 seconds.  CCP4i 
would probably still be launching, as would Phenix.

The reason I do #1 or #2 is not to be a Luddite, but to do something trivial 
and boring quickly so I can get back to something interesting like building 
structures, or beating subjects to death on CCP4bb.

What's lacking is an interactive, or just plain fast method in any guise, way 
of doing simple PDB manipulations that we do tons of times when building 
protein structures.  I've used Peek2 thousands of times for this purpose, which 
is the only reason it still exists because it's a fairly stupid program.  A 
truly interactive version of PDBSET would be splendid.  But, again, it always 
runs in batch mode.

mmCIF looked promising, apropos emacs, when I looked at the spec page at:
http://www.iucr.org/__data/iucr/cifdic_html/2/cif_mm.dic/Catom_site.html
because that ATOM data is column-formatted.  Cool.  However looking at 6LYZ.cif 
from RCSB's site revealed that the XYZ's were LEFT-justified: 
http://www.rcsb.org/pdb/files/6LYZ.cif
which makes me recoil in horror and resolve to use PDB format until someone 
puts a gun to my head.

Really, guys, if you can put multiple successive spaces to the RIGHT of the 
number, why didn't you put them to the LEFT of it instead ?  Same parsing, 
better readability.

Phil Jeffrey
Princeton
(using the vernacular but deathly serious about protein structure)

Re: [ccp4bb] mmCIF as working format?

Reply via email to