RE: [PyMOL] why is PyMOL renaming my residues?

2003-10-03 Thread Warren L. DeLano
> >Also note that by some conventions, "2OA2" in a PDB file really means
> >atom "OA22".
> >
> I didn't know that.  What conventions are those?

Well, PDB atom names are supposed to have the atomic symbol right
justified in the first two columns followed by a remoteness indicator
and then a branching number.  However, that isn't a sufficient number of
fields for all situations, such as when significant symmetry is present
in a system, and so a third field is required.  This numeric field
occupies the first column when the atom symbol itself is only one
character.

With hydrogens, this first numeric field has the additional defined
purpose of enumerating (NMR?) equivalent hydrogens, so you will see
atoms like "2HD" and "3HD" in ARG for instance.

However, in the Amber world, the atomic symbol always comes first.  Thus
a PDB hydrogen 2HH2 becomes HH22 in Amber.  Going from PDB to Amber is
easy, but the reverse is not trivial, since Amber "CD1" remains PDB
"CD1", but Amber "HE2" becomes "2HE" and "HH22" becomes "2HH2".  

Ambiguity occurs when there is a four-letter atom name which is not a
hydrogen.  "OA22" in Amber would thus need to become "2OA2" in order to
comply with the PDB convention of having the Atomic symbol right
justified in the first two columns of the file.  However, that
convention is only explicitly enforced for amino acids, so in theory
"OA22" would be legal for a non-amino acid, whereas "2OA2" is required
if the residue is an amino acid.  

What a mess!

Cheers,
Warren






--
mailto:war...@delanoscientific.com
Warren L. DeLano, Ph.D.
Principal Scientist
DeLano Scientific LLC
Voice (650)-346-1154 
Fax   (650)-593-4020

> >Also note that by some conventions, "2OA2" in a PDB file really means
> >atom "OA22".
> >
> I didn't know that.  What conventions are those?
> 
> >Furthermore, in your example, they [ATOM IDs] are not unique (a
> mistake?).
> >
> yup, a mistake.
> 
> >However, I am trying to bend PyMOL around to meet your needs a bit
> >better.
> >
> >Towards this end, I've created a new setting "pdb_retain_ids" which
> >preserves the original PDB serial numbers in the output file.
> >
> >In future PyMOL versions, so long as
> >
> >set retain_order, 1
> >set pdb_retain_ids, 1
> >set pdb_no_end_record, 1
> >
> That's absolutely fantastic.  Right now, I have to use PyMOL, MOE and
> AMBER (sander and carnal are the real problems) on the same systems,
and
> anything that makes this less painful is great!  I'll probably set
> retain_order and pdb_retain_ids in my .pymolrc.py and upgrade to the
CVS
> version within the next couple of days.
> 
> Thanks!
> 
> -michael
> --
> michael lerner
> 
> 
> 
> ---
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> ___
> PyMOL-users mailing list
> PyMOL-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pymol-users




Re: [PyMOL] why is PyMOL renaming my residues?

2003-10-02 Thread michael lerner

Hi Warren,


PyMOL's PDB handling is an attempt to navigate a minefield of
incompatible standards which exist in the conventions of various
software packages.

Ugh.  I know more about incompatable PDB files than I ever wanted to.  I 
once wrote something much like Andrew Dalke's UPDB (see 
http://biopython.org/scriptcentral/).  Unfortunately, my version ended 
up being very slow, so I didn't get a chance to give it to the biopython 
folks.  I keep meaning to go back and speed it up, but I think they have 
a good PDB parser now.



Since version 0.90, PyMOL's behavior has changed.  Nowadays,
your input data:





Would be returned as:
 





...which does preserve atom names, but not the order.  Note that
white-space in the atom names is not preserved (an inherent limitation
in PyMOL -- significant white-space within identifier is hell on users
and parsers).  


Conventionally, atom names of length 3 or less are placed in the second
column, not the first -- so your "LMN " atom may cause trouble.

That's fine with me.  I just had "LMN " and " XYZ" in there so that I 
could figure out exactly what PyMOL was changing.



Also note that by some conventions, "2OA2" in a PDB file really means
atom "OA22".


I didn't know that.  What conventions are those?


Furthermore, in your example, they [ATOM IDs] are not unique (a mistake?).


yup, a mistake.


However, I am trying to bend PyMOL around to meet your needs a bit
better.

Towards this end, I've created a new setting "pdb_retain_ids" which
preserves the original PDB serial numbers in the output file.

In future PyMOL versions, so long as

set retain_order, 1
set pdb_retain_ids, 1
set pdb_no_end_record, 1

That's absolutely fantastic.  Right now, I have to use PyMOL, MOE and 
AMBER (sander and carnal are the real problems) on the same systems, and 
anything that makes this less painful is great!  I'll probably set 
retain_order and pdb_retain_ids in my .pymolrc.py and upgrade to the CVS 
version within the next couple of days. 


Thanks!

-michael
--
michael lerner




RE: [PyMOL] why is PyMOL renaming my residues?

2003-09-26 Thread Warren L. DeLano
Michael,

PyMOL's PDB handling is an attempt to navigate a minefield of
incompatible standards which exist in the conventions of various
software packages.  Amber in particular poses significant challenges, as
its PDB files are unusual.

Since version 0.90, PyMOL's behavior has changed.  Nowadays,
your input data:

HETATM 1313 OA22 NAP   164  28.315  61.969  12.250   31.54
O 
HETATM 1314 OA23 NAP   164  26.554  62.174  14.275   21.05
O 
HETATM 1314 ABCD NAP   164  28.554  64.174  16.275   11.05
O 
HETATM 1314 XYZ  NAP   164  30.554  66.174  18.2751.05
O 
HETATM 1314  LMN NAP   164  32.554  68.174  20.275   41.05
O

Would be returned as:

HETATM1 OA22 NAP   164  28.315  61.969  12.250  0.00 31.54
O
HETATM2 OA23 NAP   164  26.554  62.174  14.275  0.00 21.05
O
HETATM3 ABCD NAP   164  28.554  64.174  16.275  0.00 11.05
O
HETATM4  LMN NAP   164  32.554  68.174  20.275  0.00 41.05
O
HETATM5  XYZ NAP   164  30.554  66.174  18.275  0.00  1.05
O
END

...which does preserve atom names, but not the order.  Note that
white-space in the atom names is not preserved (an inherent limitation
in PyMOL -- significant white-space within identifier is hell on users
and parsers).  

Conventionally, atom names of length 3 or less are placed in the second
column, not the first -- so your "LMN " atom may cause trouble.

Also note that by some conventions, "2OA2" in a PDB file really means
atom "OA22".  PyMOL used to work this way (up to 0.90).  Nowadays
(v0.91+), "2OA2" is treated as "20A2" unless pdb_literal_names is on, in
which case the old behavior returns.  

Note that in PyMOL, the PDB ATOM IDs are treated more as a property of
the PDB file than of the atoms themselves since TER records also have
unique IDs in that sequence and repeated MODELs sometimes do as well.
Furthermore, in your example, they are not unique (a mistake?).

However, I am trying to bend PyMOL around to meet your needs a bit
better.

Towards this end, I've created a new setting "pdb_retain_ids" which
preserves the original PDB serial numbers in the output file.

In future PyMOL versions, so long as

set retain_order, 1
set pdb_retain_ids, 1
set pdb_no_end_record, 1

The following behavior will be obtainable:

Assuming input of:

HETATM 1313 OA22 NAP   164  28.315  61.969  12.250   31.54
O 
HETATM 1314 OA23 NAP   164  26.554  62.174  14.275   21.05
O 
HETATM 1315 ABCD NAP   164  28.554  64.174  16.275   11.05
O 
HETATM 1316 XYZ  NAP   164  30.554  66.174  18.2751.05
O 
HETATM 1317  LMN NAP   164  32.554  68.174  20.275   41.05
O

You'll get an output of:

HETATM 1313 OA22 NAP   164  28.315  61.969  12.250  0.00 31.54
O
HETATM 1314 OA23 NAP   164  26.554  62.174  14.275  0.00 21.05
O
HETATM 1315 ABCD NAP   164  28.554  64.174  16.275  0.00 11.05
O
HETATM 1316  XYZ NAP   164  30.554  66.174  18.275  0.00  1.05
O
HETATM 1317  LMN NAP   164  32.554  68.174  20.275  0.00 41.05
O

Which is as close as I think PyMOL is going to get...

Cheers,
Warren



--
mailto:war...@delanoscientific.com
Warren L. DeLano, Ph.D.
Principal Scientist
DeLano Scientific LLC
Voice (650)-346-1154 
Fax   (650)-593-4020

> -Original Message-
> From: pymol-users-ad...@lists.sourceforge.net [mailto:pymol-users-
> ad...@lists.sourceforge.net] On Behalf Of michael lerner
> Sent: Friday, September 26, 2003 8:44 AM
> To: pymol-users@lists.sourceforge.net
> Subject: [PyMOL] why is PyMOL renaming my residues?
> 
> Hi,
> 
> If I load up a PDB file that looks like this:
> 
> HETATM 1313 OA22 NAP   164  28.315  61.969  12.250
> 31.54   O
> HETATM 1314 OA23 NAP   164  26.554  62.174  14.275
> 21.05   O
> HETATM 1314 ABCD NAP   164  28.554  64.174  16.275
> 11.05   O
> HETATM 1314 XYZ  NAP   164  30.554  66.174  18.275
> 1.05   O
> HETATM 1314  LMN NAP   164  32.554  68.174  20.275
> 41.05   O
> 
> and then save it from PyMOL, the resulting PDB file looks like this:
> 
> HETATM1 2OA2 NAP   164  28.315  61.969  12.250  0.00
> 31.54   O
> HETATM2 3OA2 NAP   164  26.554  62.174  14.275  0.00
> 21.05   O
> HETATM3 DABC NAP   164  28.554  64.174  16.275  0.00
> 11.05   O
> HETATM4  LMN NAP   164  32.554  68.174  20.275  0.00
> 41.05   O
> HETATM5  XYZ NAP   164  30.554  66.174  18.275  0.00
> 1.05   O
> END
> 
> You'll note that OA22 has been renamed to 2OA2, OA23 has been renamed
to
> 3OA2 and ABCD has been renamed to DABC.  It looks to me like residue
> with a four-letter name is getting renamed.
> 
> No .. wait .. it's a little stranger than that .. if I open up the
> second file (the one with DABC) and save *it*, I get this:
> 
> HETATM1 2OA2 NAP   164  28.315  61.969  12.250  0.00
> 31.54   O
> HETATM2 3OA2 NAP   164  26.554  62.174  14.275  0.00
> 21.05   O
>