Re: [ccp4bb] Non-sequential residue numbering?

Frances C. Bernstein Fri, 19 Sep 2008 11:40:52 -0700

I was at the PDB from 1974 - 1998 and closely involved with
processing entries 15 to ~9000.  We also designed the "PDB
format".  My replies were based on what was done for those 24
years and I cannot address what is currently being done at the PDB.


I do not know if the current PDB staff follows this bulletin
board and I can only suggest that you take this matter up
with the current PDB management, the community, and the PDB
advisory board.

                             Frances

=====================================================
****                Bernstein + Sons
*   *       Information Systems Consultants
****    5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
**** *            Frances C. Bernstein
  *   ***      [EMAIL PROTECTED]
 ***     *
  *   *** 1-631-286-1339    FAX: 1-631-286-1999
=====================================================

On Fri, 19 Sep 2008, Linda Brinen wrote:

I'm actually pleased to read your response and interpretation of what isallowable and why, Frances. However, it's it pretty stark contrast to what Iwas told about 18 months ago when I struggled (and eventually lost) topreserve a numbering scheme that had a long standing historical andliterature precedence when submitting a new structure to the PDB.
This was a two-domain protein; the first domain - according to historicalnumbering - had a number plus a letter code to indicate the domain; thesecond domain, which started again with the number 1 - had no letter code.We were told that that was not allowed. We wanted to preserve insertions anddeletions as well, but were also strongly discouraged, if not flat out toldwe could not. While it's not usually prudent to quote offline e-mailexchanges, I'm going to snip pertinent pieces of the discussion (I'm leavingthe original spelling errors and text bolding in place) with no indicationof the annotator who wrote these guidelines to our group. Here's part of oneof the many 'exchanges' that was had:
"I understand your point and that certain close research communities havecertain habits and traditions but the PDB serves to the whole community ofstructural biology, bioinformatics, to many educators, students... In allthese cases, the simplest possible numbering of sequences, ideally numberingidentical to the numbering used by the UNP sequence database, is far the mostuseful because easiest to understand. I do not say this because it is in ourmanuals and help pages but because I have eight years of experience withannotation of all kinds of structures. I would therefore very much like toask you to reconsider the way how you number your protein, your numberingschema is *interpretation* more than a mere labeling schema. Needles to say,no sequence numbering can satisfy this ambition...from my point of view,especially the jump from 96P back to 1 will cause a lot of confusion andmisunderstanding....look at the problem from a standpoint of a generalnaturalist instead of an narrow protease community"
This left us with a mandated 'start from 1 and number sequentially' formatthat did exactly the opposite of what you, Frances, correctly mention asimportant in any numbering scheme: preserve relationships with otherproteins. We've had to resort to providing 'translation tables' thatidentify what people were expecting to see as numbers for active siteresidues which now have new and non-sensical numbering. Is it the end ofthe world? Of course not. But neither is it necessarily the best scientificor logical presentation.
At the risk of inciting a rather....animated...dialogue on this topic, whathas your experience been with this kind of thing (i.e., were we justunlucky??) and do current practices make sense and serve the community??
-Linda


Frances C. Bernstein wrote:
All entries list atoms starting at the N-terminus (or 5') so
connectivity goes in the order of the atoms in the file -
obviously with the possibility of unconnected portions
where the density is inadequate.

The entire philosphy of allowing numbering other than 1 - N
had to do with preserving relationships with other proteins.
The most common use relates to having an initial sequence 1 - N
and then a similar sequence from another species with insertions
and/or gaps.  People wanted to be able to talk about the active
site (which was preserved) using the same residue numbers.
Negative numbers came up with additions at the N-terminus.
Offhand, I don't recall why descending numbers were used but
I believe that there is at least one such entry.

                       Frances
=====================================================
****                Bernstein + Sons
*   *       Information Systems Consultants
****    5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
**** *            Frances C. Bernstein
  *   ***      [EMAIL PROTECTED]
 ***     *
  *   *** 1-631-286-1339    FAX: 1-631-286-1999
=====================================================

On Fri, 19 Sep 2008, Ian Tickle wrote:
But what connectivity would be implied by descending numbers: the order
in the file or the order of the numbering?  I assume the former,
otherwise what would be the point of having descending numbering?  And I
wonder how many programs would baulk at it (or even at ascending
negative numbers?).

-- Ian
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On
Behalf Of Frances C. Bernstein
Sent: 19 September 2008 16:44
To: Todd Geders
Cc: [email protected]
Subject: Re: [ccp4bb] Non-sequential residue numbering?

As long as each residue within a chain has a unique identifier
(residue number plus insertion code), there is no restriction
on numbering.  The numbers can be in ascending or descending
order, non-sequential, and even negative.

                        Frances

=====================================================
****                Bernstein + Sons
*   *       Information Systems Consultants
****    5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
**** *            Frances C. Bernstein
   *   ***      [EMAIL PROTECTED]
  ***     *
   *   *** 1-631-286-1339    FAX: 1-631-286-1999
=====================================================

On Fri, 19 Sep 2008, Todd Geders wrote:
Hello all,

I have a structure from a non-natural fusion of the truncated
C-terminus
of
one protein with the truncated N-terminus of another.  For the
deposition, we
want to keep the numbering as found in the separate proteins.  It
looks
something like this:

            1         12
            |          |
....HWVCKDIALLMCFFLEEMSEEP....
  |        |
754      763

At no point is there an overlap in numbering (i.e. the N-terminal
residue
number is higher than the C-terminal residue number).

Is this numbering scheme supported by the PDB standard?  Thus far,
all
of the
software seems to handle it (refmac, Coot, PyMOL, pdb_extract, PDB
precheck &
validation, etc).

Can anyone see a reason to not deposit with this non-sequential
residue
numbering?

~Todd
Disclaimer
This communication is confidential and may contain privileged informationintended solely for the named addressee(s). It may not be used ordisclosed except for the purpose for which it has been sent. If you arenot the intended recipient you must not review, use, disclose, copy,distribute or take any action in reliance upon it. If you have receivedthis communication in error, please notify Astex Therapeutics Ltd byemailing [EMAIL PROTECTED] and destroy all copies of themessage and any attached documents.Astex Therapeutics Ltd monitors, controls and protects all its messagingtraffic in compliance with its corporate email policy. The Company acceptsno liability or responsibility for any onward transmission or use ofemails and attachments having left the Astex Therapeutics domain. Unlessexpressly stated, opinions in this message are those of the individualsender and not of Astex Therapeutics Ltd. The recipient should check thisemail and any attachments for the presence of computer viruses. AstexTherapeutics Ltd accepts no liability for damage caused by any virustransmitted by this email. E-mail is susceptible to data corruption,interception, unauthorized amendment, and tampering, Astex TherapeuticsLtd only send and receive e-mails on the basis that the Company is notliable for any such alteration or any consequences thereof.Astex Therapeutics Ltd., Registered in England at 436 Cambridge SciencePark, Cambridge CB4 0QA under number 3751674
--
Linda S. Brinen
Adjunct Assistant Professor
Dept of Cellular & Molecular Pharmacology and
The Sandler Center for Basic Research in Parasitic Diseases
Phone: 415-514-3426 FAX: 415-502-8193
E-mail: [EMAIL PROTECTED]
QB3/Byers Hall 508C
1700 4th Street
University of California
San Francisco, CA 94158-2550
USPS:
UCSF MC 2550
Byers Hall Room 508
1700 4th Street
San Francisco, CA 94158

Re: [ccp4bb] Non-sequential residue numbering?

Reply via email to