*** For details on how to be removed from this list visit the *** *** CCP4 home page http://www.ccp4.ac.uk ***
Hi Linda: I've been working on an RNA called the "hammerhead ribozyme" that has a particularly Kafkaesque numbering scheme. Originally I had tried adhering to it, but found that various programs, including pymol and refmac, simply couldn't cope. I finally came to my senses and numbered the RNA 1,2,3, ...N. Then I made a table to translate between the sequential and canonical numbering schemes. http://xanana.ucsc.edu/hh/rosetta_hh.html I'm not sure that this is the best way, but at least I know that if other people want to display or re-refine the structure, it will behave properly. HTH, Bill Scott Linda Brinen wrote: > This isn't specifically a ccp4-related question, but I'm hoping for > feedback > on a topic that most of have had to consider. I'm motivated to ask the > question because I'm currently trying to answer it for myself. I should > make the disclaimer right off that I'm not looking to start a heated > debate > about PDB guidelines, but am genuinely looking for constructive > suggestions. > > > > My situation involves a two-domain protein in somewhat well-studied family > of molecules. There is a long-standing history of how these are numbered - > and examples of this can be found in the PDB. The first domain can > typically > be found with a letter-descriptor after the number (i.e., 1P, 2P, 3P..) > and > then resetting to 1 with no letter following for the second domain. All > numbering is done relative to the original member of the family of these > proteins - so if there is a gap based on sequence alignment to that > sequence, the numbering skips. Similarly, if there are inserts, the > numbering becomes 46a, 45b, 45c, etc. Again, lots of precedent for this > in > the PDB. > > > > BUT, now there is a push from databases for more 'simplification' and > standardization of numbering, i.e., start from 1 and go sequentially to > the > end. Obviously there are arguments to be made for maintaining biologically > relevant and historically established precedents. But there are arguments > for the other side as well. > > > > How do you handle the numbering of your protein sequence if there are > gaps, > inserts, different biologically relevant domains? Do you use the accepted > precedents set by other related structures that have been solved or do you > simply start from 1 and push on through to your end point? > > > > Thanks in advance for any input. > > > > -Linda Brinen > > > >
