Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

2017-02-01 Thread Susan Leung
Thank you very much Andrew!

Indeed, I did not spot the pattern - how silly of me!

From: Andrew Dalke [da...@dalkescientific.com]
Sent: 01 February 2017 16:49
To: Susan Leung
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

Dear Susan,

  If I understand what's going on correctly, you have run across the difference 
between 0-based and 1-based indexing. See 
https://en.wikipedia.org/wiki/Zero-based_numbering .

RDKit, like most programming libraries and languages, index based on an offset 
from the beginning, so 0 means the beginning, 1 means one after the beginning, 
etc.

This is somewhat like how some buildings use "1" as the first floor above the 
ground, while others regard "1" as the ground floor, which is confusing if you 
are not used to it. (My apartment number says its on the second floor, while 
the elevator button says I live on floor 3.)

On Feb 1, 2017, at 5:15 PM, Susan Leung <susan.le...@st-hildas.ox.ac.uk> wrote:
> I am producing rdkit conformers and writing them to pdb files but am finding 
> the atom indexing in rdkit is different from the written pdb.
  ...
> Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 
> 4,5 in the pdb file):
  ...
> In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1")
  ...
> In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)'))
> Out[4]: (3, 4)
  ...
>   record_name  atom_number blank_1 atom_name alt_loc residue_name blank_2  \
> 0  HETATM1C1  UNL
> 1  HETATM2C2  UNL
> 2  HETATM3C3  UNL
> 3  HETATM4C4  UNL
> 4  HETATM5O1  UNL
> 5  HETATM6C5  UNL
> 6  HETATM7C6  UNL
> 7  HETATM8C7  UNL
> 8  HETATM9C8  UNL
> 9  HETATM   10C9  UNL


If I understand you correctly, then the "(3, 4)" as RDKit atom indices is (3+1, 
4+1) = (4,5) as PDB atom number, that is, the RDKit indices correspond to the 
left-most column of your table, rather than the atom_number column.

Cheers,

Andrew
da...@dalkescientific.com



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

2017-02-01 Thread Andrew Dalke
Dear Susan,

  If I understand what's going on correctly, you have run across the difference 
between 0-based and 1-based indexing. See 
https://en.wikipedia.org/wiki/Zero-based_numbering .

RDKit, like most programming libraries and languages, index based on an offset 
from the beginning, so 0 means the beginning, 1 means one after the beginning, 
etc.

This is somewhat like how some buildings use "1" as the first floor above the 
ground, while others regard "1" as the ground floor, which is confusing if you 
are not used to it. (My apartment number says its on the second floor, while 
the elevator button says I live on floor 3.)

On Feb 1, 2017, at 5:15 PM, Susan Leung  wrote:
> I am producing rdkit conformers and writing them to pdb files but am finding 
> the atom indexing in rdkit is different from the written pdb.
  ...
> Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 
> 4,5 in the pdb file):
  ...
> In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1")
  ...
> In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)'))
> Out[4]: (3, 4)
  ...
>   record_name  atom_number blank_1 atom_name alt_loc residue_name blank_2  \
> 0  HETATM1C1  UNL   
> 1  HETATM2C2  UNL   
> 2  HETATM3C3  UNL   
> 3  HETATM4C4  UNL   
> 4  HETATM5O1  UNL   
> 5  HETATM6C5  UNL   
> 6  HETATM7C6  UNL   
> 7  HETATM8C7  UNL   
> 8  HETATM9C8  UNL   
> 9  HETATM   10C9  UNL  


If I understand you correctly, then the "(3, 4)" as RDKit atom indices is (3+1, 
4+1) = (4,5) as PDB atom number, that is, the RDKit indices correspond to the 
left-most column of your table, rather than the atom_number column.

Cheers,

Andrew
da...@dalkescientific.com



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss