Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-03 Thread Stephen Roughley via Rdkit-discuss
It looks like it should be deterministic, in that it always loops through
the existing non-hydrogen atoms in their internal order, adding H's to each
in turn.

https://github.com/rdkit/rdkit/blob/ffc123a6659705adae33a6f5bf3913d65aa7b54d/Code/GraphMol/AddHs.cpp

Steve

On Wed, 3 Oct 2018 at 21:23, Peter St. John  wrote:

> Ah, well I suppose the follow up question is then does 'AddHs' add
> hydrogens in a deterministic fashion?
> If I have a canonicalized SMILES and do
>
> mol = Chem.MolFromSmiles(SMILES)
> molH = Chem.AddHs(mol)
>
> and then store information about the bonds in molH, should those be
> relatively consistent if I run the same code later?
> My limited experiments seem to indicate they are, but I'm not sure if that
> persists across python sessions or different hardware.
>
> Thanks again!
> -- Peter
>
>
> On Wed, Oct 3, 2018 at 9:53 AM Dmitri Maziuk via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net> wrote:
>
>> On Wed, 3 Oct 2018 17:26:24 +0200
>> Greg Landrum  wrote:
>>
>> > Yep good point.
>> > Though you can opt to keep the Hs if you want, that is not the default
>> > behavior.
>>
>> ;) I work for NMR people, we get very attached to our protons.
>>
>> Seriously though, I forget whether it was rdkit or openbabel, but back
>> when I was testing them I managed to read L-alanine MOL in and get
>> D-alanine InChI string out in one of them.
>> --
>> Dmitri Maziuk 
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-03 Thread Dimitri Maziuk via Rdkit-discuss
On 10/03/2018 03:23 PM, Peter St. John wrote:
> Ah, well I suppose the follow up question is then does 'AddHs' add
> hydrogens in a deterministic fashion?

It should, what's not guaranteed is that it will be the right order.
Obviously, if (using my previous example) L- and D-alanine is the "same
molecule" for your purposes, then it doesn't matter.

If it does mater, then alatis (the link I sent earlier) is the best
option that I know of.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-03 Thread Greg Landrum
Yep good point.
Though you can opt to keep the Hs if you want, that is not the default
behavior.


On Wed, 3 Oct 2018 at 17:07, Dmitri Maziuk via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> On Wed, 3 Oct 2018 06:21:06 +0200
> Greg Landrum  wrote:
>
> > The atom ordering in the RDKit molecule created from a SMILES or Mol
> block
> > will always be the same and will corresponds to the ordering of the atoms
> > in the input
>
> ... provided your molecule has no protons and/or you don't removeH/addH in
> the process.
>
> --
> Dmitri Maziuk 
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-03 Thread Dmitri Maziuk via Rdkit-discuss
On Wed, 3 Oct 2018 06:21:06 +0200
Greg Landrum  wrote:

> The atom ordering in the RDKit molecule created from a SMILES or Mol block
> will always be the same and will corresponds to the ordering of the atoms
> in the input

... provided your molecule has no protons and/or you don't removeH/addH in the 
process.

-- 
Dmitri Maziuk 


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Greg Landrum
On Tue, Oct 2, 2018 at 10:32 PM Peter St. John 
wrote:

> If I store a molecule as a SMILES string, along with relevant information
> about different bonds, is it safe to annotate those bond entries by bond
> index?
>

This has already been answered (yes, you can), but to just provide a bit
more detail:
The ordering of atoms/bonds in the RDKit molecule that results from reading
a particular input is deterministic.

The atom ordering in the RDKit molecule created from a SMILES or Mol block
will always be the same and will corresponds to the ordering of the atoms
in the input.
The bond ordering in the RDKit molecule created from a SMILES or Mol block
will always be the same. The ordering of bonds from a Mol file corresponds
to the ordering in the input file. The ordering from SMILES is a bit more
complicated.[1]

For the sake of completeness: the ordering of atoms and bonds from any of
the other currently supported input types will also always be the same.

-greg
[1] Here's the story on the ordering of bonds read from SMILES:
- non "ring closure" bonds appear in the order in which they appear in the
input SMILES.
- "ring closure" bonds appear at the end of the set of bonds. their
ordering is non-trivial to describe, but it is deterministic.
Here's a relatively simple example demonstrating this:
In [6]: m = Chem.MolFromSmiles('C12ON1.F2')

In [7]: m.Debug()
Atoms:
0 6 C chg: 0  deg: 3 exp: 3 imp: 1 hyb: 4 arom?: 0 chi: 0
1 8 O chg: 0  deg: 2 exp: 2 imp: 0 hyb: 4 arom?: 0 chi: 0
2 7 N chg: 0  deg: 2 exp: 2 imp: 1 hyb: 4 arom?: 0 chi: 0
3 9 F chg: 0  deg: 1 exp: 1 imp: 0 hyb: 4 arom?: 0 chi: 0
Bonds:
0 0->1 order: 1 conj?: 0 aromatic?: 0
1 1->2 order: 1 conj?: 0 aromatic?: 0
2 2->0 order: 1 conj?: 0 aromatic?: 0
3 3->0 order: 1 conj?: 0 aromatic?: 0
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Dimitri Maziuk via Rdkit-discuss
On 10/02/2018 03:32 PM, Peter St. John wrote:

> I.e., if I create a new rdkit Molecule with rdkit.Chem.MolFromSmiles(xxx),
> will the bond ordering always be the same? If not, does anyone know a a
> robust way of specifying a bond within a molecule as a string-based
> representation?

https://www.nature.com/articles/sdata201773


-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Maciek Wójcikowski
Hi Peter and Nils,

To supplement Nils comment I'd like to add that during writing the Mol
atoms nor bonds order is not changed, but the canonical atom mapping is
saved in molecular property "_smilesAtomOutputOrder". This does not include
bonds though, it shouldn't change, but if you wish to be safe it is best to
save the two atom indices instead the bond idx itself.


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


wt., 2 paź 2018 o 22:57 Nils Weskamp  napisał(a):

> Hi Peter,
>
> to the best of my knowledge: for a given SMILES string, you should
> always end up with the same molecule object.
>
> On the other hand, generation of (canonical / unique) SMILES often
> reorders atoms and bonds (to ensure that the SMILES is unique for a
> given structure). A conversion Molecule -> SMILES -> Molecule could thus
> lead to a different ordering of atoms and bonds and you will have to
> canonicalize your structure before you generate your index. [Or make
> sure that you use non-canonical SMILES.]
>
> Best,
> Nils
>
> Am 02.10.2018 um 22:32 schrieb Peter St. John:
> > If I store a molecule as a SMILES string, along with relevant
> > information about different bonds, is it safe to annotate those bond
> > entries by bond index?
> >
> > I.e., if I create a new rdkit Molecule with
> > rdkit.Chem.MolFromSmiles(xxx), will the bond ordering always be the
> > same? If not, does anyone know a a robust way of specifying a bond
> > within a molecule as a string-based representation?
> >
> > Thanks for the help!
> > -- Peter
> >
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Peter St. John
Awesome, thanks for the tip!

Connor, that also is a great idea, I didn't know about atom-mapped SMILES
strings. That would definitely be a good method if the indexing algorithm
changes across rdkit versions.

Thanks!
-- Peter

On Tue, Oct 2, 2018 at 2:56 PM Nils Weskamp  wrote:

> Hi Peter,
>
> to the best of my knowledge: for a given SMILES string, you should
> always end up with the same molecule object.
>
> On the other hand, generation of (canonical / unique) SMILES often
> reorders atoms and bonds (to ensure that the SMILES is unique for a
> given structure). A conversion Molecule -> SMILES -> Molecule could thus
> lead to a different ordering of atoms and bonds and you will have to
> canonicalize your structure before you generate your index. [Or make
> sure that you use non-canonical SMILES.]
>
> Best,
> Nils
>
> Am 02.10.2018 um 22:32 schrieb Peter St. John:
> > If I store a molecule as a SMILES string, along with relevant
> > information about different bonds, is it safe to annotate those bond
> > entries by bond index?
> >
> > I.e., if I create a new rdkit Molecule with
> > rdkit.Chem.MolFromSmiles(xxx), will the bond ordering always be the
> > same? If not, does anyone know a a robust way of specifying a bond
> > within a molecule as a string-based representation?
> >
> > Thanks for the help!
> > -- Peter
> >
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Nils Weskamp
Hi Peter,

to the best of my knowledge: for a given SMILES string, you should
always end up with the same molecule object.

On the other hand, generation of (canonical / unique) SMILES often
reorders atoms and bonds (to ensure that the SMILES is unique for a
given structure). A conversion Molecule -> SMILES -> Molecule could thus
lead to a different ordering of atoms and bonds and you will have to
canonicalize your structure before you generate your index. [Or make
sure that you use non-canonical SMILES.]

Best,
Nils

Am 02.10.2018 um 22:32 schrieb Peter St. John:
> If I store a molecule as a SMILES string, along with relevant
> information about different bonds, is it safe to annotate those bond
> entries by bond index?
> 
> I.e., if I create a new rdkit Molecule with
> rdkit.Chem.MolFromSmiles(xxx), will the bond ordering always be the
> same? If not, does anyone know a a robust way of specifying a bond
> within a molecule as a string-based representation?
> 
> Thanks for the help!
> -- Peter
> 


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Peter St. John
If I store a molecule as a SMILES string, along with relevant information
about different bonds, is it safe to annotate those bond entries by bond
index?

I.e., if I create a new rdkit Molecule with rdkit.Chem.MolFromSmiles(xxx),
will the bond ordering always be the same? If not, does anyone know a a
robust way of specifying a bond within a molecule as a string-based
representation?

Thanks for the help!
-- Peter
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss