Re: [Rdkit-discuss] trying to figure out what an rdkit warning means

2020-06-12 Thread Bennion, Brian via Rdkit-discuss
When I was looking for Inchi options on the rdkit docs. this is what I found.

rdkit.Chem.inchi.MolToInchi(mol, options='', logLevel=None, 
treatWarningAsError=False)

Returns the standard InChI string for a molecule

Keyword arguments: logLevel – the log level used for logging logs and messages 
from InChI API. set to None to diable the logging completely 
treatWarningAsError – set to True to raise an exception in case of a molecule 
that generates warning in calling InChI API. The resultant InChI string and 
AuxInfo string as well as the error message are encoded in the exception.

Returns: the standard InChI string returned by InChI API for the input molecule

As far as viewing the smiles strings to 2D structure, i have been using an web 
service openmolecule.org.  So that engine might be translating the smiles 
string and doing similar things as the sanitize function in rdkit is doing, if 
its not just using rdkit as well.

Brian


From: Greg Landrum 
Sent: Friday, June 12, 2020 7:06 AM
To: Bennion, Brian 
Cc: rdkit-discuss 
Subject: Re: [Rdkit-discuss] trying to figure out what an rdkit warning means



On Thu, Jun 11, 2020 at 4:04 PM Bennion, Brian 
mailto:benni...@llnl.gov>> wrote:
Thank you for the explanation Greg. When the smiles strings are viewed I see 
the E designation for them two trans double bonds. What other double bond is 
missing ?


How do you view the SMILES strings? The way you are currently constructing the 
molecule (without sanitization) means that the RDKit doesn't see the 
stereochemistry information that's present in them.

Also, is it possible within RDKit to activate the fixedH layer in the inchi 
creation?

Sure, all of the InChI options can be provided just like you would on the 
command line to the InChI executable:
In [54]: m1 = Chem.MolFromSmiles('CC1=CNC=N1')
In [55]: m2 = Chem.MolFromSmiles('CC1=CN=CN1')
In [58]: Chem.MolToInchi(m1,options='/FixedH')
Out[58]: 'InChI=1/C4H6N2/c1-4-2-5-3-6-4/h2-3H,1H3,(H,5,6)/f/h5H'
In [59]: Chem.MolToInchi(m2,options='/FixedH')
Out[59]: 'InChI=1/C4H6N2/c1-4-2-5-3-6-4/h2-3H,1H3,(H,5,6)/f/h6H'
In [60]: Chem.MolToInchi(m1)==Chem.MolToInchi(m2)
Out[60]: True

Best,
-greg
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] trying to figure out what an rdkit warning means

2020-06-12 Thread Greg Landrum
On Thu, Jun 11, 2020 at 4:04 PM Bennion, Brian  wrote:

> Thank you for the explanation Greg. When the smiles strings are viewed I
> see the E designation for them two trans double bonds. What other double
> bond is missing ?
>
>
How do you view the SMILES strings? The way you are currently constructing
the molecule (without sanitization) means that the RDKit doesn't see the
stereochemistry information that's present in them.

Also, is it possible within RDKit to activate the fixedH layer in the inchi
> creation?
>

Sure, all of the InChI options can be provided just like you would on the
command line to the InChI executable:

In [54]: m1 = Chem.MolFromSmiles('CC1=CNC=N1')

In [55]: m2 = Chem.MolFromSmiles('CC1=CN=CN1')

In [58]: Chem.MolToInchi(m1,options='/FixedH')

Out[58]: 'InChI=1/C4H6N2/c1-4-2-5-3-6-4/h2-3H,1H3,(H,5,6)/f/h5H'
In [59]: Chem.MolToInchi(m2,options='/FixedH')

Out[59]: 'InChI=1/C4H6N2/c1-4-2-5-3-6-4/h2-3H,1H3,(H,5,6)/f/h6H'
In [60]: Chem.MolToInchi(m1)==Chem.MolToInchi(m2)

Out[60]: True


Best,
-greg
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] trying to figure out what an rdkit warning means

2020-06-11 Thread Bennion, Brian via Rdkit-discuss
Thank you for the explanation Greg. When the smiles strings are viewed I see 
the E designation for them two trans double bonds. What other double bond is 
missing ?

Also, is it possible within RDKit to activate the fixedH layer in the inchi 
creation?

Brian


---
Sent from Workspace ONE Boxer

On June 11, 2020 at 12:13:10 AM PDT, Greg Landrum  
wrote:
Hi Brian,

The warning is actually because you have double bonds with unspecified 
stereochemistry.
You are skipping sanitization of the molecules. When you do this no 
stereochemistry perception is done, so the InChI code is called without any 
stereochemistry information and you get the warning.
If you construct the molecule "normally" (i.e. with sanitization) you get the 
correct InChI and no warning:
In [57]: m = 
Chem.MolFromSmiles(r'O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1')
In [58]: Chem.MolToInchi(m)
Out[58]: 
'InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3/b17-12+,21-18+'

If you really want to call the InChI code without sanitizing the molecules and 
want the stereochemistry to be correct, you have to do a bit more work:
In [63]: m = 
Chem.MolFromSmiles(r'O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1',sanitize=False)
In [64]: m.UpdatePropertyCache(strict=False)
In [65]: Chem.AssignStereochemistry(m)
In [66]: Chem.MolToInchi(m)
Out[66]: 
'InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3/b17-12+,21-18+'

Best,
-greg


On Thu, Jun 11, 2020 at 3:46 AM Bennion, Brian via Rdkit-discuss 
mailto:rdkit-discuss@lists.sourceforge.net>>
 wrote:
Hello,
Below I show a smiles string from MOE and the smiles string calculated from 
RDKit and the InChI string calculated by RDkit(2020_1).

The error on conversion to inchi string is confusing me after entering both 
smiles strings into a viewer I don't see any undefined stereo center.

O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1
CC(C)=CCC/C(C)=C/COc1ccc(C(=O)/C=C/c2c2)cc1
[18:10:42] WARNING: Omitted undefined stereo
InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3


   while len(line) != 0:
fields = line.replace('","',' ').split()
mol1 = fields[0].replace('"','')
mol_name = fields[1]

try:
mol = Chem.MolFromSmiles(mol1,sanitize=False) #, removeHs=False)
except:
mol = None
if mol is None:
print("mol1 failed:",mol1)
output.write("mol1 failes:",mol1)
else:
rkditsmiout.write('\"'+Chem.MolToSmiles(mol, 
isomericSmiles=True)+'\"\n')
print(Chem.MolToSmiles(mol, isomericSmiles=True))
rkditsmiout.write('\"'+Chem.inchi.MolToInchi(mol)+'\"\n')
print(Chem.inchi.MolToInchi(mol))
count += 1
print(count)

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] trying to figure out what an rdkit warning means

2020-06-11 Thread Greg Landrum
Hi Brian,

The warning is actually because you have double bonds with unspecified
stereochemistry.
You are skipping sanitization of the molecules. When you do this no
stereochemistry perception is done, so the InChI code is called without any
stereochemistry information and you get the warning.
If you construct the molecule "normally" (i.e. with sanitization) you get
the correct InChI and no warning:

In [57]: m =
Chem.MolFromSmiles(r'O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1')

In [58]: Chem.MolToInchi(m)

Out[58]:
'InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3/b17-12+,21-18+'


If you really want to call the InChI code without sanitizing the molecules
and want the stereochemistry to be correct, you have to do a bit more work:

In [63]: m =
Chem.MolFromSmiles(r'O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1',sanitize=False)

In [64]: m.UpdatePropertyCache(strict=False)

In [65]: Chem.AssignStereochemistry(m)

In [66]: Chem.MolToInchi(m)

Out[66]:
'InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3/b17-12+,21-18+'


Best,
-greg


On Thu, Jun 11, 2020 at 3:46 AM Bennion, Brian via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
> Below I show a smiles string from MOE and the smiles string calculated
> from RDKit and the InChI string calculated by RDkit(2020_1).
>
> The error on conversion to inchi string is confusing me after entering
> both smiles strings into a viewer I don't see any undefined stereo center.
>
> O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1
> CC(C)=CCC/C(C)=C/COc1ccc(C(=O)/C=C/c2c2)cc1
> [18:10:42] WARNING: Omitted undefined stereo
>
> InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3
>
>
>while len(line) != 0:
> fields = line.replace('","',' ').split()
> mol1 = fields[0].replace('"','')
> mol_name = fields[1]
>
> try:
> mol = Chem.MolFromSmiles(mol1,sanitize=False) #,
> removeHs=False)
> except:
> mol = None
> if mol is None:
> print("mol1 failed:",mol1)
> output.write("mol1 failes:",mol1)
> else:
> rkditsmiout.write('\"'+Chem.MolToSmiles(mol,
> isomericSmiles=True)+'\"\n')
> print(Chem.MolToSmiles(mol, isomericSmiles=True))
> rkditsmiout.write('\"'+Chem.inchi.MolToInchi(mol)+'\"\n')
> print(Chem.inchi.MolToInchi(mol))
> count += 1
> print(count)
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] trying to figure out what an rdkit warning means

2020-06-10 Thread Bennion, Brian via Rdkit-discuss
Hello,
Below I show a smiles string from MOE and the smiles string calculated from 
RDKit and the InChI string calculated by RDkit(2020_1).

The error on conversion to inchi string is confusing me after entering both 
smiles strings into a viewer I don't see any undefined stereo center.

O=C(/C=C/c1c1)c1ccc(OC/C=C(/CC/C=C(\C)/C)\C)cc1
CC(C)=CCC/C(C)=C/COc1ccc(C(=O)/C=C/c2c2)cc1
[18:10:42] WARNING: Omitted undefined stereo
InChI=1S/C25H28O2/c1-20(2)8-7-9-21(3)18-19-27-24-15-13-23(14-16-24)25(26)17-12-22-10-5-4-6-11-22/h4-6,8,10-18H,7,9,19H2,1-3H3


   while len(line) != 0:
fields = line.replace('","',' ').split()
mol1 = fields[0].replace('"','')
mol_name = fields[1]

try:
mol = Chem.MolFromSmiles(mol1,sanitize=False) #, removeHs=False)
except:
mol = None
if mol is None:
print("mol1 failed:",mol1)
output.write("mol1 failes:",mol1)
else:
rkditsmiout.write('\"'+Chem.MolToSmiles(mol, 
isomericSmiles=True)+'\"\n')
print(Chem.MolToSmiles(mol, isomericSmiles=True))
rkditsmiout.write('\"'+Chem.inchi.MolToInchi(mol)+'\"\n')
print(Chem.inchi.MolToInchi(mol))
count += 1
print(count)

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss