date:20170111

[Rdkit-discuss] UpdatePropertyCache() after RunReactants

2017-01-11 Thread Curt Fischer

Hi all,

I recently wanted to use RDKit to model the famous copper-catalyzed
cycloaddition of alkynes and azides.

I eventually got things working, kind of, but had two questions.  First, I
was surprised to find that the products of RunReactants don't have update
property caches.  Is this something I should have expected, or is it a
bug?  If the latter, is it any easy-to-fix bug or a hard-to-fix one?

Second, how can I modify my SMARTS reaction query to avoid duplication of
each product?

Here's some example code, also available at
https://github.com/tentrillion/ipython_notebooks/blob/master/rdkit_smarts_reactions_needs_updating.ipynb

# ---BEGIN CODE-- #
# import rdkit components
from rdkit import rdBase
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Draw

# use IPythonConsole for pretty drawings
from rdkit.Chem.Draw import IPythonConsole
# IPythonConsole.ipython_useSVG=True  # leave out for github

# for flattening
from itertools import chain

# define reactants
diyne_smiles = 'C#CCC(O)C#C'
azide_smiles = 'CCCN=[N+]=[N-]'

diyne = Chem.MolFromSmiles(diyne_smiles)
azide = Chem.MolFromSmiles(azide_smiles)

# define reaction
copper_click_smarts =
'[C:1]#[C:2].[N:3]=[N+:4]=[N-:5]>>[c:1]1[c:2][n-0:3][n-0:4][n-0:5]1'
copper_click = AllChem.ReactionFromSmarts(copper_click_smarts)

# run reaction
products_tuples = copper_click.RunReactants((diyne, azide))

# flatten product tuple of tuples into list
products = list(chain(*products_tuples))

# FAILS: mol property caches are not updated
try:
Draw.MolsToGridImage(products)
except (RuntimeError, ValueError) as e:
print 'FAILED!'
my_error = e

# this works: force updating
for product in products:
product.UpdatePropertyCache()

Draw.MolsToGridImage(products)

my_error

products_tuples = copper_click.RunReactants((diyne, azide))
products = list(chain(*products_tuples))
# FAILS: mol property caches are not updated
Draw.MolsToGridImage(products)

# ---END CODE-- #

The stacktrace is:

---ValueError
   Traceback (most recent call
last) in ()  2 products =
list(chain(*products_tuples))  3 # FAILS: mol property caches are
not updated> 4 Draw.MolsToGridImage(products)
/Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/IPythonConsole.pyc
in ShowMols(mols, **kwargs)198   else:199 fn =
Draw.MolsToGridImage--> 200   res = fn(mols, **kwargs)201   if
kwargs['useSVG']:202 return SVG(res)
/Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/__init__.pyc
in MolsToGridImage(mols, molsPerRow, subImgSize, legends,
highlightAtomLists, useSVG, **kwargs)403   else:404 return
_MolsToGridImage(mols, molsPerRow=molsPerRow, subImgSize=subImgSize,
legends=legends,--> 405
highlightAtomLists=highlightAtomLists, **kwargs)406 407
/Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/__init__.pyc
in _MolsToGridImage(mols, molsPerRow, subImgSize, legends,
highlightAtomLists, **kwargs)344   highlights =
highlightAtomLists[i]345 if mol is not None:--> 346   img
= _moltoimg(mol, subImgSize, highlights, legends[i], **kwargs)347
 res.paste(img, (col * subImgSize[0], row * subImgSize[1]))348
  return res
/Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/__init__.pyc
in _moltoimg(mol, sz, highlights, legend, **kwargs)309   from
rdkit.Chem.Draw import rdMolDraw2D310   if not
hasattr(rdMolDraw2D, 'MolDraw2DCairo'):--> 311 img =
MolToImage(mol, sz, legend=legend, highlightAtoms=highlights,
**kwargs)312   else:313 nmol =
rdMolDraw2D.PrepareMolForDrawing(mol, kekulize=kwargs.get('kekulize',
True))
/Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/__init__.pyc
in MolToImage(mol, size, kekulize, wedgeBonds, fitImage, options,
canvas, **kwargs)112 from rdkit import Chem113 mol =
Chem.Mol(mol.ToBinary())--> 114 Chem.Kekulize(mol)115 116
 if not mol.GetNumConformers():
ValueError: Sanitization error: Can't kekulize mol.  Unkekulized atoms: 3
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] SD file read error

2017-01-11 Thread Steve O'Hagan


OK, error messages where hidden by IPython for me too.

I used "Knime" to look at the sdf file, and it seems that the errors are 
"real" - polymers, organometallic compounds or completely daft, two 
examples:




The structure in the first column is the input sdf.

Simple workflow was:


"RDKit From Molecule" generated exactly the same 36 "broken" molecules 
as the Python script.


There's also one bad sdf record in the file.

Cheers,
Steve.


On 11/01/2017 20:17, Curt Fischer wrote:

I also got this to run with no problem in a Jupyter notebook.

BUT...I did see the error messages Milinda mentioned in the terminal 
that was running the jupyter notebook server.  If I do *from 
rdkit.Chem.Draw import IPythonConsole *before running the code, I see 
all the errors/warnings in Jupyter.


I think this version of the loop is a bit more informative (best to do 
with IPythonConsole disabled):


*from rdkit import Chem
**from rdkit.Chem import Descriptors**
**input_file = 'structures.sdf'**
**suppl  = Chem.SDMolSupplier(input_file)**
**low_mass=50
**high_mass=1000**
**ms = []**
**for idx, mol in enumerate(suppl) :
**  if mol is None:
**  print "No molecule: " + str(idx)
**  continue
**  try:
**  if (mol and
**  round(Descriptors.ExactMolWt(mol), 4) >= low_mass and
**  round(Descriptors.ExactMolWt(mol), 4) <= high_mass
** ):
**
**  ms.append(mol)
**  except:
**  print "Error: " + str(idx)
**  pass*



It shows that all the problems are from rdkit failing to generate 
molecules, i.e. the try/except isn't doing anything.  (Note it is bad 
practice to have a naked *except*).


The first molecule that fails is #491, heparin sulfate. The molecule 
can be imported using *Chem.MolFromInchi()*. This gels nicely with the 
rdkit error message for this molecule:


RDKit ERROR: [12:12:56] Unhandled CTAB feature: S group SRU on
line: 75. Molecule skipped.



The problem is thus the line M STY 1 1 SRU in the mol block, which you 
can see if you do


*suppl.reset() for idx, mol in enumerate(suppl): if idx == 491:
print suppl.GetItemText(idx)*


I don't know enough to pinpoint the precise reason for the error.  And 
there are lots more errors to go through to get everything from HMDB 
into RDKit, it seeems.


Curt

On Wed, Jan 11, 2017 at 11:39 AM, Steve O'Hagan 
> wrote:


With same code and fresh file download, works fine for me without
error.

ms contains 35177 molecules. Perhaps your download was corrupt?


On 11/01/2017 18:26, Milinda Samaraweera wrote:

Dear Experts,

I was trying to read in the attached SD file (downloaded from
HMDB) and trying to calculate the exact mass of each entry:

structures.sdf



from rdkit import Chem
from rdkit.Chem import Descriptors

suppl  = Chem.SDMolSupplier(input_file)

low_mass=50
high_mass=1000

ms = []

for mol in suppl :

if mol is None: continue

try:
if mol and
round(Descriptors.ExactMolWt(mol),4)>=low_mass and   
round(Descriptors.ExactMolWt(mol),4)<=high_mass:

ms.append(mol)

except:
  pass

By running the script, I got a barrage of errors as:

[13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater
than permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater
than permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater
than permitted
[13:15:16] Explicit valence for atom # 46 N, 4, is greater than
permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
[13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater
than permitte
[13:15:16] Explicit valence for atom # 16 N, 4, is greater than
permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
[13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater
than permitte
[13:15:17] Explicit valence for atom # 11 N, 4, is greater than
permitted
[13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
[13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater
than permitte
[13:15:18]  S group SUP ignored on line 2836416
[13:15:18] Explicit valence for atom # 1 Cl, 4, is

Re: [Rdkit-discuss] SD file read error

2017-01-11 Thread Curt Fischer

I also got this to run with no problem in a Jupyter notebook.

BUT...I did see the error messages Milinda mentioned in the terminal that
was running the jupyter notebook server.  If I do *from rdkit.Chem.Draw
import IPythonConsole *before running the code, I see all the
errors/warnings in Jupyter.

I think this version of the loop is a bit more informative (best to do with
IPythonConsole disabled):


> *from rdkit import Chem**from rdkit.Chem import Descriptors*
> *input_file = 'structures.sdf'*
> *suppl  = Chem.SDMolSupplier(input_file)*
>
> *low_mass=50**high_mass=1000*
> *ms = []*
>
> *for idx, mol in enumerate(suppl) :*
> *if mol is None: *
> *print "No molecule: " + str(idx)*
> *continue*
> *try:*
> *if (mol and *
> *round(Descriptors.ExactMolWt(mol), 4) >= low_mass
> and *
> *round(Descriptors.ExactMolWt(mol), 4) <= high_mass*
> *   ):*
>
> *ms.append(mol)*
> *except:*
> *print "Error: " + str(idx)**pass*



It shows that all the problems are from rdkit failing to generate
molecules, i.e. the try/except isn't doing anything.  (Note it is bad
practice to have a naked *except*).

The first molecule that fails is #491, heparin sulfate.  The molecule can
be imported using *Chem.MolFromInchi()*. This gels nicely with the rdkit
error message for this molecule:

RDKit ERROR: [12:12:56] Unhandled CTAB feature: S group SRU on line: 75.
Molecule skipped.



The problem is thus the line M STY 1 1 SRU in the mol block, which you can
see if you do

*suppl.reset() for idx, mol in enumerate(suppl): if idx == 491: print
> suppl.GetItemText(idx)*
>

I don't know enough to pinpoint the precise reason for the error.  And
there are lots more errors to go through to get everything from HMDB into
RDKit, it seeems.

Curt

On Wed, Jan 11, 2017 at 11:39 AM, Steve O'Hagan 
wrote:

> With same code and fresh file download, works fine for me without error.
>
> ms contains 35177 molecules. Perhaps your download was corrupt?
>
>
> On 11/01/2017 18:26, Milinda Samaraweera wrote:
>
> Dear Experts,
>
> I was trying to read in the attached SD file (downloaded from HMDB) and
> trying to calculate the exact mass of each entry:
> 
>  structures.sdf
> 
> 
> from rdkit import Chem
> from rdkit.Chem import Descriptors
>
> suppl  = Chem.SDMolSupplier(input_file)
>
> low_mass=50
> high_mass=1000
>
> ms = []
>
> for mol in suppl :
>
> if mol is None: continue
>
> try:
> if mol and round(Descriptors.ExactMolWt(mol),4)>=low_mass
> andround(Descriptors.ExactMolWt(mol),4)<=high_mass:
> ms.append(mol)
>
> except:
>   pass
>
> By running the script, I got a barrage of errors as:
>
> [13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
> [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
> permitted
> [13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
> [13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
> [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
> permitted
> [13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
> [13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
> [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
> permitted
> [13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted
> [13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
> [13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than
> permitte
> [13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted
> [13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
> [13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than
> permitte
> [13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted
> [13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
> [13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than
> permitte
> [13:15:18]  S group SUP ignored on line 2836416
> [13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted
> [13:15:18] ERROR: Could not sanitize molecule ending on line 2841449
> [13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than
> permitte
> [13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
> [13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
> [13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
> [13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
> [13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted
> [13:15:19] ERROR: Could not sanitize molecule ending on line 3107498
>

Re: [Rdkit-discuss] SD file read error

2017-01-11 Thread Jan Holst Jensen

On 2017-01-11 19:26, Milinda Samaraweera wrote:

Dear Experts,

I was trying to read in the attached SD file (downloaded from HMDB) 
and trying to calculate the exact mass of each entry:

[...]

By running the script, I got a barrage of errors as:

[13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

[13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
[13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than 
permitte

[13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
[13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than 
permitte

[13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted
[13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
[13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than 
permitte

[13:15:18]  S group SUP ignored on line 2836416
[13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted
[13:15:18] ERROR: Could not sanitize molecule ending on line 2841449
[13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than 
permitte

[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted
[13:15:19] ERROR: Could not sanitize molecule ending on line 3107498
[13:15:19] ERROR: Explicit valence for atom # 3 B, 4, is greater than 
permitted

[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:20]  Unhandled CTAB feature: S group SRU on line: 3205922. 
Molecule skip

[13:15:20] Explicit valence for atom # 0 Mg, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3222378
[13:15:20] ERROR: Explicit valence for atom # 0 Mg, 4, is greater than 
permitte

[13:15:20] Explicit valence for atom # 2 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3265386
[13:15:20] ERROR: Explicit valence for atom # 2 N, 4, is greater than 
permitted

[13:15:20] Explicit valence for atom # 31 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3305754
[13:15:20] ERROR: Explicit valence for atom # 31 N, 4, is greater than 
permitte

[13:15:21] Explicit valence for atom # 45 N, 4, is greater than permitted
[13:15:21] ERROR: Could not sanitize molecule ending on line 3437055
[13:15:21] ERROR: Explicit valence for atom # 45 N, 4, is greater than 
permitte

[13:15:56] Explicit valence for atom # 3 C, 5, is greater than permitted
[13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
[13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than 
permitted

What causes these errors? there a way to suppress or solve the errors? 
or way to stop priting them up in the command prompt.

--
Thanks,
Milinda Samaraweera,

Hi Milinda,

The errors are caused by valence errors in the SD file.

> [13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
> [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

This molecule has a single bond and a double bond to one of the oxygens 
= 3 valences used. Standard valence for uncharged oxygen is 2, so that 
won't work. Given that there is a negatively charged Cl in the molecule 
it is likely that this oxygen should have had a positive charge of one 
assigned. This would make the error go away.

The nitrogens with valence 4 should probably also have a positive charge 
applied - or bond orders adjusted.

> [13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
> [13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than 
permitted

Pentavalent carbon... hmm... could in principle be fixed with a negative 
charge on the carbon, but where is the counterion ?

I would recommend that you leave the error messages turned on - they do 
tell you that there is something wrong with the input structures. But if 
you really want to turn them off there is a discussion about it here:

Re: [Rdkit-discuss] SD file read error

2017-01-11 Thread Steve O'Hagan


With same code and fresh file download, works fine for me without error.

ms contains 35177 molecules. Perhaps your download was corrupt?

On 11/01/2017 18:26, Milinda Samaraweera wrote:

Dear Experts,

I was trying to read in the attached SD file (downloaded from HMDB) 
and trying to calculate the exact mass of each entry:


structures.sdf 



from rdkit import Chem
from rdkit.Chem import Descriptors

suppl  = Chem.SDMolSupplier(input_file)

low_mass=50
high_mass=1000

ms = []

for mol in suppl :

if mol is None: continue

try:
if mol and 
round(Descriptors.ExactMolWt(mol),4)>=low_mass and 
round(Descriptors.ExactMolWt(mol),4)<=high_mass:

ms.append(mol)

except:
  pass

By running the script, I got a barrage of errors as:

[13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than 
permitted

[13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
[13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than 
permitte

[13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
[13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than 
permitte

[13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted
[13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
[13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than 
permitte

[13:15:18]  S group SUP ignored on line 2836416
[13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted
[13:15:18] ERROR: Could not sanitize molecule ending on line 2841449
[13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than 
permitte

[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted
[13:15:19] ERROR: Could not sanitize molecule ending on line 3107498
[13:15:19] ERROR: Explicit valence for atom # 3 B, 4, is greater than 
permitted

[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:20]  Unhandled CTAB feature: S group SRU on line: 3205922. 
Molecule skip

[13:15:20] Explicit valence for atom # 0 Mg, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3222378
[13:15:20] ERROR: Explicit valence for atom # 0 Mg, 4, is greater than 
permitte

[13:15:20] Explicit valence for atom # 2 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3265386
[13:15:20] ERROR: Explicit valence for atom # 2 N, 4, is greater than 
permitted

[13:15:20] Explicit valence for atom # 31 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3305754
[13:15:20] ERROR: Explicit valence for atom # 31 N, 4, is greater than 
permitte

[13:15:21] Explicit valence for atom # 45 N, 4, is greater than permitted
[13:15:21] ERROR: Could not sanitize molecule ending on line 3437055
[13:15:21] ERROR: Explicit valence for atom # 45 N, 4, is greater than 
permitte

[13:15:56] Explicit valence for atom # 3 C, 5, is greater than permitted
[13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
[13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than 
permitted


What causes these errors? there a way to suppress or solve the errors? 
or way to stop priting them up in the command prompt.


--
Thanks,
Milinda Samaraweera,



--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Operations dependent on Kekulization

2017-01-11 Thread Greg Landrum

Hi Juuso,

Reading aromatic bonds from a mol file is problematic (the mol file spec is
clear that aromatic bond orders are only for use in queries), but ignoring
that for a moment...

There's an RDKit blog post on this very topic:
http://rdkit.blogspot.com/2016/09/avoiding-unnecessary-work-and.html

I hope this helps,
-greg



On Wed, Jan 11, 2017 at 3:36 PM, Juuso Lehtivarjo <
juuso.lehtiva...@gmail.com> wrote:

> Dear RDKit users,
>
> Interpreting the default sanitization order (in
> http://www.rdkit.org/docs/RDKit_Book.html), I assume that
> SetAromaticity must be preceded with Kekulization (otherwise, why to
> Kekulize at all if atoms and bonds are to be marked as aromatic
> anyway). However, I wonder if any of the other sanitization steps, or
> e.g. any stereochemistry-related operations are dependent on
> Kekulization as well? The bottom line question is, if I read in a
> molfile that has reliable aromatic bonds (e.g. it has gone through
> successful default sanitization procedure previously), can I just mark
> the corresponding atoms as aromatic and then skip both Kekulization
> and SetAromaticity steps, without having anything important missing?
>
> Thanks in advance!
> Juuso
>
> ps.
> The underlying reason for the whole mess is that some aromatic-N
> containing molecules require the "sanifix4" correction (from this post
> http://www.mail-archive.com/rdkit-discuss@lists.
> sourceforge.net/msg01901.html),
> which I can't use from C++ side obviously. Wonder if there is any
> better solutions to this?
>
> 
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] SD file read error

2017-01-11 Thread Milinda Samaraweera

Dear Experts,

I was trying to read in the attached SD file (downloaded from HMDB) and
trying to calculate the exact mass of each entry:

 structures.sdf


from rdkit import Chem
from rdkit.Chem import Descriptors

suppl  = Chem.SDMolSupplier(input_file)

low_mass=50
high_mass=1000

ms = []

for mol in suppl :

if mol is None: continue

try:
if mol and round(Descriptors.ExactMolWt(mol),4)>=low_mass
andround(Descriptors.ExactMolWt(mol),4)<=high_mass:
ms.append(mol)

except:
  pass

By running the script, I got a barrage of errors as:

[13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
[13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than
permitte
[13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
[13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than
permitte
[13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted
[13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
[13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than
permitte
[13:15:18]  S group SUP ignored on line 2836416
[13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted
[13:15:18] ERROR: Could not sanitize molecule ending on line 2841449
[13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than
permitte
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted
[13:15:19] ERROR: Could not sanitize molecule ending on line 3107498
[13:15:19] ERROR: Explicit valence for atom # 3 B, 4, is greater than
permitted
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:20]  Unhandled CTAB feature: S group SRU on line: 3205922. Molecule
skip
[13:15:20] Explicit valence for atom # 0 Mg, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3222378
[13:15:20] ERROR: Explicit valence for atom # 0 Mg, 4, is greater than
permitte
[13:15:20] Explicit valence for atom # 2 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3265386
[13:15:20] ERROR: Explicit valence for atom # 2 N, 4, is greater than
permitted
[13:15:20] Explicit valence for atom # 31 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3305754
[13:15:20] ERROR: Explicit valence for atom # 31 N, 4, is greater than
permitte
[13:15:21] Explicit valence for atom # 45 N, 4, is greater than permitted
[13:15:21] ERROR: Could not sanitize molecule ending on line 3437055
[13:15:21] ERROR: Explicit valence for atom # 45 N, 4, is greater than
permitte
[13:15:56] Explicit valence for atom # 3 C, 5, is greater than permitted
[13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
[13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than
permitted

What causes these errors? there a way to suppress or solve the errors? or
way to stop priting them up in the command prompt.

-- 
Thanks,
Milinda Samaraweera,
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] Operations dependent on Kekulization

2017-01-11 Thread Juuso Lehtivarjo

Dear RDKit users,

Interpreting the default sanitization order (in
http://www.rdkit.org/docs/RDKit_Book.html), I assume that
SetAromaticity must be preceded with Kekulization (otherwise, why to
Kekulize at all if atoms and bonds are to be marked as aromatic
anyway). However, I wonder if any of the other sanitization steps, or
e.g. any stereochemistry-related operations are dependent on
Kekulization as well? The bottom line question is, if I read in a
molfile that has reliable aromatic bonds (e.g. it has gone through
successful default sanitization procedure previously), can I just mark
the corresponding atoms as aromatic and then skip both Kekulization
and SetAromaticity steps, without having anything important missing?

Thanks in advance!
Juuso

ps.
The underlying reason for the whole mess is that some aromatic-N
containing molecules require the "sanifix4" correction (from this post
http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01901.html),
which I can't use from C++ side obviously. Wonder if there is any
better solutions to this?

--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] RDKit conformer handling possible bug?

2017-01-11 Thread Stephen Roughley

Thanks Greg.

For completeness, for anyone trying this from the Java wrappers, the relevant 
methods are

mol.clearConformers();

and

mol.addConformer(conf, true);

Steve

From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 11 January 2017 11:01
To: Stephen Roughley
Cc: RDKit Discuss
Subject: Re: RDKit conformer handling possible bug?

On Wed, Jan 11, 2017 at 10:47 AM, Stephen Roughley 
> wrote:
Thanks Greg.  I think I understand!

Let me just check this to be certain.

If the molecule came from SMILES, it has no conformers, so all appears to work 
as expected.

correct

If the molecule came from MOL/SDF etc, then it has coordinates in the input, 
and thus a conformer.  When I then add a conformer, it has 2 conformers with 
id=0, and so I only get the first, original one (or either at random?)

Correct, it has 2 conformers and you only get the first one, unless you change 
the ID to allow you to get the second one.

, and it looks like it only has 1 conformer? (Or it only has 1 conformer still 
at all)

Any which way, the
mol.RemoveAllConformers()
call will solve it.

Yep! And you should also use assignId=True when you call AddConformer.

Thanks again,

Steve
From: Greg Landrum 
[mailto:greg.land...@gmail.com]
Sent: 11 January 2017 07:24
To: Stephen Roughley; RDKit Discuss
Subject: Re: RDKit conformer handling possible bug?

Hi Steve,

[A general request: please send RDKit questions/comments to the rdkit-discuss 
mailing list so that others can see the questions and answers. I'm CC'ing the 
list on my reply because I think it could be of general interest]

For this one, what you have encountered here is a feature, but it's not 
immediately obvious.

There are two things going on here:
1) an RDKit molecule can have 0-N conformers. Chem.MolToMolBlock(), by default, 
picks the one with ID zero. If there are no conformers, then it just outputs 
zeros for the coordinates. Mol.AddConformer() does not replace existing 
conformers, but just adds to the list.
2) By default Mol.AddConformer() does not change the ID of the conformer it 
adds. So if you add the default conformer (number zero) from one molecule to 
another, you end up with two conformers with ID 0. You can work around this by 
calling Mol.AddConformer() with assignId=True

Here's a demo:
mol_SDF_H_2=Chem.AddHs(Chem.MolFromMolBlock(molString))
print(mol_SDF_H_2.GetNumConformers())
# output is 1
mol_SDF_H_2.AddConformer(mol_SDF_H.GetConformer(0),assignId=True)
print(mol_SDF_H_2.GetNumConformers())
# output is 2
print([x.GetId() for x in mol_SDF_H_2.GetConformers()])
# output is [0,1]
print(Chem.MolToMolBlock(mol_SDF_H_2,confId=1))
# output is what is expected.

If you want to replace the existing conformer, you can do:
mol.RemoveAllConformers() before you call mol.AddConformer().

Hope this helps,
-greg

On Tue, Jan 10, 2017 at 2:58 PM, Stephen Roughley 
> wrote:
from rdkit import Chem
from rdkit.Chem import AllChem

#From SMILES - this works!
mol_SMI = Chem.MolFromSmiles("C")
mol_SMI_H=Chem.AddHs(mol_SMI)
mol_SMI_H_2 = Chem.AddHs(Chem.MolFromSmiles("C"))

ids=AllChem.EmbedMultipleConfs(mol_SMI_H, numConfs=10)
mol_SMI_H_2.AddConformer(mol_SMI_H.GetConformer(0))
print(Chem.MolToMolBlock(mol_SMI_H_2))

## As expected, the conformer is transferred to the vanilla copy
##
## RDKit  3D
##
##  5  4  0  0  0  0  0  0  0  0999 V2000
##   -0.02210.00320.0165 C   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.66650.8884   -0.1014 H   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.3778   -0.8578   -0.5883 H   0  0  0  0  0  0  0  0  0  0  0  0
##0.0964   -0.31511.0638 H   0  0  0  0  0  0  0  0  0  0  0  0
##0.96990.2812   -0.3906 H   0  0  0  0  0  0  0  0  0  0  0  0
##  1  2  1  0
##  1  3  1  0
##  1  4  1  0
##  1  5  1  0
##M  END

molString="""
  Mrv16b2101101711412D

  1  0  0  0  0  0999 V2000
0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
M  END
$$
"""

#Now try from MOL Block
mol_SDF=Chem.MolFromMolBlock(molString)
print(Chem.MolToMolBlock(mol_SDF))

## As expected we have a molecule with 1 atom and no conformers
##
## RDKit  2D
##
##  1  0  0  0  0  0  0  0  0  0999 V2000
##0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
##M  END
##

mol_SDF_H=Chem.AddHs(mol_SDF)
Chem.MolFromMolBlock(molString)
mol_SDF_H_2=Chem.AddHs(Chem.MolFromMolBlock(molString))

ids=AllChem.EmbedMultipleConfs(mol_SDF_H, numConfs=10)
print(Chem.MolToMolBlock(mol_SDF_H,confId=0))

## As expected, we have an molecule with Hs added, and conformers with coords
##
## RDKit  3D
##
##  5  4  0  0  0  0  0  0  0  0999 V2000
##0.0260   -0.0121   -0.0153 C   0  0  0  0  0  0  0  0  0  0  0  0
##0.9804   -0.4982   -0.2778 H   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.03171.0327   -0.3649 H   0  0  0  0  0  0  0  0  0

Re: [Rdkit-discuss] RDKit conformer handling possible bug?

2017-01-11 Thread Greg Landrum

On Wed, Jan 11, 2017 at 10:47 AM, Stephen Roughley 
wrote:

> Thanks Greg.  I think I understand!
>
>
>
> Let me just check this to be certain.
>
>
>
> If the molecule came from SMILES, it has no conformers, so all appears to
> work as expected.
>

correct


> If the molecule came from MOL/SDF etc, then it has coordinates in the
> input, and thus a conformer.  When I then add a conformer, it has 2
> conformers with id=0, and so I only get the first, original one (or either
> at random?)
>

Correct, it has 2 conformers and you only get the first one, unless you
change the ID to allow you to get the second one.


> , and it looks like it only has 1 conformer? (Or it only has 1 conformer
> still at all)
>

> Any which way, the
>
> mol.RemoveAllConformers()
>
> call will solve it.
>

Yep! And you should also use assignId=True when you call AddConformer.


>
>
> Thanks again,
>
>
>
> Steve
>
> *From:* Greg Landrum [mailto:greg.land...@gmail.com]
> *Sent:* 11 January 2017 07:24
> *To:* Stephen Roughley; RDKit Discuss
> *Subject:* Re: RDKit conformer handling possible bug?
>
>
>
> Hi Steve,
>
>
>
> [A general request: please send RDKit questions/comments to the
> rdkit-discuss mailing list so that others can see the questions and
> answers. I'm CC'ing the list on my reply because I think it could be of
> general interest]
>
>
>
> For this one, what you have encountered here is a feature, but it's not
> immediately obvious.
>
>
>
> There are two things going on here:
>
> 1) an RDKit molecule can have 0-N conformers. Chem.MolToMolBlock(), by
> default, picks the one with ID zero. If there are no conformers, then it
> just outputs zeros for the coordinates. Mol.AddConformer() does not replace
> existing conformers, but just adds to the list.
>
> 2) By default Mol.AddConformer() does not change the ID of the conformer
> it adds. So if you add the default conformer (number zero) from one
> molecule to another, you end up with two conformers with ID 0. You can work
> around this by calling Mol.AddConformer() with assignId=True
>
>
>
> Here's a demo:
>
> mol_SDF_H_2=Chem.AddHs(Chem.MolFromMolBlock(molString))
>
> print(mol_SDF_H_2.GetNumConformers())
>
> # output is 1
>
> mol_SDF_H_2.AddConformer(mol_SDF_H.GetConformer(0),assignId=True)
>
> print(mol_SDF_H_2.GetNumConformers())
>
> # output is 2
>
> print([x.GetId() for x in mol_SDF_H_2.GetConformers()])
>
> # output is [0,1]
>
> print(Chem.MolToMolBlock(mol_SDF_H_2,confId=1))
>
> # output is what is expected.
>
>
>
> If you want to replace the existing conformer, you can do:
>
> mol.RemoveAllConformers() before you call mol.AddConformer().
>
>
>
> Hope this helps,
>
> -greg
>
>
>
>
>
>
>
>
>
> On Tue, Jan 10, 2017 at 2:58 PM, Stephen Roughley 
> wrote:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> #From SMILES - this works!
> mol_SMI = Chem.MolFromSmiles("C")
> mol_SMI_H=Chem.AddHs(mol_SMI)
> mol_SMI_H_2 = Chem.AddHs(Chem.MolFromSmiles("C"))
>
> ids=AllChem.EmbedMultipleConfs(mol_SMI_H, numConfs=10)
> mol_SMI_H_2.AddConformer(mol_SMI_H.GetConformer(0))
> print(Chem.MolToMolBlock(mol_SMI_H_2))
>
> ## As expected, the conformer is transferred to the vanilla copy
> ##
> ## RDKit  3D
> ##
> ##  5  4  0  0  0  0  0  0  0  0999 V2000
> ##   -0.02210.00320.0165 C   0  0  0  0  0  0  0  0  0  0  0  0
> ##   -0.66650.8884   -0.1014 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##   -0.3778   -0.8578   -0.5883 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##0.0964   -0.31511.0638 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##0.96990.2812   -0.3906 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##  1  2  1  0
> ##  1  3  1  0
> ##  1  4  1  0
> ##  1  5  1  0
> ##M  END
>
> molString="""
>   Mrv16b2101101711412D
>
>   1  0  0  0  0  0999 V2000
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> M  END
> $$
> """
>
> #Now try from MOL Block
> mol_SDF=Chem.MolFromMolBlock(molString)
> print(Chem.MolToMolBlock(mol_SDF))
>
> ## As expected we have a molecule with 1 atom and no conformers
> ##
> ## RDKit  2D
> ##
> ##  1  0  0  0  0  0  0  0  0  0999 V2000
> ##0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> ##M  END
> ##
>
> mol_SDF_H=Chem.AddHs(mol_SDF)
> Chem.MolFromMolBlock(molString)
> mol_SDF_H_2=Chem.AddHs(Chem.MolFromMolBlock(molString))
>
> ids=AllChem.EmbedMultipleConfs(mol_SDF_H, numConfs=10)
> print(Chem.MolToMolBlock(mol_SDF_H,confId=0))
>
> ## As expected, we have an molecule with Hs added, and conformers with
> coords
> ##
> ## RDKit  3D
> ##
> ##  5  4  0  0  0  0  0  0  0  0999 V2000
> ##0.0260   -0.0121   -0.0153 C   0  0  0  0  0  0  0  0  0  0  0  0
> ##0.9804   -0.4982   -0.2778 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##   -0.03171.0327   -0.3649 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##   -0.10700.01681.0896 H   0  0  0  0  0  0  0  0  0  0  0  0
> ##   -0.8677   -0.5393

Re: [Rdkit-discuss] RDKit conformer handling possible bug?

2017-01-11 Thread Stephen Roughley

Thanks Greg.  I think I understand!

Let me just check this to be certain.

If the molecule came from SMILES, it has no conformers, so all appears to work 
as expected.
If the molecule came from MOL/SDF etc, then it has coordinates in the input, 
and thus a conformer.  When I then add a conformer, it has 2 conformers with 
id=0, and so I only get the first, original one (or either at random?), and it 
looks like it only has 1 conformer? (Or it only has 1 conformer still at all)

Any which way, the
mol.RemoveAllConformers()
call will solve it.

Thanks again,

Steve
From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 11 January 2017 07:24
To: Stephen Roughley; RDKit Discuss
Subject: Re: RDKit conformer handling possible bug?

Hi Steve,

[A general request: please send RDKit questions/comments to the rdkit-discuss 
mailing list so that others can see the questions and answers. I'm CC'ing the 
list on my reply because I think it could be of general interest]

For this one, what you have encountered here is a feature, but it's not 
immediately obvious.

There are two things going on here:
1) an RDKit molecule can have 0-N conformers. Chem.MolToMolBlock(), by default, 
picks the one with ID zero. If there are no conformers, then it just outputs 
zeros for the coordinates. Mol.AddConformer() does not replace existing 
conformers, but just adds to the list.
2) By default Mol.AddConformer() does not change the ID of the conformer it 
adds. So if you add the default conformer (number zero) from one molecule to 
another, you end up with two conformers with ID 0. You can work around this by 
calling Mol.AddConformer() with assignId=True

Here's a demo:
mol_SDF_H_2=Chem.AddHs(Chem.MolFromMolBlock(molString))
print(mol_SDF_H_2.GetNumConformers())
# output is 1
mol_SDF_H_2.AddConformer(mol_SDF_H.GetConformer(0),assignId=True)
print(mol_SDF_H_2.GetNumConformers())
# output is 2
print([x.GetId() for x in mol_SDF_H_2.GetConformers()])
# output is [0,1]
print(Chem.MolToMolBlock(mol_SDF_H_2,confId=1))
# output is what is expected.

If you want to replace the existing conformer, you can do:
mol.RemoveAllConformers() before you call mol.AddConformer().

Hope this helps,
-greg




On Tue, Jan 10, 2017 at 2:58 PM, Stephen Roughley 
> wrote:
from rdkit import Chem
from rdkit.Chem import AllChem

#From SMILES - this works!
mol_SMI = Chem.MolFromSmiles("C")
mol_SMI_H=Chem.AddHs(mol_SMI)
mol_SMI_H_2 = Chem.AddHs(Chem.MolFromSmiles("C"))

ids=AllChem.EmbedMultipleConfs(mol_SMI_H, numConfs=10)
mol_SMI_H_2.AddConformer(mol_SMI_H.GetConformer(0))
print(Chem.MolToMolBlock(mol_SMI_H_2))

## As expected, the conformer is transferred to the vanilla copy
##
## RDKit  3D
##
##  5  4  0  0  0  0  0  0  0  0999 V2000
##   -0.02210.00320.0165 C   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.66650.8884   -0.1014 H   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.3778   -0.8578   -0.5883 H   0  0  0  0  0  0  0  0  0  0  0  0
##0.0964   -0.31511.0638 H   0  0  0  0  0  0  0  0  0  0  0  0
##0.96990.2812   -0.3906 H   0  0  0  0  0  0  0  0  0  0  0  0
##  1  2  1  0
##  1  3  1  0
##  1  4  1  0
##  1  5  1  0
##M  END

molString="""
  Mrv16b2101101711412D

  1  0  0  0  0  0999 V2000
0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
M  END
$$
"""

#Now try from MOL Block
mol_SDF=Chem.MolFromMolBlock(molString)
print(Chem.MolToMolBlock(mol_SDF))

## As expected we have a molecule with 1 atom and no conformers
##
## RDKit  2D
##
##  1  0  0  0  0  0  0  0  0  0999 V2000
##0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
##M  END
##

mol_SDF_H=Chem.AddHs(mol_SDF)
Chem.MolFromMolBlock(molString)
mol_SDF_H_2=Chem.AddHs(Chem.MolFromMolBlock(molString))

ids=AllChem.EmbedMultipleConfs(mol_SDF_H, numConfs=10)
print(Chem.MolToMolBlock(mol_SDF_H,confId=0))

## As expected, we have an molecule with Hs added, and conformers with coords
##
## RDKit  3D
##
##  5  4  0  0  0  0  0  0  0  0999 V2000
##0.0260   -0.0121   -0.0153 C   0  0  0  0  0  0  0  0  0  0  0  0
##0.9804   -0.4982   -0.2778 H   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.03171.0327   -0.3649 H   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.10700.01681.0896 H   0  0  0  0  0  0  0  0  0  0  0  0
##   -0.8677   -0.5393   -0.4316 H   0  0  0  0  0  0  0  0  0  0  0  0
##  1  2  1  0
##  1  3  1  0
##  1  4  1  0
##  1  5  1  0
##M  END

mol_SDF_H_2.AddConformer(mol_SDF_H.GetConformer(0))
print(Chem.MolToMolBlock(mol_SDF_H_2))
## Whoa! This time, we have added a conformation to a vanilla copy
## exactly as we did for SMILES, but no conformer coords are exported:
##
## RDKit  2D
##
##  5  4  0  0  0  0  0  0  0  0999 V2000
##0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
##0.0.0. H   0  0  0  0  0  0  0  0  0  0  0  0
##0.0.0. H

[Rdkit-discuss] UpdatePropertyCache() after RunReactants

Re: [Rdkit-discuss] SD file read error

Re: [Rdkit-discuss] SD file read error

Re: [Rdkit-discuss] SD file read error

Re: [Rdkit-discuss] SD file read error

Re: [Rdkit-discuss] Operations dependent on Kekulization

[Rdkit-discuss] SD file read error

[Rdkit-discuss] Operations dependent on Kekulization

Re: [Rdkit-discuss] RDKit conformer handling possible bug?

Re: [Rdkit-discuss] RDKit conformer handling possible bug?

Re: [Rdkit-discuss] RDKit conformer handling possible bug?

11 matches

Site Navigation

Mail list logo

Footer information