Re: [Rdkit-discuss] how to set the stereochemistry of a molecule

2019-10-29 Thread Hongbin Yang
Hello Alfredo,


The tag "CHI_TETRAHEDRAL_CW" means clockwise and CCW means anti-clockwise (I 
have no idea why it is coded CCW rather than ACW, but whatever ;) ). These 
chiral tags are related but not corresponding to the absolute chirality (R/S) 
defined in organic chemistry.


Look at the SMILES you provided and you will find that both of the chiral 
carbons are [C@@H]. 
According to the theory of Smiles in DayLight 
(https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html 3.3.3) 
> The symbol "@" indicates that the following neighbors are listed 
> anticlockwise (it is a "visual mnemonic" in that the symbol looks like an 
> anticlockwise spiral around a central circle). "@@" indicates that the 
> neighbors are listed clockwise (you guessed it, anti-anti-clockwise).
Then you may have realized that the tags "CHI_TETRAHEDRAL_CW" are corresponding 
to your SMILES but not the real chirality.


I suggest you read the chapter 3.3.3 to understand the meaning of "chiral tag", 
then you will understand why they are not the absolute chirality.


Absolute chirality depends on the chiral tag as well as the CIPRanks of the 
neighbor atoms. That is why the chirality of the two carbons are different 
while their chiral tags are the same.


I am not sure if Lukas had the same question?




Best regards,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics

On 10/29/2019 21:14,Alfredo Quevedo wrote:
Good morning,

I am trying to manually set/change the stereochemistry of a chiral 
center of a molecule, and I cant understand how to set the 
CHI_TETRAHEDRAL_CW/CHI_TETRAHEDRAL_CCW tag on a certain atom. So far my 
script does the following:

-
from rdkit import Chem
from rdkit.Chem import AllChem

molecule='C[C@@H](C(=O)COc1c(F)c(F)cc(F)c1F)n1cc([C@@H](NCc2ccc3c(c2)OCO3)c2ncc[nH]2)nn1'

molecule_smiles=Chem.MolFromSmiles(molecule)
chiralty=Chem.FindMolChiralCenters(molecule_smiles)
centers=len(chiralty)

for i in range(centers):
 current_center=chiralty[i][1]
 print(current_center)
 center_to_change=chiralty[i][0]
 print(center_to_change)


for a in molecule_smiles.GetAtoms():
 print(a.GetChiralTag())



executing the above mentioned script I can see that the molecule has 2 
chiral centers, the first one on index 1, which is S, and the second 
one, with index 19, which is R. Both chiral tags are set to: 
CHI_TETRAHEDRAL_CW. now, I want to manually change the configuration of 
index 19 to S, so I want to change its tag to CHI_TETRAHEDRAL_CCW.

Which would be the command to set this tag and how is the index indicated?

thanks in advance for the help,

kind regards

Alfredo


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2019.09.1 RDKit Release

2019-10-25 Thread Hongbin Yang
Hi Greg,


Great to know the release!


But the docs for Python API does not work in my browser. It is empty for every 
page such as http://rdkit.org/docs/source/rdkit.Chem.AllChem.html


Am I the only one?


Best,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics

On 10/25/2019 15:20,Greg Landrum wrote:
Dear all,


I'm pleased to announce that the next version of the RDKit - 2019.09 - is 
released. The release notes are below.


The release files are on the github release page:
https://github.com/rdkit/rdkit/releases/tag/Release_2019_09_1



Binaries have been uploaded to anaconda.org (https://anaconda.org/rdkit).
The available conda binaries for this release are:
Linux 64bit: python 3.6, 3.7
Mac OS 64bit: python 3.6, 3.7
Windows 64bit: python 3.6, 3.7


Conda builds of the PostgreSQL cartridge are also available:
Linux 64bit: postgresql 9.6, 10, 11
Mac OS 64bit: postgresql 9.6, 10, 11


I believe that conda-forge will also switch to the new version in the near 
future.


The online version of the documentation at rdkit.org 
(http://rdkit.org/docs/index.html) has been updated.



Some things that will be finished over the next couple of days:
- The conda build scripts will be updated to reflect the new version
- The homebrew script


Thanks to everyone who submitted code, bug reports, and suggestions for this 
release!


Please let me know if you find any problems with the release or have 
suggestions for the next one, which is scheduled for March/April 2020.


Best Regards,
-greg 


# Release_2019.09.1
(Changes relative to Release_2019.03.1)

## Important
- The atomic van der Waals radii used by the RDKit were corrected/updated in 
#2154.
  This leads to different results when generating conformations, molecular 
volumes,
  and molecular shapes.

## Backwards incompatible changes
- See the note about atomic van der Waals radii above.
- As part of the enhancements to the MolDraw2D class, we changed the type of
  DrawColour from a tuple to be an actual struct. We also added a 4th element to
  capture alpha values. This should have no affect on Python code (the alpha
  value is optional when providing color tuples), but will require changes to 
C++
  and Java/C# code that is using DrawColour.
- When reading Mol blocks, atoms with the symbol "R" are now converted into
  queries that match any atom when doing a substructure search (analogous to "*"
  in SMARTS). The previous behavior was to only match other dummy atoms
- When loading SDF files using PandasTools.LoadSDF(), we now default to
  producing isomeric smiles in pandas tables.  To reproduce the original
  behavior, use isomericSmiles=False in the call to the function.
- The SMARTS generated by the RDKit no longer contains redundant wildcard
  queries. This means the SMARTS strings generated by this release will 
generally
  be different from that in previous releases, although the results produced by
  the queries should not change.
- The RGroupDecomposition code now removes Hs from output R groups by default.
  To restore the old behavior create an RGroupDecompositionParameters object and
  set removeHydrogensPostMatch to false.
- The default values for some of the new fingerprint generators have been 
changed so
  that they more closely resemble the original fingerprinting code. In
  particular most fingerprinters no longer do count simulation by default and
  the RDKit fingerprint now sets two bits per feature by default.
- The SMARTS generated for MCS results using the ringMatchesRingOnly or
  completeRingsOnly options now includes ring-membership queries.

## Highlights:
- The substructure matching code is now about 30% faster. This also improves the
  speed of reaction matching and the FMCS code. (#2500)
- A minimal JavaScript wrapper has been added as part of the core release. 
(#2444)
- It's now possible to get information about why molecule sanitization failed. 
(#2587)
- A flexible new molecular hashing scheme has been added. (#2636)

## Acknowledgements:
Patricia Bento, Francois Berenger, Jason Biggs, David Cosgrove, Andrew Dalke,
Thomas Duigou, Eloy Felix, Guillaume Godin, Lester Hedges, Anne Hersey,
Christoph Hillisch, Christopher Ing, Jan Holst Jensen, Gareth Jones, Eisuke
Kawashima, Brian Kelley, Alan Kerstjens, Karl Leswing, Pat Lorton, John
Mayfield, Mike Mazanetz, Dan Nealschneider, Noel O'Boyle, Stephen Roughley,
Roger Sayle, Ricardo Rodriguez Schmidt, Paula Schmiel, Peter St. John, Marvin
Steijaert, Matt Swain, Amol Thakkar Paolo Tosco, Yi-Shu Tu, Ricardo Vianello,
Marc Wittke, '7FeiW', 'c56pony', 'sirbiscuit'


## Bug Fixes:
  - MCS returning partial rings with completeRingsOnly=True
 (github issue #945 from greglandrum)
  - Alternating canonical SMILES for fused ring with N
 (github issue #1028 from greglandrum)
  - Atom index out of range error
 (github issue #1868 from A-Thakkar)
  - Incorrect cis/trans stereo symbol for conjugated ring
 (github issue #2023 from baoilleach)
  - Hardcoded 

Re: [Rdkit-discuss] parsing reactions for reactants, agents, products

2019-10-22 Thread Hongbin Yang
Hi Benjamin,


The magic code uses a feature of python named "list comprehension". 
https://www.pythonforbeginners.com/basics/list-comprehensions-in-python


It does not read the rxn string directly, but splits the string first. Since 
the reaction string should be `reactants smiles>agents smiles>product smiles`, 
we can get these SMILES strings by "rxn_string.split('>')".
Then for each part, we can use splitter "." to get single molecules. So 
finally, [mols.split('.') for mols in rxn_string.split('>')] becomes 
[[reactant1, reactant2, ..], [agent1, agent2, ..], [product1, product2, ...]]. 
But they are all SMILES strings.


mols_from_smiles_list is defined here: 
https://github.com/connorcoley/ASKCOS/blob/master/makeit/utilities/io/draw.py#L16
It just reads the smiles strings in a list into a molecule list. The only API 
is uses is "Chem.MolFromSmiles".


The magic code can be translated into:


reactants_smiles, agents_smiles, product_smiles= mols in rxn_string.split('>')
package_results = []
for mols in reactants_smiles, agents_smiles, product_smiles:
  x = mols.split('.')
  y = mols_from_smiles_list(x)   # x is a list of SMILES, and y is a list of 
molecule objects
  package_results.append(y)
reactants, agents, products = package_results


The code now is not cool enough.


I have no idea with the second question. May I ask where the parameters 
threshold_unmapped_reactant_atoms and move_unmmapped_reactants_to_agents are 
defined?


Best,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics

On 10/22/2019 13:08,Benjamin Datko wrote:
Hello all,


While reading the source code for ASKCOS 
(https://github.com/connorcoley/ASKCOS/blob/master/makeit/utilities/io/draw.py) 
I noticed this code snippet (line 216 on the GitHub):


reactants, agents, products = [mols_from_smiles_list(x) for x in 
[mols.split('.') for mols in rxn_string.split('>')]]



When the above code is applied on a SMILES reaction string, the result unpacks 
the reactants, agents, and products mol objects into the respected variables, 
with pretty good accuracy.  The function 'mols_from_smiles' essentially just 
applies Chem.MolFromSmiles over a list of smiles.


I think this code snippet is really cool but I cannot find any documentation on 
how this is working. Searching this mailing list I came across the thread 
(https://sourceforge.net/p/rdkit/mailman/message/36316849/) where this 
operation of labeling reactants, agents, and products seems to be determined by 
the threshold_unmapped_reactant_atoms explained in the quoted text from the 
message (linked above)


Here's what's going on: By default the cartridge code does an extra step after 
reading a reaction from SMILES/SMARTS: it looks at all the reactants and moves 
any that don't have a sufficient fraction of mapped atoms to the agents. We do 
this by default because the reactions that we found "in the wild" often have 
agents, solvents, etc. mixed in with the reactants. The key parameter used 
there is threshold_unmapped_reactant_atoms, which defaults to 0.2.


The only further reading I can find is from Greg's paper 
(https://pubs.acs.org/doi/10.1021/ci5006614). I have two main questions: 


1. Where in the code is this atom mapping being applied? I cannot tell when 
this method is being applied or where the meta data is being saved. Applying 
the code snippet above to a SMILES reaction string results in a list of 
rdkit.Chem.rdchem.Mol objects. I cannot seem to find any static method or 
attributes specifying if it's a reactant, agent, or product when inspecting a 
mol object using help in a python terminal.


2. How can I change the value of the variables 
threshold_unmapped_reactant_atoms and move_unmmapped_reactants_to_agents? I am 
using rdkit version 2019.03.4 in an Anaconda environment. I want to experiment 
changing the mapping threshold.


Very Respectfully,


Benjamin___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Compatibility with pylint in vscode

2019-10-12 Thread Hongbin Yang
Dear Paolo,


Thank you.


The configuration can solve the problem. But after I add it to 
python->linting->Pylint Args, more (numerous) errors/warnings rush out, which 
are correct but unwanted. I guess that the configuration of "Use Minimal 
Checkers" is disabled though it is still be checked. (I don't know why)


Finally, I decide to uncheck "Whether to lint python files using pylint"  :)


Best,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics

On 10/12/2019 14:35,Paolo Tosco wrote:
Hi Hongbin,


Try configuring


extension-pkg-whitelist=rdkit 


Then pylint should recognise RDKit methods.


Cheers,
p.

On 12 Oct 2019, at 08:12, Hongbin Yang  wrote:


Dear RDKit users,


Does any one use vscode with pylint support? 
In my IDE, it hints me that "Module 'rdkit.Chem' has no 'MolFromSmiles' 
member." where there is a red wavy line under the code "Chem". The environment 
of conda/python is correctly configured and the scripts can run.


I know that it may be caused by the fact that some modules and functions in 
RDKit are just wrappers of C++, so pylint may not have recognized these modules 
or functions. 


But the red wavy line is really offending. Is there any suggestions in addition 
to disabling pylint?


Best regards,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Compatibility with pylint in vscode

2019-10-12 Thread Hongbin Yang
Dear RDKit users,


Does any one use vscode with pylint support? 
In my IDE, it hints me that "Module 'rdkit.Chem' has no 'MolFromSmiles' 
member." where there is a red wavy line under the code "Chem". The environment 
of conda/python is correctly configured and the scripts can run.


I know that it may be caused by the fact that some modules and functions in 
RDKit are just wrappers of C++, so pylint may not have recognized these modules 
or functions. 


But the red wavy line is really offending. Is there any suggestions in addition 
to disabling pylint?


Best regards,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Boc Deprotection

2019-10-03 Thread Hongbin Yang
Hi Sean,


You are right that using [#7:H] could be incorrect.


How about using explicit hydrogen and then “remove" it?


reaction = AllChem.ReactionFromSmarts('[#7:1]C(=O)OC(C)(C)C>>[#7:1][H]')
 
reactants_1 = 
[Chem.MolFromSmiles('CC(C)(C)OC(=O)NC1(C(=O)O)CCN(C(=O)OCC2c3c3-c3c32)CC1')]
display(Draw._moltoimg(product, (450, 150), [], legend='mol_1'))
products = reaction.RunReactants(reactants_1)
product = products[0][0]
product = AllChem.RemoveHs(product)
display(Draw._moltoimg(product, (450, 150), [], legend='mol_1 deprotected'))
product.UpdatePropertyCache()
display(Draw._moltoimg(product, (450, 150), [], legend='mol_1 updated'))


Then you will get the correct products in both aromatic and aliphatic cases.


Best,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics



On 10/4/2019 05:04,Sean Stromberg wrote:

Thanks Hongbin,

The problem is that there are two cases aliphatic and aromatic. If I add the 
explicit hydrogen like you suggest then the aliphatic case has the incorrect 
number of hydrogens. My question is how to contain the generality of the 
chemistry in the reaction smarts. Is that possible with the reaction smarts 
syntax, or should I just define two reactions, do a substructure search and 
apply the appropriate reaction for every building block I want to deprotect?

Thanks again,

Sean

 

From: Hongbin Yang
Sent: Wednesday, October 2, 2019 10:09 PM
To: Sean Stromberg
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re:[Rdkit-discuss] Boc Deprotection

 

Hi Sean,

 

The problem in this case is that in a non-kekulized SMILES, an aromatic 
nitrogen atom binding with a hydrogen should be symbolised as “[nH]”. The “H” 
is compulsory.

 

So you can change your reaction into "[#7:1]C(=O)OC(C)(C)C>>[#7H:1]"

 

Best,

 

Hongbin Yang 杨弘宾, Ph.D.

Research: Toxicophore and Chemoinformatics

 

On 10/3/2019 04:53,Sean Stromberg wrote:

Dear Rdkit Community,

I’m trying to learn reaction smarts with rdkit and would appreciate some 
clarification. I’m doing Boc deprotection on a large set of building blocks and 
the way I’ve defined my reaction, the number of hydrogens added after the 
deprotection is always wrong. I normally can call UpdatePropertyCache() to fix 
this, but when it’s an indole that is being deprotected when I call 
UpdatePropertyCache() it raises:

  

ValueError: Sanitization error: Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 
4 5 7 8 10

 

I know how to add hydrogens explicitly in the reaction smarts but not with 
reference to the nitrogen’s original bonds. Can I add explicit hydrogens 
conditionally? What is the best way to obtain results from the two reactant 
cases in my example code?

 

from rdkit import Chem

from rdkit.Chem import AllChem

 

reaction = AllChem.ReactionFromSmarts('[#7:1]C(=O)OC(C)(C)C>>[#7:1]')

 

reactants_1 = 
[Chem.MolFromSmiles('CC(C)(C)OC(=O)NC1(C(=O)O)CCN(C(=O)OCC2c3c3-c3c32)CC1')]

display(Draw.MolToImage(reactants_1[0], legend='mol_1'))

products = reaction.RunReactants(reactants_1)

product = products[0][0]

display(Draw.MolToImage(product, legend='mol_1 deprotected'))

product.UpdatePropertyCache()

display(Draw.MolToImage(product, legend='mol_1 deprotected and updated'))

 

 

reactants_2 = 
[Chem.MolFromSmiles('CC(C)(C)OC(=O)n1cc(C[C@@H](NC(=O)OCC2c3c3-c3c32)C(=O)O)c2c21')]

display(Draw.MolToImage(reactants_2[0], legend='mol_2'))

products = reaction.RunReactants(reactants_2)

product = products[0][0]

display(Draw.MolToImage(product, legend='mol_2 deprotected'))

product.UpdatePropertyCache()

display(Draw.MolToImage(product, legend='mol_2 deprotected and updated'))

 

Thanks for any clarification you can provide!

Sincerely,

Sean Stromberg

 

P.S. Does anyone know why I get so many products when I run these reactions?

 

 ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Boc Deprotection

2019-10-02 Thread Hongbin Yang
Hi Sean,


The problem in this case is that in a non-kekulized SMILES, an aromatic 
nitrogen atom binding with a hydrogen should be symbolised as “[nH]”. The “H” 
is compulsory.


So you can change your reaction into "[#7:1]C(=O)OC(C)(C)C>>[#7H:1]"


Best,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics



On 10/3/2019 04:53,Sean Stromberg wrote:

Dear Rdkit Community,

I’m trying to learn reaction smarts with rdkit and would appreciate some 
clarification. I’m doing Boc deprotection on a large set of building blocks and 
the way I’ve defined my reaction, the number of hydrogens added after the 
deprotection is always wrong. I normally can call UpdatePropertyCache() to fix 
this, but when it’s an indole that is being deprotected when I call 
UpdatePropertyCache() it raises:

  

ValueError: Sanitization error: Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 
4 5 7 8 10

 

I know how to add hydrogens explicitly in the reaction smarts but not with 
reference to the nitrogen’s original bonds. Can I add explicit hydrogens 
conditionally? What is the best way to obtain results from the two reactant 
cases in my example code?

 

from rdkit import Chem

from rdkit.Chem import AllChem

 

reaction = AllChem.ReactionFromSmarts('[#7:1]C(=O)OC(C)(C)C>>[#7:1]')

 

reactants_1 = 
[Chem.MolFromSmiles('CC(C)(C)OC(=O)NC1(C(=O)O)CCN(C(=O)OCC2c3c3-c3c32)CC1')]

display(Draw.MolToImage(reactants_1[0], legend='mol_1'))

products = reaction.RunReactants(reactants_1)

product = products[0][0]

display(Draw.MolToImage(product, legend='mol_1 deprotected'))

product.UpdatePropertyCache()

display(Draw.MolToImage(product, legend='mol_1 deprotected and updated'))

 

 

reactants_2 = 
[Chem.MolFromSmiles('CC(C)(C)OC(=O)n1cc(C[C@@H](NC(=O)OCC2c3c3-c3c32)C(=O)O)c2c21')]

display(Draw.MolToImage(reactants_2[0], legend='mol_2'))

products = reaction.RunReactants(reactants_2)

product = products[0][0]

display(Draw.MolToImage(product, legend='mol_2 deprotected'))

product.UpdatePropertyCache()

display(Draw.MolToImage(product, legend='mol_2 deprotected and updated'))

 

Thanks for any clarification you can provide!

Sincerely,

Sean Stromberg

 

P.S. Does anyone know why I get so many products when I run these reactions?

 ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] 答复: Non Round-trippable Molecule

2019-09-03 Thread Hongbin Yang
Hi Axel,

The format like "c11” is implicitly defining the bonds between atoms and  
the output of the “canonical” SMILES has redundant ring labels (1 and 2) , 
which I think confused the parser and caused the problem.
I have no idea whether it is a bug or whether it is true reason.

But try this to solve your problem.

smi = Chem.MolToSmiles(mol, kekuleSmiles=True)

You will get the explicit bonds in the SMILES and it can be read by 
`MolFromSmiles`
COC1:C:C:C2:C:C:1OC1:C:C:C(:C:C:1)/C=C\\C1:C:C(:C(OC):C:C:1CCN(C)C)OC1:C(:C(CCN(C)C):C(OC):C(OC):C:1OC)/C=C\\2

Best,
Hongbin

发件人: Axel Pahl
发送时间: 2019年9月3日 16:45
收件人: RDKit Discuss
主题: [Rdkit-discuss] Non Round-trippable Molecule

Hi,

I know that the RDKit makes no guarantees abount being able to round-trip 
(Smiles -> Mol -> Smiles -> Mol) every molecule, but I would like to know if 
there are any recommendations on how to handle such cases.

In my current case the problem seems to lie in different aromatic models for a 
large ring containing oxygen.

This is the code to reproduce the issue and there is also a more illustrative 
Notebook (https://gist.github.com/apahl/06e55f5965cb82bc43d2aafd8ee0d532):



from rdkit.Chem import AllChem as Chem
from rdkit.Chem import Draw
from rdkit.Chem import Descriptors as Desc
from rdkit.Chem.Draw import IPythonConsole

# RDKit can parse the original Smiles into a valid molecule.
mol = 
Chem.MolFromSmiles("c12c(\C=C/c(ccc3OC)cc3Oc4ccc(cc4)\C=C/c(cc5O1)c(CCN(C)C)cc5OC)c(CCN(C)C)c(OC)c(OC)c2OC")
print(Desc.MolWt(mol))

# And parse it back into a Smiles.
smi = Chem.MolToSmiles(mol)
print(smi)  # -> 
COc1ccc2cc1-o-c1ccc(cc1)/c=c\c1cc(c(OC)cc1CCN(C)C)-o-c1c(c(CCN(C)C)c(OC)c(OC)c1OC)/c=c\2
# But the Smiles generated by RDKit can not be parsed back into a valid 
molecule.
tmp = Chem.MolFromSmiles(smi)
# -> RDKit ERROR: [10:39:41] Can't kekulize mol.  Unkekulized atoms: 2 3 4 5 6 
7 9 10 11 12 13 14 15 16 17 18 19 20 23 24 31 32 33 39 42 45 48 49



BTW, the JS Molecule Editor by Peter is also not able to round-trip this 
molecule.

Many thanks in advance,
Axel

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] 答复: Generating R-group representation

2019-08-26 Thread Hongbin Yang
Hi Tim,

Greg posted a gist on how to generate R-group matrices shortly before.
https://sourceforge.net/p/rdkit/mailman/message/36744886/
Does it help?

Hongbin Yang

发件人: Tim Dudgeon
发送时间: 2019年8月26日 21:08
收件人: rdkit-discuss@lists.sourceforge.net
主题: [Rdkit-discuss] Generating R-group representation

I have a set of molecules that share a common scaffold and differ by 
substitution at a small number of sites (typically one or two).
I'd like to generate a generic R-group molecule that summarises the 
molecules (e.g. showing the scaffold with the sites of substitution as 
R1, R2 etc.).

Finding the MCS of such a set of molecules with RDKit seems straight 
forward, but the output of that is a SMARTS expression for the MCS.
Does anyone have any examples (or hints) of using this to generate a 
R-group representation?

Tim



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SAR matrices

2019-08-20 Thread Hongbin Yang
Hi Greg,


Very nice demo!


I’d like to ask whether we can set the size of the “elements” in a molecular 
graph rather than the figure size?


It is easy to set the width and height when drawing a compound. But when we set 
two compounds as the same size, e.g. 200*150, they may be actually in different 
size from the view of a chemist, because in their mind the size of an element 
(such as a ring, a bond or the font size of an atom symbol) should be the same. 
So can we make the size of a molecular graph dynamic and keep their element 
size the same, which means a complex molecule should have a higher size than a 
simple one.


In this example mentioned by Ken, the TOC in 
https://pubs.acs.org/doi/full/10.1021/ci300206e, the size of the substitutes 
might be 100*100 or 100*120, and the scaffolds are about 300*150.


I am not sure if it is suitable to ask under this thread, but I think you 
should consider this to “draw” such R-group tables.


Best,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics
Pharmaceutical Science, School of Pharmacy
East China University of Science and Technology 


On 08/20/2019 17:36,Greg Landrum wrote:
I actually had a bit of inspiration while waiting for a connecting flight and 
think I will have a little demo of this ready in a day or so.


-greg


On Tue, 20 Aug 2019 at 03:29, Greg Landrum  wrote:

This is a great problem, but it's certainly not a trivial one.


It's a bit of a triviality, but here's at least a demo of how to draw the R 
groups with the dummies as "attachment points":
https://gist.github.com/greglandrum/f7e310045542ab71447351a8043bbf3f




-greg




On Sun, Aug 18, 2019 at 2:43 PM ken  wrote:

Hello,



I am trying to build a 2-D R-group grid (or table, or spreadsheet), where the 
row headers contain R1 values and the column headers contain R2 values (or vice 
versa).  Compounds that have given R1 and R2 groups would be represented on the 
table as a filled cell that intersects those R1 and R2. For example, the input 
could be an SD file containing the following three compounds:



The desired output grid from the sd file would look something like this ("Y" 
can be replaced with cell formatting or some other indicator):


The closest thing to this that I have been able to find is the "SAR Matrix" 
(https://f1000research.com/articles/3-113/v2), but the code that was used to 
generate the matrices does not appear to be available.  Does anyone happen to 
have such code or know how I can generate it? I imagine the first step would be 
to perform an R-group decomposition, but I'm not sure what to do from there. 


I started to see if I could build the program from scratch, but then I thought 
that someone must've done this before and I shouldn't needlessly reinvent it.  
I've been (re)learning Python for the past year or so and I think I have a 
pretty good handle on the language, but I wouldn't mind putting said learning 
to the test on a "real" project, so if anyone has a solution that outputs 
something that even vaguely resembles the desired grid/matrix, maybe I can 
modify it to fit my needs.


At some point, I would need the grid to be editable in Word, but I'll cross 
that bridge when I get to it...


Thank you in advance for your help,
Ken

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SAR matrices

2019-08-18 Thread Hongbin Yang
Hi Ken,


I am afraid that you may have underestimated the amount of your demand. At 
least it includes R group decomposition (and it seems that only two substitutes 
are considered in your example), drawing molecules, and rendering the “table”.


You can use RDKit to decompose the compounds. For example, use maximum common 
substructure to setup the scaffold, followed by identification of the 
connecting points by comparing the the compounds and the scaffold.


As for drawing molecules, I think RDKit cannot draw the substituent group (I’m 
not sure) like what ChemDraw does.


If you just want to analyse the compounds by R-group decomposition, I believe 
some kinds of commercial software such as Schrodinger and StarDrap are pretty 
good.


And as for the example in the paper by Gupta-Ostermann et al.  I guess that 
they drew the figures via ChemDraw and Powerpoint rather than scripts.


Best regards,




Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics


On 08/18/2019 20:23,ken wrote:
Hello,



I am trying to build a 2-D R-group grid (or table, or spreadsheet), where the 
row headers contain R1 values and the column headers contain R2 values (or vice 
versa).  Compounds that have given R1 and R2 groups would be represented on the 
table as a filled cell that intersects those R1 and R2. For example, the input 
could be an SD file containing the following three compounds:



The desired output grid from the sd file would look something like this ("Y" 
can be replaced with cell formatting or some other indicator):


The closest thing to this that I have been able to find is the "SAR Matrix" 
(https://f1000research.com/articles/3-113/v2), but the code that was used to 
generate the matrices does not appear to be available.  Does anyone happen to 
have such code or know how I can generate it? I imagine the first step would be 
to perform an R-group decomposition, but I'm not sure what to do from there. 


I started to see if I could build the program from scratch, but then I thought 
that someone must've done this before and I shouldn't needlessly reinvent it.  
I've been (re)learning Python for the past year or so and I think I have a 
pretty good handle on the language, but I wouldn't mind putting said learning 
to the test on a "real" project, so if anyone has a solution that outputs 
something that even vaguely resembles the desired grid/matrix, maybe I can 
modify it to fit my needs.


At some point, I would need the grid to be editable in Word, but I'll cross 
that bridge when I get to it...


Thank you in advance for your help,
Ken
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit cannot sanitize metal atom like platinum

2019-07-26 Thread Hongbin Yang
Dear all,


I encountered a problem when reading the molecule “Oxaliplatin”.


In DrugBank, the SMILES of Oxaliplatin is 
`[H][N]1([H])[C@@H]2[C@H]2[N]([H])([H])[Pt]11OC(=O)C(=O)O1`. If you use 
Chem.MolFromSmiles without denying sanitisation, it will return an error, 
"Explicit valence for atom # 0 N, 4, is greater than permitted”. In this 
molecule, the nitrogens have four bonds, one of which is coordination bond. It 
seems that SMILES cannot present this bond type.
My question is how to read it? Sanitisation is necessary to e.g. calculate 
fingerprint so skipping sanitisation is not a good idea.


One alternative is to use ionisation. For example, the SMILES of Oxaliplatin in 
PubChem is `C1CC[C@H]([C@@H](C1)[NH-])[NH-].C(=O)(C(=O)[O-])[O-].[Pt+4]`. It 
transfers the covalent bonds into ionic bonds, which I think is not good 
enough, but it’s OK.


If this is the only solution, my question is how to transfer the “incorrect” 
SMILES in DrugBank into the “correct” one in PubChem within RDKit.


PS, I though it should be a common issue but I could not find anything similar 
in GitHub Issue and Mailing list history. (Maybe it is because we are always 
discarding organometallic compounds ?) 


Best regards,


Hongbin Yang 杨弘宾, Ph.D.
Research: Toxicophore and Chemoinformatics
Pharmaceutical Science, School of Pharmacy
East China University of Science and Technology 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Definition of HBA differs from pipeline pilot

2017-06-21 Thread Hongbin Yang







Hi, Chris,
? ? Thank you very much for the suggestion. But I tend to tell my fellows to 
use?Lipinski' HBA in PP :).
? ? (BTW,?http://www.macinchem.org?is pretty good. I like the website and 
thanks for the "advertisement")
Cheers,




Hongbin Yang?
?From:?Chris SwainDate:?2017-06-21?14:08To:?rdkit-discussSubject:?Re: 
[Rdkit-discuss] Definition of HBA differs from pipeline pilotHi,
Many applications have multiple definitions of HBA/D from simple heteroatom 
counts to sophisticated SMARTS definitions, as long as they are documented and 
referenced I’d vote for keeping all definitions. It certainly helps if you want 
to go back and try to repeat published work.?
Cheers,
Chris

Dr Chris Swain BA MA (Cantab) PhD?CChem FRSC
Macs in Chemistry
sw...@mac.com
http://www.macinchem.org





Message: 2
Date: Tue, 20 Jun 2017 23:52:48 +0800
From: "Hongbin Yang" <yanyangh...@163.com>
To: rdkit-discuss <rdkit-discuss@lists.sourceforge.net>
Subject: [Rdkit-discuss] Definition of HBA differs from pipeline pilot
Message-ID: <2017062023524801328...@163.com>
Content-Type: text/plain; charset="gb2312"







Hi, Rdkiters,
? ?The definition of HBA in rdkit is (by Lipinski) :
32 ?HAcceptorSmarts = Chem.MolFromSmarts('[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),' + 
33 
??'$([O,S;H0;v2]),$([O,S;-]),$([N;v3;!$(N-*=!@[O,N,P,S])]),'
 + 
34 ??'$([nH0,o,s;+0])]') 

? ? But in pipeline pilot (3.5), there are two HBA definitions, one of which is 
Lipinsk's. I guess the other is the "first edition", which is defined as:22 ?# 
HAcceptor ?'[$([!#6;+0]);!$([F,Cl,Br,I]); 
23 ?# !$([o,s,nX3]);!$([Nv5,Pv5,Sv4,Sv6])]' 
? ?Does it mean that we should use the newest edition of HBA and?get rid of the 
default definition in pipeline pilot. These may change the?datasets filtered by 
rules such RO5.
(I am not sure whether the HBA defined in PP is the same as defined in Line 
22-23. I made a test.?Abacavir have 7 (current edition) and 6 (old) 
respectively. And in PP, it also returned these two results).

reference:?http://www.rdkit.org/docs/api/rdkit.Chem.Lipinski-pysrc.html#NumHAcceptors?

Hongbin Yang ???

Research: Toxicophore and Chemoinformatics
Pharmaceutical Science, School of Pharmacy

East China University of Science and Technology?



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Reverse scale of svg coordinates to atom coordinates

2017-06-21 Thread Hongbin Yang







Hi, Esben Jannik Bjerrum,
    I have studied how to do it, though it may not be the best way. In 
`rdMolDraw` there are `getAtomCoords` and `getDrawCoords` in C++ APIs    
http://www.rdkit.org/docs/cppapi/classRDKit_1_1MolDraw2D.html#abd327050dfa838543103d1f2c5f28f23
     But these APIs are not wrapped for python.  So you may have to add the 
wrapper into the source and compile it manually.
Here are some codes I made in older version:### in 
`rdkit/Code/GraphMol/MolDraw2D/Wrap` ###namespace RDKit {Point2D 
getDrawCoordsViaId(MolDraw2D ,  int at_num) {

  return  self.getDrawCoords(at_num);

}

Point2D getAtomCoordsViaId(MolDraw2D , int at_num) {

  return self.getAtomCoords(at_num);

} 

}...BOOST_PYTHON_MODULE(rdMolDraw2D) 
{...python::class_<RDKit::MolDraw2D,boost::noncopyable>("MolDraw2D",docString.c_str(),python::no_init)
    .def ...)    .def("getDrawCoords", RDKit::getDrawCoordsViaId, "a")     
.def("getAtomCoords", RDKit::getAtomCoordsViaId,"b")

    ;...

(I don't know why I added "a" and "b" and they may not necessary. I am not good 
at C++ and boost-python)        BTW, I wonder is it possible to open these APIs 
officially ?

Hongbin Yang 

 From: Esben Jannik Bjerrum via Rdkit-discussDate: 2017-06-21 22:38To: 
rdkit-discuss@lists.sourceforge.netSubject: [Rdkit-discuss] Reverse scale of 
svg coordinates to atom coordinates
Hi RDkitters,  
 I'm experimenting a bit with an application with some user interactivity. I 
get the 
SVG coordinates from the Mol SVG drawing from the user interaction and 
need to get back to the atom coordinates with the goal of identifying 
the atom nearest the selected coordinates (or is there a smarter way to 
achieve that goal?).
Is this possible from Python currently?
I see that there is a MolDraw2D::getAtomCoords function in the cpp code for 
MolDraw2D, but I can't see it from the Python side, and there don't seem to be 
a way to get the scaling from the MolDraw2DSVG object.

As
 I understand it, the coordinates from the molecule are offset and 
scaled (and flipped for y) to fit the drawing canvas of the specified 
size. To get back to the original atom coordinates I must somehow 
reverse the transformation. Here's some script snippets illustrating 
what I try to achieve.

#Get som SVG depiction of a mol
mol = Chem.MolFromSmiles('')
mc = Chem.Mol(mol.ToBinary())
rdDepictor.Compute2DCoords(mc)
drawer = rdMolDraw2D.MolDraw2DSVG(300,300)
drawer.DrawMolecule(mc)
drawer.FinishDrawing()
svg = drawer.GetDrawingText().replace('svg:','')

##Visualization and User interaction code here gives SVG coordinates
#
svg_x = 271.0svg_y = 237.0

#Is there a function to scale back the coordinates? alternatively get the 
scaling and the offset from drawer and handle it manually
atomcoords = drawer.getAtomCoords((svg_x, svg_y))
#But this function doesn't exist:-(
#...#Continue working with the rdkit mol

I would welcome some hints or suggestions.

Esben Jannik Bjerrum
cand.pharm, Ph.D
/Sent from my Ubuntu Touch Phone

Phone +45 2823 8009
http://dk.linkedin.com/in/esbenbjerrum
http://www.wildcardconsulting.dk
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Definition of HBA differs from pipeline pilot

2017-06-20 Thread Hongbin Yang






Hi, Rdkiters,
? ?The definition of HBA in rdkit is (by Lipinski) :
32  HAcceptorSmarts = Chem.MolFromSmarts('[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),' + 
33   
'$([O,S;H0;v2]),$([O,S;-]),$([N;v3;!$(N-*=!@[O,N,P,S])]),' + 
34   '$([nH0,o,s;+0])]') 

? ? But in pipeline pilot (3.5), there are two HBA definitions, one of which is 
Lipinsk's. I guess the other is the "first edition", which is defined as:22  # 
HAcceptor  '[$([!#6;+0]);!$([F,Cl,Br,I]); 
23  # !$([o,s,nX3]);!$([Nv5,Pv5,Sv4,Sv6])]' 
? ?Does it mean that we should use the newest edition of HBA and?get rid of the 
default definition in pipeline pilot. These may change the?datasets filtered by 
rules such RO5.
(I am not sure whether the HBA defined in PP is the same as defined in Line 
22-23. I made a test.?Abacavir have 7 (current edition) and 6 (old) 
respectively. And in PP, it also returned these two results).

reference:?http://www.rdkit.org/docs/api/rdkit.Chem.Lipinski-pysrc.html#NumHAcceptors?

Hongbin Yang 杨弘宾

Research: Toxicophore and Chemoinformatics
Pharmaceutical Science, School of Pharmacy

East China University of Science and Technology?


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Which package should be used to improve the drawing quality in win64

2017-05-08 Thread Hongbin Yang






Hi, all,? ??? ? By default, rdkit uses pillow to render images when using 
rdkit.Chem.DrawMolecule. But the quality is not good enough, like this:? ??If 
memory does not fail me, pycairo can solve this problem. But unfortunately, I 
cannot find (a good) win64 pycairo in any channel in anaconda. In rdkit 
document, aggdraw (?http://effbot.org/downloads/#aggdraw?) is suggested to 
install. But I think it's too old, not to speak of the 
support?for?win64-python27. (So is it necessary to update the document?)? ??? ? 
So I wonder which package I should install to improve the drawing quality? 
(rdMolDraw is good, but its not as easy to use as DrawMolecule).


Hongbin Yang?


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Another Can't kekulize mol observation

2017-04-26 Thread Hongbin Yang






Hi Markus,“c1ccc(cc1)-c1nnc(n1)-c1c1” is different from 
"c1ccc(cc1)-c1nncn1-c1c1", so you cannot remove the parentheses.
The error "Can't kekulize mol." is caused by the triazole in your molecule.
"c1nncn1" tells that the molecule is aromatic, but it do not tell where the H 
is.
For example,  "C1=NN=CN1" is "4H-1,2,4-triazole" and "C1=NC=NN1" is 
1H-1,2,4-triazole. They are different in Kekulize but both of them can 
represented by "c1nncn1"
There's two solutions I suggest:1. use 
`Chem.MolFromSmiles('c1ccc(cc1)-c1nnc(n1)-c1c1',False)` (reference: 
http://www.rdkit.org/docs/api/rdkit.Chem.rdmolfiles-module.html#MolFromSmiles) 
2. Manually Kekulize it: 
`Chem.MolFromSmiles('c1ccc(cc1)-C1=NN=C(N1)-c1c1')` . This indicate the H 
is on the 4'N.



Hongbin Yang 

 From: Markus MetzDate: 2017-04-27 09:30To: RDKit DiscussSubject: 
[Rdkit-discuss] Another Can't kekulize mol observationHello all:
I obtained this smiles string:c1ccc(cc1)-c1nnc(n1)-c1c1by removing atoms 
from the n1 in parentheses.
Using:mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nnc(n1)-c1c1")throws an error: 
Can't kekulize mol.
Using mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nncn1-c1c1")
works fine.
Is there any workaround?Any input is highly appreciated.
Cheers,Markus


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Cannot import rdBase after installed rdkit by source in a non-administrator linux cluster

2017-03-29 Thread Hongbin Yang






Hi, Greg,
    Thanks for your suggestion. However, I cannot install it correctly by 
"original conda" as suggested in document.
> `conda create -c rdkit -n hbyang-rdkit-env rdkit`>   The same error occured 
>just like what I installed from 
>source:/home/hbyang/.conda/envs/hbyang-rdkit-env/lib/python2.7/site-packages/rdkit/../../../libboost_serialization.so.1.56.0:
> undefined symbol: 
>_ZN5boost13serialization6detail17singleton_wrapperINS_7archive6detail12extra_detail3mapINS3_15binary_oarchive14m_is_de

    Fortunately, I solved the problem referring to the previous mail-list 
https://sourceforge.net/p/rdkit/mailman/message/35103418/ > the linux packages 
that are available from the rdkit channel on anaconda.org are based on centos6> 
rdkit packages compatible with the rhel5 distribution could be available from 
the bioconda channel
    So I added the channel via `conda config --add channels bioconda` , 
installed rdkit by `conda install rdkit`  and it worked. I found that it 
requires boost 1.57.0-4 and thus I guess that 1.56.0 is not well compatible 
with rhel5.      The cost is that I can not use the lastest version, but it's 
ok.

Hongbin Yang 杨弘宾


 From: Greg LandrumDate: 2017-03-29 14:01To: 杨弘宾CC: rdkit-discussSubject: Re: 
[Rdkit-discuss] Cannot import rdBase after installed rdkit by source in a 
non-administrator linux clusterA first thing that's important to know and that 
may make all of this easier:You can install the rdkit using conda even if you 
don't have write access to the directory where anaconda python is installed.If 
you follow the directions in the documentation and create an environment to use 
the RDKit in you should be able to install your own packages without any 
problems. Environments are (normally) created in a subdir of your home 
directory, where you will (hopefully) have write access.
If you cannot do that or, for some other reason, want to install the RDKit from 
source, I will try and provide more help on that.
-greg

On Tue, Mar 28, 2017 at 5:56 PM, 杨弘宾 <yanyangh...@163.com> wrote:

Hi, rdkiters,
    Have you tried install rdkit from source? It's ok when I installed rdkit by 
conda in my PC. But when I tried installing it in a server in which I am only a 
user who cannot use "sudo" and the "python" is in a read-only directory.
Here is my cmake command:`~applic/cmake/bin/cmake -D 
PYTHON_LIBRARY=/home/yccai/Programs/Anaconda/lib/python2.7/config/libpython2.7.a
 -D PYTHON_INCLUDE_DIR=/home/yccai/Programs/Anaconda/include/python2.7 -D 
PYTHON_EXECUTABLE=/home/yccai/Programs/Anaconda/bin/python -D 
BOOST_ROOT=/home/yccai/Programs/Anaconda -D Boost_NO_SYSTEM_PATHS=ON ..`
And output:
-- The C compiler identification is GNU 4.1.2

-- The CXX compiler identification is GNU 4.1.2

-- Check for working C compiler: /usr/bin/cc

-- Check for working C compiler: /usr/bin/cc -- works

-- Detecting C compiler ABI info

-- Detecting C compiler ABI info - done

-- Check for working CXX compiler: /usr/bin/c++

-- Check for working CXX compiler: /usr/bin/c++ -- works

-- Detecting CXX compiler ABI info

-- Detecting CXX compiler ABI info - done

-- Check if the system is big endian

-- Searching 16 bit integer

-- Looking for sys/types.h

-- Looking for sys/types.h - found

-- Looking for stdint.h

-- Looking for stdint.h - found

-- Looking for stddef.h

-- Looking for stddef.h - found

-- Check size of unsigned short

-- Check size of unsigned short - done

-- Using unsigned short

-- Check if the system is big endian - little endian

-- Found PythonInterp: /home/yccai/Programs/Anaconda/bin/python (found version 
"2.7.12") 

-- Found PythonLibs: 
/home/yccai/Programs/Anaconda/lib/python2.7/config/libpython2.7.a (found 
version "2.7.12") 

-- Boost version: 1.56.0

-- Found the following Boost libraries:

--   python

-- Could NOT find Eigen3 (missing:  EIGEN3_INCLUDE_DIR EIGEN3_VERSION_OK) 
(Required is at least version "2.91.0")

Eigen3 not found, disabling the Descriptors3D build.

-- Looking for include file pthread.h

-- Looking for include file pthread.h - found

-- Looking for pthread_create

-- Looking for pthread_create - not found

-- Looking for pthread_create in pthreads

-- Looking for pthread_create in pthreads - not found

-- Looking for pthread_create in pthread

-- Looking for pthread_create in pthread - found

-- Found Threads: TRUE  

-- Boost version: 1.56.0

-- Found the following Boost libraries:

--   thread

--   system

-- Boost version: 1.56.0

-- Found the following Boost libraries:

--   serialization

== Using strict rotor definition

== Updating Filters.cpp from pains file

== Done updating pains files

-- Boost version: 1.56.0

-- Found the following Boost libraries:

--   regex

-- Configuring done

-- Generating done

-- Build files have been written to: 
/home/hbyang/applic/rdkit-Release_2016_09_4/build 

There was no error in `make` and `make install`.
B

Re: [Rdkit-discuss] Cannot import rdBase after installed rdkit by source in a non-administrator linux cluster

2017-03-28 Thread Hongbin Yang






Hi, Andrew,
    I did set the LD_LIBRARY_PATH and added the $conda/lib in it.        Today, 
I tried installing a new boost (1.60.0) myself with the following 
commands:```./bootstrap.sh
./b2 install
```        Interestingly, I colud not `make` successfully this time, with the 
several errors such as :```/home/hbyang/.local/lib/libboost_thread.so: 
undefined reference to `std::__cxx11::basic_string<char, 
std::char_traits, std::allocator >::_M_create(unsigned long&, 
unsigned long)@GLIBCXX_3.4.21'

/home/hbyang/.local/lib/libboost_thread.so: undefined reference to 
`std::__cxx11::basic_string<char, std::char_traits, std::allocator 
>::_M_append(char const*, unsigned long)@GLIBCXX_3.4.21' 
```    Then I tried install by conda with     `conda create -c rdkit -n 
hbyang-rdkit-env rdkit`    The same error occured just like what I installed 
from 
source:/home/hbyang/.conda/envs/hbyang-rdkit-env/lib/python2.7/site-packages/rdkit/../../../libboost_serialization.so.1.56.0:
 undefined symbol: 
_ZN5boost13serialization6detail17singleton_wrapperINS_7archive6detail12extra_detail3mapINS3_15binary_oarchive14m_is_destroyedE

    By the way, I found a similar problem reported at the end of the issue 
[#762](https://github.com/rdkit/rdkit/issues/762 ) 
user-agent conda/4.3.4 requests/2.12.4 CPython/2.7.12 Linux/2.6.18-308.el5 
CentOS/5.8 glibc/2.5

Hongbin Yang 杨弘宾

Research: Toxicophore and Chemoinformatics
Pharmaceutical Science, School of Pharmacy

East China University of Science and Technology 

 From: Andrew DalkeDate: 2017-03-29 03:35To: CC: rdkit-discussSubject: Re: 
[Rdkit-discuss] Cannot import rdBase after installed rdkit by source in a 
non-administrator linux clusterOn Mar 28, 2017, at 17:56, 杨弘宾 
<yanyangh...@163.com> wrote:
> Have you tried install rdkit from source? It's ok when I installed rdkit 
>by conda in my PC. But when I tried installing it in a server in which I am 
>only a user who cannot use "sudo" and the "python" is in a read-only directory.
 
Yes I have, and I find it rather difficult. (My system has Python 2.7 and 
Python 3.5, for several versions of RDKit, so I can do regression testing 
across multiple environments.)
 
I use Python virtual environments which helps, in that I effectively can 
control a Python installation, but also adds its own layer of complexity.
 
 
> But when I used:
> `from rdkit import rdBase`
> error happened:
> ImportError: 
> /home/yccai/Programs/Anaconda/bin/../lib/libboost_serialization.so.1.56.0: 
> undefined symbol: 
> _ZN5boost13serialization6detail17singleton_wrapperINS_7archive6detail12extra_detail3mapINS3_15binary_oarchive14m_is_destroyedE
 
I think you are missing an LD_LIBRARY_PATH entry to point to your Boost 
libraries.
 
 
Andrew
da...@dalkescientific.com
 
 
 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss