from:"Paolo Tosco"

Re: [Rdkit-discuss] GetSubstructMatches and unique match

2020-05-10 Thread Paolo Tosco


Dear Quoc-Tuan,

I think I have come with a reasonably fast algorithm that seems to be 
more robust:


https://gist.github.com/ptosco/dc4d27153e6e8e45aed654761e4d7409

Cheers,
p.

On 06/05/2020 09:11, Quoc-Tuan DO wrote:

Dear Paolo,

Thank you again for your code. Sorry for bothering you again. It works 
all fine for monoterpenes but not for diterpenes, sesquiterpenes nor 
triterpenes.


pattern: C~C~C(~C)~C

mol1: CC(=O)O[C@H]1CC[C@]2([C@H](C1(C)C)CC=C([C@@H]2CC/C(=C/C(=O)O)/C)C)C

=> ((17, 18, 19, 20, 23), (16, 24, 13, 14, 15), (8, 9, 4, 12, 7))

It should find 4 distinct units.

mol2: OCC12CCC(C2C2C(CC1)(C)C1(C)CCC3C(C1CC2)(C)CCC(C3(C)C)O)C(=C)C

=> ((16, 25, 27, 17, 15), (18, 19, 12, 13, 14), (1, 2, 5, 6, 7))

It should find 6 distinct units.

I tried with a smarts version of the pattern 
[#6]~[#6]~[#6](~[#6])~[#6], but got the same results as with smiles.


What do you think? Is there something missing in the query?

Thanks for your time,

Best regards,

QT



Le 05/05/2020 à 14:52, Paolo Tosco a écrit :


Dear Quoc-Tuan,

this should do what you need:

https://gist.github.com/ptosco/dc4d27153e6e8e45aed654761e4d7409

Cheers,
p.






___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] write sdf properties without position number

2020-05-05 Thread Paolo Tosco


Hi Nicolas,

quick and dirty solution: strip it with a regex, e.g.

sed 's|^\(>  <.*>\) *([0-9]*)|\1|'

HTH,
p.

On 05/05/2020 16:35, Nicolas Bosc wrote:

Hi RDKit users,

Writing molecules in a sdf with properties automatically add a number 
after the property name which is the position of the associated 
molecule in the file:

>   * (1) *
CHEMBL123

How can I change this so there is no number? The program that I am 
using to read the sdf file fails because of this...


Thanks,

Nicolas


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] GetSubstructMatches and unique match

2020-05-05 Thread Paolo Tosco


Dear Quoc-Tuan,

this should do what you need:

https://gist.github.com/ptosco/dc4d27153e6e8e45aed654761e4d7409

Cheers,
p.

On 05/05/2020 11:52, Quoc-Tuan DO wrote:


Dear Paolo,

Thank you for your reply.

I understand now... I did not use uniquify option first then only 
uniquify=True. I thought the default would be uniquify=False.


Actually my problem is to find 2 distinct units of isoprene (pattern) 
in the borneol (smiles) as the latter is a monoterpene.


Do you have any idea I can do this ?

Thanks in advance for your time.

Best regards,

QT



Le 04/05/2020 à 19:53, Paolo Tosco a écrit :


Dear Quoc-Tuan,

On 04/05/2020 09:10, Greenpharma S.A.S. wrote:


Dear All,

Please could you help with the following problem (I could not find 
answers in discussion list) ?


pattern='C~C~C(~C)~C'

smiles='O[C@H]1C[C@H]2C([C@@]1(C)CC2)(C)C'


pat = Chem.MolFromSmiles(pattern)
mol = Chem.MolFromSmiles(smiles)
res = mol.GetSubstructMatches(pat, uniquify=True)


The results are:

((1, 2, 3, 4, 8), (1, 5, 4, 3, 9), (1, 5, 4, 3, 10), (1, 5, 4, 9, 
10), (2, 1, 5, 4, 6), (2, 1, 5, 4, 7), (2, 1, 5, 6, 7), (2, 3, 4, 5, 
9), (2, 3, 4, 5, 10), (2, 3, 4, 9, 10), (3, 4, 5, 1, 6), (3, 4, 5, 
1, 7), (3, 4, 5, 6, 7), (5, 4, 3, 2, 8), (6, 5, 4, 3, 9), (6, 5, 4, 
3, 10), (6, 5, 4, 9, 10), (7, 5, 4, 3, 9), (7, 5, 4, 3, 10), (7, 5, 
4, 9, 10), (7, 8, 3, 2, 4), (8, 3, 4, 5, 9), (8, 3, 4, 5, 10), (8, 
3, 4, 9, 10), (8, 7, 5, 1, 4), (8, 7, 5, 1, 6), (8, 7, 5, 4, 6), (9, 
4, 3, 2, 8), (9, 4, 5, 1, 6), (9, 4, 5, 1, 7), (9, 4, 5, 6, 7), (10, 
4, 3, 2, 8), (10, 4, 5, 1, 6), (10, 4, 5, 1, 7), (10, 4, 5, 6, 7))



I expect to have only 2 matches with uniquify=True as I only have 2 
units of the pattern.


GetSubstructMatches() will report all matches of the pattern against 
your molecule. In your case, there are 35 matches which are all 
constituted by different atom indices.



Furthermore, with or without uniquify, I have the same answers.

If you set uniquify=False, you actually get 70 matches, so twice as 
many answers. This time, matches can be constitued by the same 
indices, provided they are in a different permutation.


I have uploaded a gist here:

https://gist.github.com/ptosco/6d70cec235361fbaddc7cbc2cf9c3b5d

that hopefully will make this clearer.

Cheers,
p.

I also expected that there should be 2 "independent" lists but here, 
there is always at least one common atom between each list.


Is there something misunderstood or misused?

Thanks in advance for your help and explanations.

Best regards,

Quoc-Tuan



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] GetSubstructMatches and unique match

2020-05-04 Thread Paolo Tosco


Dear Quoc-Tuan,

On 04/05/2020 09:10, Greenpharma S.A.S. wrote:


Dear All,

Please could you help with the following problem (I could not find 
answers in discussion list) ?


pattern='C~C~C(~C)~C'

smiles='O[C@H]1C[C@H]2C([C@@]1(C)CC2)(C)C'


pat = Chem.MolFromSmiles(pattern)
mol = Chem.MolFromSmiles(smiles)
res = mol.GetSubstructMatches(pat, uniquify=True)


The results are:

((1, 2, 3, 4, 8), (1, 5, 4, 3, 9), (1, 5, 4, 3, 10), (1, 5, 4, 9, 10), 
(2, 1, 5, 4, 6), (2, 1, 5, 4, 7), (2, 1, 5, 6, 7), (2, 3, 4, 5, 9), 
(2, 3, 4, 5, 10), (2, 3, 4, 9, 10), (3, 4, 5, 1, 6), (3, 4, 5, 1, 7), 
(3, 4, 5, 6, 7), (5, 4, 3, 2, 8), (6, 5, 4, 3, 9), (6, 5, 4, 3, 10), 
(6, 5, 4, 9, 10), (7, 5, 4, 3, 9), (7, 5, 4, 3, 10), (7, 5, 4, 9, 10), 
(7, 8, 3, 2, 4), (8, 3, 4, 5, 9), (8, 3, 4, 5, 10), (8, 3, 4, 9, 10), 
(8, 7, 5, 1, 4), (8, 7, 5, 1, 6), (8, 7, 5, 4, 6), (9, 4, 3, 2, 8), 
(9, 4, 5, 1, 6), (9, 4, 5, 1, 7), (9, 4, 5, 6, 7), (10, 4, 3, 2, 8), 
(10, 4, 5, 1, 6), (10, 4, 5, 1, 7), (10, 4, 5, 6, 7))



I expect to have only 2 matches with uniquify=True as I only have 2 
units of the pattern.


GetSubstructMatches() will report all matches of the pattern against 
your molecule. In your case, there are 35 matches which are all 
constituted by different atom indices.



Furthermore, with or without uniquify, I have the same answers.

If you set uniquify=False, you actually get 70 matches, so twice as many 
answers. This time, matches can be constitued by the same indices, 
provided they are in a different permutation.


I have uploaded a gist here:

https://gist.github.com/ptosco/6d70cec235361fbaddc7cbc2cf9c3b5d

that hopefully will make this clearer.

Cheers,
p.

I also expected that there should be 2 "independent" lists but here, 
there is always at least one common atom between each list.


Is there something misunderstood or misused?

Thanks in advance for your help and explanations.

Best regards,

Quoc-Tuan



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] [RDKit-discuss] rdchem.ResonanceMolSupplier cause segfault on some inputs

2020-04-29 Thread Paolo Tosco


Dear Victor,

I have tested this on the latest RDKit trunk code and it does not 
segfault; I believe this is the same bug described here:


https://github.com/rdkit/rdkit/issues/3048

and it is already fixed.

Cheers,
p.

On 29/04/2020 17:24, victor viterbo via Rdkit-discuss wrote:


 RDKit  3D

 23 25  0  0  0  0  0  0  0  0999 V2000
    0.3053    3.6659    0.0544 O   0  0  0  0  0  0  0  0  0  0 0 0
   -0.1542    2.5093    0.0748 N   0  0  0  0  0  0  0  0  0  0 0 0
   -1.5618    2.3423    0.0180 O   0  0  0  0  0  0  0  0  0  0 0 0
    0.4179    1.3007    0.1430 C   0  0  0  0  0  0  0  0  0  0 0 0
    1.7461    0.8529    0.2172 C   0  0  0  0  0  0  0  0  0  0 0 0
    1.9320   -0.5143    0.2726 C   0  0  0  0  0  0  0  0  0  0 0 0
    0.8664   -1.4626    0.2581 C   0  0  0  0  0  0  0  0  0  0 0 0
   -0.4354   -1.0479    0.1863 C   0  0  0  0  0  0  0  0  0  0 0 0
   -0.6211    0.3406    0.1300 C   0  0  0  0  0  0  0  0  0  0 0 0
   -1.8191    1.0230    0.0528 C   0  0  0  0  0  0  0  0  0  0 0 0
   -3.0437    0.2952    0.0215 C   0  0  0  0  0  0  0  0  0  0 0 0
   -2.9565   -1.0485    0.0711 C   0  0  0  0  0  0  0  0  0  0 0 0
   -1.6963   -1.8573    0.1583 C   0  0  0  0  0  0  0  0  0  0 0 0
    3.3058   -1.0642    0.3525 C   0  0  0  0  0  0  0  0  0  0 0 0
    3.5605   -2.2494    0.4147 O   0  0  0  0  0  0  0  0  0  0 0 0
    4.2582   -0.1354    0.3501 O   0  0  0  0  0  0  0  0  0  0 0 0
    2.5743    1.5404    0.2298 H   0  0  0  0  0  0  0  0  0  0 0 0
    1.1370   -2.5054    0.3065 H   0  0  0  0  0  0  0  0  0  0 0 0
   -3.9845    0.8147   -0.0397 H   0  0  0  0  0  0  0  0  0  0 0 0
   -3.8648   -1.6307    0.0494 H   0  0  0  0  0  0  0  0  0  0 0 0
   -1.7515   -2.4890    1.0517 H   0  0  0  0  0  0  0  0  0  0 0 0
    5.1452   -0.5339    0.4038 H   0  0  0  0  0  0  0  0  0  0 0 0
   -1.6720   -2.5533   -0.6873 H   0  0  0  0  0  0  0  0  0  0 0 0
  2  1  1  0
 14  6  1  0
 15 14  2  0
 16 14  1  0
 17  5  1  0
 18  7  1  0
 19 11  1  0
 20 12  1  0
 21 13  1  0
 22 16  1  0
 23 13  1  0
  3 10  1  0
  4  2  1  0
  2  3  1  0
 10  9  1  0
  8 13  1  0
 13 12  1  0
 12 11  1  0
  9  4  4  0
  5  6  4  0
  7  8  4  0
  4  5  4  0
  6  7  4  0
  8  9  4  0
 10 11  2  0
V    1 O
M  END




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Complications with ConstrainedEmbed

2020-04-18 Thread Paolo Tosco

Hi Tim,

it's in mol.__sssAtoms, but that information is avaiable only if you are
running code in the Jupyter notebook with IPythonConsole extensiosn
installed; see

https://github.com/rdkit/rdkit/blob/39bcee635e0ee8bc5da6798318fdcd4602c4baa6/rdkit/Chem/Draw/IPythonConsole.py#L142

Cheers,
p.

On 18/04/2020 16:55, Tim Dudgeon wrote:

On 18/04/2020 11:56, Paolo Tosco wrote:

Hi Tim,

mol.GetSubstructMatch(query) will give you indices in mol that match
query.

Yes, I can re-calculate it, but the fact that it's getting highlighted
in Jupyter suggests that the mol already has that info so it shouldn't
need recalculating. But I can't find it.

Also note that |rdFMCS.||MCSResult| has a |queryMol |property that
encode the MCS query, so you don't need to rebuild the query molecule
out of the SMARTS pattern.

Good. I'll use that.

On 18/04/2020 10:27, Tim Dudgeon wrote:

I also updated the Jupyter notebook with the solution.

Out of interest, I now need to get the atom indices of the part of
the molecule that matched.
As Jupyter is nicely highlighting this that must already be present
in the molecule somehow, but I can't find out how.

I look at molecule and atom properties but can't find anything that
suggests "highlight me".

How is this encoded?

Tim

On 17/04/2020 19:02, Paolo Tosco wrote:

Hi Tim,

I’ll take a look later and get back to you.

Cheers,
p.

On 17 Apr 2020, at 18:55, Tim Dudgeon wrote:

I'm wanting to use AllChem.ConstrainedEmbed() to generate a
conformer of a molecule tethered to a molecule that should always
have some MCS. I found some code on the internet that mostly
works, but I don't fully understand.

It generally works as planned, but for a small number of examples
it fails.

Can someone guide me to what is wrong. Here is an example (good
and bad):

https://github.com/tdudgeon/jupyter_mpro/blob/master/tethering.ipynb

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Complications with ConstrainedEmbed

2020-04-18 Thread Paolo Tosco


Hi Tim,

mol.GetSubstructMatch(query) will give you indices in mol that match query.

Also note that |rdFMCS.||MCSResult| has a |queryMol |property that 
encode the MCS query, so you don't need to rebuild the query molecule 
out of the SMARTS pattern.


p.

On 18/04/2020 10:27, Tim Dudgeon wrote:

I also updated the Jupyter notebook with the solution.

Out of interest, I now need to get the atom indices of the part of the 
molecule that matched.
As Jupyter is nicely highlighting this that must already be present in 
the molecule somehow, but I can't find out how.


I look at molecule and atom properties but can't find anything that 
suggests "highlight me".


How is this encoded?

Tim

On 17/04/2020 19:02, Paolo Tosco wrote:

Hi Tim,

I’ll take a look later and get back to you.

Cheers,
p.


On 17 Apr 2020, at 18:55, Tim Dudgeon  wrote:

I'm wanting to use AllChem.ConstrainedEmbed() to generate a 
conformer of a molecule tethered to a molecule that should always 
have some MCS. I found some code on the internet that mostly works, 
but I don't fully understand.


It generally works as planned, but for a small number of examples it 
fails.


Can someone guide me to what is wrong. Here is an example (good and 
bad):


https://github.com/tdudgeon/jupyter_mpro/blob/master/tethering.ipynb




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Complications with ConstrainedEmbed

2020-04-17 Thread Paolo Tosco

Hi Tim,

I’ll take a look later and get back to you.

Cheers,
p.

> On 17 Apr 2020, at 18:55, Tim Dudgeon  wrote:
> 
> I'm wanting to use AllChem.ConstrainedEmbed() to generate a conformer of a 
> molecule tethered to a molecule that should always have some MCS. I found 
> some code on the internet that mostly works, but I don't fully understand.
> 
> It generally works as planned, but for a small number of examples it fails.
> 
> Can someone guide me to what is wrong. Here is an example (good and bad):
> 
> https://github.com/tdudgeon/jupyter_mpro/blob/master/tethering.ipynb
> 
> 
> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Compilation problems on Linux

2020-04-15 Thread Paolo Tosco


Hi Max,

please share the cmake/make outputs with me off-list and I'll try to help.

Cheers,
p.

On 15/04/2020 17:33, Max Pinheiro Jr wrote:

Hi Paolo,

Thank you for your quite fast answer! Yes, I compiled Boost 1.67 using 
the same gcc version, 8.1. I have seen this GLIBCXX possible solution 
that you have commented before, and I also tried that but didn't work 
anyway, I got the same problem with the Boost library and the 
compilation can't finish. I am wondering if may exist any other 
solution. I can also provide some other specific information if this 
would help to map the problem and find a solution.

Thank you again!
Max Pinheiro Jr

Em qua., 15 de abr. de 2020 às 18:25, Paolo Tosco 
mailto:paolo.tosco.m...@gmail.com>> escreveu:


Hi Max,

you mention you are using gcc-8.1 and Boost 1.67. Did you compile
Boost with the same compiler or was it compiled with an earlier
version of gcc/g++?

If Boost was compiled with an earlier version of gcc/g++, you will
need to add to /home/mpinheiro/codes/rdkit-2020.09/CMakeLists.txt
the following line:

|add_definitions("-D_GLIBCXX_USE_CXX11_ABI=0")|

or the linker will fail during the compilation; see
https://github.com/rdkit/rdkit/issues/2013#issuecomment-553563418.

HTH, cheers
p.

On 15/04/2020 17:15, Max Pinheiro Jr wrote:

Dear all,

I have exhaustively tried to compile rdkit (latest git version)
on a Linux cluster but the compilation process was always failing
at the same point with an error message related to the boost
library. After searching in the forum, the only way I could
surpass the problem and finally get the program compiled was
setting the flag "RDK_USE_BOOST_SERIALIZATION" to OFF. However,
when I do a simple test trying to import the Chem module I get
the following error:



from rdkit import Chem
Traceback (most recent call last):
  File "", line 1, in 
  File
"/home/mpinheiro/codes/rdkit-2020.09/rdkit/Chem/__init__.py",
line 20, in 
    from rdkit.Chem import rdchem
SystemError: initialization of rdchem raised unreported exception



I am using gcc-8.1, cmake-3.11.2 and the version 1.67 of boost
library to build RDKit. The compilation instructions I have used
are the following:

cmake -DPy_ENABLE_SHARED=1 \
      -DRDK_INSTALL_INTREE=ON \
      -DRDK_BUILD_CPP_TESTS=ON \
      -DRDK_INSTALL_STATIC_LIBS=ON \
      -DRDK_BUILD_AVALON_SUPPORT=ON \
      -DRDK_BUILD_CAIRO_SUPPORT=ON \
      -DRDK_BUILD_INCHI_SUPPORT=ON \
      -DRDK_BUILD_PYTHON_WRAPPERS=ON \
      -DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON \
-DPYTHON_EXECUTABLE=/home/mpinheiro/.pyenv/versions/3.8.2/bin/python
\
-DPYTHON_LIBRARY=/home/mpinheiro/.pyenv/versions/3.8.2/lib/libpython3.8.a
\
-DPYTHON_INCLUDE_DIR=/home/mpinheiro/.pyenv/versions/3.8.2/include/python3.8
\
      -DPYTHON_NUMPY_INCLUDE_PATH="$(python -c 'import numpy ;
print(numpy.get_include())')" \
      -DBOOST_ROOT=/home/mpinheiro/codes/boost-1.67/ \
-DBOOST_INCLUDEDIR=/home/mpinheiro/codes/boost-1.67/include/boost \
-DBOOST_LIBRARYDIR=/home/mpinheiro/codes/boost-1.67/lib ..

make -j 4 > make.log
make install

I have also checked the links created in the rdBase.so file as
shown below and everything seems to be fine:

 linux-vdso.so.1 =>  (0x2aaab000)
libRDKitRDBoost.so.1 =>
/home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDBoost.so.1
(0x2adb1000)
libboost_python38.so.1.67.0 =>
/home/mpinheiro/codes/boost-1.67/lib/libboost_python38.so.1.67.0
(0x2afb5000)
libRDKitRDGeneral.so.1 =>
/home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDGeneral.so.1
(0x2b1fb000)
libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x2b423000)
libstdc++.so.6 =>
/trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libstdc++.so.6
(0x2b64)
libm.so.6 => /usr/lib64/libm.so.6 (0x2b9c4000)
libgcc_s.so.1 =>
/trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libgcc_s.so.1
(0x2bcc6000)
libc.so.6 => /usr/lib64/libc.so.6 (0x2bedf000)
librt.so.1 => /usr/lib64/librt.so.1 (0x2c2a2000)
libdl.so.2 => /usr/lib64/libdl.so.2 (0x2c4aa000)
libutil.so.1 => /usr/lib64/libutil.so.1 (0x2c6af000)
/lib64/ld-linux-x86-64.so.2 (0x4000)

As I said, I have tried many different tricks and suggestions
that I was able to find in the forum but none of them effectively
solved my problem to get the code working. So I would like to ask
you if someone

Re: [Rdkit-discuss] Compilation problems on Linux

2020-04-15 Thread Paolo Tosco


Hi Max,

you mention you are using gcc-8.1 and Boost 1.67. Did you compile Boost 
with the same compiler or was it compiled with an earlier version of 
gcc/g++?


If Boost was compiled with an earlier version of gcc/g++, you will need 
to add to /home/mpinheiro/codes/rdkit-2020.09/CMakeLists.txt the 
following line:


|add_definitions("-D_GLIBCXX_USE_CXX11_ABI=0")|

or the linker will fail during the compilation; see 
https://github.com/rdkit/rdkit/issues/2013#issuecomment-553563418.


HTH, cheers
p.

On 15/04/2020 17:15, Max Pinheiro Jr wrote:

Dear all,

I have exhaustively tried to compile rdkit (latest git version) on a 
Linux cluster but the compilation process was always failing at the 
same point with an error message related to the boost library. After 
searching in the forum, the only way I could surpass the problem and 
finally get the program compiled was setting the flag 
"RDK_USE_BOOST_SERIALIZATION" to OFF. However, when I do a simple test 
trying to import the Chem module I get the following error:



from rdkit import Chem
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/mpinheiro/codes/rdkit-2020.09/rdkit/Chem/__init__.py", 
line 20, in 

    from rdkit.Chem import rdchem
SystemError: initialization of rdchem raised unreported exception


I am using gcc-8.1, cmake-3.11.2 and the version 1.67 of boost library 
to build RDKit. The compilation instructions I have used are the 
following:


cmake -DPy_ENABLE_SHARED=1 \
      -DRDK_INSTALL_INTREE=ON \
      -DRDK_BUILD_CPP_TESTS=ON \
      -DRDK_INSTALL_STATIC_LIBS=ON \
      -DRDK_BUILD_AVALON_SUPPORT=ON \
      -DRDK_BUILD_CAIRO_SUPPORT=ON \
      -DRDK_BUILD_INCHI_SUPPORT=ON \
      -DRDK_BUILD_PYTHON_WRAPPERS=ON \
      -DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON \
-DPYTHON_EXECUTABLE=/home/mpinheiro/.pyenv/versions/3.8.2/bin/python \
-DPYTHON_LIBRARY=/home/mpinheiro/.pyenv/versions/3.8.2/lib/libpython3.8.a 
\
-DPYTHON_INCLUDE_DIR=/home/mpinheiro/.pyenv/versions/3.8.2/include/python3.8 
\
      -DPYTHON_NUMPY_INCLUDE_PATH="$(python -c 'import numpy ; 
print(numpy.get_include())')" \

      -DBOOST_ROOT=/home/mpinheiro/codes/boost-1.67/ \
-DBOOST_INCLUDEDIR=/home/mpinheiro/codes/boost-1.67/include/boost \
      -DBOOST_LIBRARYDIR=/home/mpinheiro/codes/boost-1.67/lib ..

make -j 4 > make.log
make install

I have also checked the links created in the rdBase.so file as shown 
below and everything seems to be fine:


 linux-vdso.so.1 =>  (0x2aaab000)
libRDKitRDBoost.so.1 => 
/home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDBoost.so.1 
(0x2adb1000)
libboost_python38.so.1.67.0 => 
/home/mpinheiro/codes/boost-1.67/lib/libboost_python38.so.1.67.0 
(0x2afb5000)
libRDKitRDGeneral.so.1 => 
/home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDGeneral.so.1 
(0x2b1fb000)

libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x2b423000)
libstdc++.so.6 => 
/trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libstdc++.so.6 
(0x2b64)

libm.so.6 => /usr/lib64/libm.so.6 (0x2b9c4000)
libgcc_s.so.1 => 
/trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libgcc_s.so.1 
(0x2bcc6000)

libc.so.6 => /usr/lib64/libc.so.6 (0x2bedf000)
librt.so.1 => /usr/lib64/librt.so.1 (0x2c2a2000)
libdl.so.2 => /usr/lib64/libdl.so.2 (0x2c4aa000)
libutil.so.1 => /usr/lib64/libutil.so.1 (0x2c6af000)
/lib64/ld-linux-x86-64.so.2 (0x4000)

As I said, I have tried many different tricks and suggestions that I 
was able to find in the forum but none of them effectively solved my 
problem to get the code working. So I would like to ask you if someone 
has faced a similar problem and may already have some tips on how to 
fix it. I will really appreciate any help you can provide on this issue.


Thanks!

Max Pinheiro Jr


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Help mapping atoms between two files

2020-04-04 Thread Paolo Tosco


Hi Gustavo,

that's because a2.pdb has formal charges whereas a1.pdb has none. If you 
neutralize a2 then you'll find a match.
You could use |Chem.MolStandardize.rdMolStandardize.||Uncharger| or 
something as below:


from rdkit import Chem

a1 = Chem.MolFromPDBFile("a1.pdb")
a2 = Chem.MolFromPDBFile("a2.pdb")

for a in a2.GetAtoms():
    if a.GetFormalCharge():
    a.SetFormalCharge(0)
Chem.SanitizeMol(a2)

print(a1.GetSubstructMatch(a2))

(7, 11, 13, 15, 3, 4, 17, 19, 25, 0, 23, 22, 10, 12, 14, 16, 9, 6, 26, 18, 5, 
8, 20, 1, 21, 24, 2)

HTH, cheers
p.

On 04/04/2020 18:54, Gustavo Seabra wrote:

HI all,

I'm trying to use get the substructure matches between two different PDB
files with the same molecule, but different atom order and naming. However,
GetSubstructMatches Just returns nothing, i.e. no matches (files attached):

For example:

ref_mol = Chem.MolFromPDBFile(str("a1.pdb"))
tgt_mol = Chem.MolFromPDBFile(str("a2.pdb"))
ref_mol.GetNumAtoms(),tgt_mol.GetNumAtoms()

(27, 27)


ref_mol.GetSubstructMatches(tgt_mol)

()


ref_mol.HasSubstructMatch(tgt_mol)

False

Could anyone here suggest a different way to get the atom mapping between
the two molecules?

Thanks a lot,
--
Gustavo Seabra



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] How to sort a list of mol objects

2020-04-03 Thread Paolo Tosco


Hi Zhenting,

you mean sort by a property value?
For example, this would sort your list by decreasing value of the 
ACTIVITY property (e.g., assuming it is a pIC50):


mol_list.sort(key=lambda m: float(m.GetProp("ACTIVITY")), reverse=True)

Cheers,
p.

On 03/04/2020 16:38, Zhenting Gao wrote:

Hi,

I create a list of mol objects, now I want to sort these mol objects, 
so that I can save the top 100 mols into an SDF file.

Could you help?

Best
Zhenting
4/3/2020


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] problem building from source - ‘CT_CHIRALITY_PROP_PREFIX’ is not a member of ‘schrodinger::mae’

2020-03-30 Thread Paolo Tosco


Hi Tim,

try rm -rf External/CoordGen/coordgen* External/CoordGen/maeparser*

you might have some outdated coordgen libs. Deleting those and 
re-running cmake will download them afresh.


Cheers,
p.

On 30/03/2020 14:51, Tim Dudgeon wrote:

I'm finding an error building from source (on master branch).

Any ideas?

cmake -DPYTHON_EXECUTABLE=/usr/bin/python3 
-DRDK_BUILD_INCHI_SUPPORT=ON  -DRDK_BUILD_AVALON_SUPPORT=ON 
-DRDK_BUILD_PYTHON_WRAPPERS=ON  -DRDK_BUILD_SWIG_WRAPPERS=ON  ..

make -j 8


[ 66%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/TplFileParser.cpp.o
[ 66%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/TplFileWriter.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/PDBParser.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/PDBWriter.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/PDBSupplier.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/XYZFileWriter.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/MaeMolSupplier.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/ProximityBonds.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/SequenceParsers.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/SequenceWriters.cpp.o
[ 67%] Building CXX object 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/SVGParser.cpp.o
/home/timbo/github/rdkit/rdkit/Code/GraphMol/FileParsers/MaeMolSupplier.cpp: 
In function ‘void 
RDKit::{anonymous}::set_mol_properties(RDKit::RWMol&, const 
schrodinger::mae::Block&)’:
/home/timbo/github/rdkit/rdkit/Code/GraphMol/FileParsers/MaeMolSupplier.cpp:214:37: 
error: ‘CT_CHIRALITY_PROP_PREFIX’ is not a member of ‘schrodinger::mae’
  214 | } else if (prop.first.find(mae::CT_CHIRALITY_PROP_PREFIX) 
== 0 ||

  | ^~~~
/home/timbo/github/rdkit/rdkit/Code/GraphMol/FileParsers/MaeMolSupplier.cpp:215:37: 
error: ‘CT_PSEUDOCHIRALITY_PROP_PREFIX’ is not a member of 
‘schrodinger::mae’

  215 | prop.first.find(mae::CT_PSEUDOCHIRALITY_PROP_PREFIX) == 0) {
  | ^~
/home/timbo/github/rdkit/rdkit/Code/GraphMol/FileParsers/MaeMolSupplier.cpp:217:37: 
error: ‘CT_EZ_PROP_PREFIX’ is not a member of ‘schrodinger::mae’

  217 | } else if (prop.first.find(mae::CT_EZ_PROP_PREFIX) == 0) {
  | ^
make[2]: *** 
[Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/build.make:310: 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/MaeMolSupplier.cpp.o] 
Error 1

make[2]: *** Waiting for unfinished jobs
make[1]: *** [CMakeFiles/Makefile2:7238: 
Code/GraphMol/FileParsers/CMakeFiles/FileParsers_static.dir/all] Error 2

make: *** [Makefile:163: all] Error 2




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Closing a file opened by Chem.SDMolSupplier?

2020-03-29 Thread Paolo Tosco

Hi Jean-Marc,

tempfn.close() should allow you to delete the file.
A better alternative would be using using

with temp file.TemporaryFile as tempfn
...

As the file will be automatically closed and deleted for you as soon as it goes 
out of scope at the end of the context manager block.

Cheers,
p.

> On 29 Mar 2020, at 13:44, Jean-Marc Nuzillard  
> wrote:
> 
> Dear all,
> 
> The following code:
> __
> from rdkit import Chem
> import os
> import tempfile
> 
> # create temp .sdf file with a benzene molecule inside
> m1 = Chem.MolFromSmiles('c1c1')
> m1.SetProp('_Name', 'benzene')
> tempfn = tempfile.mktemp('.sdf')
> writer = Chem.SDWriter(tempfn)
> writer.write(m1)
> writer.close()
> 
> # read the molecule from temp .sdf file and print compound name
> reader = Chem.SDMolSupplier(tempfn)
> m2 = reader[0]
> print(m2.GetProp('_Name'))
> 
> # unlik temp file
> os.unlink(tempfn)
> __
> 
> running it in windows gives:
> 
> benzene
> Traceback (most recent call last):
>   File "rdclose.py", line 21, in 
> os.unlink(tempfn)
> PermissionError: [WinError 32] The process cannot access the file because it 
> is being used by another process:
> 'C:\\Users\\jmn\\AppData\\Local\\Temp\\tmpojn1j4nb.sdf'
> 
> I suspect the reason is that reader file is not closed.
> Trying reader.close() results in a message saying that reader has no close() 
> method.
> 
> Thank you in advance for helping me to get rid of the temp file.
> 
> Take care,
> 
> Jean-Marc Nuzillard
> 
> -- 
> Jean-Marc Nuzillard
> Directeur de Recherches au CNRS
> 
> Institut de Chimie Moléculaire de Reims
> CNRS UMR 7312
> Moulin de la Housse
> CPCBAI, Bâtiment 18
> BP 1039
> 51687 REIMS Cedex 2
> France
> 
> Tel : 03 26 91 82 10
> Fax : 03 26 91 31 66
> http://www.univ-reims.fr/icmr
> http://eos.univ-reims.fr/LSD/CSNteam.html
> 
> http://www.univ-reims.fr/LSD/
> http://www.univ-reims.fr/LSD/JmnSoft/
> 
> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Save clusters

2020-03-23 Thread Paolo Tosco


Hi Francesco,

np.savetxt() expects an array of numbers, while you have an array of 
tuples (i.e., the individual clusters), hence the error.


You don't actually need numpy to save an array in human-readable, text 
format. You might store your array in JSON format:


import json

with open("c:/temp/clusters.json", "w") as hnd:
    json.dump(clusters, hnd)

And than restore it as a tuple of tuples, as it originally was:

clusters = None
with open("c:/temp/clusters.json", "r") as hnd:
    clusters = tuple(map(tuple, json.load(hnd)))

You might also store the array in its string representation...

from ast import literal_eval

with open("c:/temp/clusters.txt", "w") as hnd:
    hnd.write(str(clusters) + "\n")

...and then restore it using ast.literal_eval():

clusters = None
with open("c:/temp/clusters.txt", "r") as hnd:
    clusters = literal_eval(hnd.read())

HTH, cheers,
p.

On 23/03/2020 13:29, Francesco Coppola wrote:

Hello everyone,

I have a small problem with saving a job. With the fingerprints of a 
database of molecules, I made the clusters. It works, I see them, but 
*how can I save it*?


>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> def ClusterFps(fps, cutoff=0.2):
...     from rdkit import DataStructs
...     from rdkit.ML.Cluster import Butina
...     dists=[]
...     nfps=len(fps)
...     for i in range(1, nfps):
... sims=DataStructs.BulkTanimotoSimilarity(fps[i], fps [:i])
...             dists.extend([1-x for x in sims])
...     cs=Butina.ClusterData(dists, nfps, cutoff, isDistData=True)
...     return cs
...
>>> ms = [x for x in 
Chem.SDMolSupplier(r'C:\Users\HP\100.sdf',removeHs=False)]

>>> len(ms)
100
>>> fps = [AllChem.GetMorganFingerprintAsBitVect(x,2,1024) for x in ms]
>>> clusters=ClusterFps(fps,cutoff=0.4)
>>> print(clusters[1])
(13, 4, 8)
>>> print(clusters)
((17, 15, 46), (13, 4, 8), (91, 53), (78, 76), (64, 42), (59, 58), 
(44, 43), (39, 38), (31, 30), (25, 24), (7,), (99,), (98,), (97,), 
(96,), (95,), (94,), (93,), (92,), (90,), (89,), (88
,), (87,), (86,), (85,), (84,), (83,), (82,), (81,), (80,), (79,), 
(77,), (75,), (74,), (73,), (72,), (71,), (70,), (69,), (68,), (67,), 
(66,), (65,), (63,), (62,), (61,), (60,), (57,),
(56,), (55,), (54,), (52,), (51,), (50,), (49,), (48,), (47,), (45,), 
(41,), (40,), (37,), (36,), (35,), (34,), (33,), (32,), (29,), (28,), 
(27,), (26,), (23,), (22,), (21,), (20,), (19,
), (18,), (16,), (14,), (12,), (11,), (10,), (9,), (6,), (5,), (3,), 
(2,), (1,), (0,))


*If I try to use:*
*
*
>>> np.savetxt("DB_Clusters", clusters, delimiter="     ")
Traceback (most recent call last):
  File 
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", 
line 1447, in savetxt

    v = format % tuple(row) + newline
TypeError: must be real number, not tuple

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 1, in 
  File "<__array_function__ internals>", line 6, in savetxt
  File 
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", 
line 1451, in savetxt

    % (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format 
specifier ('%.18e')


*Then I thought I hadn't imported Numpy, but the problem was not 
resolved.*

>>> import numpy as np
>>> np.savetxt("DB_Clu.txt", clusters, delimiter="      ")
Traceback (most recent call last):
  File 
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", 
line 1447, in savetxt

    v = format % tuple(row) + newline
TypeError: must be real number, not tuple

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 1, in 
  File "<__array_function__ internals>", line 6, in savetxt
  File 
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", 
line 1451, in savetxt

    % (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format 
specifier ('%.18e')


*The problem is that I can't use this function to save clusters?
How can I save the results with the clusters?*
*
*
Sorry for the trouble,

Best regards,
Francesco


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] MMFF Atoms type definitions

2020-03-16 Thread Paolo Tosco


Hi Omar,

MMFF94 definitions are not encoded via SMARTS patterns. Definition are 
stored in Code/ForceField/MMFF/Params.cpp (defaultMMFFDef).


You may find them here in a more human-readable form:

https://github.com/openbabel/openbabel/blob/master/data/mmffdef.par

Cheers,
p.

On 16/03/2020 08:27, Omar H94 wrote:

Dear RDKit users,

The function *GetMMFFAtomType* from the *rdForceField *module returns 
a number describing the assigned atom type by the forcefield of the 
queried atom. I want to get the definition of the atom type which the 
number represents. I wonder if there's a dictionary or a documentation 
that contains the atom type meaning or SMARTS**of each number ?


Thanks,
Omar


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Building RDKit from source under Ubuntu VM, ctest 87 tests failed out of 165. Any advice much appreciated.

2020-03-13 Thread Paolo Tosco


Dear Earl,

it looks like you might only need to add $RDBASE to your PYTHONPATH.

p.

On 13/03/2020 16:38, Earl Higgins wrote:


Paolo,

Thank you for your rapid reply. It's a great suggestion, we are 
getting somewhere. I ran the first Python test with the "-V" 
(verbose?) option and, as you can see from the output below, Python is 
having a problem finding the rdkit module. Do you have any suggestions 
on how to fix this? Thank you so much.


Earl Higgins

$ RDBASE=~/conda-rdkit/rdkit ctest -I 2,2 -V

UpdateCTestConfiguration  from 
:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl


Parse Config file:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl

Add coverage exclude regular expressions.

SetCTestConfiguration:CMakeCommand:/home/deep/anaconda3/bin/cmake

UpdateCTestConfiguration  from 
:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl


Parse Config file:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl

Test project /home/deep/conda-rdkit/rdkit/build

Constructing a list of tests

Done constructing a list of tests

Updating test list for fixtures

Added 0 tests to meet fixture requirements

Checking test dependency graph...

Checking test dependency graph end

test 2

    Start 2: pyCoordGen

2: Test command: /home/deep/anaconda3/bin/python 
"/home/deep/conda-rdkit/rdkit/External/CoordGen/Wrap/testCoordGen.py"


2: Test timeout computed to be: 1500

2: Traceback (most recent call last):

2:   File 
"/home/deep/conda-rdkit/rdkit/External/CoordGen/Wrap/testCoordGen.py", 
line 13, in 


2: from rdkit.Chem import rdCoordGen, rdMolAlign

2: ModuleNotFoundError: No module named 'rdkit'

1/1 Test #2: pyCoordGen ...***Failed    0.07 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.09 sec

The following tests FAILED:

      2 - pyCoordGen (Failed)

Errors while running CTest

*From:* Paolo Tosco 

Dear Earl,

given that all Python tests are failing my guess is that you might be 
running the tests with a Python interpreter different from the one you 
have built RDKit against. Re-run one of the failing tests with -V


ctest -I 2,2 -V

to get some more information.

Cheers,
p.

On 13/03/2020 15:56, Earl Higgins wrote:

I am new to RDKit, and my goal is to be able to build it from source … 
and I'm surprised at 87/165 test case failures right out of the box … 
Any guidance anyone could offer in this area would be most 
appreciated. Thank you in advance…


This message and any attachment are confidential and may be privileged 
or otherwise protected from disclosure. If you are not the intended 
recipient, you must not copy this message or attachment or disclose 
the contents to any other person. If you have received this 
transmission in error, please notify the sender immediately and delete 
the message and any attachment from your system. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not accept liability 
for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any 
unauthorized changes of the content of this message and any attachment 
thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do 
not guarantee that this message is free of viruses and does not accept 
liability for any damages caused by any virus transmitted therewith.


Click http://www.merckgroup.com/disclaimerto access the German, 
French, Spanish and Portuguese versions of this disclaimer.


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Building RDKit from source under Ubuntu VM, ctest 87 tests failed out of 165. Any advice much appreciated.

2020-03-13 Thread Paolo Tosco


Dear Earl,

given that all Python tests are failing my guess is that you might be 
running the tests with a Python interpreter different from the one you 
have built RDKit against. Re-run one of the failing tests with -V


ctest -I 2,2 -V

to get some more information.

Cheers,
p.

On 13/03/2020 15:56, Earl Higgins wrote:


I am new to RDKit, and my goal is to be able to build it from source 
under Ubuntu 18.04.4 LTS running in a VM using Oracle VirtualBox (host 
Windows 10). I need to be able to build from source because, as a 
developer, I am on a team which is looking at making some enhancements 
to the MOL file load support, possibly adding support for Opensmiles 
as a distinct dialect of SMILES.


So, following the Linux/Python 3 instructions at 
https://www.rdkit.org/docs/Install.html#how-to-build-from-source-with-conda 
, I am able to download and install everything fine until I get to the 
step to run ctest. I get:


$ ctest

CMake Error at /home/deep/conda-rdkit/rdkit/build/CTestCustom.ctest:3 
(MESSAGE):


  Please set your RDBASE env variable before running the tests.

Problem reading custom configuration: 
/home/deep/conda-rdkit/rdkit/build/CTestCustom.ctest


Test project /home/deep/conda-rdkit/rdkit/build

No tests were found!!!

$

So I run:

$ RDBASE=~/conda-rdkit/rdkit ctest

When I do that, I get:

$ RDBASE=~/conda-rdkit/rdkit ctest

Test project /home/deep/conda-rdkit/rdkit/build

    Start   1: testCoordGen

  1/165 Test   #1: testCoordGen ...   
Passed    0.37 sec


    Start   2: pyCoordGen

  2/165 Test   #2: pyCoordGen 
.***Failed    0.05 sec


    Start   3: testDict

  3/165 Test   #3: testDict ...   
Passed    0.63 sec


    Start   4: testRDValue

  4/165 Test   #4: testRDValue    
Passed    0.00 sec


    Start   5: testDataStructs

  5/165 Test   #5: testDataStructs    
Passed    0.01 sec


    Start   6: testFPB

  6/165 Test   #6: testFPB    
Passed    0.01 sec


    Start   7: testMultiFPB

  7/165 Test   #7: testMultiFPB ...   
Passed    0.04 sec


    Start   8: pyBV

  8/165 Test   #8: pyBV 
...***Failed    0.05 sec


    Start   9: pyDiscreteValueVect

  9/165 Test   #9: pyDiscreteValueVect 
***Failed    0.09 sec


    Start  10: pySparseIntVect

10/165 Test  #10: pySparseIntVect ***Failed    
0.06 sec


    Start  11: pyFPB

11/165 Test  #11: pyFPB ..***Failed    
0.05 sec


    Start  12: testTransforms

12/165 Test  #12: testTransforms .   Passed    
0.01 sec


    Start  13: testGrid

13/165 Test  #13: testGrid ...   Passed    
0.04 sec


    Start  14: geometryTestsCatch

(Lines omitted)

161/165 Test #161: pythonTestDirDbase 
.***Failed    0.03 sec


    Start 162: pythonTestDirSimDivFilters

162/165 Test #162: pythonTestDirSimDivFilters 
.***Failed    0.04 sec


    Start 163: pythonTestDirVLib

163/165 Test #163: pythonTestDirVLib 
..***Failed    0.04 sec


    Start 164: pythonTestDirChem

164/165 Test #164: pythonTestDirChem 
..***Failed    0.06 sec


    Start 165: pythonTestSping

165/165 Test #165: pythonTestSping 
***Failed    0.03 sec


47% tests passed, 87 tests failed out of 165

Total Test time (real) =  48.62 sec

The following tests FAILED:

      2 - pyCoordGen (Failed)

      8 - pyBV (Failed)

      9 - pyDiscreteValueVect (Failed)

    10 - pySparseIntVect (Failed)

    11 - pyFPB (Failed)

    15 - testPyGeometry (Failed)

    19 - pyAlignment (Failed)

    22 - testMMFFForceField (Child aborted)

    23 - pyForceFieldConstraints (Failed)

    25 - pyDistGeom (Failed)

    28 - graphmolqueryTest (Child aborted)

    29 - graphmolMolOpsTest (Child aborted)

    31 - graphmoltestPickler (Child aborted)

    34 - hanoiTest (Child aborted)

(Lines omitted)

    157 - pyFeatures (Failed)

    158 - pythonTestDbCLI (Failed)

    159 - pythonTestDirML (Failed)

    160 - pythonTestDirDataStructs (Failed)

    161 - pythonTestDirDbase (Failed)

    162 - pythonTestDirSimDivFilters (Failed)

    163 - pythonTestDirVLib (Failed)

    164 - pythonTestDirChem (Failed)

    165 - pythonTestSping (Failed)

Errors while running CTest

$

My impression is RDKit is quite robust, well tested and portable and 
I'm surprised at 87/165 test case failures right out of the box. I'm 
very

Re: [Rdkit-discuss] RDKit in C++

2020-02-26 Thread Paolo Tosco


Hi Leon,

there is nice document produced by David Cosgrove and Greg Landrum:

https://github.com/rdkit/rdkit/blob/master/Docs/Book/GettingStartedInC%2B%2B.md

RDKit C++ unit tests, RDKit C++ API documentations and headers are also 
very helpful.


Cheers,
p.

On 26/02/2020 15:51, topgunhaides . wrote:

Hi guys,

I noticed that someone asked such question some years ago.
Since it is now 2020, do we now have anything like "Getting Started 
with the RDKit in C++"?


I am planning to transfer my RDKit python code to C++.
Can anyone give me some resources? I found some, but just in case 
that I missed important ones. Any suggestions are very welcome. Thanks!


Best,
Leon




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Draw.MolsToGridImage error: got multiple values for keyword argument

2020-02-25 Thread Paolo Tosco

Hi Konrad,

you should use the highlightAtomLists parameter rather than 
highlightAtoms, then your example will work.

Cheers,
p.

On 25/02/2020 16:02, Konrad Koehler via Rdkit-discuss wrote:

Hi everyone,

I am having trouble using a mol property value to define highlighted 
atoms when generating an image.

Starting from the beginning, I have defined a variable 
highlight_atom_numbers as a tuple:

>>>type(highlight_atom_numbers)

>>>

And set a mol property to this value:

mol.SetProp("highlight_atom_numbers",str(highlight_atom_numbers))

I then tried to create a 2D image of the molecule with the 
“highlight_atom_numbers” highlighted:

img=Draw.MolsToGridImage(

act_mols,

molsPerRow=4,

subImgSize=(200,200),

legends=[x.GetProp("_Name") for x in act_mols],

highlightAtoms=[literal_eval(x.GetProp("highlight_atom_numbers")) for 
x in act_mols]

)

Which generates the following error message:

TypeError: Boost.Python.function() got multiple values for keyword 
argument 'highlightAtoms'

Theliteral_eval function 
should 
return a single tuple and not multiple values.

Anyone have any ideas for getting this to work?  Thanks.

- Konrad

PS: I have tried googling for a solution, for example:

Stack Overflow: TypeError got multiple values for keyword argument 

and tried the suggestions there, but that did not help.

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] What is the meaning of the output of rdkit.Chem.rdMolAlign.O3A.Align()?

2020-02-24 Thread Paolo Tosco


Hi Zhenting,

if you compare different conformers of the same structure you won't need 
O3A at all, as the atom-atom correspondences will be 1:1, as all 
conformers share the same topology; GetBestRMS() is all you need.


O3A is useful when you need to align molecules which have different 
topologies, as it will find non-obvious matches between atoms.


Regarding open-sorce shape similarity tools you might have a look at 
Shape-it:


http://silicos-it.be.s3-website-eu-west-1.amazonaws.com/software/shape-it/1.0.1/shape-it.html

Cheers,
p.

On 24/02/2020 14:16, Gao Zhenting wrote:

Hi Paolo,

Thanks for pointing that out. Currently, I am using Open3DAlign to 
select diverse conformers of a single structure by measuring the rmsd. 
According to your explanation, I will choose O3A.Align() to record the 
rmsd between conformers. And my preliminary observation is O3A.Align() 
gave me a more expected rmsd value comparing to 
rdkit.Chem.rdMolAlign.GetBestRMS(probeMol, refMol). (python code 
attached) What is your comment on this?


I do imagine to use Open3DAlign as a shape similarity tool. However 
with your advice, I will think it over again.

Do you have any suggestions on an open source shape similarity tool?

Best regards
Zhenting

Paolo Tosco <mailto:paolo.tosco.m...@gmail.com>> 于2020年2月24日周一 下午9:21写道：


Hi Zhenting,

O3A.Align() returns the RMSD between the atom pairs that the O3A
algorithm was able to match across the two molecules.

O3A.Score() returns the score for an alignment. The score is
directly proportional to:

* the number of atom pairs that could be matched across the two
molecules
* the closeness of the match (i.e., how similar actually the atoms
in the matched pair are)

It is important to note that Open3DAlign was originally conceived
as a tool to generate good quality 3D overlays of molecules ahead
of a 3D-QSAR analysis. It was not designed, nor tested, as a tool
to assess 3D similarity between molecules and do virtual
screening. O3A.Score is used internally purely to pick the best
overlay between two molecules.

Cheers,
p

On 24/02/2020 12:45, Gao Zhenting wrote:


Hi Greg,


I am learning Open3D ALIGN


example code:

pyO3A = rdMolAlign.GetO3A(mol2py, mol1)
pyO3A.Align()


Questions

  * what is the meaning of the value from O3A.Align()
  o   is that similar to atom wise RMSD?
 o
  * what is  the meaning of the value from O3A.Score()
  o only an online tutorial presented the Score
  + ~ open3dqsar.org > Downloads > Open3DTOOLS tutorial
20130812

<http://open3dqsar.org/downloads/Open3DTOOLS_tutorial_20130812.pdf>


 *
  * Best Regards
 o
 +
  * Zhenting



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net  
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] What is the meaning of the output of rdkit.Chem.rdMolAlign.O3A.Align()?

2020-02-24 Thread Paolo Tosco


Hi Zhenting,

O3A.Align() returns the RMSD between the atom pairs that the O3A 
algorithm was able to match across the two molecules.


O3A.Score() returns the score for an alignment. The score is directly 
proportional to:


* the number of atom pairs that could be matched across the two molecules
* the closeness of the match (i.e., how similar actually the atoms in 
the matched pair are)


It is important to note that Open3DAlign was originally conceived as a 
tool to generate good quality 3D overlays of molecules ahead of a 
3D-QSAR analysis. It was not designed, nor tested, as a tool to assess 
3D similarity between molecules and do virtual screening. O3A.Score is 
used internally purely to pick the best overlay between two molecules.


Cheers,
p

On 24/02/2020 12:45, Gao Zhenting wrote:


Hi Greg,


I am learning Open3D ALIGN


example code:

pyO3A = rdMolAlign.GetO3A(mol2py, mol1)
pyO3A.Align()


Questions

  * what is the meaning of the value from O3A.Align()
  o   is that similar to atom wise RMSD?
 o
  * what is  the meaning of the value from O3A.Score()
  o only an online tutorial presented the Score
  + ~ open3dqsar.org > Downloads > Open3DTOOLS tutorial
20130812



 *
  * Best Regards
 o
 +
  * Zhenting



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Pharmacophore Fingerprints

2020-02-21 Thread Paolo Tosco


Hi Antoine,

|rdkit.Chem.Pharm2D.Generate.||Gen2DFingerprint()| expects a single molecule

https://www.rdkit.org/docs/source/rdkit.Chem.Pharm2D.Generate.html?highlight=gen2dfingerprint#rdkit.Chem.Pharm2D.Generate.Gen2DFingerprint

while you are passing a list of molecules:

pharmacophorefps = Generate.Gen2DFingerprint(list(df['ROMol']), sigFactory)

HTH, cheers
p.

On 21/02/2020 16:22, Antoine Dumas wrote:


Hello,

I am trying to generate a set of pharmacophore fingerprints in python 
using RDKIT from a list of SMILES (20k molecules)


No matter what I do the script keeps throwing an error saying my code 
doesn’t match the C++ signature.


Here  is a copy of my code as it stands right now

from __future__ import print_function

import os

import csv

import numpy as np

import pandas as pd

from rdkit import RDConfig, Chem, DataStructs, rdBase

from rdkit.Chem import rdFingerprintGenerator, rdMolDescriptors, 
AllChem, rdFMCS, MACCSkeys, Draw, PandasTools, ChemicalFeatures, 
rdDepictor


from rdkit.Chem.Fingerprints import FingerprintMols

from rdkit.Chem.Draw import IPythonConsole, MolDraw2D

from rdkit.Chem.Pharm2D import Gobbi_Pharm2D, Generate

from rdkit.Chem.Pharm2D.SigFactory import SigFactory

from IPython.display import SVG

from tabulate import tabulate

os.chdir(r'C:\Users\adumas\Desktop\')

input1 = r'C:\Users\adumas\Desktop\FILE.csv'

output = r'C:\Users\adumas\Desktop\FILE.csv'

df = pd.read_csv(input1, delimiter = ',', header = 0, index_col = 
[''], names = ['','row ID','MOL_ID', 'SMILES',])


PandasTools.AddMoleculeColumnToFrame(df, smilesCol = 'SMILES')

SMILES = []

SMILES = df.iloc[0:,4]

ID = df.iloc[0:,1]

molecules = [Chem.MolFromSmiles(x) for x in SMILES]

fingerprints = [FingerprintMols.FingerprintMol(x) for x in molecules]

morganfps = rdFingerprintGenerator.GetFPs(list(df['ROMol']))

df['Morgan Fingerprint'] = morganfps

mcassfps = [MACCSkeys.GenMACCSKeys(x) for x in list(df['ROMol'])]

df['MCASS Fingerprint'] = mcassfps

fdefName = 'BaseFeatures.fdef'

featFactory = ChemicalFeatures.BuildFeatureFactory(fdefName)

sigFactory = SigFactory(featFactory, minPointCount=2, maxPointCount=9)

sigFactory.SetBins([(0,2),(2,5),(5,8)])

sigFactory.Init()

sigFactory.GetSigSize()

pharmacophorefps = Generate.Gen2DFingerprint(list(df['ROMol']), 
sigFactory) * Line throwing error constantly no matter 
whether I specify (SMILES, sigFactory) or (molecules, sigFactory) or 
(df.iloc[0:,4], sigFactory)


df['Pharmacophore Fingerprints'] = pharmacophorefps

And the error it throws me every time no matter how I try to define 
the list of smiles


*ArgumentError*: Python argument types in

rdkit.Chem.rdmolops.GetDistanceMatrix(Series, bool)

did not match C++ signature:

    GetDistanceMatrix(class RDKit::ROMol {lvalue} mol, bool 
useBO=False, bool useAtomWts=False, bool force=False, char const * 
__ptr64 prefix='')




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Markush Enumeration.

2020-02-21 Thread Paolo Tosco


Hi Jitender,

you could do that quite easily using reaction SMARTS; see for example 
this thread:


https://sourceforge.net/p/rdkit/mailman/message/35730514/

You could selectively replace a specific R attachment point by 
isotopically labeling it.


Cheers,
p.

On 21/02/2020 09:55, Jitender Verma wrote:

Dear RDkit users,

I have a Markush structure with attachment points as R1, R2, and so 
on. How can I use RDkit to enumerate all the structures using specific 
R-groups from a database or library I have?


I am a new user of RDkit.

Thanks in anticipation



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] conformations for cycles with multiple stereocenters

2020-02-21 Thread Paolo Tosco


Hi Pavel,

I believe you forgot to add hydrogen atoms to the molecule graph before 
generating 3D conformers:


mol_h = Chem.AddHs(mol)

AllChem.EmbedMultipleConfs(mol_h, 50, Chem.rdDistGeom.ETKDGv2())

mol_h.GetNumConformers()
50

HTH, cheers
p.

On 21/02/2020 12:57, Pavel Polishchuk wrote:

Hello,

  I want to generate conformers for a stereoisomeric sugar moiety. The 
code below works (loads proccesor) but returns none of them.
  But if I remove all stereoconfiguration info in input SMILES the 
code generates conformers. Playing with this issue I noticed that 
conformers are generated if only one stereocenter is defined. If I 
define configuration of two centers the code stops working.

  I'm puzzled. Is there a solution or workaround?
  I use rdkit 2019.09.1

  mol = Chem.MolFromSmiles('OC[C@H]1O[C@H](O)C[C@@H](O)[C@H]1O')
  AllChem.EmbedMultipleConfs(mol, 50, Chem.rdDistGeom.ETKDGv2())
  mol.GetConformers()

Kind regards,
Pavel.


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Inconsistent sanitization of fragments

2020-02-18 Thread Paolo Tosco


Dear Puck,

Here there is some confusion between the concept of implicit/explicit 
hydrogens vs real hydrogen atoms.


In your previous example you were fragmenting an acetone molecule with 
real hydrogen atoms explicitly present in the molecule graph. Therefore, 
after calling FragmentOnBonds() it was appropriate to convert the dummy 
atoms into real hydrogens and leave them as such.


In your latest example you are fragmenting a molecule with no real 
hydrogen atoms, only implicit hydrogens, so after converting the dummy 
atoms into real hydrogens you have a mixture of real and implicit hydrogens.


Please refer to this gist for a (hopefully) cleare explanation:

https://gist.github.com/ptosco/162b52b018dc95709bd43757c0078a28

You may also refer to the following threads on the mailing list:

https://sourceforge.net/p/rdkit/mailman/message/29679834/
https://sourceforge.net/p/rdkit/mailman/message/36696340/

and to this blog post by Roger Sayle:

https://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/

for further clarifications on the various kinds of hydrogen in the RDKit.

Cheers,
p.

On 18/02/2020 12:18, Puck van Gerwen wrote:

Dear Paolo

Thank you for this illustrative example. I tried the same method on a 
few other molecules and didn't find consistent results. In the 
notebook attached, the first molecule is left with a bivalent carbon, 
and the second molecule is only sanitized after explicitly calling the 
SanitizeFrags option when getting individual fragments. Am I missing 
something?


Kind regards
Puck

On Tue, 18 Feb 2020 at 11:17, Paolo Tosco <mailto:paolo.tosco.m...@gmail.com>> wrote:


Hi Puck,

I modified your previous example to use FragmentOnBonds():

from  rdkit  import  Chem
from  rdkit.Chem.Draw  import  IPythonConsole,  MolsToGridImage

smiles  =  "COC"
mol  =  Chem.MolFromSmiles("COC")
mol

Here there are no hydrogens (as expected from the SMILES)

mol  =  Chem.AddHs(mol)
mol

mol_f  =  Chem.FragmentOnBonds(mol,  (2,))

mol_f

for  a  in  mol_f.GetAtoms():
 if  (a.GetAtomicNum()  ==  0):
 # convert the dummy into hydrogen a.SetAtomicNum(1)
 # if you do not need the labels, you may clear them
 a.SetIsotope(0)

mol_f

Cheers,
p.

On 18/02/2020 09:45, Puck van Gerwen wrote:

Dear Paolo

Thanks very much for your continuous help. I'm not sure I like
the idea of explicitly defining the valence, so I'll consider
switching to FragmentOnBonds(). From the example you gave
previously though, it quenched the fragments with dummy atoms
rather than hydrogen. Is it possible to quench with hydrogen
using FragmentOnBonds()?

Kind regards
Puck

On Mon, 17 Feb 2020 at 13:42, Paolo Tosco
mailto:paolo.tosco.m...@gmail.com>>
wrote:

Hi Puck,

sorry dor the delay in replying.

After removing the bond you will need to adjust the number of
explicit Hs on both ends with SetNumExplicitHs(), and then
add those Hs in the molecule graph as real atoms with
Chem.AddHs():

rwmol = Chem.RWMol(mol)
bonds = rwmol.GetBonds()
b = rwmol.GetBondWithIdx(2)
a1 = b.GetBeginAtom()
a2 = b.GetEndAtom()
rwmol.RemoveBond(a1.GetIdx(), a2.GetIdx())
a1.SetNumExplicitHs(a1.GetNumExplicitHs() + 1)
a2.SetNumExplicitHs(a2.GetNumExplicitHs() + 1)
rwmol = Chem.AddHs(rwmol, explicitOnly=True)

The rdmolops.FragmentOnBonds() functions adjusts the valences
for you, but you will need to adjust them manually if you do
the fragmentation at a lower level with RemoveBond().

I have attached the modified notebook.

Cheers,
p.

On 12/02/2020 10:03, Puck van Gerwen wrote:

Dear all,

I am trying to read in SMILES to generate mol objects which
I then break into fragments using rwmol.RemoveBond().
Thereafter I want to sanitize the fragments by saturating
with hydrogen. However, I am finding that rdkit often
doesn't sanitize the fragments consistently, leaving
trivalent carbon atoms. I've attached a jupyter notebook of
an example. Could anyone help me consistently generate
sanitized fragments?

Best regards
-- 
*Puck van Gerwen*

Doktorandin / PhD candidate
Departement Chemie
Klingelbergstrasse 80
CH-4056 Basel
https://chemspacelab.org/


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net  
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




-- 
*Puck van Gerwen*

Doktorandin / PhD candidate
Departement Chemie
Klingelb

Re: [Rdkit-discuss] Inconsistent sanitization of fragments

2020-02-18 Thread Paolo Tosco


Hi Puck,

I modified your previous example to use FragmentOnBonds():

from  rdkit  import  Chem
from  rdkit.Chem.Draw  import  IPythonConsole,  MolsToGridImage

smiles  =  "COC"
mol  =  Chem.MolFromSmiles("COC")
mol

Here there are no hydrogens (as expected from the SMILES)

mol  =  Chem.AddHs(mol)
mol

mol_f  =  Chem.FragmentOnBonds(mol,  (2,))

mol_f

for  a  in  mol_f.GetAtoms():
if  (a.GetAtomicNum()  ==  0):
# convert the dummy into hydrogen a.SetAtomicNum(1)
# if you do not need the labels, you may clear them
a.SetIsotope(0)

mol_f

Cheers,
p.

On 18/02/2020 09:45, Puck van Gerwen wrote:

Dear Paolo

Thanks very much for your continuous help. I'm not sure I like the 
idea of explicitly defining the valence, so I'll consider switching to 
FragmentOnBonds(). From the example you gave previously though, it 
quenched the fragments with dummy atoms rather than hydrogen. Is it 
possible to quench with hydrogen using FragmentOnBonds()?


Kind regards
Puck

On Mon, 17 Feb 2020 at 13:42, Paolo Tosco <mailto:paolo.tosco.m...@gmail.com>> wrote:


Hi Puck,

sorry dor the delay in replying.

After removing the bond you will need to adjust the number of
explicit Hs on both ends with SetNumExplicitHs(), and then add
those Hs in the molecule graph as real atoms with Chem.AddHs():

rwmol = Chem.RWMol(mol)
bonds = rwmol.GetBonds()
b = rwmol.GetBondWithIdx(2)
a1 = b.GetBeginAtom()
a2 = b.GetEndAtom()
rwmol.RemoveBond(a1.GetIdx(), a2.GetIdx())
a1.SetNumExplicitHs(a1.GetNumExplicitHs() + 1)
a2.SetNumExplicitHs(a2.GetNumExplicitHs() + 1)
rwmol = Chem.AddHs(rwmol, explicitOnly=True)

The rdmolops.FragmentOnBonds() functions adjusts the valences for
you, but you will need to adjust them manually if you do the
fragmentation at a lower level with RemoveBond().

I have attached the modified notebook.

Cheers,
p.

On 12/02/2020 10:03, Puck van Gerwen wrote:

Dear all,

I am trying to read in SMILES to generate mol objects which I
then break into fragments using rwmol.RemoveBond(). Thereafter I
want to sanitize the fragments by saturating with hydrogen.
However, I am finding that rdkit often doesn't sanitize the
fragments consistently, leaving trivalent carbon atoms. I've
attached a jupyter notebook of an example. Could anyone help me
consistently generate sanitized fragments?

Best regards
-- 
*Puck van Gerwen*

Doktorandin / PhD candidate
Departement Chemie
Klingelbergstrasse 80
CH-4056 Basel
https://chemspacelab.org/


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net  
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




--
*Puck van Gerwen*
Doktorandin / PhD candidate
Departement Chemie
Klingelbergstrasse 80
CH-4056 Basel
https://chemspacelab.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Inconsistent sanitization of fragments

2020-02-17 Thread Paolo Tosco


Hi Puck,

sorry dor the delay in replying.

After removing the bond you will need to adjust the number of explicit 
Hs on both ends with SetNumExplicitHs(), and then add those Hs in the 
molecule graph as real atoms with Chem.AddHs():


rwmol = Chem.RWMol(mol)
bonds = rwmol.GetBonds()
b = rwmol.GetBondWithIdx(2)
a1 = b.GetBeginAtom()
a2 = b.GetEndAtom()
rwmol.RemoveBond(a1.GetIdx(), a2.GetIdx())
a1.SetNumExplicitHs(a1.GetNumExplicitHs() + 1)
a2.SetNumExplicitHs(a2.GetNumExplicitHs() + 1)
rwmol = Chem.AddHs(rwmol, explicitOnly=True)

The rdmolops.FragmentOnBonds() functions adjusts the valences for you, 
but you will need to adjust them manually if you do the fragmentation at 
a lower level with RemoveBond().


I have attached the modified notebook.

Cheers,
p.

On 12/02/2020 10:03, Puck van Gerwen wrote:

Dear all,

I am trying to read in SMILES to generate mol objects which I then 
break into fragments using rwmol.RemoveBond(). Thereafter I want to 
sanitize the fragments by saturating with hydrogen. However, I am 
finding that rdkit often doesn't sanitize the fragments consistently, 
leaving trivalent carbon atoms. I've attached a jupyter notebook of an 
example. Could anyone help me consistently generate sanitized fragments?


Best regards
--
*Puck van Gerwen*
Doktorandin / PhD candidate
Departement Chemie
Klingelbergstrasse 80
CH-4056 Basel
https://chemspacelab.org/


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


NotSanitizing.ipynb
Description: application/ipynb
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] PDB file format and dummy atoms

2020-02-13 Thread Paolo Tosco


Hi Gabriele,

this has come up previously:

https://sourceforge.net/p/rdkit/mailman/message/36796385/

This gist

https://gist.github.com/ptosco/ba11e577a8c616b958c49e23cc42a5e3

shows a possible solution, provided that your molecules do not contain 
Astatine, otherwise pick another unusual element.


Cheers,
p.

On 13/02/2020 14:01, Gabriele Macari wrote:


Hello everybody,

    I have a list of molecular fragments generated with BRICS and 
saved in PDB format. When I read them back with Chem.MolFromPDBlock() 
RDKiT is unable to recognize the dummy atom * and throws a warning 
(and no molecule is returned). Is it possible add custom atoms to the 
PDB Parser? Or is there another workaround?


Thanks in advance,

Gabriele



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Inconsistent sanitization of fragments

2020-02-12 Thread Paolo Tosco


Dear Puck,

Sure; I'll get back to you later today.

Cheers,
p.

On 12/02/2020 10:03, Puck van Gerwen wrote:

Dear all,

I am trying to read in SMILES to generate mol objects which I then 
break into fragments using rwmol.RemoveBond(). Thereafter I want to 
sanitize the fragments by saturating with hydrogen. However, I am 
finding that rdkit often doesn't sanitize the fragments consistently, 
leaving trivalent carbon atoms. I've attached a jupyter notebook of an 
example. Could anyone help me consistently generate sanitized fragments?


Best regards
--
*Puck van Gerwen*
Doktorandin / PhD candidate
Departement Chemie
Klingelbergstrasse 80
CH-4056 Basel
https://chemspacelab.org/


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] RDKit upgrade issue

2020-01-23 Thread Paolo Tosco


Hi Navid,

the Descriptors module is not auto-imported when Chem is imported.

So the first example will work if you do:

from rdkit import Chem
import rdkit.Chem.Descriptors
Chem.Descriptors.MolWt(Chem.MolFromSmiles('CC'))

Cheers,
p.

On 23/01/2020 17:26, Navid Shervani-Tabar wrote:

Thanks, Paolo! That worked! But let me get this straight, if I use

from rdkit import Chem
Chem.Descriptors.MolWt(Chem.MolFromSmiles('CC'))

it does not work, but when I do

from rdkit import Chem
from rdkit.Chem import Descriptors
Descriptors.MolWt(Chem.MolFromSmiles('CC'))

it works. Shouldn't these be the same?

Thanks,
Navid


On Thu, Jan 23, 2020 at 12:07 PM Paolo Tosco 
mailto:paolo.tosco.m...@gmail.com>> wrote:


Hi Navid,

try adding

import rdkit.Chem.Descriptors

before attempting to use MolWt.

Cheers,
p.

On 23/01/2020 17:02, Navid Shervani-Tabar wrote:

Update: I was able to go back to square one. Using
RDKit-2019.09.3.0 I still get the error

module 'rdkit.Chem' has no attribute 'Descriptors'

when using Chem.Descriptors.MolWt.

Navid



On Thu, Jan 23, 2020 at 9:37 AM Navid Shervani-Tabar
mailto:nshe...@gmail.com>> wrote:

Hello,

I was trying to use the MolWt function in RDKit. I tried
Chem.Descriptors.MolWt but I got

module 'rdkit.Chem' has no attribute 'Descriptors'

I thought that might be related to the fact that I used the
2019.03 version. So I updated using

conda install -c conda-forge rdkit

But now when I import

from rdkit.Chem.QED import qed

I get

Process finished with exit code 139 (interrupted by signal
11: SIGSEGV)

Any suggestions how to fix this?

Thanks!
Navid




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net  
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] RDKit upgrade issue

2020-01-23 Thread Paolo Tosco


Hi Navid,

try adding

import rdkit.Chem.Descriptors

before attempting to use MolWt.

Cheers,
p.

On 23/01/2020 17:02, Navid Shervani-Tabar wrote:
Update: I was able to go back to square one. Using RDKit-2019.09.3.0 I 
still get the error


module 'rdkit.Chem' has no attribute 'Descriptors'

when using Chem.Descriptors.MolWt.

Navid



On Thu, Jan 23, 2020 at 9:37 AM Navid Shervani-Tabar 
mailto:nshe...@gmail.com>> wrote:


Hello,

I was trying to use the MolWt function in RDKit. I tried
Chem.Descriptors.MolWt but I got

module 'rdkit.Chem' has no attribute 'Descriptors'

I thought that might be related to the fact that I used the
2019.03 version. So I updated using

conda install -c conda-forge rdkit

But now when I import

from rdkit.Chem.QED import qed

I get

Process finished with exit code 139 (interrupted by signal 11:
SIGSEGV)

Any suggestions how to fix this?

Thanks!
Navid




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Alignment using GetO3A

2020-01-14 Thread Paolo Tosco


Hi Charmaine,

can you provide a reproducible example of your workflow? Please reply to 
me directly.


Thanks, cheers
p.

On 14/01/2020 11:40, Chu, Charmaine wrote:


Dear Rdkit community,

I’ve been trying to align some small molecules to a defined 
substructure using GetO3A but it doesn’t seem to work properly.


For example, I would be aligning some small molecules containing a 
piperidine ring within the structure to a reference piperidine ring 
but it wouldn’t align properly.


I’ve tried striping the atoms apart from the piperidine substructure 
of the small molecules to do the alignment (which would then follow by 
transformation of the original small molecule using the transformation 
matrix) but by inspecting the alignment the N within the piperidine 
ring does not always align to the same position as the reference 
piperidine ring.


If I try adding a constrain map to it then the chair conformation does 
not always face the same way (like how it would reflect by the plane 
of best fit)


The work is being done using the python node within KNIME as a part of 
the workflow.


What would be the fix for this?

Regards,

Charmaine Chu



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Exhaustive fragmentation of molecules

2020-01-08 Thread Paolo Tosco


Dear Puck,

You may break a bond by creating a Chem.RWMol out of your Chem.Mol, and 
then calling the RemoveBond() method on your Chem.RWMol, or you may use 
dedicated functions in the rdmolops module. Individual fragments can 
then be obtained by calling rdmolops.GetMolFrags().


I have put together a gist here:

https://gist.github.com/ptosco/3fb93b7c09dac15b6d355eb0ad29f532

to show examples of the above; I hope this will help get you started on 
your task.


Cheers,
p.

On 08/01/2020 10:39, Puck van Gerwen wrote:

Dear rdkit community,

I am looking to start from a mol object (loaded from an .xyz file) and 
return all possible fragments (as mol objects) generated from breaking 
one bond (any bond order). I don't want any pre-encoded rules about 
which bonds to break as in BRICS. I saw some discussions on the forum 
about using EditableMol or other mol types. Would you be able to point 
me to the best way to do this?

Thanks very much.

--
Puck van Gerwen
Doktorandin
Gruppe von Anatole von Lilienfeld
Universität Basel


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] bond order is lost in reaction transformation

2020-01-03 Thread Paolo Tosco

Hi Devendra,

Your starting molecule is aromatic, as is your product; by default RDKit writes 
aromatic SMILES.
If you wish to write your molecule as kekulized SMILES, do the following:

In [1]: from rdkit import Chem  
   
In [2]: smi = "CN1C(NC2=NC=CC=C2)=CC=C1"
   
In [3]: mol = Chem.MolFromSmiles(smi)   
   
In [4]: Chem.MolToSmiles(mol)   
   
Out[4]: 'Cn11Nc1n1' 
   
In [6]: Chem.Kekulize(mol)  
   
In [7]: Chem.MolToSmiles(mol, kekuleSmiles=True)
   
Out[7]: 'CN1C=CC=C1NC1=NC=CC=C1'

Cheers,
p.

> On 3 Jan 2020, at 23:26, Devendra Dhaked  wrote:
> 
> 
> Hi,
> 
> I have written SMIRKS for reaction transformation of  reagent 
> "CN1C(NC2=NC=CC=C2)=CC=C1" into product "CN1CC=CC1=NC2=NC=CC=C2".
> 
> but after transformation in forward direction it showed lost bond order in 
> product molecule Cn1cC=C[c]1=Nc1n1.
> code for forward direction
> reactant1 = Chem.MolFromSmiles('CN1C(NC2=NC=CC=C2)=CC=C1')
> rxn = 
> AllChem.ReactionFromSmarts('[O,NX3;H1:2][cX3z2;r5:3]:[c;r5:4][c;r5:5]:[cr5R{1-2}:6]>>[O,S,NX2:2]=[CX3;z2;r5:3][C;r5:4]=[C;r5:5][CX4;r5;!H0:6]')
> ps=rxn.RunReactants((reactant1,))
> print(len(ps))
> print(Chem.MolToSmiles(ps[0][0],True))
> 
> Similarly in backward direction produced product(reactant) also lost its bond 
> order
> CN11Nc1n1.
> code for backward direction
> reactant1 = Chem.MolFromSmiles('CN1CC=CC1=NC2=NC=CC=C2')
> rxn = 
> AllChem.ReactionFromSmarts('[O,S,NX2:2]=[CX3;z2;r5:3][C;r5:4]=[C;r5:5][CX4;r5;!H0:6]>>[O,NX3;H1:2][cX3z2;r5:3]:[c;r5:4][c;r5:5]:[cr5R{1-2}:6]')
> ps=rxn.RunReactants((reactant1,))
> print(len(ps))
> print(Chem.MolToSmiles(ps[0][0],True))
> 
> Thanks.
> 
> -- 
> Regards
> 
> Devendra K Dhaked (Ph.D.)
> Center for Cancer Research
> Chemical Biology Laboratory
> National Cancer Institute (NIH)
> Boyles Street, Frederick, MD
> USA
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] passing options to javac when building from source

2020-01-03 Thread Paolo Tosco


Hi Tim,

based on

https://cmake.org/cmake/help/latest/module/UseJava.html

set(CMAKE_JAVA_COMPILE_FLAGS"-source 8")

might do the trick.

Cheers,
p.

On 03/01/2020 17:41, Tim Dudgeon wrote:

Hi Paolo,

I'm afraid that's not working for me. Still getting Java11 class files.
My cmake command is:

cmake -Wno-dev\
  -DPYTHON_EXECUTABLE=/usr/bin/python3\
  -DRDK_BUILD_INCHI_SUPPORT=ON\
  -DRDK_BUILD_AVALON_SUPPORT=ON\
  -DRDK_BUILD_PYTHON_WRAPPERS=ON\
  -DRDK_BUILD_SWIG_WRAPPERS=ON\
  -DJAVA_COMPILE="/usr/bin/javac -source 8"\
  ..


When it's built I run:

javap -cp ./Code/JavaWrappers/gmwrapper/org.RDKit.jar -verbose 
org.RDKit.RDKFuncs | grep major


and get:

  major version: 55

55 is the class version for Java11.

In fact if I set JAVA_COMPILE to complete nonsense everything still 
builds OK!


Tim

On 26/12/2019 15:39, Paolo Tosco wrote:

Hi Tim,

Try adding this to your CMake command:

-DJAVA_COMPILE="/usr/bin/javac -source 8"

Cheers,
p.

On 26/12/2019 15:22, Tim Dudgeon wrote:
When building the Java wrappers from source (the 
-DRDK_BUILD_SWIG_WRAPPERS=ON option) is possible to specify options 
to pass on to javac.


Specifically I'm wanting to use the '-source 8' option as most 
distros now come with java11 (and make it difficult to install an 
earlier one) but I want to build a version of org.RDKit.jar that is 
compatible with older Java versions.


Thanks

Tim



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] passing options to javac when building from source

2019-12-26 Thread Paolo Tosco


Hi Tim,

Try adding this to your CMake command:

-DJAVA_COMPILE="/usr/bin/javac -source 8"

Cheers,
p.

On 26/12/2019 15:22, Tim Dudgeon wrote:
When building the Java wrappers from source (the 
-DRDK_BUILD_SWIG_WRAPPERS=ON option) is possible to specify options to 
pass on to javac.


Specifically I'm wanting to use the '-source 8' option as most distros 
now come with java11 (and make it difficult to install an earlier one) 
but I want to build a version of org.RDKit.jar that is compatible with 
older Java versions.


Thanks

Tim



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand

2019-12-16 Thread Paolo Tosco


Hi IIllimar,

The RDKit PDB reader only recognize standard amino acids and, after the 
PR I did on Saturday https://github.com/rdkit/rdkit/pull/2850 will be 
merged, nucleic acid bases.


Anything else will not have the correct hybridization/bond orders 
perceived, as those are not encoded in the PDB format and the PDB reader 
does not have any functionality to do that.


The 1ARJ case is peculiar, as it has an ARG residue which would be 
recognized if it were in the ATOM records, but not in the HETATM 
section, for which no attempt to perceive the correct hybridization/bond 
is made.


My suggestion, if you are using standard PDB files, is to download the 
SDF file:


https://www.rcsb.org/pdb/download/downloadLigandFiles.do?ligandIdList=A2F=3GOT=all=false=false

and construct your RDKit molecule from that.

You should be able to automate that without too much effort either 
constructing URLs using the template above or using the PDB REST API.


Cheers,
p.

On 16/12/2019 18:24, Illimar Hugo Rekand wrote:

Thanks, Paolo, for a good and clear example.


I adapted your code into my workflow to calculate some Lipinski-properties of 
RNA pdb-structures, and ran into some issues. I'm not sure if I should make a 
new thread or throw this onto this one I already created?


I used the following code under


from rdkit import Chem
from rdkit.Chem import rdmolops, Lipinski
from urllib.request import urlopen
import gzip
import pprint
pp = pprint.PrettyPrinter(indent=4)


Lipinski_dic = {'FractionCSP3':Lipinski.FractionCSP3,
 'HeavyAtomCount':Lipinski.HeavyAtomCount,
 'NHOHCount': Lipinski.NHOHCount,
 "NOCount":Lipinski.NOCount,
 "NumAliphaticCarbocycles": Lipinski.NumAliphaticCarbocycles,
 "NumAliphaticHeterocycles" : Lipinski.NumAliphaticHeterocycles,
 'NumAliphaticRings' :  Lipinski.NumAliphaticRings,
 'NumAromaticCarbocycles' : Lipinski.NumAromaticCarbocycles,
 'NumAromaticHeterocycles' : Lipinski.NumAromaticHeterocycles,
 'NumAromaticRings' : Lipinski.NumAromaticRings,
 'NumHAcceptors' : Lipinski.NumHAcceptors,
 'NumHDonors' : Lipinski.NumHDonors,
 'NumHeteroatoms' : Lipinski.NumHeteroatoms,
 'NumRotatableBonds' : Lipinski.NumRotatableBonds,
 'NumSaturatedCarbocycles' : Lipinski.NumSaturatedCarbocycles,
 'NumSaturatedHeterocycles' : Lipinski.NumSaturatedHeterocycles,
 'NumSaturatedRings' : Lipinski.NumSaturatedRings,
 'RingCount' : Lipinski.RingCount
 }

url =  "https://files.rcsb.org/download/1arj.pdb.gz;
pdb_data = gzip.decompress(urlopen(url).read())
mol = Chem.RWMol(Chem.MolFromPDBBlock(pdb_data))
bonds_to_cleave = {(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in 
mol.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetIsHeteroAtom() ^ 
b.GetEndAtom().GetPDBResidueInfo().GetIsHeteroAtom()}
[mol.RemoveBond(*b) for b in bonds_to_cleave]
hetatm_frags = [f for f in rdmolops.GetMolFrags(mol, asMols=True, 
sanitizeFrags=True) if f.GetNumAtoms() and 
f.GetAtomWithIdx(0).GetPDBResidueInfo().GetIsHeteroAtom()]
for hetatm in hetatm_frags:
 res_name = hetatm.GetAtomWithIdx(0).GetPDBResidueInfo().GetResidueName()
 calculated_props = {}
 for prop in Lipinski_dic:
 function = Lipinski_dic[prop]
 x = function(hetatm)
 calculated_props[prop] = x
 pp.pprint(calculated_props)


and as you can see the properties of the ligand doesn't match up with what is 
expected (The number of SP3-atoms doesn't match up). When parsing through the 
structure 3got, it fails to recognize the aromatic rings of the ligand A2F. I'm 
assuming this is caused by RDKit not assigning bond orders correctly when 
reading in RNA and DNA pdb files (something which I have reported in earlier on 
this mailing list)?


Running hetatm.UpdatePropertyCache(strict=True) does not remedy this problem. 
Is there a clever way I can fix this quickly without waiting for this to be 
fixed in a future version?


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen



From: Illimar Hugo Rekand
Sent: Monday, December 16, 2019 5:55:56 PM
To: Paolo Tosco
Subject: Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand


Hey, Paolo,


thanks for a good and clear example!


all the best,


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen


____
From: Paolo Tosco 
Sent: Monday, December 16, 2019 5:52:18 PM
To: Illimar Hugo Rekand; rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand

Hi Illimar,

this gist:

https://gist.github.com/ptosco/2ee199b219

Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand

2019-12-16 Thread Paolo Tosco


Hi Illimar,

this gist:

https://gist.github.com/ptosco/2ee199b219b27e01052a7a1433b3bd22

shows a way to achieve this.

Cheers,
p.

On 16/12/2019 16:07, Illimar Hugo Rekand wrote:

Hello, everyone


Is there a simple way to create a mol object from just the HETATM/ligand lines 
from a pdb-file?

Would it be viable to create a function where you could create a mol object 
from specific lines within a pdb-file?


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] About RemoveHs and AddHs

2019-12-10 Thread Paolo Tosco


Hi Leon,

check the documentation for Chem.AddHs():

http://rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=addhs#rdkit.Chem.rdmolops.AddHs

ARGUMENTS:

[...]

addCoords: (optional) if this toggle is set, The Hs will have 3D 
coordinates set. Default value is 0 (no 3D coords).


So if you wish hydrogen coordinates to be nonzero, you need to add the 
addCoords parameter:


mol2 = Chem.AddHs(mol, addCoords=True)

Cheers,
p.

On 10/12/2019 19:19, topgunhaides . wrote:

Hi guys,

Can anyone help me with the RemoveHs() and AddHs()? Please see 
example below.



from rdkit import Chem
from rdkit.Chem import AllChem

suppl = Chem.SDMolSupplier('123.sdf')
# 123.sdf has 3D structure with Hs coordinates
# By default, Hs are removed by SDMolSupplier (see mol1.sdf)

for mol in suppl:
    print(mol.GetNumAtoms())
    w1 = Chem.SDWriter('./mol1.sdf')
    w1.write(mol)
    mol2 = Chem.AddHs(mol)
    print(mol2.GetNumAtoms())
    w2 = Chem.SDWriter('./mol2.sdf')
    w2.write(mol2)

    AllChem.EmbedMultipleConfs(mol2, numConfs=1000, maxAttempts=1000, 
pruneRmsThresh=1.0)

    cids = [conf.GetId() for conf in mol2.GetConformers()]


It is strange that the Hs coordinates after calling Chem.AddHs(mol) 
are all zeros (see below for part of the mol2.sdf):


..
    5.1350   -0.0950    0. C   0  0  0  0  0  0  0  0  0  0  0  0
    7.7331    0.4050    0. C   0  0  0  0  0  0  0  0  0  0  0  0
    3.4030   -0.0950    0. C   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
    0.    0.    0. H   0  0  0  0  0  0  0  0  0  0  0  0
..

As you can see I am using mol2 for embedding, but its Hs coordinates 
are apparently problematic.

Can anyone help me with this?
I am planning to add Hs for embedding and optimization, then remove Hs 
for RMSD, and finally add Hs back for writing into output file. Thank you!


Best,
Leon




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Turn off warning in Huckel calculations.

2019-12-10 Thread Paolo Tosco


Dear Jan,

That warning is printed straight to stderr by the YAeHMOP code, so 
disabling Boost logging won't help.


If you are using the YAeHMOP code from Python, your best option is 
probably to catch stderr as I described here:


https://sourceforge.net/p/rdkit/mailman/message/36781695/

f = io.BytesIO()
with stderr_redirector(f):
...your YAeHMOP call(s)...
grabbed_stderr = f.getvalue().decode('utf-8')

The warnings should be grabbed by the redirector and should now be found 
in grabbed_stderr.


HTH, cheers
p.

On 10/12/2019 10:10, Jan Halborg Jensen wrote:
I am using the new Huckel feature to find bonded atoms using bond 
orders (https://github.com/jensengroup/xyz2mol : xyz2AC_huckel). This 
means I am doing a calculation in a mol object with no bonds based xyz 
coordinates I read in, including hydrogens.


A calculation on water gives the following warning
!!! Warning !!! Distance between atoms 2 and 1 (0.962107 A) is suspicious.
!!! Warning !!! Distance between atoms 3 and 1 (0.962107 A) is suspicious.

where atom 2 and 3 are Hs. I believe this warning is because there are 
no bonds defined. Is there a way to turn off this warning?


rdBase.DisableLog('rdApp.error’) doesn’t work.

Best regards, Jan


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] RD Kit PostgreSQL in a container

2019-12-04 Thread Paolo Tosco


Hi,

you might want to take a look at this Docker image maintained by Matt Swain:

https://github.com/mcs07/docker-postgres-rdkit

Cheers,
p.

On 04/12/2019 18:36, Webster Homer wrote:


I’m looking at running  RD Kit Postgresql cartridge in a docker 
container. Has anyone done this? There are PostgreSQL containers 
available on line at https://hub.docker.com/_/postgres if there is an 
existing dockerfile with the RDKit extension, that would be great.


If not has anyone built one? Ideally I’d start from one of the 
existing dockerfiles.


RDKit Postgresql in the current distribution is version 11.2, the 
dockerfiles on the hub include an 11 and an 11.6 version. Any idea as 
to which one to use?


I’m new to dockerfiles, I’d appreciate any suggestions

Regards,

Webster Homer

This message and any attachment are confidential and may be privileged 
or otherwise protected from disclosure. If you are not the intended 
recipient, you must not copy this message or attachment or disclose 
the contents to any other person. If you have received this 
transmission in error, please notify the sender immediately and delete 
the message and any attachment from your system. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not accept liability 
for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any 
unauthorized changes of the content of this message and any attachment 
thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do 
not guarantee that this message is free of viruses and does not accept 
liability for any damages caused by any virus transmitted therewith. 
Click http://www.merckgroup.com/disclaimer to access the German, 
French, Spanish and Portuguese versions of this disclaimer.



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Perceived bond orders of RNA PDB-files

2019-12-04 Thread Paolo Tosco


Hi Illimar,

that's because the StandardPDBDoubleBond() function in 
ProximityBonds.cpp only takes into account standard amino acids, but not 
nucleotides. I'll submit a PR later tonight to include standard DNA and 
RNA bases.


Cheers,
p.

On 04/12/2019 08:28, Illimar Hugo Rekand wrote:

Hello, everyone


I was wondering if there is any reason for why aromaticity and perceived bond 
orders are not set properly when using the MolFromPDB-function for RNA PDB 
files, while it works perfectly for protein PDB files?


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] (no subject)

2019-12-03 Thread Paolo Tosco


Hi Eric,

as I mentioned, I do not have access to ChemDraw, so I am afraid I can't 
help more with that.


A possible workaround is to install OpenBabel in your conda environment 
and use that to do a cdx->mol conversion. While it should be possible to 
use the openbabel or pybel module, for some reason I couldn't get it 
work, so in the example below I am calling the obabel.exe executable.


import  subprocess
from  rdkit  import  Chem
from  rdkit.Chem.Draw  import  rdMolDraw2D
from  IPython.display  import  SVG

RDKit WARNING: [17:45:59] Enabling RDKit 2019.09.1 jupyter extensions

proc  =  subprocess.Popen("obabel.exe -i cdx 1oit_2d.cdx -o mol".split(),  
stdout=subprocess.PIPE,  stderr=subprocess.PIPE)
stdout,  stderr  =  proc.communicate()

rdk_mol  =  Chem.MolFromMolBlock(stdout.decode("UTF-8"))

drawer  =  rdMolDraw2D.MolDraw2DSVG(400,  400)
Chem.Kekulize(rdk_mol)
rdMolDraw2D.PrepareAndDrawMolecule(drawer,  rdk_mol)
drawer.FinishDrawing()
svg  =  drawer.GetDrawingText()

SVG(svg)

 

This should allow you to easily automate the SVG (or PNG) generation 
process.


HTH, cheers
p.

On 03/12/2019 15:08, Eric Murphy wrote:

Hi Paolo, all,

Thank you for responding. I tried this out this morning - though I 
could not find chemdraw using the combrowser.


Instead I went to the registry editor grabbed the CLSID from 
HKEY_CLASSES_ROOT\ChemDraw.Application.15.0\CLSID. (As far as I 
understand this should be identical to the GUID) I replaced the GUID 
in 
\Local\Continuum\anaconda3\pkgs\rdkit-2019.09.1-py37h422b363_0\Lib\site-packages\rdkit\utils\chemdraw.py 
- however, I'm not seeing any change in the error when I go to import 
via "from rdkit.utils import chemdraw".


Please let me know if I am doing something wrong.

Regards,
Eric Murphy, PhD


On Tue, Dec 3, 2019 at 3:47 AM Paolo Tosco <mailto:paolo.tosco.m...@gmail.com>> wrote:


Hi Eric,

it looks like the GUID in rdkit/utils/chemdraw.py might be outdated.

Unfortunately I don't have access to ChemDraw, so I can't test
this myself, but you might be able to find the current ChemDraw
GUID as follows.

From a Python prompt, issue the following commands:

|fromwin32com.client importcombrowse combrowse.main()|

This will pop up a COM browser. Expand the "Registered Type
Libraries" and look for ChemDraw.
Expand it, and you should find the IID (which is a GUID)
corresponding to ChemDraw.
Try replacing 5F646AAB-3B56-48D2-904C-A68D7989C251 in
rdkit/utils/chemdraw.py and see if it helps.

Cheers,
p.

On 02/12/2019 21:04, Eric Murphy wrote:

Hello all,

I'm trying to make use of the rdkit.utils.chemdraw to automate
conversion of cdx files to png's. and other formats. However, I'm
getting the following error message: ImportError: ChemDraw
version (at least version 7) not found.

I'm currently using windows 10 with anaconda 3 with rdkit
2019.09.1 installed from conda forge. I do have chemdraw
professional 15.0 installed, so I am wondering if there is
anything that I need to do to the path, etc.

Regards,
Eric Murphy, PhD
Multiphase Flow Scientist
Mechanical Engineer
murphyericja...@gmail.com <mailto:murphyericja...@gmail.com> |
563-449-6661

LinkedIn <https://www.linkedin.com/in/eric-james-murphy/> |
ResearchGate
<https://www.researchgate.net/profile/Eric_Murphy5> | Google
Scholar
<https://scholar.google.com/citations?user=3Mu7770J=en>


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net  
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] handling of stereo information from mol files when not sanitizing

2019-12-03 Thread Paolo Tosco


Hi Rasmus,

the problem is that, as stated in the |rdmolfiles.||MolFromMolFile|() 
docs, the removeHs option is only honored when sanitize is True.


So to obtain sensible results without sanitizing you should rather do 
something like:


m1 = Chem.MolFromMolFile('Ran1.sdf', sanitize=False)
m1 = Chem.RemoveHs(m1, sanitize=False)
print( Chem.MolToSmiles(set_correct_Chiral_flags(m1), isomericSmiles=True) )
m2 = Chem.MolFromMolFile('Ran2.sdf', sanitize=False)
m2 = Chem.RemoveHs(m2, sanitize=False)
print( Chem.MolToSmiles(set_correct_Chiral_flags(m2), isomericSmiles=True) )

You may check the individual sanitization operations here:
https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=rdmolops%20sanitizeflags#rdkit.Chem.rdmolops.SanitizeFlags

Cheers,
p.

On 03/12/2019 12:46, Rasmus "Termo" Lundsgaard wrote:

Hi all

I would like to avoid sanitizing the sdf files, as information in 
these files should be seen as the ground truth.


I however have some problems in figuring out how to read and set 
chiral information from the file and also have RDkit behave the same 
always. Attached are two sdf files with no 3d information and only 
stereo information in the atoms section for R-Aniline. The only 
difference as I see it is the order of the lines of the bond information.
Even so I get two different smiles back with isomeric information when 
not sanitizing.


Attached is also the minimal python code: which for me at least outputs:

not setting chiral flags
CC(N)C(=O)O
CC(N)C(=O)O

setting chiral flags
[H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H]
[H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H]

setting chiral flags and sanitize
C[C@@H](N)C(=O)O
C[C@@H](N)C(=O)O


Any ideas to why this happens and how I can handle it strictly. Also 
what does the sanitizing exactly do?


Regards Rasmus



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Segmentation fault when using MMFFOptimizeMoleculeConfs

2019-12-03 Thread Paolo Tosco


Hi Kas,

Thanks for reporting this bug - I have just submitted a PR to fix it.

The reason why you got this bug in the first place is that you are 
creating an O-Na bond, then assigning a +1 charge to oxygen. MMFF94 will 
choke on this, as it does not have parameters for that beast.


The segfault was caused by the MMFFOptimizeMoleculeConfs code not 
handling the MMFF94 error condition properly.


If you wish to obtain a few reasonable starting geometries for an adduct 
between neutral glycine and a sodium ion, and then optimize them, you 
might rather do the following:


glycine = Chem.MolFromSmiles('NCC(=O)O.[Na+]')
glycine = Chem.AddHs(glycine)

potential = AllChem.ETKDG()
AllChem.EmbedMolecule(glycine, potential)
ids = AllChem.EmbedMultipleConfs(glycine, 50, enforceChirality=True)
_=AllChem.MMFFOptimizeMoleculeConfs(glycine,
    ignoreInterfragInteractions=False)

Cheers,
p.

On 03/12/2019 10:37, Kas Houthuijs wrote:

Hi all,

I've been using RDKit to generate geometries for sodium adduct ions 
that I subsequently use in a workflow. I wrote the following procedure 
for adding a Na+ to a neutral about a year ago, when I was still very 
new at RDKit:


from rdkit import Chem
from rdkit.Chem import AllChem

glycine = Chem.MolFromSmiles('NCC(=O)O')
glycine = Chem.AddHs(glycine)
glycineRW = Chem.RWMol(glycine)
Na = glycineRW.AddAtom((Chem.Atom(11)))
glycineRW.AddBond(3, Na, Chem.BondType.SINGLE)
glycineRW.GetAtomWithIdx(3).SetFormalCharge(1)
Chem.SanitizeMol(glycineRW)

potential = AllChem.ETKDG()
AllChem.EmbedMolecule(glycineRW, potential)
ids = AllChem.EmbedMultipleConfs(glycineRW, 50, enforceChirality=True)
_=AllChem.MMFFOptimizeMoleculeConfs(glycineRW)

Although a chemist might be a bit offended by this representation of a 
sodium adduct ion (since the charge ends up on the oxygen instead of 
the sodium), it performed quite well in my workflow. However, since 
upgrading to rdkit version 2019.09.01 the last line


_=AllChem.MMFFOptimizeMoleculeConfs(glycineRW)

returns a Segmentation fault. This did not occur for version 2019.03.2 
and earlier. Additionally the segmentation fault does not occur when 
substituting the sodium for a hydrogen.


Any advise on resolving the segmentation fault would be greatly 
appreciated! And if you have advise on making more realistic sodium 
adduct ions, that would also be very welcome ;-)


Best wishes,
Kas Houthuijs


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Segmentation fault when using MMFFOptimizeMoleculeConfs

2019-12-03 Thread Paolo Tosco


Hi Kas,

I'll have a look into this and get back to you soon.

Cheers,
p.

On 03/12/2019 10:37, Kas Houthuijs wrote:

Hi all,

I've been using RDKit to generate geometries for sodium adduct ions 
that I subsequently use in a workflow. I wrote the following procedure 
for adding a Na+ to a neutral about a year ago, when I was still very 
new at RDKit:


from rdkit import Chem
from rdkit.Chem import AllChem

glycine = Chem.MolFromSmiles('NCC(=O)O')
glycine = Chem.AddHs(glycine)
glycineRW = Chem.RWMol(glycine)
Na = glycineRW.AddAtom((Chem.Atom(11)))
glycineRW.AddBond(3, Na, Chem.BondType.SINGLE)
glycineRW.GetAtomWithIdx(3).SetFormalCharge(1)
Chem.SanitizeMol(glycineRW)

potential = AllChem.ETKDG()
AllChem.EmbedMolecule(glycineRW, potential)
ids = AllChem.EmbedMultipleConfs(glycineRW, 50, enforceChirality=True)
_=AllChem.MMFFOptimizeMoleculeConfs(glycineRW)

Although a chemist might be a bit offended by this representation of a 
sodium adduct ion (since the charge ends up on the oxygen instead of 
the sodium), it performed quite well in my workflow. However, since 
upgrading to rdkit version 2019.09.01 the last line


_=AllChem.MMFFOptimizeMoleculeConfs(glycineRW)

returns a Segmentation fault. This did not occur for version 2019.03.2 
and earlier. Additionally the segmentation fault does not occur when 
substituting the sodium for a hydrogen.


Any advise on resolving the segmentation fault would be greatly 
appreciated! And if you have advise on making more realistic sodium 
adduct ions, that would also be very welcome ;-)


Best wishes,
Kas Houthuijs


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] (no subject)

2019-12-03 Thread Paolo Tosco


Hi Eric,

it looks like the GUID in rdkit/utils/chemdraw.py might be outdated.

Unfortunately I don't have access to ChemDraw, so I can't test this 
myself, but you might be able to find the current ChemDraw GUID as follows.


From a Python prompt, issue the following commands:

|fromwin32com.client importcombrowse combrowse.main()|

This will pop up a COM browser. Expand the "Registered Type Libraries" 
and look for ChemDraw.
Expand it, and you should find the IID (which is a GUID) corresponding 
to ChemDraw.
Try replacing 5F646AAB-3B56-48D2-904C-A68D7989C251 in 
rdkit/utils/chemdraw.py and see if it helps.


Cheers,
p.

On 02/12/2019 21:04, Eric Murphy wrote:

Hello all,

I'm trying to make use of the rdkit.utils.chemdraw to automate 
conversion of cdx files to png's. and other formats. However, I'm 
getting the following error message: ImportError: ChemDraw version (at 
least version 7) not found.


I'm currently using windows 10 with anaconda 3 with rdkit 2019.09.1 
installed from conda forge. I do have chemdraw professional 15.0 
installed, so I am wondering if there is anything that I need to do to 
the path, etc.


Regards,
Eric Murphy, PhD
Multiphase Flow Scientist
Mechanical Engineer
murphyericja...@gmail.com  | 
563-449-6661


LinkedIn  | 
ResearchGate  | 
Google Scholar 




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Open3DAlign scoring of existing alignment?

2019-11-26 Thread Paolo Tosco


Hi James,

What you've done seems correct to me; I have prepared an example gist here:

https://gist.github.com/ptosco/40c2530c67d9c0930b8efbc8c92da0be

which indeed shows correct matches.

One observation here is that Open3DALIGN was conceived to generate 
alignments for 3D-QSAR; it really doesn't a great job at computing a 3D 
similarity score, as it was not made for that purpose. So the attempts 
to score an existing overlay "in place" might be disappointing.


Cheers,
p.

On 26/11/2019 17:00, James Davidson wrote:


Dear All (especially Paolo!),

I have a strong suspicion I have already asked this at some point in 
the past – so apologies in advance (but I can’t seem to find the answer)…


I am interested in taking an existing overlay of two RDKit molecules 
in 3D and scoring the overlay using Open3DAlign scoring scheme (eg 
with MMFF atom-types), but **without** trying to optimise the 
alignment or score.


I thought setting maxIters=0 in the call to AllChem.GetO3A() would do 
the trick (I even tried setting options=3 to “trigger local 
optimization”).  Eg


o3a = AllChem.GetO3A(prb_mol, ref_mol, maxIters=0, options=3)

o3a.Matches()  # Show the matches

But while the options setting certainly changes the matching atoms 
(and the score), the matches don’t seem to correspond well to my 
starting alignment…


Any advice is greatly appreciated (including, of course, simply 
pointing me to the old answer that I am likely missing!)


Kind regards

James


__
PLEASE READ - This email is confidential and may be privileged. It is 
intended for the named addressee(s) only and access to it by anyone 
else is unauthorised. If you are not an addressee, any disclosure or 
copying of the contents of this email or any action taken (or not 
taken) in reliance on it is unauthorised and may be unlawful. If you 
have received this email in error, please notify the sender or 
postmas...@vernalis.com. Email is not a secure method of communication 
and the Company cannot accept responsibility for the accuracy or 
completeness of this message or any attachment(s). Please check this 
email for virus infection for which the Company accepts no 
responsibility. If verification of this email is sought then please 
request a hard copy. Unless otherwise stated, any views or opinions 
presented are solely those of the author and do not represent those of 
the Company.


Vernalis Limited (Company no. 2304992), Vernalis (R) Limited (no. 
1985479) and Vernalis Development Limited (no. 2600483)

Granta Park
Great Abington
Cambridge
CB21 6GB, UK
Tel: +44 (0)1223 895 555


__


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] The "confID" for "MMFFOptimizeMoleculeConfs"

2019-11-19 Thread Paolo Tosco


Hi Leon,

you are right, that's a documentation bug: The confId parameter is 
actually ignored, as you have already found out.


Thanks for reporting this, cheers
p.

On 19/11/2019 20:56, topgunhaides . wrote:


Hi guys,

Does the "confID" argument actually work for 
"MMFFOptimizeMoleculeConfs"? Try the following code:



from rdkit import Chem
from rdkit.Chem import AllChem

mh = Chem.AddHs(Chem.MolFromSmiles('OCCCN'))
cids = AllChem.EmbedMultipleConfs(mh, numConfs=3, maxAttempts=1000,
                                  pruneRmsThresh=0.5, numThreads=0, 
randomSeed=-1)


# try to optimize one conformer at a time in the loop:
for cid in cids:
    mmffopt_1 = AllChem.MMFFOptimizeMoleculeConfs(mh, confId=cid, 
maxIters=1000,

mmffVariant='MMFF94s', numThreads=0)
    print(mmffopt_1)

# just optimize one specific conformer (ID = 0):
mmffopt_2 = AllChem.MMFFOptimizeMoleculeConfs(mh, confId=0, maxIters=1000,
mmffVariant='MMFF94s', numThreads=0)
print(mmffopt_2)

# Or optimize all conformers:
mmffopt_3 = AllChem.MMFFOptimizeMoleculeConfs(mh, confId=-1, 
maxIters=1000,

mmffVariant='MMFF94s', numThreads=0)
print(mmffopt_3)


In the document for MMFFOptimizeMoleculeConfs: "confId : indicates 
which conformer to optimize". However, in all three cases, it still 
optimize all conformers and give me the "whole" thing:


[(0, 1.0966514172064503), (0, -1.5120724826923375), (0, 
0.6847373779429624)]
[(0, 1.0966514171119535), (0, -1.512072483200475), (0, 
0.6847373779078172)]
[(0, 1.0966514168939838), (0, -1.5120724834832924), (0, 
0.6847373779001575)]
[(0, 1.0966514168498929), (0, -1.512072483655178), (0, 
0.6847371291858746)]
[(0, 1.096651416829605), (0, -1.5120724837465005), (0, 
0.6847371291858746)]


Thank you.

Best,
Leon




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] ff.Minimize vs MMFFOptimize

2019-11-18 Thread Paolo Tosco


Hi Leon,

MMFFOptimizeMoleculeConfs() by default will distribute minimization 
tasks across all available CPU cores as it is multi-threaded at the C++ 
level, while ff.Minimize() will run single-threaded, unless you do the 
distribution of individual minimization tasks yourself in the Python 
layer (e.g., using Python multiprocessing.Pool or similar).


Apart from that technical difference, there is no scientific difference 
in the calculation itself.


Cheers,
p.

On 18/11/2019 17:05, topgunhaides . wrote:

Hey guys,

Are there any differences between the ff.minimize and 
MMFFGetMoleculeForceField for conformer optimization? Please see the 
forllowing example:



from rdkit import Chem
from rdkit.Chem import AllChem

m = Chem.AddHs(Chem.MolFromSmiles('OCCCN'))
cids = AllChem.EmbedMultipleConfs(m, 10, randomSeed=1)
ps = AllChem.MMFFGetMoleculeProperties(m, mmffVariant='MMFF94s')

for cid in cids:
    ff = AllChem.MMFFGetMoleculeForceField(m, ps, confId=cid)
    ff.Initialize()
    ff.Minimize(maxIts=1000)
    print(ff.CalcEnergy())

optm = AllChem.MMFFOptimizeMoleculeConfs(m, maxIters=1000, 
mmffVariant='MMFF94s')

print(optm)

Both give me the same set of energies for generated conformers. Thank 
you!


Best,
Leon




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Questions about adding and removing hydrogen atoms

2019-11-18 Thread Paolo Tosco


Hi Omar,

you may have a look at this thread for an explanation and an example:

https://sourceforge.net/p/rdkit/mailman/message/36787480/

Cheers,
p.

On 18/11/2019 15:28, Omar H94 wrote:

What does it mean to take symmetry into account ?

On Mon, Nov 18, 2019 at 6:07 PM topgunhaides . > wrote:


Hi Greg,

Thanks a lot! This is very helpful. Further questions:

1. If I need RMSD matrix for clustering, I guess I will have to
figure out a way to loop over all conformers to get the matrix
first, if I choose to use GetBestRMS()?

2. Does the AlignMolConformers() handle symmetry and align all
permutations to get the best "RMSlist"?

3. So I guess the EmbedMultipleConfs() also uses "standard" (no
symmetry consideration) method to compute RMS for pruning?

I appreciate your help!

Best,
Leon


On Thu, Nov 14, 2019 at 2:45 AM Greg Landrum
mailto:greg.land...@gmail.com>> wrote:

Hi Leon,

There's not really a "right" answer to your question - it
depends on what you want to calculate.
I personally think it makes more sense to use the heavy atom
RMSD (which is what you get if you remove Hs before
calculating the RMSD), particularly if you are comparing to
experiment.

Note that AllChem.GetConformerRMSMatrix() does not take
symmetry into account, so you may not get the correct results.
I just opened a ticket to fix this, but in the meantime if you
have molecules with symmetry-equivalent atoms you are probably
better off generating the conformer RMS matrix manually using
GetBestRMS().

Best,
-greg

On Wed, Nov 13, 2019 at 5:17 PM topgunhaides .
mailto:sunzhi@gmail.com>> wrote:

Hi guys,

I am new to RDKit, and have a question about adding and
removing Hs.

As recommended in the documentation, hydrogen atoms should
be added for generating conformers, optimization, etc.

However, for clustering, should the Hs be removed first,
before generating the conformer RMS matrix? For instance:


from rdkit import Chem
from rdkit.Chem import AllChem, TorsionFingerprints

suppl = Chem.SDMolSupplier('molecule.sdf')

for mol in suppl:
    mh = Chem.AddHs(mol)
    cids = AllChem.EmbedMultipleConfs(mh, numConfs=5,
maxAttempts=1000,
pruneRmsThresh=0.5, numThreads=0, randomSeed=1)
    m = Chem.RemoveHs(mh)
    # RMS matrix
    rmsmat = AllChem.GetConformerRMSMatrix(m,
prealigned=False)
    # TFD matrix
    tfdmat = TorsionFingerprints.GetTFDMatrix(m)
    print(rmsmat)
    print(tfdmat)


Note I remove the Hs before getting RMS and TFD matrices.
Both resulting matrices are different if I do not remove
Hs. The RMS without Hs, in general, tend to be smaller
than the RMS with Hs. This will in turn affect the
subsequent clustering result.

Could you guys give me some suggestions? Thank you!

Best,
Leon
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Fragmentation Help

2019-11-17 Thread Paolo Tosco


Hi Ben,

you could use SMARTS queries to do that, e.g.:

from  rdkit  import  Chem
from  rdkit.Chem.Draw  import  IPythonConsole

RDKit WARNING: [19:50:41] Enabling RDKit 2020.03.1dev1 jupyter extensions

sucrose  =  
"O1[C@H](CO)[C@@H](O)[C@H](O)[C@@H](O)[C@H]1O[C@@]2(O[C@@H]([C@@H](O)[C@@H]2O)CO)CO"

sucrose_mol  =  Chem.MolFromSmiles(sucrose)

sucrose_mol

primary_alcohol  =  Chem.MolFromSmarts("[CH2][OH1]")

sucrose_mol.GetSubstructMatches(primary_alcohol)

((2, 3), (19, 20), (21, 22))

secondary_alcohol  =  Chem.MolFromSmarts("[CH1][OH1]")

sucrose_mol.GetSubstructMatches(secondary_alcohol)

((4, 5), (6, 7), (8, 9), (15, 16), (17, 18))

I hope this helps, cheers
p.

On 16/11/2019 22:02, Ben Davidson wrote:
Hi, I am using RDKit to basically identify the oxygenated groups in 
complex biomass. The ether, ester, ketone, and aldehyde getfrag 
commands are amazing. I am wondering if it is possible to identify the 
different types of alcohols in a molecule ie, primary, secondary, and 
finally tertiary. For example in sucrose, there are 8 alcohols, 5 
secondary and 3 primary, is it possible to have RDKit identify that 
for me? if you would like to see any code feel free to email me at 
davidson.be...@gmail.com 

Any help would be greatly appreciated.
-Ben


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Somthing wrong with MolDraw2DSVG

2019-11-06 Thread Paolo Tosco


Hi Zhang,

this looks like a bug triggered by molecules whose Y size is very small, 
such as all molecules which are constituted by a single, horizontal bond:


https://github.com/rdkit/rdkit/issues/2762

Cheers,
p.


On 11/04/19 07:14, Shengde wrote:

Hi,

I try to draw molecules in a grid and use the  following code.
Usually it works good. Howerver, when I try to draw some molecules
with single atom like "Cl", I got a *blank figure* with nothing on
it. As long as I add a more complex molecule to the smi_list like
["*Cl","*CC"], I got what I want again. Why can't I draw only
single atom molecules in a figure ? My rdkit version is *2019.03.2.*


from rdkit import Chem
from rdkit.Chem import Draw
import math
from IPython.display import SVG

smi_list = ["*Cl"]
mols = [Chem.MolFromSmiles(smi) for smi in smi_list]
sub_size = [250,250]
mols_num = len(mols)
columns_num = 5
rows_num = math.ceil(mols_num/5)
grid = [columns_num,rows_num]
d = 
Draw.rdMolDraw2D.MolDraw2DSVG(grid[0]*sub_size[0],grid[1]*sub_size[1], 
sub_size[0],sub_size[1])

opt = d.drawOptions()
opt.legendFontSize=20
d.SetFontSize(1.3*d.FontSize())
d.SetLineWidth(1)
d.DrawMolecules(mols,
                highlightAtoms=None,
                highlightBonds=None,
                highlightAtomColors=None,
                highlightBondColors=None,
                legends=None)
d.FinishDrawing()
SVG(d.GetDrawingText())


Thank you for your help!

Best regards,
Shengde, Zhang




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Explicit H in substructure searches

2019-11-05 Thread Paolo Tosco


Hi Markus,

I tried to put together a comprehensible explanation in his gist:

https://gist.github.com/ptosco/1088937ce332bd66c999a2a5fbc855b3

Please also refer to the following threads on the mailing list:

https://sourceforge.net/p/rdkit/mailman/message/29679834/
https://sourceforge.net/p/rdkit/mailman/message/36696340/

and to this blog post by Roger Sayle:

https://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/

for further clarifications.

Cheers,
p.

On 05/11/2019 19:52, Markus Heller wrote:


Hi,

I’m trying to understand how to properly use explicit hydrogens in 
substructure searches.  Below is an example.  I would like to find all 
molecules that contain my query with hydrogens at the nitrogens, and I 
thought I was on the right track … Why does the first query with the 
explicit H not match m1?


Thanks

Markus



from rdkit import Chem

from rdkit.Chem.Draw import IPythonConsole

from rdkit.Chem import rdDepictor

rdDepictor.SetPreferCoordGen(True)

IPythonConsole.ipython_useSVG = True

m1 = Chem.MolFromSmiles('c1cn[nH]c1N')

m2 = Chem.MolFromSmiles('CNc1ccn[nH]1')

m3 = Chem.MolFromSmiles('Nc1ccnn(C)1')

# do not remove explicit H

params = Chem.SmilesParserParams()

params.removeHs=False

query = Chem.MolFromSmiles('c1cn[nH]c(N([H])([H]))1', params)

# first should be True, but all are False

m1.HasSubstructMatch(query)

m2.HasSubstructMatch(query)

m3.HasSubstructMatch(query)

# rebuild query with explicit H removed, not what I want

query = Chem.MolFromSmiles('c1cn[nH]c(N([H])([H]))1')

m1.HasSubstructMatch(query)

m2.HasSubstructMatch(query)

m3.HasSubstructMatch(query)



--

*Markus Heller, PhD*

Senior Scientist

Direct:604.827.1122 Main:604.827.1147

A027228F

2405 WesbrookMall,4th Floor,Vancouver,BCV6T 1Z3

Thisemail and any attachments theretomay 
containconfidentialmaterialforthesoleuse of theintended 
recipient.Anyreview,copying,or distribution of thisemail (or any 
attachments thereto)by others is strictly prohibited. If youare not 
theintended recipient,please contact thesenderimmediately and 
permanently delete theoriginal and any copiesof thisemail and any 
attachments thereto.




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Building RDKit Java Wrappers

2019-11-05 Thread Paolo Tosco


Hi Jenny,

-- Could NOT find Boost (missing: serialization) (found suitable version 
"1.65.1", minimum required is "1.56.0")


it looks like CMake found your Boost installation but either the 
serialization library was not built or possibly was not built for the 
same architecture (32-bit vs 64-bit) as the architecture you are 
attempting to build RDKit for.


In particular, I suggest that you add to your CMake command line the 
following arguments:


-G "Visual Studio 15 2017" -A x64

as the CMake generator for VS 2017 still defaults to 32-bit, whereas 
most likely you have built Boost for a 64-bit architecture.


Feel free to get back to me off-list if you still have trouble building 
the RDKit.


Cheers,
p.

On 05/11/2019 21:54, Jenny h wrote:

Dear all,

First of all sorry for the spam in case you received the mail earlier, 
but since I so far did not see it on the list I assumed it got somehow 
stuck in some spam filter.


I wanted to use the RDKit Java wrappers to just learn and play around 
with RDKit in Java (and maybe one day be able to write a KNIME node 
myself).
Unfortunately I am stuck already at the building process. 
Unfortunately, I am neither familiar with boost nor cmake, therefore I 
am sorry if I missed a trivial step.

In general I am working on Win10 64bit.

What I have done so far:
Install Visual Studio 2017
Install cmake
build boost 1.65.1 (using bootstrap and .\b2) and run b2 in the test 
folder of the serialization library, so far it looked fine (to me at 
least).

Then I added cmake as well as the boost root and boost\lib to the path.

Next I am trying to compile RDKit and the wrappers with the command:

cmake -DRDK_BUILD_PYTHON_WRAPPERS=OFF
-DRDK_BUILD_INCHI_SUPPORT=ON
-DRDK_BUILD_AVALON_SUPPORT=ON
-DRDK_BUILD_SWIG_WRAPPERS=ON
 -DBOOST_ROOT=C:\rdkitstuff\boost\boost_1_65_1 ..

However, I am getting the following output:
-- Selecting Windows SDK version 10.0.14393.0 to target Windows 
10.0.18362.


-- Found Catch2 source in C:/rdkitstuff/rdkit/External/catch/catch 
CATCH: C:/rdkitstuff/rdkit/External/catch/catch/single_include -- 
Could NOT find InChI in system locations (missing: INCHI_LIBRARY 
INCHI_INCLUDE_DIR) CUSTOM_INCHI_PATH = 
C:/rdkitstuff/rdkit/External/INCHI-API -- Found InChI software locally 
-- Could NOT find Boost (missing: serialization) (found suitable 
version "1.65.1", minimum required is "1.56.0") == Using strict rotor 
definition CMake Error at 
C:/rdkitstuff/CMake/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146 
(message):   Could NOT find Boost (missing: system iostreams) (found 
suitable version   "1.65.1", minimum required is "1.56.0") Call Stack 
(most recent call first):   
C:/rdkitstuff/CMake/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393 
(_FPHSA_FAILURE_MESSAGE)   
C:/rdkitstuff/CMake/share/cmake-3.16/Modules/FindBoost.cmake:2162 
(find_package_handle_standard_args)   
External/CoordGen/CMakeLists.txt:13 (find_package) -- Configuring 
incomplete, errors occurred! See also 
"C:/rdkitstuff/rdkit/build/CMakeFiles/CMakeOutput.log". See also 
"C:/rdkitstuff/rdkit/build/CMakeFiles/CMakeError.log".


I have tried it with the cmake gui, the vs 2017 developer command line 
and the normal cmd from Win10. I get this error in all cases. I also 
tried it with boost 1.66.0 but the problem is the same. It looks like 
I forgot to specify some Path somewhere, although I assume that by 
specifying the BOOST_ROOT it should find boost with all necessary 
libraries. Maybe I missed something here?


I have not attached the error logs for more information since the mail 
was not sent the first time, however, if needed I am happy to try to 
send them again.


It would be great to get some help here as I am totally clueless on 
what else I could try. I am happy to provide any other information if 
more is needed.


Thanks in advance!

Best,

Jennifer



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] fingerprint a molecule with pseudoatoms denoted by 'Du'

2019-10-30 Thread Paolo Tosco


Hi Jenke,


I have put together a small gist showing a slightly hacky way to 
round-trip a molecule containing dummy atoms through a PDB block 
(assuming that your molecules do not contain astatine). If your dummy 
atoms are called "DU" rather than " *", you may just change the 
replace() expression with something that fits your needs.



HTH, cheers

p.


On 10/30/19 12:06, SCHEEN Jenke wrote:

Hi RDKitters,

I'm trying to use rdkit to generate molecular fingerprints (such as AP 
or ECFP) on molecules that have non-interactive pseudoatoms ('dummy 
atoms', denoted by Du). I attached a sample PDB file containing the 
dummy atoms on positions 21-24. Reading this file 
(Chem.rdmolfiles.MolFromPDBFile("test.pdb", sanitize=False) throws a 
post-condition violation because the element 'Du' isn't recognised, 
which makes sense. I've been searching online and haven't been able to 
find any workarounds, do you have any suggestions?


Some notes:

  * I'm hoping that once rdkit is able to read in the pdb file the mol
object can be parsed without the FP constructor (e.g.
AllChem.GetMorganFingerprint) complaining.
  * The use of the term dummy atoms here should not be confused with
the dummy atoms depiction in fragmentising molecules in rdkit
(where * is the smiles notation).
  * For this project all I aim to do is generate structural
fingerprints for these types of ligands. This means I won't have
to worry about defining chemical properties to Du.
  * The context for this issue is that we're aiming to featurise the
ligands for an ML protocol where the dummy atoms are one of the
major descriptors of the problem.

  * I thought manually inserting a 119th element in atomic_data.cpp
might resolve the issue but I've been unable to locate the file in
my conda installation.
  * The ODDT python API seems to parse the Du element without any
issues but is limited in its FP generator diversity.


Best,

Jenke

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] AlignMol and GetBestRMS

2019-10-17 Thread Paolo Tosco


Hi Stamatia,

the difference between GetBestRMS and AlignMol is that GeBestRMS will 
yield the lowest RMSD across all symmetry-equivalent supeirmpositions, 
whereas AlignMol will yield the RMS from a single superimposition, by 
default obtained overlaying atoms with the same index across the two 
conformations.


See this gist for an example:

http://htmlpreview.github.io/?https://gist.githubusercontent.com/ptosco/967ed3739cdcb169cf8308502f9c5881/raw/60e20790b1a78f88abfe348029c2134f8273a9c3/GetBestRMS.html

Cheers,
p.

On 10/17/19 16:36, Stamatia Zavitsanou wrote:
What is the difference with GetBestRMS function since the AlignMol is 
supposed to give me the minimum RMSD?


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Setting atom type

2019-10-16 Thread Paolo Tosco


Hi Luan,

as mentioned in the e-mail thread that you have found, it is not 
currently possible to change atom type definitions from Python.


Could you give me a few more details regarding your use case? What are 
you trying to achieve? Feel free to reply to me off-list.


Thanks, cheers
p.


On 10/16/19 20:18, Luan Carvalho Martins wrote:

Dear all,

Some years ago, in this list, there was a discussion regarding the 
possibility of explicitly setting the atom type of an atom in MMFF or 
UFF (thread: 
https://sourceforge.net/p/rdkit/mailman/message/31600590/). It is not 
clear if this functionality made to the code. Does anyone know if it 
did? Is there and alternative way to enforce an atom type to UFF/MMFF? 
Eg: by setting with atom names/atomic numbers?


Thank you very much.

Sincerely,
Luan Carvalho Martins
luancarvalhomart...@gmail.com 




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] identify and setformal charge of carboxylic acid

2019-10-16 Thread Paolo Tosco


Hi Jorgen,

you can use a reaction SMARTS for that purpose; see below for an example 
and the docs


https://www.rdkit.org/docs/GettingStartedInPython.html#chemical-reactions

for more information.

from  cresset  import  flare

from  rdkit  import  Chem
from  rdkit.Chem  import  AllChem

deprotonate_cooh  =  AllChem.ReactionFromSmarts(
"[C:1](=[O:2])-[OH1:3]>>[C:1](=[O:2])-[O-H0:3]")

m1  =  Chem.MolFromSmiles(
'CC(=O)N[C@@H]1[C@H](C[C@@](O[C@H]1[C@H]([C@H](CO)O)O)(C(=O)O)O)O')
m2  =  Chem.MolFromSmiles('CCN')
m3  =  Chem.MolFromSmiles('OC(=O)CCC(=O)[O-]')
mols  =  (m1,  m2,  m3)

[print(Chem.MolToSmiles(m))  for  m  in  mols];

CC(=O)N[C@@H]1[C@@H](O)C[C@](O)(C(=O)O)O[C@H]1[C@@H](O)[C@@H](O)CO
CCN
O=C([O-])CCC(=O)O

mols_deprot  =  []
for  m  in  mols:
m_deprot  =  deprotonate_cooh.RunReactants((m,))
mols_deprot.append(m_deprot[0][0]  if  m_deprot  else  m)

[print(Chem.MolToSmiles(m))  for  m  in  mols_deprot];

CC(=O)N[C@@H]1[C@@H](O)C[C@](O)(C(=O)[O-])O[C@H]1[C@@H](O)[C@@H](O)CO
CCN
O=C([O-])CCC(=O)[O-]

Cheers,
p.

On 10/16/19 18:55, Jorgen Simonsen wrote:

Hi all,

I have hit the wall with how to do this the smartest way - I have a 
bunch and molecules and I need to set their charge state. One of the 
molecules:



# Has a carboxylic acid
m1 = 
Chem.MolFromSmiles('CC(=O)N[C@@H]1[C@H](C[C@@](O[C@H]1[C@H]([C@H](CO)O)O)(C(=O)O)O)O')


So my question is what is the best way - iterate through the molecule 
and identify the carbon that has =O,-O attached - maybe there is 
already a functionality to do this in rdkit. Or is there a function 
that deprotonates all carboxylic groups?


Ay advice how to proceed very much appreciated thanks.




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] SDMolSupplier, next()

2019-10-15 Thread Paolo Tosco


Dear Jean-Marc,

in Python 3 you need to use next(suppl) or suppl.__next__().

Cheers,
p.


On 10/15/19 21:34, Jean-Marc Nuzillard wrote:

Dear all,

The code:
from rdkit import Chem
sdfnamein = "simple.sdf"
suppl = Chem.SDMolSupplier(sdfnamein)
m = suppl.next()
print(m.GetNumAtoms())
prints:
Traceback (most recent call last):
   File "demo.py", line 4, in 
     m = suppl.next()
AttributeError: 'SDMolSupplier' object has no attribute 'next'
even though the code in 
http://www.rdkit.org/docs/source/rdkit.Chem.rdmolfiles.html#rdkit.Chem.rdmolfiles.SDMolSupplier,

at paragraph "Lazy evaluation 2" indicates:
>>> suppl  =  SDMolSupplier('in.sdf')
>>> mol1  =  suppl.next()
I run rdkit 2018.09.1.0 from Anaconda in Windows 10.

for  mol  in  suppl:
mol.GetNumAtoms()
works fine.

Best,

Jean-Marc

--
Dr. Jean-Marc Nuzillard
Institute of Molecular Chemistry, CNRS UMR 7312
Faculté des Sciences Exactes et Naturelles, Bâtiment 18
BP 1039
51687 REIMS Cedex 2
France

Tel : 33 3 26 91 82 10
Fax : 33 3 26 91 31 66
http://www.univ-reims.fr/icmr
http://eos.univ-reims.fr/LSD/CSNteam.html

http://www.univ-reims.fr/LSD/
http://www.univ-reims.fr/LSD/JmnSoft/




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Compatibility with pylint in vscode

2019-10-12 Thread Paolo Tosco

Hi Hongbin,

Try configuring

extension-pkg-whitelist=rdkit 

Then pylint should recognise RDKit methods.

Cheers,
p.

> On 12 Oct 2019, at 08:12, Hongbin Yang  wrote:
> 
> Dear RDKit users,
> 
> Does any one use vscode with pylint support? 
> In my IDE, it hints me that "Module 'rdkit.Chem' has no 'MolFromSmiles' 
> member." where there is a red wavy line under the code "Chem". The 
> environment of conda/python is correctly configured and the scripts can run.
> 
> I know that it may be caused by the fact that some modules and functions in 
> RDKit are just wrappers of C++, so pylint may not have recognized these 
> modules or functions. 
> 
> But the red wavy line is really offending. Is there any suggestions in 
> addition to disabling pylint?
> 
> Best regards,
> 
> Hongbin Yang 杨弘宾, Ph.D.
> Research: Toxicophore and Chemoinformatics
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Paolo Tosco


Hi Mike,

please find here a solution which I have just tested and works well on 
both Unix and Windows.


You need to redirect the C++ stderr stream with ctypes around the call 
whose output you wish to grab.


This can be done defining a context manager that uses ctypes:

import os
import sys
import datetime
import ctypes
import io
import tempfile
from contextlib import contextmanager

# Adapted from
# 
https://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/

if (sys.platform == "win32"):
    kernel32 = ctypes.WinDLL("kernel32")
    # https://docs.microsoft.com/en-us/windows/console/getstdhandle
    C_STD_ERROR_HANDLE = -12
    c_stderr = kernel32.GetStdHandle(C_STD_ERROR_HANDLE)
    c_flush = kernel32.FlushFileBuffers
else:
    libc = ctypes.CDLL(None)
    c_stderr = ctypes.c_void_p.in_dll(libc, "stderr")
    c_flush = libc.fflush

@contextmanager
def stderr_redirector(stream):
    # The original fd stderr points to.
    original_stderr_fd = sys.stderr.fileno()

    def _redirect_stderr(to_fd):
    """Redirect stderr to the given file descriptor."""
    # Flush the C-level buffer stderr
    c_flush(c_stderr)
    # Flush and close sys.stderr - also closes the file descriptor (fd)
    sys.stderr.close()
    # Make original_stderr_fd point to the same file as to_fd
    os.dup2(to_fd, original_stderr_fd)
    # Create a new sys.stderr that points to the redirected fd
    sys.stderr = io.TextIOWrapper(os.fdopen(original_stderr_fd, 'wb'))

    # Save a copy of the original stderr fd in saved_stderr_fd
    saved_stderr_fd = os.dup(original_stderr_fd)
    try:
    # Create a temporary file and redirect stderr to it
    tfile = tempfile.TemporaryFile(mode='w+b')
    _redirect_stderr(tfile.fileno())
    # Yield to caller, then redirect stderr back to the saved fd
    yield
    _redirect_stderr(saved_stderr_fd)
    # Copy contents of temporary file to the given stream
    tfile.flush()
    tfile.seek(0, io.SEEK_SET)
    stream.write(tfile.read())
    finally:
    tfile.close()
    os.close(saved_stderr_fd)

Then, all you need to grab the RDKit warning printed to stderr is use 
the stderr_redirector() context manager around the relevant call, then 
check the grabbed output for relevant content.


For instance, in your example wrap the Chem.MolToInchi() call as follows:

  f = io.BytesIO()
  with stderr_redirector(f):
  InChi = Chem.MolToInchi(Chem.MolFromSmiles(y))
  grabbed_stderr = f.getvalue().decode('utf-8')
  if ("WARNING" in grabbed_stderr):
  print("caught: ", grabbed_stderr)

Cheers,
p.

On 09/10/2019 18:10, Mike Mazanetz wrote:


Hi,

Many thanks this, it is very helpful to see some code.

Yes, as it stands, I am yet to get warnings which are seen in stdout 
being sent to a file, only errors seem to find their way to my files.


Usually Warnings about stereochemistry don’t get captured.  Anyone see 
this, I’m guessing it’s the same for failed InChI’s too?


Thanks,

mike

*From:*Scalfani, Vincent 
*Sent:* 09 October 2019 14:40
*To:* Maciek Wójcikowski ; Greg Landrum 


*Cc:* RDKit Discuss 
*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Hi Macjek and Mike,

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:


m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')

Chem.MolToInchi(m)

'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

Now, try with one of the non-standard options such as FixedH:

Chem.MolToInchi(m,'/FixedH')

'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

To answer the question of what happens when the InChI calculation 
fails, I get an empty string.


m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')


Chem.MolToInchi(m)

'  '

There is also an InChI option that can warn on empty structures, and 
calculate an empty InChI, which I am assuming is supposed to be 
‘InChI=1S//’, however, when trying this option I get the same result 
as above.


Chem.MolToInchi(m,'/WarnOnEmptyStructure')

'  '

I hope that helps.

Vin

*From:*Maciek Wójcikowski >

*Sent:* Wednesday, October 9, 2019 3:41 AM
*To:* Greg Landrum >
*Cc:* RDKit Discuss >

*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Mike,

On top of what Greg said what might be particularly useful is an 
options parameter where you can pass some non default params to InChI 
call.


śr., 9 paź 2019, 07:22 użytkownik Greg Landrum > napisał:


Hi Mike,

The InChI API itself is not exposed. The contents of the
module are in the documentation along with some explanations of
how to call it:

Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Paolo Tosco


Hi Mike,

as I promised I'll put together something for you to capture warnings; 
I'll try to get it done tonight.


p.


On 10/09/19 18:10, Mike Mazanetz wrote:


Hi,

Many thanks this, it is very helpful to see some code.

Yes, as it stands, I am yet to get warnings which are seen in stdout 
being sent to a file, only errors seem to find their way to my files.


Usually Warnings about stereochemistry don’t get captured.  Anyone see 
this, I’m guessing it’s the same for failed InChI’s too?


Thanks,

mike

*From:*Scalfani, Vincent 
*Sent:* 09 October 2019 14:40
*To:* Maciek Wójcikowski ; Greg Landrum 


*Cc:* RDKit Discuss 
*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Hi Macjek and Mike,

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:


m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')

Chem.MolToInchi(m)

'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

Now, try with one of the non-standard options such as FixedH:

Chem.MolToInchi(m,'/FixedH')

'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

To answer the question of what happens when the InChI calculation 
fails, I get an empty string.


m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')


Chem.MolToInchi(m)

'  '

There is also an InChI option that can warn on empty structures, and 
calculate an empty InChI, which I am assuming is supposed to be 
‘InChI=1S//’, however, when trying this option I get the same result 
as above.


Chem.MolToInchi(m,'/WarnOnEmptyStructure')

'  '

I hope that helps.

Vin

*From:*Maciek Wójcikowski >

*Sent:* Wednesday, October 9, 2019 3:41 AM
*To:* Greg Landrum >
*Cc:* RDKit Discuss >

*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Mike,

On top of what Greg said what might be particularly useful is an 
options parameter where you can pass some non default params to InChI 
call.


śr., 9 paź 2019, 07:22 użytkownik Greg Landrum > napisał:


Hi Mike,

The InChI API itself is not exposed. The contents of the
module are in the documentation along with some explanations of
how to call it:

http://rdkit.org/docs/source/rdkit.Chem.rdinchi.html

If something is missing there, please let us know.

-greg

On Tue, Oct 8, 2019 at 5:20 PM mailto:mi...@novadatasolutions.co.uk>> wrote:

Dear RdKit users,

I was reading the inchi module docs and I couldn't find
methods to call the InChI API.  Are these exposed in RDKit?

It says the default is the standard Inchi.  What happens when
this conversion fails?

Thanks,

Mike

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] regarding hydrogens from SMILES

2019-10-08 Thread Paolo Tosco


Hi Jorgen,

use the MMFF94s variant of the forcefield if you wish to force trigonal 
nitrogens to be planar:


AllChem.MMFFOptimizeMolecule(m2, mmffVariant="MMFF94s")

More information here:

https://doi.org/10.1002/(SICI)1096-987X(199905)20:7%3C720::AID-JCC7%3E3.0.CO;2-X

Cheers,
p.

On 10/08/19 15:27, Jorgen Simonsen wrote:

Cheers Paolo,

It looks like that it keeps sp3 as the optimal geometry and not sp2.
The optimization did converge :

AllChem.MMFFOptimizeMolecule(m2,)

#returned 1

I think it is getting the types wrong or I have to specify the types?



On Tue, Oct 8, 2019 at 10:10 AM Paolo Tosco 
mailto:paolo.tosco.m...@gmail.com>> wrote:


Hi Jorgen,

optimizing your molecule geometry with UFF or MMFF should fix the
problem:

AllChem.UFFOptimizeMolecule(m2)

or

AllChem.MMOptimizeMolecule(m2)

see rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule

<https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule>
or rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule

<https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule>.

Cheers,
p.

On 10/08/19 14:41, Jorgen Simonsen wrote:

Hi all,

I am trying to built 3D structures from SMILES which for most of
the molecules works fine - I get the SMILES from pubchem
('canonical_smiles' and 'isomeric_smiles') but some of the
molecules they hydrogens are not added correctly and are out of
plane - e.g. amide group in ATP ( see below for an example or
arginine in a peptide).

I use the following code to generate the 3D structure :

from rdkit import Chem
from rdkit.Chem import AllChem
m1 =

Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N')

m2 = Chem.AddHs(m1)
AllChem.EmbedMolecule(m2)

w = Chem.SDWriter('foo.sdf')
w.write(m2)

# or to mol file

print(Chem.MolToMolBlock(m2),file=open('foo.mol','w+'))

How to insure that the atomtype are correct ?

Thanks in advance

Best
Jorgen











___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] regarding hydrogens from SMILES

2019-10-08 Thread Paolo Tosco


Hi Jorgen,

optimizing your molecule geometry with UFF or MMFF should fix the problem:

AllChem.UFFOptimizeMolecule(m2)

or

AllChem.MMOptimizeMolecule(m2)

see rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule 
 
or rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule 
.


Cheers,
p.

On 10/08/19 14:41, Jorgen Simonsen wrote:

Hi all,

I am trying to built 3D structures from SMILES which for most of the 
molecules works fine - I get the SMILES from pubchem 
('canonical_smiles' and 'isomeric_smiles') but some of the molecules 
they hydrogens are not added correctly and are out of plane - e.g. 
amide group in ATP ( see below for an example or arginine in a peptide).


I use the following code to generate the 3D structure :

from rdkit import Chem
from rdkit.Chem import AllChem
m1 = 
Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N')


m2 = Chem.AddHs(m1)
AllChem.EmbedMolecule(m2)

w = Chem.SDWriter('foo.sdf')
w.write(m2)

# or to mol file

print(Chem.MolToMolBlock(m2),file=open('foo.mol','w+'))

How to insure that the atomtype are correct ?

Thanks in advance

Best
Jorgen











___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Saving chains from PDB file

2019-10-05 Thread Paolo Tosco

Hi Chris,

The following, though quite inefficient, will work:

from rdkit import Chem
mol = Chem.MolFromPDBFile("1CX2.pdb")
chains = {a.GetPDBResidueInfo().GetChainId() for a in mol.GetAtoms()}
chain_mols = {c: Chem.RWMol(mol) for c in chains}
for c, m in chain_mols.items():
bonds_to_remove = [(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in 
m.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetChainId() != c or 
b.GetEndAtom().GetPDBResidueInfo().GetChainId() != c]
atoms_to_remove = [a.GetIdx() for a in m.GetAtoms() if 
a.GetPDBResidueInfo().GetChainId() != c]
[m.RemoveBond(*b) for b in bonds_to_remove]
[m.RemoveAtom(a) for a in sorted(atoms_to_remove, reverse=True)]
Chem.MolToPDBFile(m, "{0:s}.pdb".format(c))

Individual chains are saved to .

As chains will be separate fragments, a more efficient way would to use 
rdmolops.GetMolFrags(asMols=True) which would avoid the bond/atom removal.

Sorry for the poor formatting but this is what I could come up with IPython on 
the iPhone :-(

p.

> On 5 Oct 2019, at 12:46, Chris Swain via Rdkit-discuss 
>  wrote:
> 
> Hi,
> 
> I have a number of PDB files (foo.pdb.gz) and I want to separate each chain 
> in each file out into a separate file. So if a file contains 4 chains it will 
> generate 4 separate files.
> 
> Can I do this using RDKit, if so how?
> 
> Cheers
> 
> Chris
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Bug in Chem.SetDihedralDeg ?

2019-09-26 Thread Paolo Tosco

Dear Thomas,

That sounds very much like a bug. I’ll have a look and see what’s going wrong 
there; I’ll be in touch  soon.

Cheers,
p.

> On 26 Sep 2019, at 18:13,  
>  wrote:
> 
> Hi,
>  
> I came across a weird behavior of Chem.SetDihedralDeg:
>  
> In the attached python script I try to generate conformations by rotating 
> around the central bond of NCCO. When I call Chem.SetDihedralDeg with a 
> torsion value of 0, a very weird and distorted structure is produced, 
> however, when I set the torsion value any other value (even a value of 0.0001 
> works), the resulting conformations are ok.
>  
> I have attached my python script, the input sd file (input.sdf) and the 
> output sd file for torsion 0.0 (bug.sdf) and for the torsion 0.1 (ok.sdf)
>  
> Is this intended behavior and I miss sth obvious?
>  
> Best,
> Th.
>  
> Mit freundlichen Grüßen / Kind regards,
> Dr. Thomas Fox
> 
> Boehringer Ingelheim Pharma GmbH & Co. KG
> Medicinal Chemistry 
> Tel.: +49 (7351) 54-7585
> Fax: +49 (7351) 83-7585 
> mailto:thomas@boehringer-ingelheim.com
> Pflichtangaben finden Sie unter: 
> https://www.boehringer-ingelheim.de/unser-unternehmen/gesellschaften-in-deutschland
>  
> Mandatory information can be found at: 
> https://www.boehringer-ingelheim.de/unser-unternehmen/gesellschaften-in-deutschland
> 
> Datenschutzhinweis: Für bereits bestehende und neue Geschäftsbeziehungen 
> nutzen wir personenbezogene Daten und werden diese für die Dauer unserer 
> Geschäftsbeziehung aufbewahren. Während unserer Geschäftsbeziehung erheben 
> wir unter Umständen Kontaktdaten, Daten zur Berufsqualifikation 
> (Publikationen etc.). Einige Daten werden aus öffentlichen Quellen und 
> Internetseiten bezogen. Rechtsgrundlage: Artikel 6 (1) b) und f) EU DS-GVO. 
> Klicken Sie hier, um weitere Informationen auf der lokalen 
> Unternehmensinternetseite des betreffenden Landes über Datenschutz bei 
> Boehringer Ingelheim und zu Ihren Rechten zu erhalten. Bitte beachten Sie, 
> dass zusätzliche Datenschutzhinweise gelten können und alle diese 
> Datenschutzhinweise von Zeit zu Zeit aktualisiert werden können.
> Privacy Notice: We use personal data for current and future business 
> collaborations, and will retain such data for the duration of our business 
> relationship. During the course of our business relationship we may collect 
> contact data, data about professional qualifications (publications etc.). 
> Some of the data is sourced from public sources and websites. Legal basis: 
> Article 6 (1) b) and f) EU GDPR. Click here for more information on the local 
> company website of the respective country about data protection at Boehringer 
> Ingelheim and your rights. Please note that additional privacy notices may 
> apply and that all these privacy notices might be updated from time to time.
> Diese E-Mail ist vertraulich zu behandeln. Sie kann besonderem rechtlichem 
> Schutz unterliegen. Wenn Sie nicht der richtige Adressat sind, senden Sie 
> bitte diese E-Mail an den Absender zurück, löschen die eingegangene E-Mail 
> und geben den Inhalt der E-Mail nicht weiter. Jegliche unbefugte Bearbeitung, 
> Nutzung, Vervielfältigung oder Verbreitung ist verboten. / This e-mail is 
> confidential and may also be legally privileged. If you are not the intended 
> recipient please reply to sender, delete the e-mail and do not disclose its 
> contents to any person. Any unauthorized review, use, disclosure, copying or 
> distribution is strictly prohibited. 
>  
> 
> 
> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] problem when drawing a murcko scaffold

2019-08-28 Thread Paolo Tosco


Hi Jose Manuel,

the problem is just that the scaffold returned by 
MurckoScaffold.GetScaffoldForMol() has no explicit hydrogens on the imino N:


for atom in ms.GetAtoms():
    print(atom.GetIdx(), atom.GetAtomicNum(), atom.GetNumExplicitHs(),
  atom.GetNumImplicitHs(), atom.GetIsAromatic())

0 7 1 0 True
1 6 0 1 True
2 7 0 0 True
3 6 0 0 True
4 6 0 0 True
5 7 0 0 False <--
6 7 1 0 True
7 6 0 1 True
8 7 0 0 True
9 6 0 0 True

Therefore, after sanitizing, that nitrogen is set to be a radical:

ms_all.GetAtomWithIdx(5).GetNumRadicalElectrons()
1

and the Unicode bullet operator used to represent the radical cannot be 
encoded by the latin-1 codec, hence theUnicodeEncodeError.


If you do a

ms_all.GetAtomWithIdx(5).SetNumExplicitHs(1)

before sanitizing, your problem will disappear.

Cheers,
p.

On 28/08/2019 13:22, Jose-Manuel Gally wrote:


Dear all,

I noticed a strange behavior when extracting murcko scaffolds from 
preprocessed molecules with an inhouse standardization protocol.


I made a gist to illustrate the problem:

https://gist.github.com/jose-manuel/04d69dd3ac52cca74449e73d614df42e

This leaves me with several questions:

 1. When working with the standardized molecule, I get a drawing of
the murcko scaffold without Hs on the terminal nitrogen.
Why is that? I would expect either a radical (so with '.') or an
additional hydrogen. The smiles does not indicate the molecule is
a radical either.

 2. When sanitizing the molecule to update the smiles, I get a radical
by default, instead of a H bound to the nitrogen. Why is not a H
added instead? If I switch off the FINDRADICALS sanitization flag,
I do not get an extra hydrogen either...

 3. When I apply the default Sanitization to the murcko scaffold and
try to display it, I get an UnicodeEncodeError.
If I manually replace [N] by N in the smiles and create a new
molecule from it, I don't get an error anymore. Is there a
workaround? Interestingly, the function Draw.MolsToGridImage works
just fine but I could not find how to change the atom label size
and bond width.

Am I missing something obvious?

Many thanks in advance as any feedback would be much appreciated!

Cheers,
Jose Manuel

 
	Virus-free. www.avast.com 
 



<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Setting custom coordinates for new atoms

2019-08-28 Thread Paolo Tosco

h_unit_vector = rdGeometry.Point3D.Length(unit_vector) #this 
is usually ~1.72 Å for a N in an aromatic six-membered ring
 print("length:", length_unit_vector, "Å")
 scalar = 2.4 / length_unit_vector #let's figure out how much the 
unit vector needs to be scaled
 scaled_vector =[i * scalar for i in list(unit_vector)] #let's 
scale the vector
 scaled_vector_3d = rdGeometry.Point3D(scaled_vector[0], 
scaled_vector[1], scaled_vector[2]) #make a Point3D object. Easier to handle in 
RDKit
 A1 = atom_3d_point + scaled_vector_3d # New atom coordinate
 A2 = atom_3d_point - scaled_vector_3d # New atom coordinate
 print("A1", list(A1), A1,"A2", list(A2), A2)
 A1_idx = mw.AddAtom(Chem.Atom(6))
 print("A1_ix", A1_idx)
 print(type(A1_idx))
 c = conf.SetAtomPosition(A1_idx, A1)
 A2_idx = mw.AddAtom(Chem.Atom(6))
 print("A2_idx", A2_idx)


prot = Chem.MolFromPDBFile("./3etr.pdb")
protconf = prot.GetConformer()

block_apolar_N_atoms(prot, protconf)



I get the following error:


RuntimeError: Pre-condition Violation

 Violation occurred on line 51 in file Code/GraphMol/Conformer.cpp
 Failed Expression: dp_mol->getNumAtoms() == d_positions.size()
 RDKIT: 2019.09.1dev1
 BOOST: 1_67

Hoping to hear from you soon!


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen



From: Paolo Tosco 
Sent: Thursday, August 22, 2019 11:47:19 AM
To: Illimar Hugo Rekand; rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Setting custom coordinates for new atoms

Hi Illimar,

AddAtom() will return the index i of the added atom, then you can call
SetAtomPosition on that index on the molecule conformer and pass a
Point3D with the desired coordinates:

conf.SetAtomPosition(i, Point3D(x, y, z))

Cheers,
p.

On 08/22/19 09:24, Illimar Hugo Rekand wrote:

Hello, everyone


I'm wondering whether there is a way to set custom coordinates to an atom in a 
conformer?

In particular I'm interested in using the AddAtom() function in the RWMol class 
to place a new dummy atom in a PDB-file.


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Setting custom coordinates for new atoms

2019-08-22 Thread Paolo Tosco


Hi Illimar,

AddAtom() will return the index i of the added atom, then you can call 
SetAtomPosition on that index on the molecule conformer and pass a 
Point3D with the desired coordinates:


conf.SetAtomPosition(i, Point3D(x, y, z))

Cheers,
p.

On 08/22/19 09:24, Illimar Hugo Rekand wrote:

Hello, everyone


I'm wondering whether there is a way to set custom coordinates to an atom in a 
conformer?

In particular I'm interested in using the AddAtom() function in the RWMol class 
to place a new dummy atom in a PDB-file.


Illimar Rekand
Ph.D. candidate,
Brenk-lab, Haug-lab
Department of Biomedicine
Department of Chemistry
University of Bergen



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Get distances between two atoms in different molecules

2019-08-16 Thread Paolo Tosco

Hi Illimar,

You may use Point3D.Distance:

conf_a.GetAtomPosition(i).Distance(conf_b.GetAtomPosition(j))

Cheers,
p.

> On 16 Aug 2019, at 16:01, Illimar Hugo Rekand  wrote:
> 
> Hello everyone,
> 
> 
> I see that GetBondDistance() is a useful tool for calculating the distances 
> between atoms, whether bonded or nonbonded. However, I am unsure on how to 
> implement this function to calculate distances between atoms in different 
> molecule, as the functions requires a conformer as an argument.
> 
> 
> Would I need to generate a conformer containing both molecules in order to 
> use this function for calculating the intermolecular atom distances?
> 
> 
> Illimar Rekand
> Ph.D. candidate,
> Brenk-lab, Haug-lab
> Department of Biomedicine
> Department of Chemistry
> University of Bergen
> 
> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] GetSubstructMatches() as smiles

2019-08-07 Thread Paolo Tosco

Hi Mel,

You can use

Chem.MolFragmentToSmiles(mol, match)

where match is a tuple of atom indices returned by GetSubstructMatch().

Cheers,
p.

> On 7 Aug 2019, at 11:36, Melissa Adasme  wrote:
> 
> Dear rdkitters,
> 
> I'm trying to find substructures (query molecules built from SMARTS) matching 
> my molecules (SMILES). I found the GetSubstructMatches() method which works 
> pretty well returning the indices of matching atoms in my molecule. 
> 
> I wonder if there is a way to directly obtain the SMILES of the found 
> substructures instead of the atom indexes or maybe a way to transform the 
> indexes to smiles?
> 
> Many thanks in advance!
> Mel
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Protonation site

2019-08-05 Thread Paolo Tosco


Hi Nitzan,

you should try to modify the molecule conformation as little as possible 
between the two calculations; see the example below.


rom  rdkit  import  Chem
from  rdkit.Chem  import  AllChem

test_ion1  =  Chem.MolFromSmiles('COC(=O)C(c1c1)C1[NH2+]1')
test_ion1   =  Chem.AddHs(test_ion1)
AllChem.EmbedMolecule(test_ion1,  randomSeed=2)
prop1  =  AllChem.MMFFGetMoleculeProperties(test_ion1,
  mmffVariant="MMFF94")
ff1  =  AllChem.MMFFGetMoleculeForceField(test_ion1,  prop1)
ff1.Minimize(maxIts=1000)
print(ff1.CalcEnergy())

50.90705167464979

mn  =  test_ion1.GetSubstructMatch(Chem.MolFromSmarts("[NH2+]([H])[H]"))
n  =  mn[0]
na  =  test_ion1.GetAtomWithIdx(n)
mo  =  test_ion1.GetSubstructMatch(Chem.MolFromSmarts("O=CO"))
o  =  mo[0]
for  h  in  mn[1:]:
test_ion2  =  Chem.RWMol(test_ion1)
na.SetFormalCharge(0)
test_ion2.RemoveBond(n,  h)
test_ion2.RemoveAtom(h)
oa  =  test_ion2.GetAtomWithIdx(o)
oa.SetFormalCharge(1)
oa.SetNoImplicit(True)
oa.SetNumExplicitHs(1)
Chem.SanitizeMol(test_ion2)
test_ion2  =  Chem.AddHs(test_ion2,  addCoords=True)
prop2  =  AllChem.MMFFGetMoleculeProperties(test_ion2,  
mmffVariant="MMFF94")
ff2  =  AllChem.MMFFGetMoleculeForceField(test_ion2,  prop2)
print(ff2.CalcEnergy())

53.20633094793681
58.41577310354003


In this example I built two conformers with axial and equatorial N-H, 
and both their MMFF94 energies are larger than the NH2+ tautomer, as 
expected.
To make the calculation more accurate one should actually take into 
consideration multiple conformers and orientations of the mobile 
protons. Also, molecular mechanics are most likely not accurate enough 
for this kind of exercise - you should look into using a QM method to 
get more accurate figures.


Cheers,
p.

On 08/05/19 19:26, Nitzan Tzanani wrote:

Dear rdkitters,

I am trying to use rdkit descriptors to investigate protonation of 
molecules in the gas phase
while testing my procedure on Ritalin, I try to calculate the 
protonation energy on different functional groups in the molecule
although my understanding tells me protonation should occur on the 
secondary amine, energy calculations point at the keto group as the 
favorable site.

Bellow is my code
Is it wrong to use this approach? Is it a problem of using the wrong 
parameters?

Any help would be greatly appreciated

Yours

Nitzan

test_ion1 = Chem.MolFromSmiles('COC(=O)C(c1c1)C1[NH2+]1')
test_ion1  = Chem.Add Hs(test_ion1)
AllChem.Embed Molecule( test_ion1,randomSeed=2)
AllChem.MMFFOptimizeMolecule(test_ion1,maxlters-1000,mmffVariant="MMFF94")
Prop1 = AllChem.MMFFGetMoleculeProperties(test_ion1,mmffVariant="MMFF94")
ff1 = AllChem.MMFFGetMoleculeForceField (test_ion1,Prop1)
print ff1.CalcEnergy()

test_ion2 = Chem.MolFromSmiles('COC(=[OH+])C(c1c1)C1N1')
test_ion2  = Chem.Add Hs(test_ion2)
AllChem.Embed Molecule(test_ion2,randomSeed=2)
AllChem.MMFFOptimizeMolecule( test_ion2,maxlters-1000,mmffVariant="MMFF94")
Prop2 = AllChem.MMFFGetMoleculeProperties(test_ion2,mmffVariant="MMFF94")
ff2 = AllChem.MMFFGetMoleculeForceField (test_ion2,Prop2)
print ff2.CalcEnergy()

 
	Virus-free. www.avast.com 
 







___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] How to make non-planar rings have a consistent conformation?

2019-08-05 Thread Paolo Tosco


Dear Masgils,

you might look into using

AllChem.ConstrainedEmbed()

and supplying pre-generated 3D coordinates for your 6-membered chair (or 
boat) ring conformations; see the example below where I constrain the 
piperidine ring in bilastine to have a chair conformation.


You might need further filtering/constraints if you wish to remove axial 
conformations.


Cheers,
p.

from  rdkit  import  Chem
from  rdkit.Chem  import  AllChem
from  rdkit.Chem.Draw  import  IPythonConsole
import  py3Dmol

cyclohexane_chair  =  """\
Untitled
RDKit 3D

6 6 0 0 0 0 0 0 0 0999 V2000
-2.3879 0.9357 0.3335 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.8567 1.6374 -0.9329 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.3152 1.5949 -0.9570 C 0 0 0 0 0 0 0 0 0 0 0 0
0.1794 0.1349 -0.9091 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.3566 -0.5682 0.3547 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.8980 -0.5257 0.3727 C 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0
2 3 1 0
3 4 1 0
4 5 1 0
5 6 1 0
6 1 1 0
M END
"""

bilastine_smi  =  
"CCOCCn1c(C2CC[NH+](CCc3ccc(C(C)(C)C(=O)[O-])cc3)CC2)nc2c21"

chair_core  =  Chem.MolFromMolBlock(cyclohexane_chair)

params  =  AllChem.AdjustQueryParameters()

params.makeAtomsGeneric  =  True

chair_core_generic  =  AllChem.AdjustQueryProperties(chair_core,  params)

chair_core_generic

bilastine  =  Chem.AddHs(Chem.MolFromSmiles(bilastine_smi))

bilastine_3d  =  AllChem.ConstrainedEmbed(bilastine,  chair_core_generic,
useTethers=True)


On 08/05/19 02:34, Masgils wrote:

Dear RDKitters,

I want all the nonplanar rings in a group of compounds to have the 
same conformation.The goal is to make the rings perfectly aligned.


In RDkit, the boat and chair conformations generated by 
AllChem.EmbedMolecule() will have different bond lengths and angles 
depending on the atoms in the ring
Is it possible to unify all the rings in the chair conformation into 
one conformations, and to do the same for the boat conformation?


Or can we qualify all the rings to be flat？

Best Regards,
M.






___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] How to get substructure matches with different atoms?

2019-07-25 Thread Paolo Tosco


Dear Masgils,

you may try something along these lines, i.e. make atoms and/or bonds 
generic on one of the molecules withrdmolops.AdjustQueryProperties() in 
order to get subtructures to match, and then use rdMolAlign.GetBestRMS():


piperidine  =  Chem.AddHs(Chem.MolFromSmiles("C1CC(C)CCN1"))
AllChem.EmbedMolecule(piperidine)
AllChem.MMFFOptimizeMolecule(piperidine)
piperidine_noh  =  Chem.RemoveHs(piperidine)

piperazine  =  Chem.AddHs(Chem.MolFromSmiles("C1CN(C)CCN1"))
AllChem.EmbedMolecule(piperazine)
AllChem.MMFFOptimizeMolecule(piperazine)
piperazine_noh  =  Chem.RemoveHs(piperazine)

piperidine_noh

piperazine_noh

piperidine_noh.GetSubstructMatches(piperazine_noh)

()

params  =  AllChem.AdjustQueryParameters()

params.makeAtomsGeneric  =  True

params.makeBondsGeneric  =  True

piperazine_noh_generic  =  AllChem.AdjustQueryProperties(
piperazine_noh,  params)

piperazine_noh_generic

piperidine_noh.GetSubstructMatches(piperazine_noh_generic)

((0, 1, 2, 3, 4, 5, 6),)

AllChem.GetBestRMS(piperazine_noh_generic,  piperidine_noh)

0.39432427325884206


Hope this helps, cheers
p.

On 07/25/19 16:53, Masgils wrote:

Hi, all

Is it possible use GetSubstructMatches() to match a substructure with one or 
two atom different from ref_mol? (eg. a Piperidine ring and a Piperazing ring)

And how to get the RMSD between corresponding atoms of two substructures?





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Problem with Installation Conda Miniconda Azure Linux

2019-07-09 Thread Paolo Tosco

nda_cos6_linux_gnu 


    CONDA_DEFAULT_ENV=dev_env
    CONDA_EXE=/home/User/programs/Miniconda3/bin/conda
CONDA_PREFIX=/home/User/programs/Miniconda3/envs/dev_env
   CONDA_PREFIX_1=/home/User/programs/Miniconda3
    CONDA_PROMPT_MODIFIER=
CONDA_PYTHON_EXE=/home/User/programs/Miniconda3/bin/python
   CONDA_ROOT=/home/User/programs/Miniconda3
  CONDA_SHLVL=2
PATH=/home/User/programs/Miniconda3/bin:/home/User/programs/Miniconda
3/envs/dev_env/bin:/home/User/programs/Miniconda3/condabin:/usr/loc
al/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/l
  ocal/games:/snap/bin
   REQUESTS_CA_BUNDLE=
    SSL_CERT_FILE=
    ftp_proxy=
   http_proxy=
  https_proxy=

 active environment : dev_env
    active env location : /home/User/programs/Miniconda3/envs/dev_env
    shell level : 2
   user config file : /home/User/.condarc
 populated config files : /home/User/.condarc
  conda version : 4.7.5
    conda-build version : not installed
 python version : 3.7.3.final.0
   virtual packages :
   base environment : /home/User/programs/Miniconda3 (writable)
   channel URLs : https://conda.anaconda.org/anaconda/linux-64
https://conda.anaconda.org/anaconda/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
  https://repo.anaconda.com/pkgs/r/noarch
  package cache : /home/User/programs/Miniconda3/pkgs
  /home/User/.conda/pkgs
   envs directories : /home/User/programs/Miniconda3/envs
  /home/User/.conda/envs
   platform : linux-64
 user-agent : conda/4.7.5 requests/2.21.0 CPython/3.7.3 
Linux/4.18.0-1024-azure ubuntu/18.04.2 glibc/2.27

    UID:GID : 1002:1003
 netrc file : None
   offline mode : False
"
(I replaced my Username with "User".)

Best,
Jasmin

Am 09.07.2019 13:30 schrieb Paolo Tosco:

Hi Jasmin,

after you have activated your environment it should be sufficient to
run the command

conda install boost

and then build the RDKit as you were doing previously. cmake should
now pick the conda Boost instead of system Boost.

Cheers,
p.








___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Problem with Installation Conda Miniconda Azure Linux

2019-07-09 Thread Paolo Tosco


Hi Jasmin,

after you have activated your environment it should be sufficient to run 
the command


conda install boost

and then build the RDKit as you were doing previously. cmake should now 
pick the conda Boost instead of system Boost.


Cheers,
p.


On 07/09/19 12:14, Jasmin wrote:

Hi Paolo,

yes, you're right. This is probably a stupid question but how do I 
built within the anaconda environment?
Given that 'my_env' is a environment I built with "conda create --name 
my_env", I thought I can just run "conda activate my_env" but this 
does not seem to change anything.


Best,
Jasmin

Am 09.07.2019 10:03 schrieb Paolo Tosco:

Hi Jasmin,

as it looks like you are attempting to build RDKit in a conda
environment, you should be building against the Boost version supplied
by conda, while this line

Boost include path: /usr/include

shows that cmake has picked up the system-installed Boost from
libboost-all-dev.This Boost is most likely built against the system
Python, which is probably different and incompatible with python3.6 as
supplied from conda.
If you do a conda install boost in the conda environment where you are
building the RDKit you should see that cmake picks up the conda Boost,
and the problem should then disappear.

Cheers,
p.

On 07/09/19 08:37, Jasmin wrote:

Hello!

My Problem still persists.

I am trying to build from source right now. Now I am having problems 
with the boost-libraries.

The output says:
"
CMake Warning at /usr/share/cmake-3.10/Modules/FindBoost.cmake:1626 
(message):

  No header defined for python3; skipping header check
"
and later:
"
CMake Error at /usr/share/cmake-3.10/Modules/FindBoost.cmake:1947 
(message):

  Unable to find the requested Boost libraries.

  Boost version: 1.62.0

  Boost include path: /usr/include

  Could not find the following Boost libraries:

  boost_python

  No Boost libraries were found.  You may need to set 
BOOST_LIBRARYDIR to the

  directory containing Boost libraries or BOOST_ROOT to the location of
  Boost.
"
and the same thing for boost_system.
I installed libboost-all-dev. I think I found the boost-libraries in 
/usr/lib/x86_64-linux-gnu/ . Does

"
cmake .. -DPy_ENABLE_SHARED=1 -DRDK_INSTALL_INTREE=ON 
-DRDK_INSTALL_STATIC_LIBS=OFF -DRDK_BUILD_CPP_TESTS=ON 
-DPYTHON_NUMPY_INCLUDE_PATH="$CONDA_PREFIX/lib/python3.6/site-packages/numpy/core/include" 
-DBOOST_ROOT="$CONDA_PREFIX"

"
assume the libraries are in $CONDA_PREFIX? Doe I need move/ copy 
something here?


Thank you for your time. Any help is appreciated!
Best,
Jasmin


Am 03.07.2019 10:38 schrieb Jasmin:

Hi Greg,

no, I just replaced my real username.

Best,
Jasmin

Am 03.07.2019 08:55 schrieb Greg Landrum:

Hi Jasmin,

I would definitely expect the conda installation to work. The paths
that your output shows are a bit strange: it looks like your username
on the Azure machine is "". If that is the case, it 
could be

that this is causing a problem (I've not seen user names with "<>"
characters before.

Could you please execute the following 4 commands and send the entire
output that you get?

cd
pwd
which conda
conda install -c rdkit rdkit

Best,

-greg

On Wed, Jul 3, 2019 at 8:19 AM Jasmin  wrote:


Hello!

I am experiencing problems with installing rdkit by conda in a
Miniconda-environment on a Azure Ubuntu machine using "conda
install -c
rdkit rdkit".
The error message I'm getting is:
'''
environment variables:
CIO_TEST=
CONDA_DEFAULT_ENV=base

CONDA_EXE=/home//programs/Miniconda3/bin/conda
CONDA_PREFIX=/home//programs/Miniconda3
CONDA_PROMPT_MODIFIER=

CONDA_PYTHON_EXE=/home//programs/Miniconda3/bin/python
CONDA_ROOT=/home//programs/Miniconda3
CONDA_SHLVL=1


PATH=/home//programs/Miniconda3/bin:/home//programs/Miniconda 




3/bin:/home//programs/Miniconda3/condabin:/usr/local/sbin:/usr/ 




local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/s 


nap/bin
REQUESTS_CA_BUNDLE=
SSL_CERT_FILE=
ftp_proxy=
http_proxy=
https_proxy=

active environment : base
active env location : /home//programs/Miniconda3
shell level : 1
user config file : /home//.condarc
populated config files : /home//.condarc
conda version : 4.7.5
conda-build version : not installed
python version : 3.7.3.final.0
virtual packages :
base environment : /home//programs/Miniconda3
(writable)
channel URLs :
https://conda.anaconda.org/conda-forge/linux-64 [1]

https://conda.anaconda.org/conda-forge/noarch [2]

https://repo.anaconda.com/pkgs/main/linux-64 [3]

https://repo.anaconda.com/pkgs/main/noarch [4]

https://repo.anaconda.com/pkgs/r/linux-64 [5]
https://repo.anaconda.com/pkgs/r/noarch
[6]
package cache :
/home//programs/Miniconda3/pkgs
/home//.conda/pkgs
envs directories :
/home//programs/Miniconda3/envs
/home//.conda/envs
platform : linux-64
user-agent : conda/4.7.5 requests/2.21.0
CPython/3.7.3
Linux/4.18.0-1018-azure ubuntu/18.04.2 glibc/2.27
UID:GID : 1002:1003
net

Re: [Rdkit-discuss] Problem with Installation Conda Miniconda Azure Linux

2019-07-09 Thread Paolo Tosco


Hi Jasmin,

as it looks like you are attempting to build RDKit in a conda 
environment, you should be building against the Boost version supplied 
by conda, while this line


Boost include path: /usr/include

shows that cmake has picked up the system-installed Boost from 
libboost-all-dev.This Boost is most likely built against the system 
Python, which is probably different and incompatible with python3.6 as 
supplied from conda.
If you do a conda install boost in the conda environment where you are 
building the RDKit you should see that cmake picks up the conda Boost, 
and the problem should then disappear.


Cheers,
p.

On 07/09/19 08:37, Jasmin wrote:

Hello!

My Problem still persists.

I am trying to build from source right now. Now I am having problems 
with the boost-libraries.

The output says:
"
CMake Warning at /usr/share/cmake-3.10/Modules/FindBoost.cmake:1626 
(message):

  No header defined for python3; skipping header check
"
and later:
"
CMake Error at /usr/share/cmake-3.10/Modules/FindBoost.cmake:1947 
(message):

  Unable to find the requested Boost libraries.

  Boost version: 1.62.0

  Boost include path: /usr/include

  Could not find the following Boost libraries:

  boost_python

  No Boost libraries were found.  You may need to set BOOST_LIBRARYDIR 
to the

  directory containing Boost libraries or BOOST_ROOT to the location of
  Boost.
"
and the same thing for boost_system.
I installed libboost-all-dev. I think I found the boost-libraries in 
/usr/lib/x86_64-linux-gnu/ . Does

"
cmake .. -DPy_ENABLE_SHARED=1 -DRDK_INSTALL_INTREE=ON 
-DRDK_INSTALL_STATIC_LIBS=OFF -DRDK_BUILD_CPP_TESTS=ON 
-DPYTHON_NUMPY_INCLUDE_PATH="$CONDA_PREFIX/lib/python3.6/site-packages/numpy/core/include" 
-DBOOST_ROOT="$CONDA_PREFIX"

"
assume the libraries are in $CONDA_PREFIX? Doe I need move/ copy 
something here?


Thank you for your time. Any help is appreciated!
Best,
Jasmin


Am 03.07.2019 10:38 schrieb Jasmin:

Hi Greg,

no, I just replaced my real username.

Best,
Jasmin

Am 03.07.2019 08:55 schrieb Greg Landrum:

Hi Jasmin,

I would definitely expect the conda installation to work. The paths
that your output shows are a bit strange: it looks like your username
on the Azure machine is "". If that is the case, it could be
that this is causing a problem (I've not seen user names with "<>"
characters before.

Could you please execute the following 4 commands and send the entire
output that you get?

cd
pwd
which conda
conda install -c rdkit rdkit

Best,

-greg

On Wed, Jul 3, 2019 at 8:19 AM Jasmin  wrote:


Hello!

I am experiencing problems with installing rdkit by conda in a
Miniconda-environment on a Azure Ubuntu machine using "conda
install -c
rdkit rdkit".
The error message I'm getting is:
'''
environment variables:
CIO_TEST=
CONDA_DEFAULT_ENV=base

CONDA_EXE=/home//programs/Miniconda3/bin/conda
CONDA_PREFIX=/home//programs/Miniconda3
CONDA_PROMPT_MODIFIER=

CONDA_PYTHON_EXE=/home//programs/Miniconda3/bin/python
CONDA_ROOT=/home//programs/Miniconda3
CONDA_SHLVL=1


PATH=/home//programs/Miniconda3/bin:/home//programs/Miniconda 




3/bin:/home//programs/Miniconda3/condabin:/usr/local/sbin:/usr/ 





local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/s

nap/bin
REQUESTS_CA_BUNDLE=
SSL_CERT_FILE=
ftp_proxy=
http_proxy=
https_proxy=

active environment : base
active env location : /home//programs/Miniconda3
shell level : 1
user config file : /home//.condarc
populated config files : /home//.condarc
conda version : 4.7.5
conda-build version : not installed
python version : 3.7.3.final.0
virtual packages :
base environment : /home//programs/Miniconda3
(writable)
channel URLs :
https://conda.anaconda.org/conda-forge/linux-64 [1]

https://conda.anaconda.org/conda-forge/noarch [2]

https://repo.anaconda.com/pkgs/main/linux-64 [3]

https://repo.anaconda.com/pkgs/main/noarch [4]

https://repo.anaconda.com/pkgs/r/linux-64 [5]
https://repo.anaconda.com/pkgs/r/noarch
[6]
package cache :
/home//programs/Miniconda3/pkgs
/home//.conda/pkgs
envs directories :
/home//programs/Miniconda3/envs
/home//.conda/envs
platform : linux-64
user-agent : conda/4.7.5 requests/2.21.0
CPython/3.7.3
Linux/4.18.0-1018-azure ubuntu/18.04.2 glibc/2.27
UID:GID : 1002:1003
netrc file : None
offline mode : False

An unexpected error has occurred. Conda has prepared the above
report.
'''

I also tried to install from repositories: sudo apt-get install
python-rdkit librdkit1 rdkit-data
There is no error message but trying to import rdkit in a python3
console gives me a "module not found".

The third thing I tried is to build from source in conda
(https://www.rdkit.org/docs/Install.html [7]). But when running:
cmake .. -DPy_ENABLE_SHARED=1
-DRDK_INSTALL_INTREE=ON
-DRDK_INSTALL_STATIC_LIBS=OFF
-DRDK_BUILD_CPP_TESTS=ON


-DPYTHON_NUMPY_INCLUDE_PATH="$CONDA_PREFIX/lib/python3.7/site-packages/numpy/core/include" 




-DBOOST_ROOT="$CONDA_PREFIX"
I'm getting:
'''
-- Catch not found in

Re: [Rdkit-discuss] How to calculate Lennard-Jones potential by using RDkit?

2019-07-08 Thread Paolo Tosco


Hi,

a similar question has come up recently on the RDKit mailing list:

https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08995.html

the GetUFFVdWParams() and GetMMFFVdWParams() functions will return the 
vdW parameters for UFF and MMFF94 force fields, respectively. For UFF 
the function indeed returns sigma and epsilon van der Waals parameters 
for atoms with indexes idx1, idx2 as a (x_ij, D_ij) tuple, as per the 
vdW functional form in UFF:




However, for MMFF94, as explained in the link reported above, the 
parameters returned by
GetMMFFVdWParams() are not quite sigma and epsilon as the vdW term uses 
a different 14-7 term with combination rules which depend on the 
chemical nature of the two particles.


Cheers,
Paolo

On 07/08/19 07:25, Masgils wrote:

Hi, All

I want to calculate Lennard-Jones potential between a small molecule 
probe and a atom. How do I get the sigma and epsilon parameters by 
using rdkit?


{\displaystyle {\mathcal {V}}\left(r\right)=4\varepsilon 
\left[\left({\frac {\sigma }{r}}\right)^{12}-\left({\frac {\sigma 
}{r}}\right)^{6}\right]}


Best Regards,
M.






___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Error from qed

2019-07-04 Thread Paolo Tosco


Hi Navid,

based on the adjacency matrix that you reported the first molecule has 
the following topology:


i.e., it features an oxygen atom with 3 valences, plus a disconnected 
oxygen (the last one in the node). That molecule makes little chemical 
sense to me - it looks like some sort of slightly broken sydnonimine. 
However, unless you set a +1 formal charge on the oxygen wth index 6 in 
your node the RDKit will fail to sanitize that molecule.


The second compound is nitromethane followed by a bunch of disconnected 
atoms:


Again, as that nitrogen has a valence of 4, RDKit will fail to sanitize 
the molecule unless you set a formal charge of +1 on that nitrogen (and 
most likely a formal charge of -1 on the single-bonded oxygen). However, 
all those disconnected atoms (C, N, H) in the rest of your node seem 
suspicious to me - I wonder if something has gone wrong somewhere in 
your workflow.


I hope the above helps, cheers
p.

On 03/07/2019 22:10, Navid Shervani-Tabar wrote:

Hello,

I was trying to find quantitative estimation of drug-likeness (QED) 
 for qm9 dataset. qm9 dataset provides a graph representation of the 
molecule, these are translated into mol objects and then qed is 
estimated.


```
from rdkit.Chem.QED import qed
.
.
.
mol = Graphs2Mol(node, adj_mat)
my_qed = qed(mol)
```

I was trying to find qed for molecule with node and adjacency matrix

```
node = ['C', 'C', 'C', 'C', 'C', 'N', 'O', 'N', 'O']

adj_mat =
[[0 1 0 0 0 0 0 0 0]
 [1 0 2 1 0 0 0 0 0]
 [0 2 0 0 0 0 0 0 0]
 [0 1 0 0 1 0 0 0 0]
 [0 0 0 1 0 1 2 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 2 0 0 1 0]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 0 0]]
```

and I received following error

`ValueError: Sanitization error: Explicit valence for atom # 6 O, 3, 
is greater than permitted

`
Same happened with
```
node = ['O', 'N', 'O', 'C', 'C', 'C', 'C', 'N', 'H']

adj_mat =
[[0 2 0 0 0 0 0 0 0]
 [2 0 1 1 0 0 0 0 0]
 [0 1 0 0 0 0 0 0 0]
 [0 1 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]]
```

and I received

`ValueError: Sanitization error: Explicit valence for atom # 1 N, 4, 
is greater than permitted

`
Any suggestions on what might be incorrect? Thanks.

Navid


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] What is returned by GetMMFFVdWParams?

2019-07-03 Thread Paolo Tosco


Hi Lewis,

The reason why it takes two particle indices as input is that MMFF94 
uses rather complex combination rules for the van der Waals term.


In particular, EvdW between two particles /i/ and /j/ is given by the 
following equation:


where /R/_/ij/ is the distance between particles /i/ and /j/, while 
/R/_/ij/^* and /ε//_ij / are constants computed through these equations:


/B/ is 0 if one of the particles is a HBD, otherwise it is 0.2, while 
the expression for /γ/ is:


This is the reason why you need to pass two particles to get the values 
of /R/_/ij/^* and /ε//_ij /: you need to know if either particle is a 
HBD to compute /R/_/ij/^* , and /ε//_ij / depends on R_/ij/^* .


Finally, if /i/-/j/ is a HBD-HBA interaction, R_/ij/^* is further scaled 
by the factor DARAD = 0.8, while /ε//_ij / (which is computed using the 
unscaled R_/ij/^* ) is scaled by the factor DAEPS = 0.5.


More information is available on the MMFF94 paper by Thomas A. Halgren 
which describes the non-bonded terms:


Merck molecular force field. II. MMFF94 van der Waals and electrostatic 
parameters for intermolecular interactions 



Cheers,
p.

On 03/07/2019 03:24, Lewis Martin wrote:

Hi all,
Can anyone please help explain what values are returned 
by GetMMFFVdWParams? It takes two indices as input, so is it an 
interaction term between the two? Or is it the well depth and minimum 
(i.e. epsilon and R)?


Example:
In:
m = Chem.MolFromSmiles('C1CCC1OC')
m2=Chem.AddHs(m)
AllChem.EmbedMolecule(m2)
AllChem.MMFFOptimizeMolecule(m2)
pyMP = AllChem.MMFFGetMoleculeProperties(m2)
pyFF = AllChem.MMFFGetMoleculeForceField(m2, pyMP)
pyMP.GetMMFFVdWParams(0,1)

Out:
(3.9377389919289634,
  0.06779699304291371,
  3.9377389919289634,
  0.06779699304291371)
The code at 
https://github.com/rdkit/rdkit/blob/master/Code/ForceField/Wrap/ForceField.cpp
says it returns R_ij_starUnscaled, epsilonUnscaled, R_ij_star, and 
epsilon, but Im unclear on what the scaling is or why two indices are 
needed.

Thanks!
lewis



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] support for Postgres 11 ?

2019-07-01 Thread Paolo Tosco


Hi John,

yes, it is, I have myself compiled the RDKit PostgreSQL cartridge 
against Postgres 11.


It should just be a matter of adding the following flags to the cmake 
command:


  -DPostgreSQL_ROOT=/usr/pgsql-11 \
  -DPostgreSQL_INCLUDE_DIR=/usr/pgsql-11/include \
-DPostgreSQL_TYPE_INCLUDE_DIR=/usr/pgsql-11/include/server \
  -DPostgreSQL_LIBRARY_DIR=/usr/pgsql-11/lib \
  -DRDK_BUILD_PGSQL=ON \
  -DRDK_PGSQL_STATIC=ON \

and it should work.
Feel free to get back to me off-list if the above does not work for you; 
also, it would be useful to see the error messages that you are getting.


Best,
Paolo

On 07/01/19 18:50, John Irwin wrote:

Hi -

I notice that the anaconda installation of RDKit is with Postgres 9.6.
We tried to install RDKit with Postgres 10.4 and 11 and struggled to 
compile from source.
Meanwhile, Postgres 12 is already in beta and we don't want to get too 
far behind.

Is RDKit compatible with Postgres 11?

Thanks

John

John Irwin
UCSF Pharmaceutical Chemistry
http://irwinlab.compbio.ucsf.edu




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Query about Substructure match by RDKit

2019-06-28 Thread Paolo Tosco

Hi Goutam,

I assume you are typing the first example interactively in the Python 
interpreter or in a ipython shell or Jupyter notebook.
In that case you only need to enter the same commands in a text editor, save 
the script with a name (e.g., substructure.py) and run it from a shell;

python substructure.py

If this is not what you are after please get back to me off-list such that we 
keep the mailing list traffic to a minimum.

Cheers,
p.

> On 28 Jun 2019, at 19:10, Goutam Mukherjee  wrote:
> 
> Hi All,
> 
> I have one query about Substructure matching using RDKit.
> I have a target SMILES codes (Required to convert Canonical SMILES format) 
> which may content 1 or more than one SMILES code
> I have one more file, a query file, which content just one SMILES code of a 
> small fragment. Say,
> Target file: Brc1ccc(cc1)C(=O)Nc1n[nH]c(c1)C1C1
> Query file format: c1cn[nH]c1
> I have to compare whether the query fragment is present in the target file or 
> not.
> 
> I able to do it in command line
> Here is the command
> from __future__ import print_function
> from rdkit import Chem
> m = Chem.MolFromSmiles('Brc1ccc(cc1)C(=O)Nc1n[nH]c(c1)C1C1')
> m.HasSubstructMatch(Chem.MolFromSmarts('c1cn[nH]c1'))
> True
> 
> However, I want to do this substructure search using command line (say, 
> python script).
> Could you please help me to do this.
> Your help will be highly appreciated.
> 
> Thanks and Kind Regards,
> Goutam
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Chem.RemoveHs does not remove explicit hydrogens

2019-06-22 Thread Paolo Tosco

Dear Steven,

yes, it is the expected behaviour.

Molecule mol encodes acetone with all hydrogens made explicit, such that 
the RDKit does not need to use its valence model to compute the number 
of implicit Hs each heavy atom is connected to.

However, as you can see in your mol.GetAtoms() loop, the molecule does 
not contain any real hydrogen atoms in its graph; when you loop over the 
molecule's atoms you only retrieve the heavy atoms.

If you omit one of the explicit hydrogens on the first carbon in your 
SMILES string, the RDKit will replace it with an implicit hydrogen as it 
will use its valence model to fill in the missing explicit hydrogen in 
the assumption that the molecule is not a radical:

mol2  =  Chem.MolFromSmiles('C(C(=O)C([H])([H])[H])([H])[H]')

for  atom  in  mol2.GetAtoms():
print(atom.GetAtomicNum(),  atom.GetNumImplicitHs(),
  atom.GetNumExplicitHs(),  atom.GetNumRadicalElectrons())

6 1 2 0
6 0 0 0
8 0 0 0
6 0 3 0

However, if you explicitly set that you'd rather not want that the RDKit 
adds implicit hydrogens, you can set the NoImplicit flag on the parent 
heavy atoms. If you then sanitize the molecule, you'll see that it has 
indeed become a radical:

for  atom  in  mol2.GetAtoms():
atom.SetNoImplicit(True)

Chem.SanitizeMol(mol2)

rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

for  atom  in  mol2.GetAtoms():
print(atom.GetAtomicNum(),  atom.GetNumImplicitHs(),
  atom.GetNumExplicitHs(),  atom.GetNumRadicalElectrons())

6 0 2 1
6 0 0 0
8 0 0 0
6 0 3 0

If you now want to add real hydrogen atoms to the molecule graph you can 
do it with Chem.AddHs(); this zeroes all explicit/implicit H counts on 
heavy atoms and replaces explicit/implicit Hs with real hydrogen atoms:

mol2_h  =  Chem.AddHs(mol2,  explicitOnly=True)

for  atom  in  mol2_h.GetAtoms():
print(atom.GetAtomicNum(),  atom.GetNumImplicitHs(),
  atom.GetNumExplicitHs(),  atom.GetNumRadicalElectrons())

6 0 0 1
6 0 0 0
8 0 0 0
6 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0

I hope the above helps. I suggest to also have a look at this blog post 
by Roger Sayle:

https://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/

It is a quick read and it explains very clearly the difference between 
implicit Hs, explicit Hs, and explicit Hs included in the molecule graph.

Cheers,
p.

On 17/06/2019 20:40, Steven Kearnes via Rdkit-discuss wrote:

mol = Chem.MolFromSmiles('C(C(=O)C([H])([H])[H])([H])([H])[H]')
for atom in mol.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())

>>>
0 3
0 0
0 0
0 3

mol_noh = Chem.RemoveHs(mol, updateExplicitCount=True)
for atom in mol_noh.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())

>>>
0 3
0 0
0 0
0 3

Note that if I add the hydrogens first, everything works as expected:

mol_h = Chem.AddHs(mol, explicitOnly=True)
mol_noh = Chem.RemoveHs(mol_h)
for atom in mol_noh.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())

>>>
3 0
0 0
0 0
3 0

Perhaps this is expected behavior?

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Shape alignment in wrong 3D reference frame - how to fix?

2019-06-18 Thread Paolo Tosco


Dear Jörg,

I have tried to put together a simple example of constrained alignment 
using a constraintMap.


The gist is here:

https://gist.github.com/ptosco/d948935d41c9f2ab6cae958da56b50a4

while its HTML counterpart (easier for visualizing 3Dmol.js objects) is 
here:


http://htmlpreview.github.io/?https://gist.githubusercontent.com/ptosco/e5ec85fe17810e7e3e67120ed9435226/raw/9e9f83543d7c72d5fee993ae04b9397b7d1e12d9/o3a_restraints.html

I hope this helps, cheers
p.

On 18/06/2019 08:37, Jörg Kurt Wegner wrote:
Hi Paolo, works like a charm. I did not know you can pass the CID 
directly, so had a workaround via molblocks, which did not work. 
Thanks again.


Now I stumble over another challenge. If I interpret it correctly then 
I should be able to SMARTS match atoms, even if they are different 
elements, as long as the SMARTS matching and the weight vectors are 
the same.
The returned SMARTS matches are integer tuples, the RDKit test cases 
use constraintMap=[[query3dd_contraint_idx, 
references3dd_contraint_idx]], but use often only single integers.
It seems the syntax is wrong? I seem to define the contraint map 
wrong, any hints? Ever seen this or worked with contraint maps?


query3dd_contraint_idx = query3d.GetSubstructMatch(smartsQuery)
references3dd_contraint_idx = references3d.GetSubstructMatch(smartsQuery)
rdMolAlign.GetO3A(query3d, references3d, query3d_props, references3d_props,constraintMap=[[query3dd_contraint_idx, references3dd_contraint_idx]], 
constraintWeights=contraint_weights, prbCid=cid)


Cheers
/.Joerg

On Sun, Jun 16, 2019 at 8:51 PM Paolo Tosco 
mailto:paolo.tosco.m...@gmail.com>> wrote:


Dear Jörg,

I have just tried this and it seems to work for me.

I have created a gist here which shows an example usage on
multiple conformations:

https://gist.github.com/ptosco/b9b7341251457fe26441dc17609ae34a

As the notebook contains 3Dmol.js renderings that won't show up in
the gist, I have also created an HTML version which may be more
convenient for you to browse here:


http://htmlpreview.github.io/?https://gist.githubusercontent.com/ptosco/2bc42766a1672f61135d7d7dcce72223/raw/302e459d98c9f4e2fd0f1f2b23221d54c7bd2d8a/o3a.html

I think the problem in your case might originate from the fact
that when you call pyO3A.Align() the transformation is directly
applied to the query molecule, and the RMSD is returned.

If you call pyO3A.Trans() after calling pyO3A.Align() the
transformation will actually be an identity transformation, as the
query has already been aligned when Align() was called, so no
further transformation is required. So if you try to apply the
transformation to the original coordinates instead, that will
result in no change to the coordinates, being an identity.

If you only need the transformation to apply it at a later stage,
you should not call Align first.

I hope the above is clear; feel free to contact me if not (also
off-list).

Cheers,
Paolo

On 16/06/2019 17:56, Jörg Kurt Wegner wrote:

It seems the shape alignment is in a wrong 3D reference frame -
how to fix this?
Here a code snippet looping over all conformations, then finding
the one with the lowest score.
I was under the assumption "query3d" should be at the end in the
same reference frame as "references3d", but it is not? Has anyone
a working 3D shape alignment, ensuring things are truly aligned
in the same reference frame? Thanks /.Joerg

pyO3A=rdMolAlign.GetO3A(query3d_conf, references3d)
rmsd=pyO3A.Align()
score = pyO3A.Score()
rmsd, trans_matrix = pyO3A.Trans()

if highestConfId!=-1:
        rdkit.Chem.AllChem.TransformMol(query3d, trans_matrix,
confId=highestConfId, keepConfs=True)

-- 
LinkedIn <http://linkedin.com/in/joergkurtwegner>, Twitter

<http://twitter.com/joergkurtwegner>, FaceBook
<http://www.facebook.com/joergkurtwegner>


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net  
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




--
LinkedIn <http://linkedin.com/in/joergkurtwegner>, Twitter 
<http://twitter.com/joergkurtwegner>, FaceBook 
<http://www.facebook.com/joergkurtwegner>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Shape alignment in wrong 3D reference frame - how to fix?

2019-06-16 Thread Paolo Tosco


Dear Jörg,

I have just tried this and it seems to work for me.

I have created a gist here which shows an example usage on multiple 
conformations:


https://gist.github.com/ptosco/b9b7341251457fe26441dc17609ae34a

As the notebook contains 3Dmol.js renderings that won't show up in the 
gist, I have also created an HTML version which may be more convenient 
for you to browse here:


http://htmlpreview.github.io/?https://gist.githubusercontent.com/ptosco/2bc42766a1672f61135d7d7dcce72223/raw/302e459d98c9f4e2fd0f1f2b23221d54c7bd2d8a/o3a.html

I think the problem in your case might originate from the fact that when 
you call pyO3A.Align() the transformation is directly applied to the 
query molecule, and the RMSD is returned.


If you call pyO3A.Trans() after calling pyO3A.Align() the transformation 
will actually be an identity transformation, as the query has already 
been aligned when Align() was called, so no further transformation is 
required. So if you try to apply the transformation to the original 
coordinates instead, that will result in no change to the coordinates, 
being an identity.


If you only need the transformation to apply it at a later stage, you 
should not call Align first.


I hope the above is clear; feel free to contact me if not (also off-list).

Cheers,
Paolo

On 16/06/2019 17:56, Jörg Kurt Wegner wrote:
It seems the shape alignment is in a wrong 3D reference frame - how to 
fix this?
Here a code snippet looping over all conformations, then finding the 
one with the lowest score.
I was under the assumption "query3d" should be at the end in the same 
reference frame as "references3d", but it is not? Has anyone a working 
3D shape alignment, ensuring things are truly aligned in the same 
reference frame? Thanks /.Joerg


pyO3A=rdMolAlign.GetO3A(query3d_conf, references3d)
rmsd=pyO3A.Align()
score = pyO3A.Score()
rmsd, trans_matrix = pyO3A.Trans()

if highestConfId!=-1:
        rdkit.Chem.AllChem.TransformMol(query3d, trans_matrix, 
confId=highestConfId, keepConfs=True)


--
LinkedIn , Twitter 
, FaceBook 




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Catching errors in SMILES files

2019-06-04 Thread Paolo Tosco


Hi David,

I think I already have a fix for this bug, I'll submit a PR later. If 
you can create a ?GitHub issue it would be great so I can link my PR to 
the bug.


Thanks, cheers
p.


On 06/04/19 12:10, David Cosgrove wrote:

Hi Paolo,
Many thanks for the speedy reply.  I'll do as you suggest for now.  Do 
you want me to file an issue on github, or even, maybe, see if I can 
fix it myself?

Cheers,
Dave


On Mon, Jun 3, 2019 at 5:32 PM Paolo Tosco <mailto:paolo.tosco.m...@gmail.com>> wrote:


Hi David,

a workaround could be adding a final check after the for loop:

#!/usr/bin/env python

from rdkit import Chem

suppl1 = Chem.SmilesMolSupplier('test1.smi', titleLine=False,
nameColumn=1)
rec_num = 0
print("len(suppl1) = {0:d}".format(len(suppl1)))
for mol in suppl1:
    rec_num += 1
    if not mol:
    print('Record {} not read.'.format(rec_num))
    else:
    print('Record {} read ok.'.format(rec_num))
if (rec_num == len(suppl1) - 1):
    rec_num += 1
    print('Record {} not read.'.format(rec_num))


suppl2 = Chem.SmilesMolSupplier('test2.smi', titleLine=False,
nameColumn=1)
rec_num = 0
print("len(suppl2) = {0:d}".format(len(suppl2)))
for mol in suppl2:
    rec_num += 1
    if not mol:
    print('Record {} not read.'.format(rec_num))
    else:
    print('Record {} read ok.'.format(rec_num))
if (rec_num == len(suppl2) - 1):
    rec_num += 1
    print('Record {} not read.'.format(rec_num))

This should work until what seems to be an issue in the
SmilesSupplier is fixed.

Cheers,
p.

On 06/03/19 16:49, David Cosgrove wrote:

Hi,

I'm trying to catch the line numbers of lines in a SMILES file
that aren't parsed by the SmilesMolSupplier.  Example code is
attached, along with 2 SMILES files.  When there is a bad SMILES
string on the last line, the error is not reported, as in
test2.smi.  I've tried iterating through the file in a loop using
next(suppl1) and catching the StopIteration exception, but I have
the same issue. Is there a way to spot a last bad record in a file?

Thanks,
Dave

-- 
David Cosgrove

Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




--
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Catching errors in SMILES files

2019-06-03 Thread Paolo Tosco


Hi David,

a workaround could be adding a final check after the for loop:

#!/usr/bin/env python

from rdkit import Chem

suppl1 = Chem.SmilesMolSupplier('test1.smi', titleLine=False, nameColumn=1)
rec_num = 0
print("len(suppl1) = {0:d}".format(len(suppl1)))
for mol in suppl1:
    rec_num += 1
    if not mol:
    print('Record {} not read.'.format(rec_num))
    else:
    print('Record {} read ok.'.format(rec_num))
if (rec_num == len(suppl1) - 1):
    rec_num += 1
    print('Record {} not read.'.format(rec_num))


suppl2 = Chem.SmilesMolSupplier('test2.smi', titleLine=False, nameColumn=1)
rec_num = 0
print("len(suppl2) = {0:d}".format(len(suppl2)))
for mol in suppl2:
    rec_num += 1
    if not mol:
    print('Record {} not read.'.format(rec_num))
    else:
    print('Record {} read ok.'.format(rec_num))
if (rec_num == len(suppl2) - 1):
    rec_num += 1
    print('Record {} not read.'.format(rec_num))

This should work until what seems to be an issue in the SmilesSupplier 
is fixed.


Cheers,
p.

On 06/03/19 16:49, David Cosgrove wrote:

Hi,

I'm trying to catch the line numbers of lines in a SMILES file that 
aren't parsed by the SmilesMolSupplier.  Example code is attached, 
along with 2 SMILES files.  When there is a bad SMILES string on the 
last line, the error is not reported, as in test2.smi.  I've tried 
iterating through the file in a loop using next(suppl1) and catching 
the StopIteration exception, but I have the same issue.  Is there a 
way to spot a last bad record in a file?


Thanks,
Dave

--
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Unable to build RDKit C# wrapper due to boost libraries

2019-05-03 Thread Paolo Tosco

3.14/Modules/UseSWIG.cmake:695 
(SWIG_ADD_SOURCE_TO_MODULE)

  Code/JavaWrappers/csharp_wrapper/CMakeLists.txt:63 (SWIG_ADD_LIBRARY)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Configuring done
-- Generating done
-- Build files have been written to: 
C:/Users/paolo/AppData/Local/rdkit-Release_2019_03_1/build


(my-rdkit-env) C:\Users\paolo\AppData\Local\rdkit-Release_2019_03_1>cd build

(my-rdkit-env) 
C:\Users\paolo\AppData\Local\rdkit-Release_2019_03_1\build>"C:\Program 
Files (x86)\Microsoft Visual 
Studio\2017\Professional\MSBuild\15.0\Bin\MSBuild.exe" /m:4 
/p:Configuration=Release /p:Platform=x64 INSTALL.vcxproj



[...]


    227 Warning(s)
    0 Error(s)

Time Elapsed 00:27:44.47

(my-rdkit-env) C:\Users\paolo\AppData\Local\rdkit-Release_2019_03_1\build>

Cheers,
p.



On 05/03/19 10:58, Esther Barlow-Smith wrote:

Hi Paolo,
I have tried it with the conda provided boost and it didn't work which 
is why I thought I'd try with a non-Anaconda Boost, but I've now 
switched back to the conda boost version. Unfortunately it still does 
not work.

Esther


*From:* Paolo Tosco 
*Sent:* 03 May 2019 10:43
*To:* Esther Barlow-Smith
*Subject:* Re: [Rdkit-discuss] Unable to build RDKit C# wrapper due to 
boost libraries


Hi Esther,


I have looked at your error messages in more detail.

So it looks like you are using cmake from an Anaconda CMD shell:


conda activate my-rdkit-env


but are then trying to use a non-Anaconda Boost build located in 
C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0.


Is there any specific reason why you are doing so? This is going to 
fail unless you built Boost against the conda Python.


If you use the conda-provided Boost doing a


conda install boost


you can skip defining BOOST_ROOT, BOOST_LIBRARYDIR and 
BOOST_INCLUDEDIR, and cmake should automatically find and use the 
conda Boost.



Let me know if this works.


p.


On 05/03/19 10:28, Esther Barlow-Smith wrote:

Hi Paolo,
I updated CMake to the 3.14.3 version with conda but it gave me the 
same error. I have also tried previously with boost 1.56 and it 
didn't work but I could try using it again.

Esther


*From:* Paolo Tosco  
<mailto:paolo.tosco.m...@gmail.com>

*Sent:* 03 May 2019 10:08
*To:* Esther Barlow-Smith
*Subject:* Re: [Rdkit-discuss] Unable to build RDKit C# wrapper due 
to boost libraries


Hi Esther,


worth updating CMake to the latest version, as the Boost version you 
are using is fairly new and a less recent CMake could not be able to 
find it.



Let me know how it goes.


p.


On 05/03/19 10:06, Esther Barlow-Smith wrote:

Hi Paolo,
I tried again with your correction but I got the same error that the 
boost libraries could not be found.

Thanks,
Esther


*From:* Paolo Tosco  
<mailto:paolo.tosco.m...@gmail.com>

*Sent:* 03 May 2019 09:43
*To:* Esther Barlow-Smith; rdkit-discuss@lists.sourceforge.net 
<mailto:rdkit-discuss@lists.sourceforge.net>
*Subject:* Re: [Rdkit-discuss] Unable to build RDKit C# wrapper due 
to boost libraries


Dear Esther,

you should not set BOOST_INCLUDEDIR, BOOST_ROOT, BOOST_LIBRARYDIR as 
environment variables, but as CMake parameters, i.e. something like:


cmake -S %RDBASE% -B "%RDBASE%\build" 
-DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON 
-DBOOST_ROOT=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost 
-DBOOST_INCLUDEDIR=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost\include 
-DBOOST_LIBRARYDIR=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost\lib


Cheers,
p.


On 05/02/19 20:28, Esther Barlow-Smith wrote:

Dear all,

I am trying to build the RDKit C# wrapper but I am having issues 
due to the build not finding certain boost libraries (boost_system, 
boost_iostreams and boost_python). I have tried solving the error 
by installing boost in different locations; through anaconda and 
also by installing boost separately but I get the error for both 
instances. I have also set the 3 environment 
variables;  BOOST_INCLUDEDIR, BOOST_ROOT, BOOST_LIBRARYDIR, but 
CMake ignores the variables no matter what they are set to.

The error is reproducible for me with the following code:

set RDBASE=C:\Users\e.barlow-smith\AppData\Local\RDKit
set BOOST_ROOT=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost
cd %RDBASE%
conda activate my-rdkit-env
cmake -S %RDBASE% -B "%RDBASE%\build" 
-DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON


My configuration:
OS: Windows: 10.0.17
Rdkit: 2019.03.01
Python: 3.7.3
CMake: 3.14
Conda: 4.6.11
swig: 3.0.12

*Error message:*
-- Found Catch2 source in 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/catch/catch
CATCH: 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/catch/catch/single_include

CMake Warning (dev) at CMakeLists.txt:246 (find_p

Re: [Rdkit-discuss] Unable to build RDKit C# wrapper due to boost libraries

2019-05-03 Thread Paolo Tosco


Dear Esther,

you should not set BOOST_INCLUDEDIR, BOOST_ROOT, BOOST_LIBRARYDIR as 
environment variables, but as CMake parameters, i.e. something like:


cmake -S %RDBASE% -B "%RDBASE%\build" -DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON 
-DBOOST_ROOT=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost 
-DBOOST_INCLUDEDIR=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost\include 
-DBOOST_LIBRARYDIR=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost\lib


Cheers,
p.


On 05/02/19 20:28, Esther Barlow-Smith wrote:

Dear all,

I am trying to build the RDKit C# wrapper but I am having issues due 
to the build not finding certain boost libraries (boost_system, 
boost_iostreams and boost_python). I have tried solving the error by 
installing boost in different locations; through anaconda and also by 
installing boost separately but I get the error for both instances. I 
have also set the 3 environment variables;  BOOST_INCLUDEDIR, 
BOOST_ROOT, BOOST_LIBRARYDIR, but CMake ignores the variables no 
matter what they are set to.

The error is reproducible for me with the following code:

set RDBASE=C:\Users\e.barlow-smith\AppData\Local\RDKit
set BOOST_ROOT=C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost
cd %RDBASE%
conda activate my-rdkit-env
cmake -S %RDBASE% -B "%RDBASE%\build" -DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON

My configuration:
OS: Windows: 10.0.17
Rdkit: 2019.03.01
Python: 3.7.3
CMake: 3.14
Conda: 4.6.11
swig: 3.0.12

*Error message:*
-- Found Catch2 source in 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/catch/catch
CATCH: 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/catch/catch/single_include

CMake Warning (dev) at CMakeLists.txt:246 (find_package):
  Policy CMP0074 is not set: find_package uses _ROOT 
variables.
  Run "cmake --help-policy CMP0074" for policy details.  Use the 
cmake_policy

  command to set the policy and suppress this warning.

  Environment variable Boost_ROOT is set to:
C:\Users\e.barlow-smith\AppData\Local\boost_1_67_0\boost
  For compatibility, CMake is ignoring the variable.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning at 
C:/Users/e.barlow-smith/AppData/Local/Continuum/anaconda3/envs/my-rdkit-env/Library/share/cmake-3.14/Modules/FindBoost.cmake:1799 
(message):

  No header defined for python-py37; skipping header check
Call Stack (most recent call first):
  CMakeLists.txt:246 (find_package)

CMake Error at 
C:/Users/e.barlow-smith/AppData/Local/Continuum/anaconda3/envs/my-rdkit-env/Library/share/cmake-3.14/Modules/FindBoost.cmake:2132 
(message):

  Unable to find the requested Boost libraries.
  Boost version: 1.67.0
  Boost include path:
C:/Users/e.barlow-smith/AppData/Local/Continuum/anaconda3/envs/my-rdkit-env/Library/include

  Could not find the following Boost libraries:
          boost_python
  No Boost libraries were found.  You may need to set BOOST_LIBRARYDIR 
to the

  directory containing Boost libraries or BOOST_ROOT to the location of
  Boost.
Call Stack (most recent call first):
  CMakeLists.txt:254 (find_package)

-- Could NOT find Boost
== Using strict rotor definition
-- maeparser include dir set as 
'C:/Users/e.barlow-smith/AppData/Local/RDKit/External/CoordGen'
-- maeparser libraries set as 
'C:/Users/e.barlow-smith/AppData/Local/Continuum/anaconda3/envs/my-rdkit-env/Library/lib/maeparser.lib'
-- Found maeparser: 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/CoordGen
-- coordgen include dir set as 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/CoordGen
-- coordgen libraries set as 
'C:/Users/e.barlow-smith/AppData/Local/Continuum/anaconda3/envs/my-rdkit-env/Library/lib/coordgen.lib'

-- coordgen templates file set as 'coordgen_TEMPLATE_FILE-NOTFOUND'
-- Could NOT find coordgen (missing: coordgen_TEMPLATE_FILE)
-- Found coordgenlibs source in 
C:/Users/e.barlow-smith/AppData/Local/RDKit/External/CoordGen/coordgen


  Could not find the following Boost libraries:
          boost_system
          boost_iostreams

  Some (but not all) of the required Boost libraries were found.  You may
  need to install these additional Boost libraries.  Alternatively, set
  BOOST_LIBRARYDIR to the directory containing Boost libraries or 
BOOST_ROOT

  to the location of Boost.
Call Stack (most recent call first):
  Code/RDStreams/CMakeLists.txt:1 (find_package)

-- Could NOT find Boost
CMake Warning (dev) at Code/GraphMol/FileParsers/CMakeLists.txt:6 
(find_package):
CMake Error at 
C:/Users/e.barlow-smith/AppData/Local/Continuum/anaconda3/envs/my-rdkit-env/Library/share/cmake-3.14/Modules/FindBoost.cmake:2132 
(message):

  Unable to find the requested Boost libraries.
-- Could NOT find Boost
== Making EnumerateLibrary without boost Serialization support
== Making FilterCatalog without boost Serialization support
== Updating Filters.cpp from pains file
== Done updating pains files
== Making SubstructLibrary without boost Serialization support
-- Found RapidJSON source in

Re: [Rdkit-discuss] Get num of heavy atoms returns incorrect value

2019-05-01 Thread Paolo Tosco


Hi Lukas,

in the RDKit notation all atoms are explicit if they are present in the 
molecule graph, including hydrogens. You mention that hydrogens are 
explicitly present in your input structure, so that's the expected 
behaviour.


If you wish to retrieve the number of heavy atoms you can use 
mol.GetNumHeavyAtoms().


If you want to remove hydrogens from the molecule graph and make them 
implicit, you may call Chem.RemoveHs().


Hope this helps, cheers
p.


On 05/01/19 15:23, Lukas Pravda wrote:


Dear all,

I construct my own rdkit.Mol objects from mmcif files. I wanted to use 
mol.GetNumAtoms(onlyExplicit=True) to get the number of heavy atoms 
for that molecule, however, I have noticed that the function returns 
all the time number of all atoms in the molecule including hydrogens 
(47 vs. expected 31). When I try to iterate over the atoms to get 
number of Implicit/Explicit Hs for each atom I get 0 for all the atoms 
in the molecule, although the element types are correct (C’s, O’s, H’s 
etc.)


So I assume that I construct the molecule incorrectly and wonder if 
there’s a way to tag hydrogen atoms correctly when I construct them.


Hydrogens are explicitly present in my input structures and I’d like 
to get GetNumAtoms(onlyExplicit=True) function to work as expected. 
Attached is a python pickle of ATP molecule with two conformations.


Interestingly rdkit.Chem.Descriptors.HeavyAtomCount(self.mol) returns 
correct value as expected.


My configuration:

OS: MacOS: 10.14.4

Rdkit: 2019.03.01

Python: 3.7.3

Best,

Lukas





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] molecular dynamics using RDkit only

2019-04-13 Thread Paolo Tosco


Dear Jim,

I have indeed written some preliminary code to:

1) Implement MMFF94/MMFF94s into OpenMM
2) Integrate OpenMM within the RDKit

Even though it needs polishing and adding unit tests, this code is 
functional and usable.


I will make sure that it builds and integrates correctly with the 
current version of OpenMM and RDKit and notify the mailing list as soon 
as this is done.


Cheers,
p.


On 04/13/19 19:13, James T. Metz via Rdkit-discuss wrote:

RDkit Discussion Group,

    I am aware of RDkit scripts that use the MMFF force field to 
minimize small
molecules.  Has anyone written RDkit code to perform molecular 
dynamics (MD)
of small molecules or protein-ligand complexes using only RDkit and 
existing
RDkit force fields.  I am aware of a number of other programs to 
perform MD, but
I am specifically interested in RDkit/Python only codes at present 
i.e., no other
dependencies.  If anyone has any code, even if preliminary, that 
calculates the
potential energies, forces, velocities, accelerations, etc to 
propagate the motions

of the atoms, and is willing to share that would be much appreciated.

    Regards,
    Jim Metz






___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Beta of the 2019.03 release available

2019-04-06 Thread Paolo Tosco


Dear Markus,

in case you ran into the Float8GetDatum undefined symbol issue (it just 
happened to me), please make sure that you are actually building the 
cartridge against the PostgreSQL 11 headers.


You may check this in the CMake output; for example, in my case it has 
to be:


postgres: /usr/pgsql-11/include;/usr/pgsql-11/include/server

and to obtain this I had to define in my cmake command line:

  -DPostgreSQL_ROOT=/usr/pgsql-11 \
  -DPostgreSQL_INCLUDE_DIR=/usr/pgsql-11/include \
-DPostgreSQL_TYPE_INCLUDE_DIR=/usr/pgsql-11/include/server \
  -DPostgreSQL_LIBRARY_DIR=/usr/pgsql-11/lib \

in addition to the usual

  -DRDK_BUILD_PGSQL=ON \
  -DRDK_PGSQL_STATIC=ON \

to avoid that FindPostgreSQL.cmake finds the headers of the system 
CentOS 7 PostgreSQL in /usr/include rather than the PostgreSQL headers 
in /usr/pgsql-11/include.


HTH, cheers,
p.

On 05/04/2019 17:42, Markus Sitzmann wrote:

Hi Greg,

my Chembience RDKit image build with version 2019.03-b1b went fine 
(well, I just pull it with conda; in case someone is interested it is 
available with tag 0.2.10-beta-1 at Dockerhub).


For the Postgres extension (which I still compile myself during the 
Docker build against Postgress), your python 3 enforcement uncovered 
some dark corners of my build process, but that is fixed. However, 
compiling 2019.03-b1b against Postgres 11 fails during compilation (am 
I too cheeky?).


Markus

On Wed, Apr 3, 2019 at 11:38 AM Greg Landrum <mailto:greg.land...@gmail.com>> wrote:


Dear all,

The beta of the 2019.03 RDKit release has been tagged in github:
https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1b1

There are a couple more bug fixes and maybe one more feature
expected before the actual release, but I wanted to go ahead and
get the beta out there.

I've done conda builds for Python 3.6 and 3.7 for Windows, Mac,
and Linux. These all use the beta label so that they do not
install by default; you'll need to run "conda install" as follows:

conda install -c rdkit/label/beta rdkit

Be sure to confirm that it's installing the right version when you
are prompted (if there's no build available, it will pick the
current production release instead).

The relevant section of the release notes is below, or you can see
a nicely formatted version here:
https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1b1

As usual, if you have time to try out the new release I would love
feedback. If nothing major comes up, I plan to do the actual
release early next week.

Best,
-greg

# Release_2019.03.1
(Changes relative to Release_2018.09.1)

## REALLY IMPORTANT ANNOUNCEMENT
- As of this realease (2019.03.1) the RDKit no longer supports Python 2. 
Please read this rdkit-discuss post to learn what your options are if you need 
to keep using Python 2:
   
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html

## Backwards incompatible changes
- The fix for github #2245 means that the default behavior of the 
MaxMinPicker
   is now truly random. If you would like to reproduce the previous 
behavior,
   provide a seed value of 42.
- The uncharging method in the MolStandardizer now attempts to generate
   canonical results for a given molecule. This may result in different 
output
   for some molecules.
   
## Highlights:

- There's now a Japanese translation of large parts of the RDKit 
documentation
- SGroup data can now be read from and written to Mol/SDF files
- The enhanced stereo handling has been improved: the information is now
   accessible from Python, EnumerateStereoisomers takes advantage of it, 
and it
   can be read from and written to CXSmiles

## Acknowledgements:
Michael Banck, Francois Berenger, Thomas Blaschke, Brian Cole, Andrew Dalke,
Bakary N'tji Diallo, Guillaume Godin, Jan Holst Jensen, Sunhwan Jo, Brian
Kelley, Petr Kubat, Karl Leswing, Susan Leung, John Mayfield, Adam Moyer, 
Dan
Nealschneider, Noel O'Boyle, Stephen Roughley, Takayuki Serizawa, Gianluca
Sforna, Ricardo Rodriguez Schmidt, Matt Swain, Paolo Tosco, Ricardo 
Vianello,
'John-Videogames', 'magattaca', 'msteijaert', 'paconius', 'sirbiscuit'

## Bug Fixes:
   - PgSQL: fix boolean definitions for Postgresql 11
  (github pull #2129 from pkubatrh)
   - update fingerprint tutorial notebook
  (github pull #2130 from greglandrum)
   - Fix typo in RecapHierarchyNode destructor
  (github pull #2137 from iwatobipen)
   - SMARTS roundtrip failure
  (github issue #2142 from mcs07)
   - Error thrown in rdMolStandardize.ChargeParent
  (github issue #2144 from paconius)
   - SMILES parsing inconsistency based on input order
  (github issue #2148 from coleb)
   - MolDraw2D: line width not in python wrapper
  (github issue #2149 from greglandrum)

Re: [Rdkit-discuss] multiThreads OptimizeMoleculeConf with constraint?

2019-03-29 Thread Paolo Tosco


Hi Rafal,

There is no C++-level function to do that, but you can use Python 
multiprocessing to achieve the same result with only a small efficiency 
loss, due to the larger overhead of starting multiple processes as 
compared to multiple threads:


from  multiprocessing  import  Pool
from  rdkit  import  Chem
from  rdkit.Chem  import  rdDistGeom,  ChemicalForceFields

mol  =  Chem.AddHs(Chem.MolFromSmiles(
"CCOCCn1c(C2CC[NH+](CCc3ccc(C(C)(C)C(=O)[O-])cc3)CC2)nc2c21"))

def  min_proc(mol_confid_tuple):
mol,  confid  =  mol_confid_tuple
ffps  =  ChemicalForceFields.MMFFGetMoleculeProperties(mol)
ff  =  ChemicalForceFields.MMFFGetMoleculeForceField(
mol,  ffps,  confId=confid,  ignoreInterfragInteractions=False)
ff.MMFFAddDistanceConstraint(1,  2,  False,  1.2,  1.55,  1e05)
res=ff.Minimize(maxIts=200)
return  (mol,  confid)

confids  =  list(rdDistGeom.EmbedMultipleConfs(
mol,  numConfs=100,  numThreads=10))

p  =  Pool(10)
mol_confid_tuples  =  p.map(min_proc,
  [(mol,  confid)  for  confid  in  confids])
mol.RemoveAllConformers()
[mol.AddConformer(m.GetConformer(confid))
 for  m,  confid  in  mol_confid_tuples];

Cheers,
p.

On 03/28/19 19:44, Rafal Roszak wrote:

Hello all,

Is it possible to optimize multiple conformation with constraint using more 
that one thread (core)?
I have following code:

Chem.rdDistGeom.EmbedMultipleConfs(mol, numConfs=100, numThreads=10)
ffps = ChemicalForceFields.MMFFGetMoleculeProperties(mol)
ff = ChemicalForceFields.MMFFGetMoleculeForceField(mol, ffps, confId=confid, 
ignoreInterfragInteractions=False)
ff.MMFFAddDistanceConstraint(1, 2, False, 1.2, 1.55, 1e05)
res=ff.Minimize(maxIts=500)

which generate initial structures in parallel fashion but then optimize one by 
one. I found in AllChem interesting function

AllChem.MMFFOptimizeMoleculeConfs

which can optimize all conformers using many threads but how to combine this 
with constraints?

Best regards,

Rafał


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Get list of dihedrals using atom names/numbers from PDB

2019-03-27 Thread Paolo Tosco


Hi Angelica,

if torsion_lists is the tuple of of (non_ring, ring) torsion lists in mol:

torsion_lists = TorsionFingerprints.CalculateTorsionLists(mol)

you can get the same list where the atom index is replaced by a 
(PDB_atom_serial, PDB_atom_name) tuple as follows:


torsion_lists_sn = [[tuple([[tuple([(
    mol.GetAtomWithIdx(i).GetMonomerInfo().GetSerialNumber(),
    mol.GetAtomWithIdx(i).GetMonomerInfo().GetName().strip()
) for i in t]) for t in tt], v]) for tt, v in tl] for tl in torsion_lists]

which is a rather unreadable nested list comprehension equivalent to the 
following series of nested loops:


torsion_lists_sn = []
for tl in torsion_lists:
    tl2 = []
    for tt, v in tl:
    tv2 = []
    for t in tt:
    tt2 = []
    for i in t:
    pdb_res_info = mol.GetAtomWithIdx(i).GetMonomerInfo()
    serial = pdb_res_info.GetSerialNumber()
    name = pdb_res_info.GetName().strip()
    tt2.append((serial, name))
    tv2.append(tuple(tt2))
    tl2.append((tv2, v))
    torsion_lists_sn.append(tl2)

Cheers,
p.

On 03/27/19 21:43, Angelica Parente wrote:
I’m trying to use the module 
|rdkit.Chem.TorsionFingerprints.||CalculateTorsionLists to print out a 
list of torsions. I pass in a PDB file using Chem.MolFromPDBFile, and 
would ideally like the torsion list to specify the atoms using the 
atom names and/or numbers from the PDB file. Right now I get a list of 
rdkit atom indices, and am not sure how to convert these back to the 
original PDB naming/numbering schemes. |

*
*
*
*
*Thanks,*
*
*
*Angelica
*
|
|
|
|
|
|




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] ForwardSDMolSupplier, stdin, and python 3.7

2019-03-27 Thread Paolo Tosco


Hi Andy,

for that to work with Python 3 you will need to replace sys.stdin with 
sys.stdin.buffer.


HTH, cheers
Paolo


On 03/27/19 18:32, Andy wrote:

Hi,

I have used the git version of rdkit for years (local build with 
python 2.7). The last update pushed me to use python3, and, in 
addition to the usual fixes, I noticed that ForwardSDMolSupplier does 
not work with sys.stdin anymore, i.e., the usual construct


fin = Chem.ForwardSDMolSupplier(sys.stdin)
for mol in fin:

gives
[11:59:54] Unexpected error hit on line 1
[11:59:54] ERROR: moving to the begining of the next molecule
TypeError: expected bytes, str found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 7, in 
    for mol in fin:
SystemError:  returned 
a result with an error set


The pre-built conda package 
(linux-64/rdkit-2018.09.2.0-py37hc20afe1_1.tar.bz2) shows the same 
behavior. I am using 64-bit Linux (Arch) system.


Thanks,
--Andy






___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] add hydrogen after its removal

2019-03-22 Thread Paolo Tosco


Hi Pavel,

After you have first called AddHs(), all hydrogens in your molecule are 
now in the molecule graph as real atoms, and there are no more 
implicit/explicit Hs .


Therefore, when you call RemoveAtom(), you are removing a real atom from 
the molecule graph, and the implicit/explicit H count stays 0 for the 
parent heavy atom. Hence calling again AddHs() does not add any hydrogens.


If you wish the hydrogen to come back when you call AddHs(), you need to 
increase the explicit H count on the parent heavy atom:


print(' load mol ')
m = Chem.MolFromSmiles('c1c1O')
print(Chem.MolToSmiles(m, allHsExplicit=True))
print(' add Hs ')
m = Chem.AddHs(m)
print(Chem.MolToSmiles(m, allHsExplicit=True))
print(' remove H ')
nbrs = m.GetAtomWithIdx(7).GetNeighbors()
if (len(nbrs) == 1 and nbrs[0].GetAtomicNum() > 1):
    nbrs[0].SetNumExplicitHs(nbrs[0].GetNumExplicitHs() + 1)
em = Chem.EditableMol(m)
em.RemoveAtom(7)
print(Chem.MolToSmiles(em.GetMol(), allHsExplicit=True))
print(' add Hs ')
mm = Chem.AddHs(em.GetMol())
print(Chem.MolToSmiles(mm, allHsExplicit=True))

 load mol 
[OH][c]1[cH][cH][cH][cH][cH]1
 add Hs 
[H][O][c]1[c]([H])[c]([H])[c]([H])[c]([H])[c]1[H]
 remove H 
[H][O][c]1[cH][c]([H])[c]([H])[c]([H])[c]1[H]
 add Hs 
[H][O][c]1[c]([H])[c]([H])[c]([H])[c]([H])[c]1[H]

HTH, cheers,
p.

On 03/22/19 10:20, Pavel wrote:


Hello,

  I encountered with an issue which I cannot understand and solve. 
This might be a bug or a feature. After removal of some specific 
hydrogens I could not add them back. Is it expected behavior or should 
I create an issue on github?


print(' load mol ')
m = Chem.MolFromSmiles('c1c1O')
print(Chem.MolToSmiles(m, allHsExplicit=True))
print(' add Hs ')
m = Chem.AddHs(m)
print(Chem.MolToSmiles(m, allHsExplicit=True))
print(' remove H ')
em = Chem.EditableMol(m)
em.RemoveAtom(7)
print(Chem.MolToSmiles(em.GetMol(), allHsExplicit=True))
print(' add Hs ')
mm = Chem.AddHs(em.GetMol())
print(Chem.MolToSmiles(mm, allHsExplicit=True))
print(' update property cache ')
mm.UpdatePropertyCache()
print(Chem.MolToSmiles(mm, allHsExplicit=True))

output:

 load mol 
[OH][c]1[cH][cH][cH][cH][cH]1
 add Hs 
[H][O][c]1[c]([H])[c]([H])[c]([H])[c]([H])[c]1[H]
 remove H 
[H][O][c]1[c][c]([H])[c]([H])[c]([H])[c]1[H]
 add Hs 
[H][O][c]1[c][c]([H])[c]([H])[c]([H])[c]1[H]
 update property cache 
[H][O][c]1[c][c]([H])[c]([H])[c]([H])[c]1[H]

Pavel.




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] AM1-BCC charges for small molecules

2019-03-11 Thread Paolo Tosco


Dear Jim,

you can do that quite conveniently using antechamber from the AmberTools 
 package.


antechamber takes as input formats PDB, MOL2 and MOL, so it should be 
pretty straightforward for you to interface it with the RDKit.


Please feel free to get back to me off-list if you get into trouble.

Cheers,
p.

On 03/11/19 18:55, James T. Metz via Rdkit-discuss wrote:

RDkit Discussion Group,

    I am interested in generating and assigning AM1-BCC charges to 
small molecules,
preferably in batch mode.  I understand this topic has been discussed 
previously, but
has there been RDkit code written to do this?  Since this relies on 
the results of AM1
calculations, has anyone perhaps written RDkit code to calculate and 
assign the
charges if I have already generated a MOPAC output file by some other 
means?


    I greatly appreciate all the capabilities of RDkit, and not to be 
off-topic, but if
someone is aware of a non-RDkit way to generate AM1-BCC charges, that 
might work

for me.  Hence, please let me know. Thank you.

    Regards,
    Jim Metz






___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Why this doesn't work? HasStructMatch function

2019-02-20 Thread Paolo Tosco


Hi Xiaobo,


as this molecule does not contain the middle aromatic ring, the bonds 
connecting the two /m/-toluyl substituentsto the carbonyl group are 
actually single, so the SMARTS pattern specifying explicit single bonds 
matches.



Cheers,

p.


On 02/19/19 23:42, Li, Xiaobo [xiaoboli] wrote:


Hi Paolo,


Thanks.


But why this returns True?


m=Chem.MolFromSmiles('C1=CC=C(C=C1C(C2=CC(=CC=C2)C)=O)C')
s=Chem.MolFromSmarts('c1c1-[#6](-c2c2)=[#8]')
m.HasSubstructMatch(s)


Output: True


m




Best regards,


Xiaobo Li




*From:* Paolo Tosco 
*Sent:* 19 February 2019 23:15
*To:* Li, Xiaobo [xiaoboli]
*Subject:* Re: [Rdkit-discuss] Why this doesn't work? HasStructMatch 
function


Hi Xiaobo,


it will work if you use as SMARTS query one of the following 
expressions, not if you specify an explicit single bond, as those two 
bonds are aromatic in molecule m:



s=Chem.MolFromSmarts('c1c1[#6](c2c2)=[#8]')
s=Chem.MolFromSmarts('c1c1~[#6](~c2c2)=[#8]')
s=Chem.MolFromSmarts('c1c1:[#6](:c2c2)=[#8]')

Cheers,
p.

On 02/19/19 22:41, Li, Xiaobo [xiaoboli] wrote:

Dear all,
Why the output is False?
m=Chem.MolFromSmiles('CN(C(C=CC=C1)=C1C2=O)C3=C2C=CC=C3')
s=Chem.MolFromSmarts('c1c1-[#6](-c2c2)=[#8]')
m.HasSubstructMatch(s)
Output: False
m

s

Best regards,
Xiaobo Li




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

< 1 2 3 4 >

101 - 200 of 353 matches

Mail list logo