Re: [Rdkit-discuss] implicit conversion of smart pointer with version 2018_03_4

2018-12-13 Thread Yingfeng Wang
Greg,

I figure it out. Adding const should work.

const Atom *atom = (*pCurrentROMol)[*atBegin];

Thanks.
Yingfeng

On Wed, Dec 12, 2018 at 10:21 AM Yingfeng Wang  wrote:

> Greg,
>
> I figure out the problem. Removing "->get()" can solve the problem.
>
> Here is the updated version.
>
> boost::tie(firstB,lastB) = pCurrentROMol->getEdges();
>
> while(firstB!=lastB)
>
> {
>
> auto bond = (*pCurrentROMol)[*firstB];
>
> ...
>
> sEdgeLabel += ((queryIsBondInRing(bond)) ? "ring|" : "linear|" );
>
> ...
>
> ++firstB;
>
> }
>
>
> The original version was
>
>
> boost::tie(firstB,lastB) = pCurrentROMol->getEdges();
>
> while(firstB!=lastB)
>
> {
>
> boost::shared_ptr bond = (*pCurrentROMol)[*firstB];
>
> ...
>
> sEdgeLabel += ((queryIsBondInRing(bond.get())) ? "ring|" : "linear|"
> );
>
> ...
>
> ++firstB;
>
> }
>
>
> However, I still have a question. Why "Atom *atom" and "Bond *bond" do not
> work? Yes, "auto" solves the problem, but I feel it just covers the
> essential issue.
>
>
> Thanks.
>
> Yingfeng
>
>
>
>
> On Wed, Dec 12, 2018 at 9:50 AM Yingfeng Wang  wrote:
>
>> Greg,
>>
>> Thanks.  The first way does not work. I got the following error message.
>>
>> *Database.cpp:148:15: **error: **cannot initialize a variable of type
>> 'RDKit::Atom *'*
>>
>> *  with an rvalue of type 'const RDKit::Atom *'*
>>
>> Atom *atom = (*pCurrentROMol)[*atBegin];
>>
>>
>> The second way, which uses "auto", works. I also updated the code for
>> bond and got a new error with the following error message.
>>
>> *Database.cpp:164:49: **error: **no member named 'get' in 'RDKit::Bond'*
>>
>> sEdgeLabel += ((queryIsBondInRing(bond->get())) ? "ring|" :
>> "linear|" );
>>
>> *    ^*
>>
>> 1 error generated.
>>
>> Could you please confirm that the updated class Bond removes method
>> get()? If this is the case, which method should I use?
>>
>> Again, thank you very much for your help!
>>
>> Yingfeng
>>
>>
>>
>> On Wed, Dec 12, 2018 at 12:58 AM Greg Landrum 
>> wrote:
>>
>>> Hi Yingfeng,
>>>
>>> As part of the move over to Modern C++ we also changed the way atoms and
>>> bonds are stored in molecules: you now get raw pointers back instead of
>>> smart pointers.
>>> If you change your code from:
>>>  boost::shared_ptr atom = (*pCurrentROMol)[*atBegin];
>>> to:
>>>  Atom *atom = (*pCurrentROMol)[*atBegin];
>>> or, even simpler:
>>>  auto atom = (*pCurrentROMol)[*atBegin];
>>>
>>> things should work.
>>> -greg
>>>
>>>
>>> On Wed, Dec 12, 2018 at 12:36 AM Yingfeng Wang 
>>> wrote:
>>>
 I am using the C++ library of RDKit on Mac. My C++ code works with
 RDKit_2017_09_3. However, after I switch to RDKit 2018_03_4, I got the
 following error when compiling my C++ source code.

 *Database.cpp:148:33: **error: **no viable conversion from 'const
 RDKit::Atom *' to*

 *  'boost::shared_ptr'*

 boost::shared_ptr atom = (*pCurrentROMol)[*atBegin];

 *^  ~~*

 */usr/local/Cellar/boost/1.68.0/include/boost/smart_ptr/shared_ptr.hpp:358:21:
 **note: *

   candidate constructor not viable: no known conversion from

   'const RDKit::Atom *' to 'boost::detail::sp_nullptr_t' (aka
 'nullptr_t')

   for 1st argument

 BOOST_CONSTEXPR shared_ptr( boost::detail::sp_nullptr_t )
 BOOST_SP_N...

 *^*

 */usr/local/Cellar/boost/1.68.0/include/boost/smart_ptr/shared_ptr.hpp:422:5:
 **note: *

   candidate constructor not viable: no known conversion from

   'const RDKit::Atom *' to 'const boost::shared_ptr
 &' for 1st

   argument

 shared_ptr( shared_ptr const & r ) BOOST_SP_NOEXCEPT : px( r.px ),
 p...

 *^*

 I am using Clang on Mac. The version information is given as follows.

 g++ -v

 Configured with: --prefix=/Library/Developer/CommandLineTools/usr
 --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1

 Apple LLVM version 10.0.0 (clang-1000.10.44.4)

 Target: x86_64-apple-darwin18.2.0

 Thread model: posix

 I notice that "Starting with the 2018_03 release, the RDKit core C++
 code is written in modern C++; for this release that means C++11. "

 Actually, I also use -std=c++11 when compiling my C++ source code. I
 also tested RDKit 2018_09_1 and got the similar error. I am wondering how
 to fix this problem.

 Thanks.

 Yingfeng


 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>>
___
Rdkit-discuss mailing list

Re: [Rdkit-discuss] RGroup matching in RGroup decomposition code

2018-12-13 Thread Stiefl, Nikolaus
HI Paolo
That's cool thanks. This will also maybe help me in trying to solve my problem 
of R-group label numbering not taking into account the actual R-group numbering 
(ie if a molecule has R8 and R5 as sole R-group definitions then they get R1,R2 
labels).
I also was in contact with Brian Kelley and he suggested to fix it in the 
underlying codebase so I hope this will be fixed in the next version :)
Cheers
Nik


From: Paolo Tosco 
Sent: Thursday, December 13, 2018 11:09 AM
To: Stiefl, Nikolaus ; RDKit Discuss 

Subject: Re: [Rdkit-discuss] RGroup matching in RGroup decomposition code


Hi Nik,

There is a way to achieve what you describe, even though it is slightly 
cumbersome:

from rdkit import Chem

from rdkit.Chem import rdmolops

from rdkit.Chem.Draw import MolsToGridImage, IPythonConsole

from rdkit.Chem.rdRGroupDecomposition import (

RGroupDecomposition, RGroupDecompositionParameters)

smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1n1',

'Nc1ccc(Br)cn1', 'c1ccncc1']

mols = [Chem.MolFromSmiles(smi) for smi in smis]

MolsToGridImage(mols)
[cid:image001.png@01D49312.8A22D520]

params = RGroupDecompositionParameters()

# rather than using the built-in flag we will manually

# adjust the query in two steps using AdjustQueryProperties()

params.onlyMatchAtRGroups = False

# just atom number the rgroups

core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')

# make dummies queries

core1_params = rdmolops.AdjustQueryParameters()

core1_params.makeDummiesQueries = True

core1_params.adjustDegree = False

core1 = rdmolops.AdjustQueryProperties(core1, core1_params)

# change the atoms connected to the dummies into dummies

former_atomic_nums = {}

for b in core1.GetBonds():

if (b.GetBeginAtom().GetAtomicNum() == 0):

a = b.GetEndAtom()

elif (b.GetEndAtom().GetAtomicNum() == 0):

a = b.GetBeginAtom()

else:

continue

former_atomic_nums[a.GetIdx()] = a.GetAtomicNum()

a.SetAtomicNum(0)

# this has the same effect as setting onlyMatchAtRGroups to True

# but we can avoid applying it the atoms connected to the R groups

core1_params.adjustHeavyDegreeFlags = Chem.ADJUST_IGNOREDUMMIES

core1_params.makeDummiesQueries = False

core1_params.adjustDegree = False

core1_params.adjustHeavyDegree = True

core1 = rdmolops.AdjustQueryProperties(core1, core1_params)

# restore the original atomic numbers

for i, an in former_atomic_nums.items():

core1.GetAtomWithIdx(i).SetAtomicNum(an)

rg1 = RGroupDecomposition(core1, params)

failMols = []

for m in mols:

res = rg1.Add(m)

if res < 0:

failMols.append(m)

rg1.Process()

True

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))

FailedMols: Nc1ccc(Br)cn1

core1
[cid:image002.png@01D49312.8A22D520]

d = rg1.GetRGroupsAsColumns(asSmiles=False)

MolsToGridImage(d['Core'])
[cid:image003.png@01D49312.8A22D520]

MolsToGridImage(d['R1'])
[cid:image004.png@01D49312.8A22D520]

MolsToGridImage(d['R2'])
[cid:image005.png@01D49312.8A22D520]

Hope that helps, cheers
p.

On 12/11/18 11:01, Stiefl, Nikolaus wrote:
Hi all,

I was playing around with the RGroup decomposition code and must say that I am 
pretty impressed by it. The fact that one can directly work with a MDL R-group 
file and that the output is a pandasDataFrame makes analysis really slick - 
well done !

However, one thing that irritates me is the fact that seemingly when I have 
R-groups defined in my core and enforce matching only at R-groups then 
molecules having hydrogen atoms in that position are ignored in the "Add" step. 
I would expect those to be included as long as the molecules don't have 
additional heavy atoms in positions that are not defined as R-groups in the 
core.

__ snip 

from rdkit import Chem
from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition, 
RGroupDecompositionParameters


smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1n1', 'Nc1ccc(Br)cn1', 
'c1ccncc1']
mols = [Chem.MolFromSmiles(smi) for smi in smis]
params = RGroupDecompositionParameters()

params.onlyMatchAtRGroups = True

# just atom number the rgroups
core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')
rg1 = RGroupDecomposition(core1, params)

failMols = []
for m in mols:
  res = rg1.Add(m)
  if res < 0:
failMols.append(m)

rg1.Process()

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))

 end snip 


the output shows that molecules 3-5 are not included at the "Add" step

>> FailedMols: Nc1n1 Nc1ccc(Br)cn1 c1ccncc1

For molecules 4 (the 5-bromo substituted aminopyridine) I agree, however I 
don't understand how I can make sure mols 3 and 5 are also included ... is 
there a magic parameter that I can set?

Cheers
Nik







___

Rdkit-discuss mailing list

Rdkit-discuss@lists.sourceforge.net


Re: [Rdkit-discuss] assigning bond orders from reference: problems with COOH group

2018-12-13 Thread Paolo Tosco

Hi Mariana,

if you modify your Jupyter notebook as in

https://gist.github.com/ptosco/d807c64df9f277e92284bc5a7cecbc1d

then it should work also in the presence of graph isomorphisms.

Cheers,
p.


On 12/13/18 12:18, Mariana Assmann wrote:

Hi all,

I have docking poses (from AutoDockVina) in a format that does not 
provide bond information, and there is no information on hydrogens 
atoms unless they are important for hydrogens bonds. I was assigning 
the bond orders from the original reference structure or it's smiles 
string using the methods in 
https://gist.github.com/ptosco/4844d3635cf14d11e5e14381993915c1 using 
AllChem.AssignBondOrdersFromTemplate() and adding the original 
hydrogens back on. This approach does not work well when I have -COOH 
groups present. In AssignBondOrdersFromTemplate, the substructure 
match almost always assigns the oxygens the opposite way so that my 
resulting structure is -C(OH)=OH which can't be sanitized in the end. 
I would like to keep the configuration of the docking output.


Is there a way to correct that in an automated way? I have quite a few 
poses with that problem and can't do that manually.


I have uploaded a gist with an example problem here: 
https://gist.github.com/c4asma/b338321238d1924cc2964612ed278e9e
Here, oxygen with atom number 46 has the hydrogen, and number 45 
should be assigned a double bond (input 7). But, after 
AllChem.AssignBondOrdersFromTemplate() oxygen 46 is assigned the 
double bond (input 15)


Thank you,
Mariana


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Question about (generic) Murcko frameworks

2018-12-13 Thread Gonzalo Colmenarejo
Hi there,

is it normal to get, for the same set of smiles, more generic Murcko
frameworks that normal frameworks? When running this:

### Generate framework for a SMILES, handling for errors
def framecheck(s):
try:
return Chem.MolToSmiles(ms.GetScaffoldForMol(Chem.MolFromSmiles(s)))
except:
pass


### Generate generic framework for a SMILES, handling for errors
def gframecheck(s):
try:
return
Chem.MolToSmiles(ms.MakeScaffoldGeneric(Chem.MolFromSmiles(s)))
except:
pass
# Count unique frameworks
fraq = [framecheck(s) for s in smis]
fraq = list(set(fraq))
len(fraq)


# Count unique generic frameworks
gfraq = [gframecheck(s) for s in smis]
gfraq = list(set(gfraq))
len(gfraq)

I get for the attached set of smiles 1431 frameworks and 2207 generic
frameworks. The generic frameworks are supposed to have all atom types set
to C and all bonds to single, so I would expect less generic frameworks. Is
it necessary some sort of canonicalization?

Thanks in advance

Gonzalo


examp.smi
Description: Binary data
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] assigning bond orders from reference: problems with COOH group

2018-12-13 Thread Mariana Assmann

Hi all,

I have docking poses (from AutoDockVina) in a format that does not 
provide bond information, and there is no information on hydrogens atoms 
unless they are important for hydrogens bonds. I was assigning the bond 
orders from the original reference structure or it's smiles string using 
the methods in 
https://gist.github.com/ptosco/4844d3635cf14d11e5e14381993915c1 using 
AllChem.AssignBondOrdersFromTemplate() and adding the original hydrogens 
back on. This approach does not work well when I have -COOH groups 
present. In AssignBondOrdersFromTemplate, the substructure match almost 
always assigns the oxygens the opposite way so that my resulting 
structure is -C(OH)=OH which can't be sanitized in the end. I would like 
to keep the configuration of the docking output.


Is there a way to correct that in an automated way? I have quite a few 
poses with that problem and can't do that manually.


I have uploaded a gist with an example problem here: 
https://gist.github.com/c4asma/b338321238d1924cc2964612ed278e9e
Here, oxygen with atom number 46 has the hydrogen, and number 45 should 
be assigned a double bond (input 7). But, after 
AllChem.AssignBondOrdersFromTemplate() oxygen 46 is assigned the double 
bond (input 15)


Thank you,
Mariana


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RGroup matching in RGroup decomposition code

2018-12-13 Thread Paolo Tosco

Hi Nik,

There is a way to achieve what you describe, even though it is slightly 
cumbersome:


from  rdkit  import  Chem
from  rdkit.Chem  import  rdmolops
from  rdkit.Chem.Draw  import  MolsToGridImage,  IPythonConsole
from  rdkit.Chem.rdRGroupDecomposition  import  (
RGroupDecomposition,  RGroupDecompositionParameters)

smis  =  ['Cc1ccnc(O)c1',  'Cc1cc(Cl)ccn1',  'Nc1n1',
'Nc1ccc(Br)cn1',  'c1ccncc1']
mols  =  [Chem.MolFromSmiles(smi)  for  smi  in  smis]

MolsToGridImage(mols)

params  =  RGroupDecompositionParameters()
# rather than using the built-in flag we will manually
# adjust the query in two steps using AdjustQueryProperties()
params.onlyMatchAtRGroups  =  False
# just atom number the rgroups
core1  =  Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')
# make dummies queries
core1_params  =  rdmolops.AdjustQueryParameters()
core1_params.makeDummiesQueries  =  True
core1_params.adjustDegree  =  False
core1  =  rdmolops.AdjustQueryProperties(core1,  core1_params)
# change the atoms connected to the dummies into dummies
former_atomic_nums  =  {}
for  b  in  core1.GetBonds():
if  (b.GetBeginAtom().GetAtomicNum()  ==  0):
a  =  b.GetEndAtom()
elif  (b.GetEndAtom().GetAtomicNum()  ==  0):
a  =  b.GetBeginAtom()
else:
continue
former_atomic_nums[a.GetIdx()]  =  a.GetAtomicNum()
a.SetAtomicNum(0)
# this has the same effect as setting onlyMatchAtRGroups to True
# but we can avoid applying it the atoms connected to the R groups
core1_params.adjustHeavyDegreeFlags  =  Chem.ADJUST_IGNOREDUMMIES
core1_params.makeDummiesQueries  =  False
core1_params.adjustDegree  =  False
core1_params.adjustHeavyDegree  =  True
core1  =  rdmolops.AdjustQueryProperties(core1,  core1_params)
# restore the original atomic numbers
for  i,  an  in  former_atomic_nums.items():
core1.GetAtomWithIdx(i).SetAtomicNum(an)
rg1  =  RGroupDecomposition(core1,  params)
failMols  =  []
for  m  in  mols:
res  =  rg1.Add(m)
if  res  <  0:
failMols.append(m)
rg1.Process()

True

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m)  for  m  in  failMols]))

FailedMols: Nc1ccc(Br)cn1

core1

d  =  rg1.GetRGroupsAsColumns(asSmiles=False)

MolsToGridImage(d['Core'])

MolsToGridImage(d['R1'])

MolsToGridImage(d['R2'])


Hope that helps, cheers
p.

On 12/11/18 11:01, Stiefl, Nikolaus wrote:


Hi all,

I was playing around with the RGroup decomposition code and must say 
that I am pretty impressed by it. The fact that one can directly work 
with a MDL R-group file and that the output is a pandasDataFrame makes 
analysis really slick – well done !


However, one thing that irritates me is the fact that seemingly when I 
have R-groups defined in my core and enforce matching only at R-groups 
then molecules having hydrogen atoms in that position are ignored in 
the “Add” step. I would expect those to be included as long as the 
molecules don’t have additional heavy atoms in positions that are not 
defined as R-groups in the core.


__ snip 

from rdkit import Chem

from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition, 
RGroupDecompositionParameters


smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1n1', 'Nc1ccc(Br)cn1', 
'c1ccncc1']


mols = [Chem.MolFromSmiles(smi) for smi in smis]

params = RGroupDecompositionParameters()

params.onlyMatchAtRGroups = True

# just atom number the rgroups

core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')

rg1 = RGroupDecomposition(core1, params)

failMols = []

for m in mols:

res = rg1.Add(m)

if res < 0:

failMols.append(m)

rg1.Process()

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))

 end snip 

the output shows that molecules 3-5 are not included at the “Add” step

>> FailedMols: Nc1n1 Nc1ccc(Br)cn1 c1ccncc1

For molecules 4 (the 5-bromo substituted aminopyridine) I agree, 
however I don’t understand how I can make sure mols 3 and 5 are also 
included … is there a magic parameter that I can set?


Cheers

Nik





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss