Hello

Here are a few suggestion you can try that may speed up your code.

Instead of GetSubstructMatches, you can use the list of neighbours of each
atom. Here is something that may work, it creates an iterator that will
return all the atoms for bond angles. I did not test it, however it may
give you an idea.

def iterAngles(mol): """ Return an iterator over all angles in molecule """
natoms = mol.GetNumAtoms() # Get the coordinates of all atoms conf =
mol.GetConformer() coords = [conf.GetAtomPosition(i) for i in
range(natoms)] for atom in mol.GetAtoms(): center = coords[atom.GetIdx()]
neighbours = [a.GetIdx() for a in atom.GetNeighbors()] for a1, a2 in
combinations(neighbours, 2): yield coords[a1], center, coords[a2]
The other one is reducing the calls to numpy functions. It is not necessary
to calculate the angle, instead us the cos(angle) value in the comparison.
I also avoid calling the np.norm function. That will call sqrt for each
vector, so we can move it out.

def cangle3(v1, v2): v12 = np.dot(v1, v2) v11v22 = np.dot(v1, v1) *
np.dot(v2, v2) c = v12 / np.sqrt(v11v22) # -> cosine of the angle return
np.clip(c, -1, 1) def angleContribution3(angles): """ Return the angle
contributions """ a60 = a90 = a102 = aUnmatched = 0 cos102 = np.cos(102.0 /
180 * np.pi) for a1, c, a2 in angles: v1 = np.array([a1.x - c.x, a1.y -
c.y, a1.z - c.z]) v2 = np.array([a2.x - c.x, a2.y - c.y, a2.z - c.z])
cosAngle = cangle3(v1, v2) if 0.5 <= cosAngle: a60 += 1 elif 0 <= cosAngle
< 0.5: a90 += 1 elif cos102 <= cosAngle < 0: a102 += 1 else: aUnmatched +=
1 return a60, a90, a102, aUnmatched

I tried this function with a random list of 100,000 vectors and get these
timings:

Your code: 1.6375901699066162
Remove arccos, abs, and degrees: 1.440047025680542
Remove call of lingalg.norm: 0.9320440292358398

The following code gets it down to 0.7 seconds by removing the call to
sqrt. However I think we loose clarity of the code here and documentation
will be crucial.

def cangle4(v1, v2): v12 = np.dot(v1, v2) v11v22 = np.dot(v1, v1) *
np.dot(v2, v2) return np.sign(v12) * (v12 * v12) / v11v22 def
angleContribution4(angles): """ Return the angle contributions """ a60 =
a90 = a102 = aUnmatched = 0 cos102 = np.cos(102.0 / 180 * np.pi) cos102 =
-cos102 * cos102 for a1, c, a2 in angles: v1 = np.array([a1.x - c.x, a1.y -
c.y, a1.z - c.z]) v2 = np.array([a2.x - c.x, a2.y - c.y, a2.z - c.z])
cosAngle = cangle4(v1, v2) if 0.5 * 0.5 <= cosAngle: a60 += 1 elif 0 <=
cosAngle < 0.5 * 0.5: a90 += 1 elif cos102 <= cosAngle < 0: a102 += 1 else:
aUnmatched += 1 return a60, a90, a102, aUnmatched

I'm curious if these changes will speed up your calculation.

Best,

Peter

On Wed, Sep 14, 2016 at 3:18 PM Guillaume GODIN <
guillaume.go...@firmenich.com> wrote:

> Dear Greg,
>
>
> I found my mistake I need the bond angle not the torsion.
>
>
> I add this function for that...
>
>
> from numpy import (array, dot, arccos, clip,zeros,degrees)
> from numpy.linalg import norm
>
>
> def cangle(v1,v2):
>     c=dot(v1,v2)/norm(v1)/norm(v2) # -> cosine of the angle
>     return arccos(clip(c, -1, 1))  # if you really want the angle
>
> # THIS CODE IS WORKING BUT IT'S SLOW!
> def AnglesBond(mol):
>     angles = mol.GetSubstructMatches(Chem.MolFromSmarts('*~*~*'))
>     conf = mol.GetConformer()
>     A60=0
>     A90=0
>     A102=0
>     for ang in angles:
>         a1 = conf.GetAtomPosition(ang[0])
>         c = conf.GetAtomPosition(ang[1])
>         a2 = conf.GetAtomPosition(ang[2])
>         v1=array([a1.x-c.x,a1.y-c.y,a1.z-c.z])
>         v2=array([a2.x-c.x,a2.y-c.y,a2.z-c.z])
>         Angle = abs(degrees(cangle(v1,v2)))
>         if Angle<=60:
>             A60+=1
>         elif Angle>60 and Angle<=90:
>             A90+=1
>         elif Angle>90 and Angle<=102:
>             A102+=1
>     return (A60,A90,A102)
>
>
> the stats are not perfect but in progress.
>
>
> ​
>
>
>
> BR,
>
>
> Guillaume
> ------------------------------
> *De :* Greg Landrum <greg.land...@gmail.com>
> *Envoyé :* mercredi 14 septembre 2016 14:14
>
> *À :* Guillaume GODIN
> *Cc :* RDKit Discuss
> *Objet :* Re: [Rdkit-discuss] Angstroms Hydrogen bonding
>
>
> On Wed, Sep 14, 2016 at 4:16 AM, Guillaume GODIN <
> guillaume.go...@firmenich.com> wrote:
>
>> Your solution is perfect!
>>
>>
>> glad it worked
>
>
>> I am currently implementing this article "
>> http://www.mdpi.com/1420-3049/20/10/18279"; using RDKit.
>>
>>
> Interesting, I hadn't seen that one.
>
>
>
>> It's now almost done, I need to check the results on Heat of Formation
>> now and understand why my code to get Angle in torsions is not working.
>>
>> def Angles(mol):
>>     tors =
>> mol.GetSubstructMatches(Chem.MolFromSmarts('[C]-[C;O;S]-[C;O;S]-[C]'))
>>     conf = mol.GetConformer()
>>     A60=0
>>     A90=0
>>     A102=0
>>     for tor in tors:
>>         Angle = abs(AllChem.GetDihedralDeg(conf,tor[0], tor[1], tor[2],
>> tor[3]))
>>         if Angle<=60:
>>             A60+=1
>>         elif Angle>60 and Angle<=90:
>>             A90+=1
>>         elif Angle>90 and Angle<=102:
>>             A102+=1
>>     return (A60,A90,A102)
>>
>> Comparing to the publication on Heat of Formation page 18294, I have to
>> much more occurences (ie. a factor of 50-100 more then expected).
>>
>
> I suspect that the problem is that the SMARTS you
> specify: '[C]-[C;O;S]-[C;O;S]-[C]' won't actually match anything. [C;O;S]
> means 'aliphatic C AND aliphatic O AND aliphatic S'. It think you like mean
> '[C,O,S]', which is 'aliphatic C OR aliphatic O OR aliphatic S'.
> You might also want to think about whether or not you should replace C, O,
> S (all of which only match aliphatic elements) with #6, #8, #16 (which can
> match aliphatic or aromatic) throughout the SMARTS
>
>
>> When it will be done I will share the code with you guys.
>>
>>
> Cool, I look forward to trying that out.
>
> Best,
> -greg
>
>
>
>> BR,
>>
>> Guillaume
>> ------------------------------
>> *De :* Greg Landrum <greg.land...@gmail.com>
>> *Envoyé :* mercredi 14 septembre 2016 12:16
>> *À :* Guillaume GODIN
>> *Cc :* RDKit Discuss
>> *Objet :* Re: [Rdkit-discuss] Angstroms Hydrogen bonding
>>
>> Hi Guillaume,
>>
>> On Tue, Sep 13, 2016 at 10:12 PM, Guillaume GODIN <
>> guillaume.go...@firmenich.com> wrote:
>>
>>> 1 Does 3D coordinates of a conformer is in Angstroms ?
>>>
>>> If you read the conformer from a file, for example a mol file, then the
>> 3D coordinates are in whatever units they were in in that file. This is
>> usually Angstrsom.
>> If you generate the conformation using one of the RDKit embedding
>> functions, then they are certainly in Angstroms.
>>
>>> 2 How to enumerate all HBonding to determine the bond length ?
>>>
>>
>> Interesting question. Here's a python function that might be a starting
>> point:
>>
>> def
>> findHBonds(m,confId=-1,possiblePartners='[#8,#7]',possibleHs='[#1][#8,#7]',distThresh=2.5):
>>     conf = m.GetConformer(confId)
>>     partners =[x[0] for x in
>>  m.GetSubstructMatches(Chem.MolFromSmarts(possiblePartners))]
>>     hs=  [x[0] for x in
>> m.GetSubstructMatches(Chem.MolFromSmarts(possibleHs))]
>>     res = []
>>     for h in hs:
>>         ph = conf.GetAtomPosition(h)
>>         for partner in partners:
>>             if m.GetBondBetweenAtoms(h,partner) is not None:
>>                 continue
>>             d = conf.GetAtomPosition(partner).Distance(ph)
>>             if d<=distThresh:
>>                 res.append((h,partner,d))
>>     return res
>>
>>
>> In order to allow flexibility about what an H bond is, I left the
>> definitions of the acceptors (partners in the above code) and polar Hs
>> (just Hs in the above code) as SMARTS definitions so that they can be
>> customized.
>>
>>
>>
>> **********************************************************************
>> DISCLAIMER
>> This email and any files transmitted with it, including replies and
>> forwarded copies (which may contain alterations) subsequently transmitted
>> from Firmenich, are confidential and solely for the use of the intended
>> recipient. The contents do not represent the opinion of Firmenich except to
>> the extent that it relates to their official business.
>> **********************************************************************
>>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to