# Re: [Rdkit-discuss] Tversky Shape similarity

```Hi Susan,
```
```

On Tue, Feb 20, 2018 at 1:24 PM, Susan Leung <susan.le...@st-hildas.ox.ac.uk
> wrote:

>
> Thank you very much for your response and for the example. I am now trying
> to understand how to two functions work.
>
> I have a few questions about ShapeProtrudeDist and ShapeTanimotoDist.
>
>    1. In ./Code/Geometry/GridUtils.cpp lines 28-31 in tanimotoDistance
>    and lines 46-49 in protrudeDistance seem the same to me. What is the
>    difference between them? How does one calculate the dist and inter and the
>    other the totProtrude and intersectVolume? And why on line 31 is inter a
>    double, whereas on line 49, intersectVolume an unsigned int?
>
> In this case I think you're over-analyzing what's there: the differences
are mostly just a result of lazy coding (I think I can say that since I
think I wrote that stuff). :-)
The values returned from many of the methods on the grid are unsigned ints
(since the grids themselves use ints to store occupancies), but at some
point you need to get to a double in order to calculate a normed distance
(like tanimoto or protrude). The differences in the code basically just
reflect different points at which that could happen.

>
>    1. I follow your example below, that the ShapeProtrudeDist is 0.0 but
>    could you point me to where in the code it does the check for which is the
>    smaller shape?
>
> That test isn't actually done in the function that does the calculation on
the grids (that just calculates what you tell it), but is done in the
convenience function that generates the protrude distance between two
molecules. The specific line is here:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/ShapeHelpers/ShapeUtils.cpp#L187

>    1. Am I right in understanding that the ShapeProtrudeDist is the
>    amount of volume protrusion of the smaller molecule (from the larger
>    molecule) as a percentage with respect to the total volume of the smaller?
>
> Yes, it's the percentage of the smaller molecule that protrudes from the
larger.

apologies that this is a bit idiosyncratic. There are parts of the RDKit
(and this is one) where we needed a value for a particular purpose (in this
case: how good is the shape overlap of a small shape with a large one) and
just came up with something quickly instead of looking for existing
(perhaps equivalent) definitions.

Does that help?
-greg

Thanks very much for the help!
>
>
> Susan
> ------------------------------
> *From:* Greg Landrum [greg.land...@gmail.com]
> *Sent:* 14 February 2018 06:25
> *To:* Susan Leung
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Tversky Shape similarity
>
> Hi Susan,
>
> There isn't currently a function available to calculate Tversky distance
> using shapes (though it wouldn't be terribly difficult to add), but you can
> use the ShapeProtrudeDist to generate a measure for comparing two molecules
> of unequal size. Here's a simple demonstration:
>
> In [3]: m = Chem.AddHs(Chem.MolFromSmiles('CC'))
>
> In [4]: AllChem.EmbedMolecule(m)
> Out[4]: 0
>
> In [5]: nm = Chem.RWMol(m)
>
> In [6]: nm.RemoveAtom(7)
>
> In [7]: nm.RemoveAtom(6)
>
> In [8]: nm.RemoveAtom(5)
>
> In [9]: from rdkit.Chem import rdShapeHelpers
>
> In [13]: rdShapeHelpers.ShapeTanimotoDist(m,nm,ignoreHs=False)
> Out[13]: 0.09966499162479062
>
> In [15]: rdShapeHelpers.ShapeProtrudeDist(m,nm,ignoreHs=False)
> Out[15]: 0.0
>
>
>
> Note that by default ShapeProtrudeDist will reorder the arguments so that
> it's always looking at the fraction of the larger shape protrudes from the
> smaller shape.
>
> I hope this helps,
> -greg
>
>
> On Tue, Feb 13, 2018 at 2:07 PM, Susan Leung <susan.le...@st-hildas.ox.ac.
> uk> wrote:
>
>> Dear all,
>>
>> I would like to compute the shape overlap between two molecules. I
>> understand that there is rdShapeHelpers.ShapeTanimotoDist however I
>> would like to compare the two molecules in a tversky manner.
>>
>> For example I have mol1 and mol2 and I want a score which penalises mol1
>> for failing to cover mol2 but do not wish to penalise mol1 for having extra
>> volume.
>>
>> Best wishes,
>>
>> Susan
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
```
```------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
```_______________________________________________