Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Thomas Evangelidis
Great, thank you! Btw, does RDKit offer any scalar vector similarity
functions apart from the bit vector similarities?

On Thu, 14 Nov 2019 at 16:48, Greg Landrum  wrote:

> Yep, that's about 7x faster than what I came up with.
> Thanks Maciek!
>
> -greg
>
>
> On Thu, Nov 14, 2019 at 4:35 PM Maciek Wójcikowski 
> wrote:
>
>> Hi Thomas,
>>
>> You could also use SetBitsFromList() method:
>>
>>> bv.SetBitsFromList(np.where(ar)[0].tolist())
>>>
>>
>> 
>> Pozdrawiam,  |  Best regards,
>> Maciek Wójcikowski
>> mac...@wojcikowski.pl
>>
>>
>> czw., 14 lis 2019 o 16:28 Greg Landrum 
>> napisał(a):
>>
>>> Hi Thomas,
>>>
>>> There may be more efficient ways to do this, but here's something that
>>> works (and isn't the slowest thing I came up with):
>>> def np_to_bv(fv):
>>> bv = DataStructs.ExplicitBitVect(len(fv))
>>> for i,v in enumerate(fv):
>>> if v:
>>> bv.SetBit(i)
>>>return bv
>>>
>>> -greg
>>>
>>>
>>>
>>> On Thu, Nov 14, 2019 at 3:47 PM Thomas Evangelidis 
>>> wrote:
>>>
 Greetings,

 I am opening this old thread again for someone to answer my initial
 question this time, which was "How do I convert numpy.ndarray objects to
 rdkit.DataStructs.ExplicitBitVect objects?". At the time I asked
 the question I circumvented the problem by calculating Tanimoto
 similarities with Scipy, but now I want to utilize all similarity functions
 offered by rdkit.DataStructs. I am struggling with that for quite some time
 although I feel that the answer is simple.

 So basically, I have these arrays and want to calculate their
 DataStructs.McConnaugheySimilarity similarity. How do I do it?

 fv1 = numpy.array([1,1,0,0,1,0,1])


 fv2 = numpy.array([0,1,1,0,1,0,0])

 Thanks in advance.
 Thomas


 --

 ==

 Dr. Thomas Evangelidis

 Research Scientist

 IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
 Academy of Sciences
 , Prague, Czech
 Republic
   &
 CEITEC - Central European Institute of Technology
 , Brno, Czech Republic

 email: teva...@gmail.com, Twitter: tevangelidis
 , LinkedIn: Thomas Evangelidis
 

 website: https://sites.google.com/site/thomasevangelidishomepage/



 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>

-- 

==

Dr. Thomas Evangelidis

Research Scientist

IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy
of Sciences , Prague,
Czech Republic
  &
CEITEC - Central European Institute of Technology
, Brno,
Czech Republic

email: teva...@gmail.com, Twitter: tevangelidis
, LinkedIn: Thomas Evangelidis


website: https://sites.google.com/site/thomasevangelidishomepage/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Greg Landrum
Yep, that's about 7x faster than what I came up with.
Thanks Maciek!

-greg


On Thu, Nov 14, 2019 at 4:35 PM Maciek Wójcikowski 
wrote:

> Hi Thomas,
>
> You could also use SetBitsFromList() method:
>
>> bv.SetBitsFromList(np.where(ar)[0].tolist())
>>
>
> 
> Pozdrawiam,  |  Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl
>
>
> czw., 14 lis 2019 o 16:28 Greg Landrum 
> napisał(a):
>
>> Hi Thomas,
>>
>> There may be more efficient ways to do this, but here's something that
>> works (and isn't the slowest thing I came up with):
>> def np_to_bv(fv):
>> bv = DataStructs.ExplicitBitVect(len(fv))
>> for i,v in enumerate(fv):
>> if v:
>> bv.SetBit(i)
>>return bv
>>
>> -greg
>>
>>
>>
>> On Thu, Nov 14, 2019 at 3:47 PM Thomas Evangelidis 
>> wrote:
>>
>>> Greetings,
>>>
>>> I am opening this old thread again for someone to answer my initial
>>> question this time, which was "How do I convert numpy.ndarray objects to
>>> rdkit.DataStructs.ExplicitBitVect objects?". At the time I asked
>>> the question I circumvented the problem by calculating Tanimoto
>>> similarities with Scipy, but now I want to utilize all similarity functions
>>> offered by rdkit.DataStructs. I am struggling with that for quite some time
>>> although I feel that the answer is simple.
>>>
>>> So basically, I have these arrays and want to calculate their
>>> DataStructs.McConnaugheySimilarity similarity. How do I do it?
>>>
>>> fv1 = numpy.array([1,1,0,0,1,0,1])
>>>
>>>
>>> fv2 = numpy.array([0,1,1,0,1,0,0])
>>>
>>> Thanks in advance.
>>> Thomas
>>>
>>>
>>> --
>>>
>>> ==
>>>
>>> Dr. Thomas Evangelidis
>>>
>>> Research Scientist
>>>
>>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>>> Academy of Sciences 
>>> , Prague, Czech Republic
>>>   &
>>> CEITEC - Central European Institute of Technology
>>> , Brno, Czech Republic
>>>
>>> email: teva...@gmail.com, Twitter: tevangelidis
>>> , LinkedIn: Thomas Evangelidis
>>> 
>>>
>>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>>
>>>
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Maciek Wójcikowski
Hi Thomas,

You could also use SetBitsFromList() method:

> bv.SetBitsFromList(np.where(ar)[0].tolist())
>


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


czw., 14 lis 2019 o 16:28 Greg Landrum  napisał(a):

> Hi Thomas,
>
> There may be more efficient ways to do this, but here's something that
> works (and isn't the slowest thing I came up with):
> def np_to_bv(fv):
> bv = DataStructs.ExplicitBitVect(len(fv))
> for i,v in enumerate(fv):
> if v:
> bv.SetBit(i)
>return bv
>
> -greg
>
>
>
> On Thu, Nov 14, 2019 at 3:47 PM Thomas Evangelidis 
> wrote:
>
>> Greetings,
>>
>> I am opening this old thread again for someone to answer my initial
>> question this time, which was "How do I convert numpy.ndarray objects to
>> rdkit.DataStructs.ExplicitBitVect objects?". At the time I asked
>> the question I circumvented the problem by calculating Tanimoto
>> similarities with Scipy, but now I want to utilize all similarity functions
>> offered by rdkit.DataStructs. I am struggling with that for quite some time
>> although I feel that the answer is simple.
>>
>> So basically, I have these arrays and want to calculate their
>> DataStructs.McConnaugheySimilarity similarity. How do I do it?
>>
>> fv1 = numpy.array([1,1,0,0,1,0,1])
>>
>>
>> fv2 = numpy.array([0,1,1,0,1,0,0])
>>
>> Thanks in advance.
>> Thomas
>>
>>
>> --
>>
>> ==
>>
>> Dr. Thomas Evangelidis
>>
>> Research Scientist
>>
>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>> Academy of Sciences 
>> , Prague, Czech Republic
>>   &
>> CEITEC - Central European Institute of Technology
>> , Brno, Czech Republic
>>
>> email: teva...@gmail.com, Twitter: tevangelidis
>> , LinkedIn: Thomas Evangelidis
>> 
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Greg Landrum
Hi Thomas,

There may be more efficient ways to do this, but here's something that
works (and isn't the slowest thing I came up with):
def np_to_bv(fv):
bv = DataStructs.ExplicitBitVect(len(fv))
for i,v in enumerate(fv):
if v:
bv.SetBit(i)
   return bv

-greg



On Thu, Nov 14, 2019 at 3:47 PM Thomas Evangelidis 
wrote:

> Greetings,
>
> I am opening this old thread again for someone to answer my initial
> question this time, which was "How do I convert numpy.ndarray objects to
> rdkit.DataStructs.ExplicitBitVect objects?". At the time I asked
> the question I circumvented the problem by calculating Tanimoto
> similarities with Scipy, but now I want to utilize all similarity functions
> offered by rdkit.DataStructs. I am struggling with that for quite some time
> although I feel that the answer is simple.
>
> So basically, I have these arrays and want to calculate their
> DataStructs.McConnaugheySimilarity similarity. How do I do it?
>
> fv1 = numpy.array([1,1,0,0,1,0,1])
>
>
> fv2 = numpy.array([0,1,1,0,1,0,0])
>
> Thanks in advance.
> Thomas
>
>
> --
>
> ==
>
> Dr. Thomas Evangelidis
>
> Research Scientist
>
> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
> Academy of Sciences , 
> Prague,
> Czech Republic
>   &
> CEITEC - Central European Institute of Technology 
> , Brno, Czech Republic
>
> email: teva...@gmail.com, Twitter: tevangelidis
> , LinkedIn: Thomas Evangelidis
> 
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Thomas Evangelidis
Greetings,

I am opening this old thread again for someone to answer my initial
question this time, which was "How do I convert numpy.ndarray objects to
rdkit.DataStructs.ExplicitBitVect objects?". At the time I asked
the question I circumvented the problem by calculating Tanimoto
similarities with Scipy, but now I want to utilize all similarity functions
offered by rdkit.DataStructs. I am struggling with that for quite some time
although I feel that the answer is simple.

So basically, I have these arrays and want to calculate their
DataStructs.McConnaugheySimilarity similarity. How do I do it?

fv1 = numpy.array([1,1,0,0,1,0,1])


fv2 = numpy.array([0,1,1,0,1,0,0])

Thanks in advance.
Thomas


-- 

==

Dr. Thomas Evangelidis

Research Scientist

IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy
of Sciences , Prague,
Czech Republic
  &
CEITEC - Central European Institute of Technology
, Brno,
Czech Republic

email: teva...@gmail.com, Twitter: tevangelidis
, LinkedIn: Thomas Evangelidis


website: https://sites.google.com/site/thomasevangelidishomepage/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2017-03-17 Thread Thomas Evangelidis
Guys, my question was how to cast a fingerprint in the form of a binary
array back to the bit vector form, in order to calculate Tanimoto
distances. According to Curt's answer (thanks for that!), I can calculate
the Tanimono simply by using binary arrays. distance.jaccard also works
with numpy arrays (thanks Matthew!).




On 17 March 2017 at 05:05, Greg Landrum  wrote:

> I'm a bit confused by all this. The RDKit has Tanimoto (and a bunch of
> other similarity measures) built in:
>
> In [6]: from rdkit import DataStructs
>
> In [7]: fp1 = rdMolDescriptors.GetMorganFingerprintAsBitVect(
> theobromine,2,2048)
>
> In [8]: fp2 = rdMolDescriptors.GetMorganFingerprintAsBitVect(
> caffeine,2,2048)
>
> In [9]: DataStructs.TanimotoSimilarity(fp1,fp2)
> Out[9]: 0.5294117647058824
>
> Is there a reason that you're interested in using something else?
>
> -greg
>
>
>
> On Thu, Mar 16, 2017 at 9:42 PM, matthew  wrote:
>
>> I don't think you even need to cast them to numpy arrays if you use
>> scipy. It should be able to take bit arrays. Also, jaccard distance is
>> another name for tanimoto distance. This simplifies the code above:
>>
>> *from __future__ import print_function from rdkit import Chem*
>> *from rdkit.Chem import AllChem*
>>
>>
>> *from scipy.spatial import distance *
>>
>> *mol1 = Chem.MolFromSmiles('CCO')*
>> *mol2 = Chem.MolFromSmiles('CCC')*
>>
>>
>> *fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1, 8) *
>> *fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2, 8)*
>>
>>
>>
>>
>> * # jaccard distance is the same as tanimoto distance # 1 - distance =
>> similarity print(1 - distance.jaccard(fp1, fp2)) *
>>
>> # 0.4285714285714286
>>
>> Matt
>> PhD Student
>> Chemoinformatics Research Group
>> University of Sheffield
>>
>>
>> On 16/03/2017 17:38, Curt Fischer wrote:
>>
>> If you are looking for something quick and dirty, you could stay in numpy
>> to calculate Tanimoto.
>>
>> *from rdkit import Chem*
>> *from rdkit.Chem import AllChem*
>>
>> *import numpy as np*
>> *from __future__ import division*
>>
>> *mol1 = Chem.MolFromSmiles('CCO')*
>> *mol2 = Chem.MolFromSmiles('CCC')*
>>
>> *fp1 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol1, 8),
>> dtype='bool')*
>> *fp2 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol2, 8),
>> dtype='bool')*
>>
>> *def tanimoto(v1, v2):*
>> *"""*
>> *Calculates tanimoto similarity for two bit vectors*
>> *"""*
>> *return(np.bitwise_and(v1, v2).sum() / np.bitwise_or(v1, v2).sum())*
>>
>> *tanimoto(fp1, fp2)*
>>
>> *Out[4]:0.42857142857142855*
>>
>>
>> On Thu, Mar 16, 2017 at 7:28 AM, Thomas Evangelidis 
>> wrote:
>>
>>> Hello,
>>>
>>> I created a numpyarray from a molecule using the following function:
>>>
>>> AllChem.GetMorganFingerprintAsBitVect()
>>>
>>>
>>> Now I would like to convert back to bit vector the numpy array, in order
>>> to calculate the Tanimoto similarity of two compounds. Is this possible?
>>>
>>> thanks
>>> Thomas
>>>
>>>
>>>
>>> --
>>>
>>> ==
>>>
>>> Thomas Evangelidis
>>>
>>> Research Specialist
>>> CEITEC - Central European Institute of Technology
>>> Masaryk University
>>> Kamenice 5/A35/1S081,
>>> 62500 Brno, Czech Republic
>>>
>>> email: tev...@pharm.uoa.gr
>>>
>>>   teva...@gmail.com
>>>
>>>
>>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>>
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>>
>>
>> ___
>> Rdkit-discuss mailing 
>> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> 

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-17 Thread Curt Fischer
Hi Greg,

On Thu, Mar 16, 2017 at 9:05 PM, Greg Landrum 
wrote:

> I'm a bit confused by all this. The RDKit has Tanimoto (and a bunch of
> other similarity measures) built in:
>
>
Good point (as always).  I'd been assuming that for some reason that OP had
fingerprints that had been converted to *numpy.ndarray* objects, not
*rdkit.DataStructs.ExplicitBitVect
*objects.

Looking back over the thread, maybe what was really being asked was, "how
do I convert *numpy.ndarray* objects to *rdkit.DataStructs.ExplicitBitVect *
objects?

Curt
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Greg Landrum
I'm a bit confused by all this. The RDKit has Tanimoto (and a bunch of
other similarity measures) built in:

In [6]: from rdkit import DataStructs

In [7]: fp1 =
rdMolDescriptors.GetMorganFingerprintAsBitVect(theobromine,2,2048)

In [8]: fp2 =
rdMolDescriptors.GetMorganFingerprintAsBitVect(caffeine,2,2048)

In [9]: DataStructs.TanimotoSimilarity(fp1,fp2)
Out[9]: 0.5294117647058824

Is there a reason that you're interested in using something else?

-greg



On Thu, Mar 16, 2017 at 9:42 PM, matthew  wrote:

> I don't think you even need to cast them to numpy arrays if you use scipy.
> It should be able to take bit arrays. Also, jaccard distance is another
> name for tanimoto distance. This simplifies the code above:
>
> *from __future__ import print_function from rdkit import Chem*
> *from rdkit.Chem import AllChem*
>
>
> *from scipy.spatial import distance *
>
> *mol1 = Chem.MolFromSmiles('CCO')*
> *mol2 = Chem.MolFromSmiles('CCC')*
>
>
> *fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1, 8) *
> *fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2, 8)*
>
>
>
>
> * # jaccard distance is the same as tanimoto distance # 1 - distance =
> similarity print(1 - distance.jaccard(fp1, fp2)) *
>
> # 0.4285714285714286
>
> Matt
> PhD Student
> Chemoinformatics Research Group
> University of Sheffield
>
>
> On 16/03/2017 17:38, Curt Fischer wrote:
>
> If you are looking for something quick and dirty, you could stay in numpy
> to calculate Tanimoto.
>
> *from rdkit import Chem*
> *from rdkit.Chem import AllChem*
>
> *import numpy as np*
> *from __future__ import division*
>
> *mol1 = Chem.MolFromSmiles('CCO')*
> *mol2 = Chem.MolFromSmiles('CCC')*
>
> *fp1 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol1, 8),
> dtype='bool')*
> *fp2 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol2, 8),
> dtype='bool')*
>
> *def tanimoto(v1, v2):*
> *"""*
> *Calculates tanimoto similarity for two bit vectors*
> *"""*
> *return(np.bitwise_and(v1, v2).sum() / np.bitwise_or(v1, v2).sum())*
>
> *tanimoto(fp1, fp2)*
>
> *Out[4]:0.42857142857142855*
>
>
> On Thu, Mar 16, 2017 at 7:28 AM, Thomas Evangelidis 
> wrote:
>
>> Hello,
>>
>> I created a numpyarray from a molecule using the following function:
>>
>> AllChem.GetMorganFingerprintAsBitVect()
>>
>>
>> Now I would like to convert back to bit vector the numpy array, in order
>> to calculate the Tanimoto similarity of two compounds. Is this possible?
>>
>> thanks
>> Thomas
>>
>>
>>
>> --
>>
>> ==
>>
>> Thomas Evangelidis
>>
>> Research Specialist
>> CEITEC - Central European Institute of Technology
>> Masaryk University
>> Kamenice 5/A35/1S081,
>> 62500 Brno, Czech Republic
>>
>> email: tev...@pharm.uoa.gr
>>
>>   teva...@gmail.com
>>
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread matthew
I don't think you even need to cast them to numpy arrays if you use 
scipy. It should be able to take bit arrays. Also, jaccard distance is 
another name for tanimoto distance. This simplifies the code above:


*from __future__ import print_function
from rdkit import Chem*
*from rdkit.Chem import AllChem*
*
*
*from scipy.spatial import distance
*
*
*
*mol1 = Chem.MolFromSmiles('CCO')*
*mol2 = Chem.MolFromSmiles('CCC')*
*
*
*fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1, 8)
*
*fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2, 8)*
*
# jaccard distance is the same as tanimoto distance
# 1 - distance = similarity
print(1 - distance.jaccard(fp1, fp2))
*

# 0.4285714285714286

Matt
PhD Student
Chemoinformatics Research Group
University of Sheffield

On 16/03/2017 17:38, Curt Fischer wrote:
If you are looking for something quick and dirty, you could stay in 
numpy to calculate Tanimoto.


*from rdkit import Chem*
*from rdkit.Chem import AllChem*
*
*
*import numpy as np*
*from __future__ import division*
*
*
*mol1 = Chem.MolFromSmiles('CCO')*
*mol2 = Chem.MolFromSmiles('CCC')*
*
*
*fp1 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol1, 8), 
dtype='bool')*
*fp2 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol2, 8), 
dtype='bool')*

*
*
*def tanimoto(v1, v2):*
*"""*
*Calculates tanimoto similarity for two bit vectors*
*"""*
*return(np.bitwise_and(v1, v2).sum() / np.bitwise_or(v1, v2).sum())*
*
*
*tanimoto(fp1, fp2)*
*
*
*Out[4]:0.42857142857142855*

On Thu, Mar 16, 2017 at 7:28 AM, Thomas Evangelidis > wrote:


Hello,

I created a numpyarray from a molecule using the following function:

AllChem.GetMorganFingerprintAsBitVect()


Now I would like to convert back to bit vector the numpy array, in
order to calculate the Tanimoto similarity of two compounds. Is
this possible?

thanks
Thomas



-- 


==

Thomas Evangelidis

Research Specialist

CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/1S081,
62500 Brno, Czech Republic

email: tev...@pharm.uoa.gr 

teva...@gmail.com 


website: https://sites.google.com/site/thomasevangelidishomepage/





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Curt Fischer
If you are looking for something quick and dirty, you could stay in numpy
to calculate Tanimoto.

*from rdkit import Chem*
*from rdkit.Chem import AllChem*

*import numpy as np*
*from __future__ import division*

*mol1 = Chem.MolFromSmiles('CCO')*
*mol2 = Chem.MolFromSmiles('CCC')*

*fp1 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol1, 8),
dtype='bool')*
*fp2 = np.array(AllChem.GetMorganFingerprintAsBitVect(mol2, 8),
dtype='bool')*

*def tanimoto(v1, v2):*
*"""*
*Calculates tanimoto similarity for two bit vectors*
*"""*
*return(np.bitwise_and(v1, v2).sum() / np.bitwise_or(v1, v2).sum())*

*tanimoto(fp1, fp2)*

*Out[4]:0.42857142857142855*


On Thu, Mar 16, 2017 at 7:28 AM, Thomas Evangelidis 
wrote:

> Hello,
>
> I created a numpyarray from a molecule using the following function:
>
> AllChem.GetMorganFingerprintAsBitVect()
>
>
> Now I would like to convert back to bit vector the numpy array, in order
> to calculate the Tanimoto similarity of two compounds. Is this possible?
>
> thanks
> Thomas
>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr
>
>   teva...@gmail.com
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Francois BERENGER
Hi,

Here is a Python script that was created with the help of some
rdkit wizards:
https://github.com/UnixJunkie/mol2ecfp4

It works with unfolded ECFP4 fingerprints, so not exactly
what you are looking for.
There would be more modifications needed in order to fold
the fingerprint to the desired number of bits.

Regards,
Francois.

On 03/16/2017 09:28 AM, Thomas Evangelidis wrote:
> Hello,
>
> I created a numpyarray from a molecule using the following function:
>
> AllChem.GetMorganFingerprintAsBitVect()
>
>
> Now I would like to convert back to bit vector the numpy array, in order
> to calculate the Tanimoto similarity of two compounds. Is this possible?
>
> thanks
> Thomas
>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
>
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr 
>
>   teva...@gmail.com 
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Francois BERENGER
I'll send a Python script.
It works for .smi files.
If anyone can adapt it to work on sdf files, that would be wonderful.

Just give me 5mn to put it on github.

On 03/16/2017 09:28 AM, Thomas Evangelidis wrote:
> Hello,
>
> I created a numpyarray from a molecule using the following function:
>
> AllChem.GetMorganFingerprintAsBitVect()
>
>
> Now I would like to convert back to bit vector the numpy array, in order
> to calculate the Tanimoto similarity of two compounds. Is this possible?
>
> thanks
> Thomas
>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
>
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr 
>
>   teva...@gmail.com 
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss