Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Thomas Evangelidis
Great, thank you! Btw, does RDKit offer any scalar vector similarity functions apart from the bit vector similarities? On Thu, 14 Nov 2019 at 16:48, Greg Landrum wrote: > Yep, that's about 7x faster than what I came up with. > Thanks Maciek! > > -greg > > > On Thu, Nov 14, 2019 at 4:35 PM

Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Greg Landrum
Yep, that's about 7x faster than what I came up with. Thanks Maciek! -greg On Thu, Nov 14, 2019 at 4:35 PM Maciek Wójcikowski wrote: > Hi Thomas, > > You could also use SetBitsFromList() method: > >> bv.SetBitsFromList(np.where(ar)[0].tolist()) >> > > > Pozdrawiam, | Best regards, >

Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Maciek Wójcikowski
Hi Thomas, You could also use SetBitsFromList() method: > bv.SetBitsFromList(np.where(ar)[0].tolist()) > Pozdrawiam, | Best regards, Maciek Wójcikowski mac...@wojcikowski.pl czw., 14 lis 2019 o 16:28 Greg Landrum napisał(a): > Hi Thomas, > > There may be more efficient ways to do

Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Greg Landrum
Hi Thomas, There may be more efficient ways to do this, but here's something that works (and isn't the slowest thing I came up with): def np_to_bv(fv): bv = DataStructs.ExplicitBitVect(len(fv)) for i,v in enumerate(fv): if v: bv.SetBit(i) return bv -greg On Thu,

Re: [Rdkit-discuss] numpy array to bit vector

2019-11-14 Thread Thomas Evangelidis
Greetings, I am opening this old thread again for someone to answer my initial question this time, which was "How do I convert numpy.ndarray objects to rdkit.DataStructs.ExplicitBitVect objects?". At the time I asked the question I circumvented the problem by calculating Tanimoto similarities

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-17 Thread Thomas Evangelidis
Guys, my question was how to cast a fingerprint in the form of a binary array back to the bit vector form, in order to calculate Tanimoto distances. According to Curt's answer (thanks for that!), I can calculate the Tanimono simply by using binary arrays. distance.jaccard also works with numpy

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-17 Thread Curt Fischer
Hi Greg, On Thu, Mar 16, 2017 at 9:05 PM, Greg Landrum wrote: > I'm a bit confused by all this. The RDKit has Tanimoto (and a bunch of > other similarity measures) built in: > > Good point (as always). I'd been assuming that for some reason that OP had fingerprints that

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Greg Landrum
I'm a bit confused by all this. The RDKit has Tanimoto (and a bunch of other similarity measures) built in: In [6]: from rdkit import DataStructs In [7]: fp1 = rdMolDescriptors.GetMorganFingerprintAsBitVect(theobromine,2,2048) In [8]: fp2 =

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread matthew
I don't think you even need to cast them to numpy arrays if you use scipy. It should be able to take bit arrays. Also, jaccard distance is another name for tanimoto distance. This simplifies the code above: *from __future__ import print_function from rdkit import Chem* *from rdkit.Chem import

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Curt Fischer
If you are looking for something quick and dirty, you could stay in numpy to calculate Tanimoto. *from rdkit import Chem* *from rdkit.Chem import AllChem* *import numpy as np* *from __future__ import division* *mol1 = Chem.MolFromSmiles('CCO')* *mol2 = Chem.MolFromSmiles('CCC')* *fp1 =

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Francois BERENGER
Hi, Here is a Python script that was created with the help of some rdkit wizards: https://github.com/UnixJunkie/mol2ecfp4 It works with unfolded ECFP4 fingerprints, so not exactly what you are looking for. There would be more modifications needed in order to fold the fingerprint to the desired

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Francois BERENGER
I'll send a Python script. It works for .smi files. If anyone can adapt it to work on sdf files, that would be wonderful. Just give me 5mn to put it on github. On 03/16/2017 09:28 AM, Thomas Evangelidis wrote: > Hello, > > I created a numpyarray from a molecule using the following function: > >