Dear RDKit community,
I was treating AllChem.GetMorganFingerprint(m1,2) the same as ECFP4. I am
writing a paper for a open source tool, so I need to be very accurate. I
have seen one open source implementation for ECFP, which is from CDK. Most
researchers are using Pipeline Pilot to calculate
fingerprints are binary, thus can be stored as np.bool_, which
compared to double should be 64 times more memory efficient.
Best,
Maciej
Pozdrawiam, | Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl
2015-08-27 16:15 GMT+02:00 Jing Lu ajin...@gmail.com:
Hi Greg,
Thanks
Hi Greg,
Thanks! It works! But, is that possible to fold the fingerprint to smaller
size? np.zeros((100,2048)) still takes a lot of memory...
Best,
Jing
On Wed, Aug 26, 2015 at 11:02 PM, Greg Landrum greg.land...@gmail.com
wrote:
On Thu, Aug 27, 2015 at 3:00 AM, Jing Lu ajin
/bayon/
It's not function of RDKit, but I think the library can cluster molecules
using ECFP4.
Unfortunately, input file format of bayon is not distance matrix but easy
to prepare the format.
Best regards.
Takayuki
2015年8月23日(日) 12:03 Jing Lu ajin...@gmail.com:
Currently, I prefer
Dear RDKit users,
If I want to cluster more than 1M molecules by ECFP4. How could I do it? If
I calculate the distance between every pair of molecules, the size of
distance matrix will be too big. Does RDKit support any heuristic
clustering algorithm without calculating the distance matrix of the
5 matches
Mail list logo