Well, I'm not really familiar with the Taylor-Butina clustering method, so
I'm proposing a methodology based on generalizing something that I found to
be useful in a somewhat different clustering context.
Presuming that what you are clustering is the fingerprints of structures,
and that you know
I was very happy to hear about the integration of MolVS into RDKit core
in the talk by Susan Leung at the recent UGM.
https://github.com/rdkit/UGM_2018/blob/master/Presentations/Leung_GSoC_RDKit-MolVS_Integration.pdf
This is going to be incredibly useful once it gets released.
To help with
On Sep 21, 2018, at 14:53, Philipp Thiel
wrote:
> you probably read about the Tanimoto being a proper metric in case of having
> binary data
> in Leach and Gillet 'Introduction to Chemoinformatics' chapter 5.3.1 in the
> revised edition.
What we call Tanimoto is more broadly known as the
(I see that I accidentally responded to Andrew, only, earlier; I'm copying
to the group this time.)
FWIW, in work on conformational clustering, I used the “most
representative” molecule; that is, the real molecule closest to the
mathematical centroid. This would probably be the best way of
Hi Colin,
The RDkit outputs charge information to mol blocks using the CHG line:
In [3]: m = Chem.MolFromSmiles('C[NH3+]')
In [4]: print(Chem.MolToMolBlock(m))
RDKit 2D
2 1 0 0 0 0 0 0 0 0999 V2000
0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0
Well yes I have this line indeed, I did not put the whole file for
clarity purpose. The thing is tools as MOE, Pymol read it without
problem but RDock for example can't read it properly and returns a
neutral N which is not the case. And if I open it with pymol and save it
back in mol format,
Hello,
Awhile back I had noticed that rdkit has issues with boron containing
compounds. One is below, and I admit it is a strange one. I read in an sdf
file and write it out after calculating a formal charge on the molecule.
It seems to be read into rdkit ok but writing errored out with
Hey everyone,
I have a question concerning the Chem.MolToMolFile() function.
When I open this file containing a N+ (here is the line corresponding in
the mol file) :
11.37003.4360 -11.8300 N 0 3 0 0 0 0 0 0 0 0 0 0
And I just save it back withotu any modification, the
On Sep 25, 2018, at 17:13, Peter S. Shenkin wrote:
> FWIW, in work on conformational clustering, I used the “most representative”
> molecule; that is, the real molecule closest to the mathematical centroid.
> This would probably be the best way of displaying a single molecule that
> typifies
9 matches
Mail list logo