Hi guys,

I am trying to construct my own symmetrical RMS matrix (lower half) for
Butina clustering, by using GetBestRMS which considers symmetry.

I need to get the matrix with rms elements in correct order first. Here is
what I did for testing, by just using GetConformerRMSMatrix and
GetConformerRMS:


from rdkit import Chem
from rdkit.Chem import AllChem

mh = Chem.AddHs(Chem.MolFromSmiles('OCCCN'))
cids = AllChem.EmbedMultipleConfs(mh, numConfs=5, maxAttempts=1000,
                                  pruneRmsThresh=0.5, numThreads=0,
randomSeed=1)
m = Chem.RemoveHs(mh)
mat_a = AllChem.GetConformerRMSMatrix(m, prealigned=False)
print(mat_a)

mat_b = []
count = len(cids)
for i in range(count - 1):
    for j in range(i + 1, count):
         mat_b.append(AllChem.GetConformerRMS(m, cids[i], cids[j]))
print(mat_b)


mat_a:
[0.660379357470512, 0.5803507133538487, 0.8111033830159597,
0.7063747192537949, 0.10437239857420268, 0.8858184043706921,
0.9292367217722529, 0.87
2233146598343, 0.6451929254710606, 0.9110647560331953]
mat_b:
[0.660379357470513, 0.5803507133538501, 0.7063747192537968,
0.929236721772254, 0.7045981421188982, 0.09521549761836234,
0.6494273777558387, 0.766
7663565750649, 0.6265013024617176, 0.6467365004737882]

You see the two matrices do not match. Apparently, my mat_b gives me this
rms list: [01, 02, 03, 04, 12, 13, 14, 23, 24, 34] (numbers are id pairs)

According to the documentation, GetConformerRMSMatrix should give me the
following matrix and so the list [ a, b, c, d, e, f, g, h, i,  j ]:
rmsmatrix = [ a,
                      b, c,
                      d, e, f,
                      g, h, i, j ]

After assign the id numbers:
rmsmatrix = [    0, 1, 2, 3, 4
                    0
                    1   a,
                    2   b, c,
                    3   d, e, f,
                    4   g, h, i,  j     ]
So the mat_a from GetConformerRMSMatrix should be:
[ a,   b,   c,   d,   e,   f,    g,    h,   i,    j  ] =
[01, 02, 12, 03, 13, 23, 04, 14, 24, 34 ]

This might tell the differences between mat_a and mat_b. But still, some
numbers are very different, even after reordering manually. I cannot figure
out why.
Did I miss anything important? I am new to RDKit. Can anyone help me with
this? Thanks a lot!

Best,
Leon
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to