Hi Lorenzo,

As you've discovered, GetEuclideanDistMat() just returns one diagonal of
the matrix.
I haven't tried to convert this back into an actual symmetric matrix (at
least I don't think I have), but it does look like using np.tri works. That
only sets the lower diagonal, so you also need to add on the transpose.
Maybe try something like this (I've also simplified the calculation of n):

lower = GetEuclideanDistMat(descriptors.values)
n = len(descriptors.values)
mask = np.tri(n, dtype=bool, k=-1)
distances = np.zeros((n, n), dtype=float)
distances[mask] = lower
distances += distances.transpose()


Note that if you have scikit learn installed, it's *much* easier to use:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.euclidean_distances.html

-greg


On Fri, Oct 4, 2019 at 5:14 PM Lorenzo Fabbri via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> I have a matrix of descriptors and I want to use GetEuclideanDistMat to
> get the pairwise Euclidean distances. Once I compute it, I need to create a
> full matrix (number of compounds x number of compounds) from the 1D vector.
> I’m currently using
>
> lower = GetEuclideanDistMat(descriptors.values)
> n = int(np.sqrt(len(lower)*2)) + 1
> mask = np.tri(n, dtype=bool, k=-1)
> distances = np.zeros((n, n), dtype=float)
> distances[mask] = lower
>
> It seems to be working but I’m getting some weird results (very small
> distances for very different compounds), so I’m guessing I’m doing
> something wrong with the code above. Any suggestion?
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to