Hello everyone,

I have a small problem with saving a job. With the fingerprints of a
database of molecules, I made the clusters. It works, I see them, but *how
can I save it*?

>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> def ClusterFps(fps, cutoff=0.2):
...     from rdkit import DataStructs
...     from rdkit.ML.Cluster import Butina
...     dists=[]
...     nfps=len(fps)
...     for i in range(1, nfps):
...             sims=DataStructs.BulkTanimotoSimilarity(fps[i], fps [:i])
...             dists.extend([1-x for x in sims])
...     cs=Butina.ClusterData(dists, nfps, cutoff, isDistData=True)
...     return cs
...
>>> ms = [x for x in
Chem.SDMolSupplier(r'C:\Users\HP\100.sdf',removeHs=False)]
>>> len(ms)
100
>>> fps = [AllChem.GetMorganFingerprintAsBitVect(x,2,1024) for x in ms]
>>> clusters=ClusterFps(fps,cutoff=0.4)
>>> print(clusters[1])
(13, 4, 8)
>>> print(clusters)
((17, 15, 46), (13, 4, 8), (91, 53), (78, 76), (64, 42), (59, 58), (44,
43), (39, 38), (31, 30), (25, 24), (7,), (99,), (98,), (97,), (96,), (95,),
(94,), (93,), (92,), (90,), (89,), (88
,), (87,), (86,), (85,), (84,), (83,), (82,), (81,), (80,), (79,), (77,),
(75,), (74,), (73,), (72,), (71,), (70,), (69,), (68,), (67,), (66,),
(65,), (63,), (62,), (61,), (60,), (57,),
(56,), (55,), (54,), (52,), (51,), (50,), (49,), (48,), (47,), (45,),
(41,), (40,), (37,), (36,), (35,), (34,), (33,), (32,), (29,), (28,),
(27,), (26,), (23,), (22,), (21,), (20,), (19,
), (18,), (16,), (14,), (12,), (11,), (10,), (9,), (6,), (5,), (3,), (2,),
(1,), (0,))

*If I try to use:*

>>> np.savetxt("DB_Clusters", clusters, delimiter="     ")
Traceback (most recent call last):
  File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1447, in savetxt
    v = format % tuple(row) + newline
TypeError: must be real number, not tuple

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 6, in savetxt
  File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1451, in savetxt
    % (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format specifier
('%.18e')

*Then I thought I hadn't imported Numpy, but the problem was not resolved.*
>>> import numpy as np
>>> np.savetxt("DB_Clu.txt", clusters, delimiter="      ")
Traceback (most recent call last):
  File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1447, in savetxt
    v = format % tuple(row) + newline
TypeError: must be real number, not tuple

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 6, in savetxt
  File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1451, in savetxt
    % (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format specifier
('%.18e')


*The problem is that I can't use this function to save clusters?How can I
save the results with the clusters?*

Sorry for the trouble,

Best regards,
Francesco
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to