Hello everyone, I have a small problem with saving a job. With the fingerprints of a database of molecules, I made the clusters. It works, I see them, but *how can I save it*?
>>> from rdkit import Chem >>> from rdkit.Chem import AllChem >>> def ClusterFps(fps, cutoff=0.2): ... from rdkit import DataStructs ... from rdkit.ML.Cluster import Butina ... dists=[] ... nfps=len(fps) ... for i in range(1, nfps): ... sims=DataStructs.BulkTanimotoSimilarity(fps[i], fps [:i]) ... dists.extend([1-x for x in sims]) ... cs=Butina.ClusterData(dists, nfps, cutoff, isDistData=True) ... return cs ... >>> ms = [x for x in Chem.SDMolSupplier(r'C:\Users\HP\100.sdf',removeHs=False)] >>> len(ms) 100 >>> fps = [AllChem.GetMorganFingerprintAsBitVect(x,2,1024) for x in ms] >>> clusters=ClusterFps(fps,cutoff=0.4) >>> print(clusters[1]) (13, 4, 8) >>> print(clusters) ((17, 15, 46), (13, 4, 8), (91, 53), (78, 76), (64, 42), (59, 58), (44, 43), (39, 38), (31, 30), (25, 24), (7,), (99,), (98,), (97,), (96,), (95,), (94,), (93,), (92,), (90,), (89,), (88 ,), (87,), (86,), (85,), (84,), (83,), (82,), (81,), (80,), (79,), (77,), (75,), (74,), (73,), (72,), (71,), (70,), (69,), (68,), (67,), (66,), (65,), (63,), (62,), (61,), (60,), (57,), (56,), (55,), (54,), (52,), (51,), (50,), (49,), (48,), (47,), (45,), (41,), (40,), (37,), (36,), (35,), (34,), (33,), (32,), (29,), (28,), (27,), (26,), (23,), (22,), (21,), (20,), (19, ), (18,), (16,), (14,), (12,), (11,), (10,), (9,), (6,), (5,), (3,), (2,), (1,), (0,)) *If I try to use:* >>> np.savetxt("DB_Clusters", clusters, delimiter=" ") Traceback (most recent call last): File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", line 1447, in savetxt v = format % tuple(row) + newline TypeError: must be real number, not tuple During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<__array_function__ internals>", line 6, in savetxt File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", line 1451, in savetxt % (str(X.dtype), format)) TypeError: Mismatch between array dtype ('object') and format specifier ('%.18e') *Then I thought I hadn't imported Numpy, but the problem was not resolved.* >>> import numpy as np >>> np.savetxt("DB_Clu.txt", clusters, delimiter=" ") Traceback (most recent call last): File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", line 1447, in savetxt v = format % tuple(row) + newline TypeError: must be real number, not tuple During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<__array_function__ internals>", line 6, in savetxt File "C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py", line 1451, in savetxt % (str(X.dtype), format)) TypeError: Mismatch between array dtype ('object') and format specifier ('%.18e') *The problem is that I can't use this function to save clusters?How can I save the results with the clusters?* Sorry for the trouble, Best regards, Francesco
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss