Hi Francesco,
np.savetxt() expects an array of numbers, while you have an array of
tuples (i.e., the individual clusters), hence the error.
You don't actually need numpy to save an array in human-readable, text
format. You might store your array in JSON format:
import json
with open("c:/temp/clusters.json", "w") as hnd:
json.dump(clusters, hnd)
And than restore it as a tuple of tuples, as it originally was:
clusters = None
with open("c:/temp/clusters.json", "r") as hnd:
clusters = tuple(map(tuple, json.load(hnd)))
You might also store the array in its string representation...
from ast import literal_eval
with open("c:/temp/clusters.txt", "w") as hnd:
hnd.write(str(clusters) + "\n")
...and then restore it using ast.literal_eval():
clusters = None
with open("c:/temp/clusters.txt", "r") as hnd:
clusters = literal_eval(hnd.read())
HTH, cheers,
p.
On 23/03/2020 13:29, Francesco Coppola wrote:
Hello everyone,
I have a small problem with saving a job. With the fingerprints of a
database of molecules, I made the clusters. It works, I see them, but
*how can I save it*?
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> def ClusterFps(fps, cutoff=0.2):
... from rdkit import DataStructs
... from rdkit.ML.Cluster import Butina
... dists=[]
... nfps=len(fps)
... for i in range(1, nfps):
... sims=DataStructs.BulkTanimotoSimilarity(fps[i], fps [:i])
... dists.extend([1-x for x in sims])
... cs=Butina.ClusterData(dists, nfps, cutoff, isDistData=True)
... return cs
...
>>> ms = [x for x in
Chem.SDMolSupplier(r'C:\Users\HP\100.sdf',removeHs=False)]
>>> len(ms)
100
>>> fps = [AllChem.GetMorganFingerprintAsBitVect(x,2,1024) for x in ms]
>>> clusters=ClusterFps(fps,cutoff=0.4)
>>> print(clusters[1])
(13, 4, 8)
>>> print(clusters)
((17, 15, 46), (13, 4, 8), (91, 53), (78, 76), (64, 42), (59, 58),
(44, 43), (39, 38), (31, 30), (25, 24), (7,), (99,), (98,), (97,),
(96,), (95,), (94,), (93,), (92,), (90,), (89,), (88
,), (87,), (86,), (85,), (84,), (83,), (82,), (81,), (80,), (79,),
(77,), (75,), (74,), (73,), (72,), (71,), (70,), (69,), (68,), (67,),
(66,), (65,), (63,), (62,), (61,), (60,), (57,),
(56,), (55,), (54,), (52,), (51,), (50,), (49,), (48,), (47,), (45,),
(41,), (40,), (37,), (36,), (35,), (34,), (33,), (32,), (29,), (28,),
(27,), (26,), (23,), (22,), (21,), (20,), (19,
), (18,), (16,), (14,), (12,), (11,), (10,), (9,), (6,), (5,), (3,),
(2,), (1,), (0,))
*If I try to use:*
*
*
>>> np.savetxt("DB_Clusters", clusters, delimiter=" ")
Traceback (most recent call last):
File
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1447, in savetxt
v = format % tuple(row) + newline
TypeError: must be real number, not tuple
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 6, in savetxt
File
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1451, in savetxt
% (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format
specifier ('%.18e')
*Then I thought I hadn't imported Numpy, but the problem was not
resolved.*
>>> import numpy as np
>>> np.savetxt("DB_Clu.txt", clusters, delimiter=" ")
Traceback (most recent call last):
File
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1447, in savetxt
v = format % tuple(row) + newline
TypeError: must be real number, not tuple
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 6, in savetxt
File
"C:\Anaconda3\envs\py37_rdkit\lib\site-packages\numpy\lib\npyio.py",
line 1451, in savetxt
% (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format
specifier ('%.18e')
*The problem is that I can't use this function to save clusters?
How can I save the results with the clusters?*
*
*
Sorry for the trouble,
Best regards,
Francesco
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss