Hey you lovely people, as I am creating a set of building blocks for my in-silico reaction, I downloaded various accessible databases (ChemBL28, GDB13, GDB17, Pubchem, emolecules and mcule) and want to just work through them with "HasSubstructMatch". Unfortunately I run into a "File parsing error: ran out of lines" I open the .smi files as SmilesMolSupplier and then just for loop through them:
with open(target_file, "w") as outfile: suppl = Chem.SmilesMolSupplier(infile, sanitize=False, nameColumn=-1) for mol in suppl: if Descriptors.MolWt(mol) <= mwt: if mol.HasSubstructMatch(pattern1) == True: mol = Chem.MolToSmiles(mol) outfile.write(mol + "\n") else: continue else: continue I can imagine that it possibly has something to do with the length of the files, but I don't know how to actually fix that. Thanks for all your help! Kind regards Philipp
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss