Re: [Rdkit-discuss] GetSubstructMatch bug? + mol depiction issue

2021-11-04 Thread Greg Landrum
Yeah, this is definitely a bug. @Ling: thanks for raising it. @Ivan: thanks for simplifying the example! I created a github issue here: https://github.com/rdkit/rdkit/issues/4674 -greg On Thu, Nov 4, 2021 at 11:33 PM Ivan Tubert-Brohman < ivan.tubert-broh...@schrodinger.com> wrote: > That does

Re: [Rdkit-discuss] GetSubstructMatch bug? + mol depiction issue

2021-11-04 Thread Wim Dehaen
Re: your second issue with the 2D depiction, especially for these kinds of bicyclic structures, the Schrödinger coordgen that is a part of RDKit gives better looking results(at the cost of a somewhat longer running time). You can turn it on using: from rdkit.Chem import rdDepictor rdDepictor.SetPre

Re: [Rdkit-discuss] GetSubstructMatch bug? + mol depiction issue

2021-11-04 Thread Ivan Tubert-Brohman
That does seem like a bug. You can also see it without involving DeleteSubstructs, by starting from different SMILES representations of the same molecule: >>> m1 = Chem.MolFromSmiles('FC12C31C32F') >>> m2 = Chem.MolFromSmiles('C12C31C32') >>> m3 = Chem.MolFromSmiles('C1CC2C3C(C1)C23')

Re: [Rdkit-discuss] Reading text records from SDF from gzipped files

2021-11-04 Thread Andrew Dalke
Hi Tim, You might also consider using chemfp, which has this sort of functionality available through its toolkit wrapper API: from chemfp import rdkit_toolkit as T import itertools with T.read_ids_and_molecules("chembl_28.sdf.gz") as reader: loc = reader.location for id, mol in itertools.

Re: [Rdkit-discuss] Reading text records from SDF from gzipped files

2021-11-04 Thread Tim Dudgeon
Thanks Paolo, that's fantastic. The first option was what I needed. Tim On Thu, Nov 4, 2021 at 4:36 PM Paolo Tosco wrote: > Hi Tim, > > if you need access to the original text, you'll have to do the chunking > yourself, e.g.: > > import gzip > > def molgen(hnd): > mol_text_tmp = "" > whi

Re: [Rdkit-discuss] Reading text records from SDF from gzipped files

2021-11-04 Thread Paolo Tosco
Hi Tim, if you need access to the original text, you'll have to do the chunking yourself, e.g.: import gzip def molgen(hnd): mol_text_tmp = "" while 1: line = hnd.readline() if not line: return line = line.decode("utf-8") mol_text_tmp += line

[Rdkit-discuss] Reading text records from SDF from gzipped files

2021-11-04 Thread Tim Dudgeon
I am needing to access the text of each record of a SDF, as well as creating a mol instance. I was successfully doing this using SDMolSupplier.GetItemText(). Then I needed to switch to handling gzipped SD files, and SDMolSupplier can only take a file name in its constructor. ForwardSDMolSupplier ca