Thanks Paolo, that's fantastic. The first option was what I needed. Tim On Thu, Nov 4, 2021 at 4:36 PM Paolo Tosco <paolo.tosco.m...@gmail.com> wrote:
> Hi Tim, > > if you need access to the original text, you'll have to do the chunking > yourself, e.g.: > > import gzip > > def molgen(hnd): > mol_text_tmp = "" > while 1: > line = hnd.readline() > if not line: > return > line = line.decode("utf-8") > mol_text_tmp += line > if line.startswith("$$$$"): > mol_text = mol_text_tmp > mol_text_tmp = "" > yield mol_text > > with gzip.open("yourfile.sdf.gz", "rb") as gzip_hnd: > for mol_text in molgen(gzip_hnd): > print(mol_text) > suppl = Chem.SDMolSupplier() > suppl.SetData(mol_text) > mol = next(suppl) > print(mol.GetNumAtoms()) > print("------------------") > > If you are happy with the RDKit-generated text, you can combine the > ForwardSDMolSupplier with the SDWriter: > > import gzip > from io import StringIO > > with gzip.open("yourfile.sdf.gz", "rb") as gzip_hnd: > with Chem.ForwardSDMolSupplier(gzip_hnd) as suppl: > for mol in suppl: > buf = StringIO() > with Chem.SDWriter(buf) as w: > w.write(mol) > print(buf.getvalue()) > print(mol.GetNumAtoms()) > print("------------------") > > Cheers, > p. > > On Thu, Nov 4, 2021 at 5:09 PM Tim Dudgeon <tdudgeon...@gmail.com> wrote: > >> I am needing to access the text of each record of a SDF, as well as >> creating a mol instance. >> I was successfully doing this using SDMolSupplier.GetItemText(). >> Then I needed to switch to handling gzipped SD files, and SDMolSupplier >> can only take a file name in its constructor. >> ForwardSDMolSupplier can handle a gzip file-like instance, but doesn't >> have the GetItemText() function. >> Reading the file records as text is easy enough, but I can't figure out >> how to get the SD file properties (Chem.MolFromMolBlock() does not handle >> the properties). >> >> Seems like there should be an easy way to handle this that I'm not seeing! >> >> Tim >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss