On Dec 17, 2016, at 1:45 AM, Milinda Samaraweera wrote: > However at the end of each tag header I noticed there is a number (bolded): > > ... > > <Name_IUPAC_CAS> (1) > N1-(2-ethylbutyl)hexane-1,3,6-triamine ... > What is this number and how you avoid printing this number when SDwriter is > used? As this number is not found in standard SD files.
Many programs do not generate a term in parentheses, although it it allowed by the connection table specification as a way to designate an "external registry number". The ctfile.pdf I have from 2011 says: • Note: The > sign is a reserved character. A field name cannot contain hyphen (-), period (.), less than (<), greater than (>), equal sign (=), percent sign (%) or blank space ( ). Field names must begin with an alpha character and can contain alpha and numeric characters after that, including underscore. Optional information for the data header includes: • The compound’s external and internal registry numbers. External registry numbers must be enclosed in parentheses. • Any combination of information The following are examples of valid data headers: > <MELTING_POINT> > 55 (MD-08974) <BOILING_POINT> DT12 > DT12 55 > (MD-0894) <BOILING_POINT> FROM ARCHIVES As you have discovered, RDKit stores the output record number for each molecule in this field. I see no way to disable that through the API. The two options I can suggest for now are: 1) Implement your own writer using MolToMolBlock() to generate the connection table text and your own code to enumerate through the properties. The result looks something like: def mol_to_sd_block(mol): block = Chem.MolToMolBlock(mol) lines = [block] for name in mol.GetPropNames(): lines.append("> <%s>\n%s\n\n" % (name, mol.GetProp(name))) lines.append("$$$$\n") return "".join(lines) 2) Maintain your own fork where you've deleted like 87 or so of Code/GraphMol/FileParsers/SDWriter.cpp where it says: if (d_molid >= 0) (*dp_ostream) << "(" << d_molid + 1 << ") "; Cheers, Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss