On Dec 17, 2016, at 1:45 AM, Milinda Samaraweera wrote:
> However at the end of each tag header I noticed there is a number (bolded):
>
> ...
> > <Name_IUPAC_CAS> (1)
> N1-(2-ethylbutyl)hexane-1,3,6-triamine
...
> What is this number and how you avoid printing this number when SDwriter is
> used? As this number is not found in standard SD files.
Many programs do not generate a term in parentheses, although it it allowed by
the connection table specification as a way to designate an "external registry
number".
The ctfile.pdf I have from 2011 says:
• Note: The > sign is a reserved character. A field name cannot contain
hyphen (-),
period (.), less than (<), greater than (>), equal sign (=), percent sign
(%) or
blank space ( ). Field names must begin with an alpha character and can
contain
alpha and numeric characters after that, including underscore.
Optional information for the data header includes:
• The compound’s external and internal registry numbers.
External registry numbers must be enclosed in parentheses.
• Any combination of information
The following are examples of valid data headers:
> <MELTING_POINT>
> 55 (MD-08974) <BOILING_POINT> DT12
> DT12 55
> (MD-0894) <BOILING_POINT> FROM ARCHIVES
As you have discovered, RDKit stores the output record number for each molecule
in this field.
I see no way to disable that through the API. The two options I can suggest for
now are:
1) Implement your own writer using MolToMolBlock() to generate the connection
table text and your own code to enumerate through the properties. The result
looks something like:
def mol_to_sd_block(mol):
block = Chem.MolToMolBlock(mol)
lines = [block]
for name in mol.GetPropNames():
lines.append("> <%s>\n%s\n\n" % (name, mol.GetProp(name)))
lines.append("$$$$\n")
return "".join(lines)
2) Maintain your own fork where you've deleted like 87 or so of
Code/GraphMol/FileParsers/SDWriter.cpp where it says:
if (d_molid >= 0) (*dp_ostream) << "(" << d_molid + 1 << ") ";
Cheers,
Andrew
[email protected]
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss