Hi David,

a workaround could be adding a final check after the for loop:

#!/usr/bin/env python

from rdkit import Chem

suppl1 = Chem.SmilesMolSupplier('test1.smi', titleLine=False, nameColumn=1)
rec_num = 0
print("len(suppl1) = {0:d}".format(len(suppl1)))
for mol in suppl1:
    rec_num += 1
    if not mol:
        print('Record {} not read.'.format(rec_num))
    else:
        print('Record {} read ok.'.format(rec_num))
if (rec_num == len(suppl1) - 1):
    rec_num += 1
    print('Record {} not read.'.format(rec_num))


suppl2 = Chem.SmilesMolSupplier('test2.smi', titleLine=False, nameColumn=1)
rec_num = 0
print("len(suppl2) = {0:d}".format(len(suppl2)))
for mol in suppl2:
    rec_num += 1
    if not mol:
        print('Record {} not read.'.format(rec_num))
    else:
        print('Record {} read ok.'.format(rec_num))
if (rec_num == len(suppl2) - 1):
    rec_num += 1
    print('Record {} not read.'.format(rec_num))

This should work until what seems to be an issue in the SmilesSupplier is fixed.

Cheers,
p.

On 06/03/19 16:49, David Cosgrove wrote:
Hi,

I'm trying to catch the line numbers of lines in a SMILES file that aren't parsed by the SmilesMolSupplier.  Example code is attached, along with 2 SMILES files.  When there is a bad SMILES string on the last line, the error is not reported, as in test2.smi.  I've tried iterating through the file in a loop using next(suppl1) and catching the StopIteration exception, but I have the same issue.  Is there a way to spot a last bad record in a file?

Thanks,
Dave

--
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk





_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to