Re: [Rdkit-discuss] Catching errors in SMILES files

Greg Landrum Thu, 06 Jun 2019 06:08:01 -0700

For what it's worth, I think I fixed this (and cleared up some other
problems) in this PR:
https://github.com/rdkit/rdkit/pull/2482


On Tue, Jun 4, 2019 at 1:44 PM Paolo Tosco <[email protected]>
wrote:

> Hi David,
>
> I think I already have a fix for this bug, I'll submit a PR later. If you
> can create a ?GitHub issue it would be great so I can link my PR to the bug.
>
> Thanks, cheers
> p.
>
> On 06/04/19 12:10, David Cosgrove wrote:
>
> Hi Paolo,
> Many thanks for the speedy reply.  I'll do as you suggest for now.  Do you
> want me to file an issue on github, or even, maybe, see if I can fix it
> myself?
> Cheers,
> Dave
>
>
> On Mon, Jun 3, 2019 at 5:32 PM Paolo Tosco <[email protected]>
> wrote:
>
>> Hi David,
>>
>> a workaround could be adding a final check after the for loop:
>>
>> #!/usr/bin/env python
>>
>> from rdkit import Chem
>>
>> suppl1 = Chem.SmilesMolSupplier('test1.smi', titleLine=False,
>> nameColumn=1)
>> rec_num = 0
>> print("len(suppl1) = {0:d}".format(len(suppl1)))
>> for mol in suppl1:
>>     rec_num += 1
>>     if not mol:
>>         print('Record {} not read.'.format(rec_num))
>>     else:
>>         print('Record {} read ok.'.format(rec_num))
>> if (rec_num == len(suppl1) - 1):
>>     rec_num += 1
>>     print('Record {} not read.'.format(rec_num))
>>
>>
>> suppl2 = Chem.SmilesMolSupplier('test2.smi', titleLine=False,
>> nameColumn=1)
>> rec_num = 0
>> print("len(suppl2) = {0:d}".format(len(suppl2)))
>> for mol in suppl2:
>>     rec_num += 1
>>     if not mol:
>>         print('Record {} not read.'.format(rec_num))
>>     else:
>>         print('Record {} read ok.'.format(rec_num))
>> if (rec_num == len(suppl2) - 1):
>>     rec_num += 1
>>     print('Record {} not read.'.format(rec_num))
>>
>> This should work until what seems to be an issue in the SmilesSupplier is
>> fixed.
>>
>> Cheers,
>> p.
>>
>> On 06/03/19 16:49, David Cosgrove wrote:
>>
>> Hi,
>>
>> I'm trying to catch the line numbers of lines in a SMILES file that
>> aren't parsed by the SmilesMolSupplier.  Example code is attached, along
>> with 2 SMILES files.  When there is a bad SMILES string on the last line,
>> the error is not reported, as in test2.smi.  I've tried iterating through
>> the file in a loop using next(suppl1) and catching the StopIteration
>> exception, but I have the same issue.  Is there a way to spot a last bad
>> record in a file?
>>
>> Thanks,
>> Dave
>>
>> --
>> David Cosgrove
>> Freelance computational chemistry and chemoinformatics developer
>> http://cozchemix.co.uk
>>
>>
>>
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing 
>> [email protected]https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>>
>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Catching errors in SMILES files

Reply via email to