As I understand it, the pandas CSV parsing can handle it automatically when
it makes the DataFrame, so it would effectively handle this approach
without intervention. it's just a case of standardisation, although if it
wasn't for the fact I don't really want to rewrite the file without the
quotes, I'd just handle it before it got to this stage in a sanitisation. I
mainly raised the question in case there was a quotechar feature I was
missing.

On Mon, 10 Jan 2022 at 14:10, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> I think you have to add a step that removes the quote marks if they are
> present?
> Tim
>
> On Mon, Jan 10, 2022 at 10:15 AM James Wallace <jeawall...@gmail.com>
> wrote:
>
>> As the subject suggests, I'm trying to find a universal solution for
>> reading CSVs via the SmilesMolSupplier (as the input setup could be single
>> column or multiple column, using the pandas tools for interconversion is
>> overkill)
>>
>> The general structure I use for analysing the CSV is:
>>
>>
>> with open(chem_file_name, "r") as csv_upload_file:
>>             first_line = csv_upload_file.readline()
>>             dialect = sniffer.sniff(first_line)
>>             has_header = sniffer.has_header(first_line)
>>             csv_upload_file.close()
>>
>> supplier = Chem.SmilesMolSupplier(chem_file_name,
>> delimiter=str(dialect.delimiter), smilesColumn=smi_col_header,
>> nameColumn=-1, titleLine=has_header)
>>
>> If I use a CSV without quoted data,, this is fine, I can autodetect the
>> delimiter, the column header is loaded in by the rest of my workflow,
>> everything else is worked out through the CSV sniffer. However, where it is
>> quoted data, the actual parsing will fail because of the quotemarks.
>>
>> [10:09:56] SMILES Parse Error: syntax error for input: '"C1=CC=CC=C1"'
>> [10:09:56] ERROR: Smiles parse error on line 1
>>
>> Is there some easy way of handling this, or do I have to mandate not
>> using quoting of data in the CSV generation?
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to