Re: [Rdkit-discuss] Using SmilesMolSuplier with CSV containing quotemarks

2022-01-10 Thread James Wallace
As I understand it, the pandas CSV parsing can handle it automatically when it makes the DataFrame, so it would effectively handle this approach without intervention. it's just a case of standardisation, although if it wasn't for the fact I don't really want to rewrite the file without the quotes,

Re: [Rdkit-discuss] Using SmilesMolSuplier with CSV containing quotemarks

2022-01-10 Thread Tim Dudgeon
I think you have to add a step that removes the quote marks if they are present? Tim On Mon, Jan 10, 2022 at 10:15 AM James Wallace wrote: > As the subject suggests, I'm trying to find a universal solution for > reading CSVs via the SmilesMolSupplier (as the input setup could be single > column

Re: [Rdkit-discuss] Using SmilesMolSuplier with CSV containing quotemarks

2022-01-10 Thread James Wallace
I figured as much, and I guess in my case the pandas side will be useful enough for this, thanks. On Mon, 10 Jan 2022 at 14:05, Greg Landrum wrote: > Hi James, > > The RDKit does not have a full-featured CSV parser, writing such a thing > is a non-trivial task. If you need to support general CSV

Re: [Rdkit-discuss] Using SmilesMolSuplier with CSV containing quotemarks

2022-01-10 Thread Greg Landrum
Hi James, The RDKit does not have a full-featured CSV parser, writing such a thing is a non-trivial task. If you need to support general CSV, I'd suggest using pandas or python's builtin csv module... it seems like overkill, but dealing with all the oddness that can show up in CSVs is really not e

[Rdkit-discuss] Using SmilesMolSuplier with CSV containing quotemarks

2022-01-10 Thread James Wallace
As the subject suggests, I'm trying to find a universal solution for reading CSVs via the SmilesMolSupplier (as the input setup could be single column or multiple column, using the pandas tools for interconversion is overkill) The general structure I use for analysing the CSV is: with open(chem_