On Friday, 22 January 2016 at 01:36:40 UTC, cym13 wrote:
On Friday, 22 January 2016 at 01:27:13 UTC, H. S. Teoh wrote:
And now that you mention this, RFC-4180 does not allow doubled quotes in an unquoted field. I'll take that out of the code (it improves performance :-D).

Right, re-reading the RFC would have been a great thing. That said I saw that kind of CSV in the real world, so I don't know what to think of it. I'm not saying it should be supported, but I wonder if there are points outside RFC-4180 that are taken for granted.

You have to understand CSV didn't come from a standard. People started using because it was simple for writing out some tabular data. Then they changed it because their data changed. It's not like their language came with a CSV parser, it was always hand written and people still do it today. And that is why data is delimited with so many things not comma (people thought they wouldn't need to escape their data).

So yes, some CSV parsers will accept comments but that just means it breaks for people that have # in their data. Yeah, you can assume that two double quotes in unquoted data is just a quote, but then it breaks for those who have that kind of data which isn't escaped.

There is also many other issues with CSV data, like is the file in ASCII or UTF or some other code page. And many times CSV isn't well formed because the data was output without proper escaping.

std.csv isn't the end-all csv parsers, but it will at least handle well formed CSV that use different separators or quotes.

Reply via email to