On Thursday, 21 January 2016 at 21:24:49 UTC, H. S. Teoh wrote:
[snip]
There are some limitations to this approach: while the current
code does try to unwrap quoted values in the CSV, it does not
correctly parse escaped double quotes ("") in the fields. This
is because to process those values correctly we'd have to copy
the field data into a new string and construct its interpreted
value, which is slow. So I leave it as an exercise for the
reader to implement (it's not hard, when the double
double-quote sequence is detected, allocate a new string with
the interpreted data instead of slicing the original data.
Either that, or just unescape the quotes in the application
code itself).
What about wrapping the slices in a range-like interface that
would unescape the quotes on demand? You could even set a flag on
it during the initial pass to say the field has double quotes
that need to be escaped so it doesn't need to take a per-pop
performance hit checking for double quotes (that's probably a
pretty minor boost, if any, though).