Expanding on what Jay replied to:
Hi,
well we could discuss endlessly, what well formed CSV files are.
Not really, the rules are pretty set in stone, it isn't open to interpretation.
Given that we cannot fix the generation of the CSV file, why not making the
importer a little bit smarter?
The .import command already treats " as literal data, when it doesn't appear at the
beginning of the cell, but it requires an even number of " in the same cell to don't
get confused in breaking up the row into cells.
All I ask for is to treat an odd number of " in a cell as literal data except when
the rule for dequoting applies, i. e. when the cell starts with a ".
This is impossible - How do you know where the end of a cell is if you are
ignoring/counting Quotes???
For instance, what is the resulting fields from this CSV: [123, 9" Nail, ", Caliper, "set, up
5"", hold", 8 ]
In REAL CSV terms that would be: 123 | 9 Nail, | Caliper | set, up 5",
hold | 8
How would you want that to be interpreted? Once we encounter the first quote, how do we know at which quote the field ends? How do
we know how many quotes are in the field? Or do we just ignore quotes altogether and break on every comma - in which case the above
becomes: 123 | 9" Nail | " | Caliper | "set | up 5"" | hold" | 8
Surely it's easy to see how this is incorrect?
How about your other suggestion about taking fields starting with quotes to be quoted in pairs (even numbers) but otherwise ignoring
odd numbered quotes - how do we know at which quote to stop counting?
Going strictly by your suggestion It might be interpreted as: 123 | 9" Nail | , Caliper,
| up 5"" | hold" | 8 |
or maybe even as: 123 | 9" Nail | , Caliper, set | up 5"" | hold" | 8
| (the interpreter would never be sure)
The interpreter working like this is not "a little bit smarter" but rather is
very much non-conforming.
You have a very specific case of weirdly created data - I already offered help to give you an interpreter specific to it and will be
happy to, but there is no way the SQlite can be altered like this, it will read all other CSV files wrong - and making a case
specifc SQLite is possible (you can alter the C code and compile your own easily), but all future updates will need this adjustment.
Submitting a patch will require your version to work on all other CSV file too - which as desribed above - it won't.
is there no way to fix the output? Or use an in-between step of converting the data before import from the weird layout to proper
CSV? (we can help with that).
Have a great day!
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users