Hi Reinhard - studying this CSV formatting (and file-formatting in general) is somewhat of a hobby of mine and have solved many issues around it, so I can tell you that your request is a border case and found nowhere else in CSV realms. I think that whatever system is making the files you are trying to import is not actually creating CSV files, but just output using it's own format or another related format (Like SYLK etc) that happens to be delimited by commas.

In any official CSV format (and there are more than one, Holland for instance uses semi-colons as delimeters for official CSV) the quotation character is both standard and required. Consider the data section:
[123, "Rue de Caprice, Cordon's est. '99", Deschamps, Bordeaux, "1465""5", 
France]
Without the Quotes, the second field would look like two fields and the 4th field would contain insane amounts of quotes, and this is very VALID CSV, so it is quite impossible to create a CSV file without quoting or with a different quoting format even. Notice also that the fields essentially start with spaces, so the first character of the field is not always a quote in a quoted field value. It could well be a space, control character used in print formatting such as 0x0C or 0x0F (especially used in older dot-matrix applications), it may also start with any whitespace characters such as CR, LF, TAB, VTAB, and of course Space itself or any number/combination of them, followed by a quote which starts the actual field.

It is also fully permissible to have comments in datasets of CSV such as [F1,"F2" This is a comment, F3] which if you put in a CSV file and load into a CSV program, like Excel for instance, would show up as: F1 F2 F3 and nothing else.

I have made quite a few file format interpreters/translators in my life, (even very fast Raw-Delimeted-->SQLite importers) of which I am very happy to send you for free to support whatever app is giving you these output purported to be CSV (but isn't).

A quick check to know the validation, is to simply take whatever output you have, save it to any file with .csv extension and then double-click it which will hopefully open it in Excel or some other Openoffice or whatever proper spreadsheet system you use, and if it isn't loaded correctly in there, then the input format is likely at fault. I know ALL of these will break with an unquoted [ 19" rack ] entry amongst the fields for sure.

May I also ask you do this test before assuming it "an SQLite bug" in future when in fact SQLite works exactly as it should in these cases and, in fact, may not work any different.

Mit vielen dank und freundlichen grüßen
Ryan


On 2013/06/25 11:19, Nißl Reinhard wrote:
Hi,

for example, it cannot import the column value

             19" rack

because it stays in quotation mode until it finds a further ", which is incorrect. 
Quotation mode may only get activated when " appears at the beginning of a column 
value.

Once the line has been broken into column values, the import command properly 
handles that rule when dequoting each column value before inserting the row 
into the table.

Besides this bug, it would be nice if the quotation character could be 
specified like the separator, so that de-/quoting can even be turned off it the 
file hasn't been created appropriately. E. g. sqlserver bulk import does not 
support quoting, hence the files may not be created with quoting turned on.

Mit freundlichen Grüßen / Best regards

Reinhard Nißl
Softwareentwicklung


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to