Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-03 Thread Niall O'Reilly
On 1 May 2018, at 15:20, Simon Slavin wrote:

> Seems like the best way to solve this would be to write a converter for 
> Windows which converts SCSV to CSV.  Then it could be used by all Excel users 
> instead of just SQLite users.

As the heavy lifting of implementing Python on Windows has been done already, I 
think
that some wrapping around this might do the job, since the separator is 
parameterized:
https://docs.python.org/3.6/library/csv.html

Like Simon,

> I'd do it myself but I don't use Windows.

8-)

Niall O'Reilly


signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread David Burgess
For those of you who use SQLite to prepare CSV for import/open into
Excel beware of this problem:

"Text","Next bit is a reference id","A001"
"text","same again","0009"

On Windows, In the second row, 3rd column Excel will remove the
leading zeroes, if the file has an extension of .csv
The same content in a file with the extension .txt is processed as
csv, but without the removal of the leading zeroes from text values.

This is value translation not formatting. Nightmare.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Peter Da Silva


On 5/1/18, 1:42 PM, "sqlite-users on behalf of R Smith" 
 wrote:
My point is that CSV was not necessarily "meant" to be what you say. Who 
exactly "meant" for it to be that? Because the official stuff makes no 
such claim or mention.

Bah. Existential shenanigans. There's probably some pseudo-Latin or German term 
for what I meant.

Your point is however more towards the WHY question, and no, I am not 
sure. But, you know, it's Microsoft - after they made "locales" they 
probably wanted someone to use it, so they bullied Excel division into 
it. :)

From what I've heard the Excel division is the least bulliable part of the 
company. Didn't they maintain their own C compiler for a while to cut down on 
dependencies on the rest of Microsoft? 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread R Smith


On 2018/05/01 8:21 PM, Peter Da Silva wrote:

On 5/1/18, 1:15 PM, "sqlite-users on behalf of R Smith" 
 wrote:

 On 1 May 2018, at 6:43pm, Peter Da Silva  
wrote:
 
 > CSV is an interchange format, it's for software to communicate with other software, so the syntax needs to be independent of the locale since you don't know if the sender and recipient are in the same locale. Field separator is syntax, so the locale settings should not have any effect on it.
 
 That is a very bold and idealistic claim.


I'm not claiming what you think I'm claiming. All I'm saying is that CSV is meant to 
communicate with software, not humans. Locales are there for humans. Using "the 
locale says list separates are semicolons" as a justification for sometimes using 
semicolons in an interchange format instead of commas doesn't make sense.


My point is that CSV was not necessarily "meant" to be what you say. Who 
exactly "meant" for it to be that? Because the official stuff makes no 
such claim or mention.


That said, I'm agreeing, it is nowadays very much used as a data 
exchange, no modern users read CSV files for a hobby, and Excel (or any 
other software) using the locale for "List separator" inside CSV files 
should probably turn a new page and stop doing that - I feel it will be 
better, I'm just not as convinced as you that my opinion on this is 
ultimate.




Do you actually have a reference for that being the reason for Excel using 
semicolons?


I have zero proof that it is WHY they do it, but I have proof that it IS 
what they do - the verification is rather easy, you can change your 
windows locale setting for List separator and see for yourself. (The 
record separator in turn affects the records, but mostly that is CRLF in 
all locales). You might need to restart Excel after changing locale 
settings.


Your point is however more towards the WHY question, and no, I am not 
sure. But, you know, it's Microsoft - after they made "locales" they 
probably wanted someone to use it, so they bullied Excel division into 
it. :)





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Peter Da Silva
On 5/1/18, 1:15 PM, "sqlite-users on behalf of R Smith" 
 wrote:

On 1 May 2018, at 6:43pm, Peter Da Silva  
wrote:

> CSV is an interchange format, it's for software to communicate with other 
software, so the syntax needs to be independent of the locale since you don't 
know if the sender and recipient are in the same locale. Field separator is 
syntax, so the locale settings should not have any effect on it.

That is a very bold and idealistic claim.

I'm not claiming what you think I'm claiming. All I'm saying is that CSV is 
meant to communicate with software, not humans. Locales are there for humans. 
Using "the locale says list separates are semicolons" as a justification for 
sometimes using semicolons in an interchange format instead of commas doesn't 
make sense. Do you actually have a reference for that being the reason for 
Excel using semicolons?

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread R Smith

On 1 May 2018, at 6:43pm, Peter Da Silva  wrote:


CSV is an interchange format, it's for software to communicate with other 
software, so the syntax needs to be independent of the locale since you don't 
know if the sender and recipient are in the same locale. Field separator is 
syntax, so the locale settings should not have any effect on it.


That is a very bold and idealistic claim. Not even the person who wrote 
the currently used CSV standard (RFC4180 anyway) claims such things - in 
fact he is very quick to point out that it isn't a standard but more of 
a "commonly used format" which did not have official documentation and 
so he decided to write up some.  But I am on your side of this war - I 
feel it "should" be standard, it's just that since it isn't a standard, 
it's hard to lord it over others doing differently.


There have been very many custom implementations (complete with 
documentation) for many of the CSV formats around, but it never really 
progressed to a standard - which is why I assume so much trouble exists.


Probably my favourite line from RFC4180 is to Implementors: "be 
conservative in what you do, be liberal in what you accept from others".



While on the subject, when I was doing the converter mentioned earlier, 
I had researched the CSV thing trying to figure out a rule-book (with 
very little success), but I did make this little test CSV file handy to 
push through all the CSV importers/readers to see who does it how.


I managed to find it now again and thought I would post it for fun. I 
have a result set that is correct according to RFC4180 (and yes, SQLite 
gets it right) but as a fun exercise to the reader, put this text in a 
file named something.csv and open it in some CSV readers.


For more fun, try to predict the outcome before starting.

What I have found is that there are nearly as many differences in the 
interpretation as there are CSV interpreters.



csv file--
Item,Value1, Value 2
1,This is a non-quoted string.,"This is a ""quoted"" one"
2,"This contains a comma ',' and some single quotes.",This one doesn't
3,"Magic disappearing value after quoted value" This text should be 
hidden, This text must be visible.

4,"This is all on one line","This string goes
over
3 lines."
5, "The value to my right is empty, see how Excel gets this wrong!",
6, The,End
eof--

PS: According to the RFC, the last record of the last row may or may not 
contain an ending . I always add one since it makes appending the 
file easier later.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Simon Slavin
On 1 May 2018, at 6:43pm, Peter Da Silva  wrote:

> CSV is an interchange format, it's for software to communicate with other 
> software, so the syntax needs to be independent of the locale since you don't 
> know if the sender and recipient are in the same locale. Field separator is 
> syntax, so the locale settings should not have any effect on it.

The problem is not a file format with semicolons in.  The problem is calling it 
"CSV" or allowing such files to be saved with ".csv" extensions.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Peter Da Silva
To another post hating on Excel - Excel has many flaws, but this is not 
one of them, it's a fault of the list-separator setting in the Windows 
OS on which the Excel runs.

CSV is an interchange format, it's for software to communicate with other 
software, so the syntax needs to be independent of the locale since you don't 
know if the sender and recipient are in the same locale. Field separator is 
syntax, so the locale settings should not have any effect on it.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread R Smith



On 2018/05/01 4:20 PM, Simon Slavin wrote:

On 1 May 2018, at 3:01pm, Olivier Mascia  wrote:


My question was more generic, even though it didn't look that way: the well-known and (maybe too) 
much-used software tool named Excel tend to encourage people to export "CSV" files which 
are actually "SCSV" files (semi-colon separated values). So the need to script some data 
happens regularly.

What kind of insanity would lead someone to invent semi-colon delimited format 
when CSV already existed ?  I bet it was a badly-thought-out solution for 
European numeric formats which use a comma as a decimal point.  Tell whoever 
uses that format to stop.


*Sigh* If only...

There are only a couple of Countries using semi-colon list separators as 
far as I am aware, but the problem is that one of those is Holland - not 
exactly some third-World stow-away island that can be ignored. A lot of 
the things we make had to be Holland-friendly and so allowing 
semi-colon-happy CSV files became the norm.


I have successfully made a simple converter to read semi-colon CSV files 
(in fact, to detect which separator is used), and convert if needed, 
which does the trick, though it makes some important assumptions. (I'm 
willing to share if interested, but it's a Windoze CLI, mail me direct 
if needed).


To another post hating on Excel - Excel has many flaws, but this is not 
one of them, it's a fault of the list-separator setting in the Windows 
OS on which the Excel runs... It's also a flaw of software anyone may 
make, where it will produce one sort of output in one Country and a 
different kind in another, if you bother to use the locale settings. 
Also, the doubled-double-Quote char escaping is the law of the CSV 
standard, not of Excel, and backslash escaping is by no means more 
common or more or less correct. All of CSV and all of SQL use doubled-up 
quotes to escape strings - which must make it vastly more common than 
anything else.


To make matters worse, some enlightened Dutchmen realise this semi-colon 
insanity and so cleverly set their own computer's locale to use the 
comma in stead of the semi-colon. So now you cannot trust that it will 
use the semi-colon either - you HAVE to manually check and act accordingly.


Some other countries have other problems, like setting the decimal 
separator to a comma - imagine that in a CSV file - completely breaking 
it unless you cleverly Quote all the numeric values too, or ignore that 
local setting too  Just silly.



I agree - someone needs to be shot for this. I just can't really figure 
out who.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Luuk


On 1-5-2018 16:20, Simon Slavin wrote:
> On 1 May 2018, at 3:01pm, Olivier Mascia  wrote:
>
>> My question was more generic, even though it didn't look that way: the 
>> well-known and (maybe too) much-used software tool named Excel tend to 
>> encourage people to export "CSV" files which are actually "SCSV" files 
>> (semi-colon separated values). So the need to script some data happens 
>> regularly.
> What kind of insanity would lead someone to invent semi-colon delimited 
> format when CSV already existed ?  I bet it was a badly-thought-out solution 
> for European numeric formats which use a comma as a decimal point.  Tell 
> whoever uses that format to stop.
>
> Seems like the best way to solve this would be to write a converter for 
> Windows which converts SCSV to CSV.  Then it could be used by all Excel users 
> instead of just SQLite users.  Take a look at
>
> 
>
> 
>
> and hack up a solution.  I'd do it myself but I don't use Windows.
>
> Simon.
>

You 'forgot' this link:
https://answers.microsoft.com/en-us/msoffice/forum/msoffice_excel-mso_windows8-mso_2010/sep-support-for-older-excel/293076a3-6593-4ceb-8167-d29aa3418773

;)


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Peter Da Silva
Having tried to write a generic clean HANDLES ALL CSV reader for speedtables, I 
kind of want to burn Excel with nuclear fire, but that's a side issue. :)

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread dave
> From: sqlite-users 
> [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
> Behalf Of Olivier Mascia
> 
> Considering:
> 
> CREATE VIRTUAL TABLE temp.t1 USING csv(filename='thefile.csv');
> 
> Is there any way to teach the csv extension to use ';' 
> instead of ',' as the column delimiter, getting away from the 
> strict RFC4180 definition?
 
While on the topic of Excel generated CSV, I had occaision to do this a
year-and-a-half ago, so let me forewarn you of a couple other things:
*  quoting; excel will sometimes enclose a field in quotes, and sometimes
not, depending on the content therein
*  quoting; excel uses 'double quoting' for 'escaping'; i.e. two quotation
marks in a row are interpreted as a quote char, rather than the arguably
more common backslash style escaping
*  multi-line fields;  your excel document can have cells which contain
multiple lines (i.e. embedded carriage returns)

Anyway, all this led me in my case to write a state machine to read in a
logical 'line' of text, and crack it into fields.

That being said, just now taking a peek at the current implementaion in the
sqlite codebase: 

ext\misc\csv.c:197

Is a commment

/* Read a single field of CSV text.  Compatible with rfc4180 and extended
** with the option of having a separator other than ",".

So maybe that implementation is already enhanced to accommodate those cases,
but the capability is simply not exposed through parameters to the
CREATE VIRTUAL TABLE CSV (...)

So maybe one would just need to modify the csvtabCreate to process some
additional parameters and propagate those settings to the implementation.

-dave


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Simon Slavin
On 1 May 2018, at 3:01pm, Olivier Mascia  wrote:

> My question was more generic, even though it didn't look that way: the 
> well-known and (maybe too) much-used software tool named Excel tend to 
> encourage people to export "CSV" files which are actually "SCSV" files 
> (semi-colon separated values). So the need to script some data happens 
> regularly.

What kind of insanity would lead someone to invent semi-colon delimited format 
when CSV already existed ?  I bet it was a badly-thought-out solution for 
European numeric formats which use a comma as a decimal point.  Tell whoever 
uses that format to stop.

Seems like the best way to solve this would be to write a converter for Windows 
which converts SCSV to CSV.  Then it could be used by all Excel users instead 
of just SQLite users.  Take a look at





and hack up a solution.  I'd do it myself but I don't use Windows.

Simon.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Olivier Mascia
> Le 1 mai 2018 à 14:00, Simon Slavin  a écrit :
> 
>> CREATE VIRTUAL TABLE temp.t1 USING csv(filename='thefile.csv');
>> 
>> Is there any way to teach the csv extension to use ';' instead of ',' as the 
>> column delimiter, getting away from the strict RFC4180 definition?
> 
> The source code for the csv extension is in the archive.  Make your own 
> version called 'ssv'.
> 
> Or, if you can rely on strings in the file not having commas in them, use a 
> simple shell tool to replace all commas with semicolons.

Thanks a lot for the answer Simon, even though I knew of these solutions. :)
My question was more generic, even though it didn't look that way: the 
well-known and (maybe too) much-used software tool named Excel tend to 
encourage people to export "CSV" files which are actually "SCSV" files 
(semi-colon separated values). So the need to script some data happens 
regularly. Very easy for programmers like you and me. A bit harder for people 
just trained enough for some simple sqlite command-line interactions, they 
might not have the right tool / script-languages knowledge to do it easily or 
more importantly to do it right.
Nothing really important.  Just wanted to trigger some thinking.

Again, thank you.
-- 
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Simon Slavin
On 1 May 2018, at 11:11am, Olivier Mascia  wrote:

> CREATE VIRTUAL TABLE temp.t1 USING csv(filename='thefile.csv');
> 
> Is there any way to teach the csv extension to use ';' instead of ',' as the 
> column delimiter, getting away from the strict RFC4180 definition?

The source code for the csv extension is in the archive.  Make your own version 
called 'ssv'.

Or, if you can rely on strings in the file not having commas in them, use a 
simple shell tool to replace all commas with semicolons.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Configuring csv extension to use '; ' as column delimiter instead of ',' ?

2018-05-01 Thread Olivier Mascia
Considering:

CREATE VIRTUAL TABLE temp.t1 USING csv(filename='thefile.csv');

Is there any way to teach the csv extension to use ';' instead of ',' as the 
column delimiter, getting away from the strict RFC4180 definition?

-- 
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users