Re: [sqlite] Importing data into SQLite - text files are not really portable

2009-09-09 Thread P Kishor
On Wed, Sep 9, 2009 at 10:33 AM, Jim Showalter <j...@jimandlisa.com> wrote:
> Microsoft also supports an XML standard for import/export. Whenever
> possible, it should be used instead of CSV. It's not the
> line-terminators that are the problem--the problem is CSV itself.
> That's why Microsoft upgraded to supporting XML. They only support CSV
> for backward-compatibility reasons.

with reference to our problem here, how does exporting data to XML
help if one wants to then import it into sqlite3? As far as I can
tell, sqlite3 doesn't support importing XML, or does it?


>
> - Original Message -
> From: "Jean-Denis Muys" <jdm...@kleegroup.com>
> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
> Sent: Wednesday, September 09, 2009 6:09 AM
> Subject: Re: [sqlite] Importing data into SQLite - text files are not
> really portable
>
>
>>
>> On 9/8/09 22:24 , various people wrote:
>>
>>> Unfortunately, the 3 main families of small computer operating
>>> systems
>>> have 3 different definitions of what a text file is...
>>>
>>> DOS/Windows (PC): lines are terminated with CR+LF
>>> Unix: lines are terminated with LF
>>> Macintosh: lines are terminated with CR
>> [...]
>>> FYI: Mac excel does not separate rows with \r, but inserts a ^M
>>> instead.
>> [...]
>>> From: "Kavita Raghunathan" <kavita.raghunat...@skyfiber.com>
>>>
>>> Yes, this works. Must have been my original csv file.
>>> I was using mac based excel and I'll now try using the windows
>>> based
>>> excel.
>>
>> For the record, the Mac has not been using CR line terminations for
>> many
>> years now (2001).
>>
>> Microsoft, in its not very high wisdom, proposes in the latest Mac
>> version
>> of Excel, 4 export formats that might correspond, with useless
>> names:
>>
>> 1- the main format at the top of the menu is named "Comma Separated
>> Values
>> (.csv)"
>> It's a comma -separated, CR-terminated format, with characters
>> encoded in
>> MacRoman ( http://en.wikipedia.org/wiki/Mac_OS_Roman).
>> As both line-termination and character-encoding conventions have
>> stopped
>> being used for many years, it's despicable from Micros__t to
>> continue to
>> name it as they do, and to propose it as the main choice.
>>
>> 2- secondary format named "Tab delimited text (.txt)"
>> Same as 1-, except the separator is now a tab character.
>> Totally useless format
>>
>> 3- secondary format named "Windows Comma-separated (.csv)"
>> Line terminations: CR-LF (Windows convention)
>> Character encoding: CP1502 (Windows extension to ISO-8859-1)
>> This is the most useful format, as it's likely to work for
>> interoperability
>> with the Windows world.
>>
>> 4- secondary format named "MS-Dos Comma-separated (.csv)"
>> Line terminations: CR
>> Character encoding: CP850 ( http://en.wikipedia.org/wiki/CP850)
>> A totally obsolete variant.
>>
>> Notably missing would be for example the *current* version: line
>> terminated
>> with LF and UTF8 encoding.
>>
>> I will stop there. Ranting on MS is bad for my nerves.
>>
>> Jean-Denis
>>
>>
>>
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Puneet Kishor
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing data into SQLite - text files are not really portable

2009-09-09 Thread Jim Showalter
Microsoft also supports an XML standard for import/export. Whenever 
possible, it should be used instead of CSV. It's not the 
line-terminators that are the problem--the problem is CSV itself. 
That's why Microsoft upgraded to supporting XML. They only support CSV 
for backward-compatibility reasons.

- Original Message - 
From: "Jean-Denis Muys" <jdm...@kleegroup.com>
To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, September 09, 2009 6:09 AM
Subject: Re: [sqlite] Importing data into SQLite - text files are not 
really portable


>
> On 9/8/09 22:24 , various people wrote:
>
>> Unfortunately, the 3 main families of small computer operating 
>> systems
>> have 3 different definitions of what a text file is...
>>
>> DOS/Windows (PC): lines are terminated with CR+LF
>> Unix: lines are terminated with LF
>> Macintosh: lines are terminated with CR
> [...]
>> FYI: Mac excel does not separate rows with \r, but inserts a ^M 
>> instead.
> [...]
>> From: "Kavita Raghunathan" <kavita.raghunat...@skyfiber.com>
>>
>> Yes, this works. Must have been my original csv file.
>> I was using mac based excel and I'll now try using the windows 
>> based
>> excel.
>
> For the record, the Mac has not been using CR line terminations for 
> many
> years now (2001).
>
> Microsoft, in its not very high wisdom, proposes in the latest Mac 
> version
> of Excel, 4 export formats that might correspond, with useless 
> names:
>
> 1- the main format at the top of the menu is named "Comma Separated 
> Values
> (.csv)"
> It's a comma -separated, CR-terminated format, with characters 
> encoded in
> MacRoman ( http://en.wikipedia.org/wiki/Mac_OS_Roman).
> As both line-termination and character-encoding conventions have 
> stopped
> being used for many years, it's despicable from Micros__t to 
> continue to
> name it as they do, and to propose it as the main choice.
>
> 2- secondary format named "Tab delimited text (.txt)"
> Same as 1-, except the separator is now a tab character.
> Totally useless format
>
> 3- secondary format named "Windows Comma-separated (.csv)"
> Line terminations: CR-LF (Windows convention)
> Character encoding: CP1502 (Windows extension to ISO-8859-1)
> This is the most useful format, as it's likely to work for 
> interoperability
> with the Windows world.
>
> 4- secondary format named "MS-Dos Comma-separated (.csv)"
> Line terminations: CR
> Character encoding: CP850 ( http://en.wikipedia.org/wiki/CP850)
> A totally obsolete variant.
>
> Notably missing would be for example the *current* version: line 
> terminated
> with LF and UTF8 encoding.
>
> I will stop there. Ranting on MS is bad for my nerves.
>
> Jean-Denis
>
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing data into SQLite - text files are not really portable

2009-09-09 Thread Jean-Denis Muys

On 9/8/09 22:24 , various people wrote:

> Unfortunately, the 3 main families of small computer operating systems
> have 3 different definitions of what a text file is...
> 
> DOS/Windows (PC): lines are terminated with CR+LF
> Unix: lines are terminated with LF
> Macintosh: lines are terminated with CR
[...] 
> FYI: Mac excel does not separate rows with \r, but inserts a ^M instead.
[...]
> From: "Kavita Raghunathan" 
> 
> Yes, this works. Must have been my original csv file.
> I was using mac based excel and I'll now try using the windows based
> excel.

For the record, the Mac has not been using CR line terminations for many
years now (2001).

Microsoft, in its not very high wisdom, proposes in the latest Mac version
of Excel, 4 export formats that might correspond, with useless names:

1- the main format at the top of the menu is named "Comma Separated Values
(.csv)"
It's a comma -separated, CR-terminated format, with characters encoded in
MacRoman ( http://en.wikipedia.org/wiki/Mac_OS_Roman).
As both line-termination and character-encoding conventions have stopped
being used for many years, it's despicable from Micros__t to continue to
name it as they do, and to propose it as the main choice.

2- secondary format named "Tab delimited text (.txt)"
Same as 1-, except the separator is now a tab character.
Totally useless format

3- secondary format named "Windows Comma-separated (.csv)"
Line terminations: CR-LF (Windows convention)
Character encoding: CP1502 (Windows extension to ISO-8859-1)
This is the most useful format, as it's likely to work for interoperability
with the Windows world.

4- secondary format named "MS-Dos Comma-separated (.csv)"
Line terminations: CR
Character encoding: CP850 ( http://en.wikipedia.org/wiki/CP850)
A totally obsolete variant.

Notably missing would be for example the *current* version: line terminated
with LF and UTF8 encoding.

I will stop there. Ranting on MS is bad for my nerves.

Jean-Denis




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing data into SQLite - text files are not really portable

2009-09-08 Thread Robert Citek
Yes, one big, long line.

As for displaying, depending on the program you use, \r may get
displayed as ^M.  For example:

$ echo -ne '\r\n' | cat -A
^M$

$ echo -ne '\r\n' | od -An -abcx
  cr  nl
 015 012
  \r  \n
 0a0d

Regards,
- Robert

> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Kavita Raghunathan
> Sent: Tuesday, September 08, 2009 2:14 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Importing data into SQLite
>
> FYI: Mac excel does not separate rows with \r, but inserts a ^M instead.
> (I dont have a windows machine with me, I wonder if this is Mac specific)
> Sqlite does not like this because a large file with many rows appears as
> 1 huge infinite line to sqlite.
> Kavita
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing data into SQLite - text files are not really portable

2009-09-08 Thread Rich Shepard
On Tue, 8 Sep 2009, Ribeiro, Glauber wrote:

> Unfortunately, the 3 main families of small computer operating systems
> have 3 different definitions of what a text file is...
>
> This causes no end of trouble when moving text files between these kinds
> of systems.

   I've never worked with a Mac so I have no knowledge of them. However, for
years I have used dos2unix (and, occasionally unix2dos) to do the requisite
line termination changes. It's no big deal to run the text file through the
filter.

Rich

-- 
Richard B. Shepard, Ph.D.   |  IntegrityCredibility
Applied Ecosystem Services, Inc.|Innovation
 Voice: 503-667-4517  Fax: 503-667-8863
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing data into SQLite - text files are not really portable

2009-09-08 Thread Ribeiro, Glauber
Unfortunately, the 3 main families of small computer operating systems
have 3 different definitions of what a text file is...

DOS/Windows (PC): lines are terminated with CR+LF
Unix: lines are terminated with LF
Macintosh: lines are terminated with CR

This causes no end of trouble when moving text files between these kinds
of systems. 

Java takes the approach that any of (CR, LF, or CR+LF) is a valid line
terminator, but many or perhaps most non-Java software insists on having
the correct line terminator for the platform where it's running.

g


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Kavita Raghunathan
Sent: Tuesday, September 08, 2009 2:14 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Importing data into SQLite

FYI: Mac excel does not separate rows with \r, but inserts a ^M instead.
(I dont
have a windows machine with me, I wonder if this is Mac specific)
Sqlite does not like this because a large file with many rows appears as
1 
huge infinite line to sqlite.
Kavita
- Original Message -
From: "Kavita Raghunathan" 
To: "General Discussion of SQLite Database" 
Sent: Tuesday, September 8, 2009 11:47:38 AM GMT -08:00 US/Canada
Pacific
Subject: Re: [sqlite] Importing data into SQLite

Yes, this works. Must have been my original csv file. 
I was using mac based excel and I'll now try using the windows based
excel.
Thanks to all for the awesome support.

Kavita


SQLite version 3.3.6
Enter ".help" for instructions
sqlite> create table test(name text, id integer);
sqlite> .separator ","
sqlite> .import data.csv test
sqlite> 
sqlite> select * from test
   ...> ;
"a",1
"b",2
"c",3
sqlite> 

 Original Message -
From: "Robert Citek" 
To: "General Discussion of SQLite Database" 
Sent: Monday, September 7, 2009 9:08:41 AM GMT -08:00 US/Canada Pacific
Subject: Re: [sqlite] Importing data into SQLite

On Sun, Sep 6, 2009 at 9:32 PM, Kavita
Raghunathan wrote:
> Timothy and all,
> When I try to import a .csv, I get a segmentation fault:
> 1) First I set .seperator to ,
> 2) Then I type .import  
> 3) I see "Segmentation fault"
>
> Any ideas ?

Here's an example of how it worked for me.

$ cat data.csv
"a",1
"b",2
"c",3

$ sqlite3 sample.db .schema
CREATE TABLE data (foo text, bar int);

$ sqlite3 -separator , sample.db '.imp "data.csv" "data" '

$ sqlite3 -header -column sample.db 'select * from data ;'
foo bar
--  --
"a" 1
"b" 2
"c" 3

More details here, including caveats:

http://www.sqlite.org/cvstrac/wiki?p=ImportingFiles

Personally, I prefer to used tab-delimited files and then import by
specifying the separator as a tab:

$ sqlite3 -separator $'\t' sample.db '.imp "data.tsv" "data" '

This takes advantage of the bash shell's use of $'\t' to encode a tab.

Regards,
- Robert
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users