Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Gary R. Schmidt
On 27/01/2018 05:32, Peter Da Silva wrote: On 1/26/18, 12:31 PM, "sqlite-users on behalf of J Decker" wrote: ctrl-z was end of file text character in DOS (wrote char 26; not FF) DOS wasn't an operating system. That will come as a surprise to the people who used DOS/360 and DOS/VSE and th

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread petern
tyle-Wide-String is defined as a >> > > "bunch-a-non-zero-words-terminated-by-a-zero-word", then how is it >> > > possible to have a zero/null word "embedded" within a >> > C-Style-Wide-String? >> > > >> > > Given that SQLite3 i

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
e String? > > > > > > > > Similarly, if a C-Style-Wide-String is defined as a > > > > "bunch-a-non-zero-words-terminated-by-a-zero-word", then how is it > > > > possible to have a zero/null word "embedded" within a > > > C-Style-Wide-String? > > &

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread petern
es C-Strings or > > > C-Style-Wide-Strings, then you cannot have zero/null bytes embedded in > > > those strings. > > > > > > You may of course argue that perhaps SQLite3 should use something other > > > than C-Style-Strings, however, this is not what seems

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
posed. It > > seems to be proposing the use of some magical C-Style-String that is not > > actually a C-Style-String, without explicitly stating this. > > > > SQLite3 does handle non-C-Ctyle-Strings. They are called "blobs". > > > > --- > > The fact that th

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread petern
to Hell but only a Stairway to Heaven says > a lot about anticipated traffic volume. > > > >-Original Message- > >From: sqlite-users [mailto:sqlite-users- > >boun...@mailinglists.sqlite.org] On Behalf Of J Decker > >Sent: Friday, 26 January, 2018 17:18 > >

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
r a decade or more as char. > --- > The fact that there's a Highway to Hell but only a Stairway to Heaven says > a lot about anticipated traffic volume. > > > >-Original Message- > >From: sqlite-users [mailto:sqlite-users- > >boun...@mailinglists.sqlite.org] On Be

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Keith Medcalf
On Behalf Of J Decker >Sent: Friday, 26 January, 2018 17:18 >To: SQLite mailing list >Subject: Re: [sqlite] UTF8 and NUL > >On Fri, Jan 26, 2018 at 3:56 PM, Peter Da Silva < >peter.dasi...@flightaware.com> wrote: > >> On 2018-01-26, at 17:05, J Decker wrote:

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 3:56 PM, Peter Da Silva < peter.dasi...@flightaware.com> wrote: > On 2018-01-26, at 17:05, J Decker wrote: > > On Fri, Jan 26, 2018 at 1:21 PM, Peter Da Silva < > > peter.dasi...@flightaware.com> wrote: > >> Sqlite uses NUL as the string terminator internally, the publishe

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 2018-01-26, at 17:05, J Decker wrote: > On Fri, Jan 26, 2018 at 1:21 PM, Peter Da Silva < > peter.dasi...@flightaware.com> wrote: >> Sqlite uses NUL as the string terminator internally, the published API >> specifies has stuff like this all over the place: >>> In those routines that have a fou

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 1:21 PM, Peter Da Silva < peter.dasi...@flightaware.com> wrote: > Sqlite uses NUL as the string terminator internally, the published API > specifies has stuff like this all over the place: > > > In those routines that have a fourth argument, its value is the number > of byt

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Simon Slavin
On 26 Jan 2018, at 9:04pm, J Decker wrote: > I bet windows command line tools still use it because copy has /B and /A on > windows 10. Windows is indeed a problem. I don't know enough about it to know whether the above statement outlines the problem but Windows in general is terrifically diff

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
Sqlite uses NUL as the string terminator internally, the published API specifies has stuff like this all over the place: > In those routines that have a fourth argument, its value is the number of > bytes in the parameter. To be clear: the value is the number of bytes in the > value, not the nu

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 11:41 AM, Peter Da Silva < peter.dasi...@flightaware.com> wrote: > On 1/26/18, 1:37 PM, "sqlite-users on behalf of J Decker" < > sqlite-users-boun...@mailinglists.sqlite.org on behalf of d3c...@gmail.com> > wrote: > >doesn't get 26 either. 0x1a > > 26 isn't EOF, it's SU

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 2:34 PM, "sqlite-users on behalf of J. King" wrote: > Do you have a point in making either statement? If you do, I'm really not > seeing it. The point is that apart from CP/M and derivatives like DOS, this kind of behavior is strictly a leftover from the '60s. And CP/M only had th

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J. King
On 2018-01-26 15:13:46, "Peter Da Silva" wrote: On 1/26/18, 2:11 PM, "sqlite-users on behalf of John McKown" john.archie.mck...@gmail.com> wrote: ​In the distant past (CP/M-80), the filesystem meta data did not include the actual _length_ of the data for a text data file. Since DOS wasn't a

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 2:11 PM, "sqlite-users on behalf of John McKown" wrote: > ​In the distant past (CP/M-80), the filesystem meta data did not include the > actual _length_ of the data for a text data file. Since DOS wasn't an OS, then CP/M certainly wasn't. _

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread John McKown
On Fri, Jan 26, 2018 at 1:41 PM, Peter Da Silva < peter.dasi...@flightaware.com> wr > On 1/26/18, 1:37 PM, "sqlite-users on behalf of J Decker" < > sqlite-users-boun...@mailinglists.sqlite.org on behalf of d3c...@gmail.com> > wrote: > >doesn't get 26 either. 0x1a > > 26 isn't EOF, it's SUB (su

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 1:37 PM, "sqlite-users on behalf of J Decker" wrote: >doesn't get 26 either. 0x1a 26 isn't EOF, it's SUB (substitute). It was used to represent untranslatable characters when converting (for example) EBCDIC to ASCII. ___ sqlite-users

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 10:44 AM, Peter Da Silva < peter.dasi...@flightaware.com> wrote: > On 1/26/18, 12:40 PM, "sqlite-users on behalf of J Decker" < > sqlite-users-boun...@mailinglists.sqlite.org on behalf of d3c...@gmail.com> > wrote: > > reads the bytes and does things with them. the EOF wo

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 12:40 PM, "sqlite-users on behalf of J Decker" wrote: > reads the bytes and does things with them. the EOF would get returned with > fgetc() but not the character. Fgetc returns an int, not a byte. That EOF is -1, not 0xFF. ___ sqlit

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 10:35 AM, Tim Streater wrote: > On 26 Jan 2018, at 18:12, Keith Medcalf wrote: > > > Actually, EOF (0xFF) *is* part of a text file, and is the byte in an > ASCII > > byte-stream that indicates end-of-file. > > First I've heard of that. Which systems did that then? EOF is

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Tim Streater
On 26 Jan 2018, at 18:12, Keith Medcalf wrote: > Actually, EOF (0xFF) *is* part of a text file, and is the byte in an ASCII > byte-stream that indicates end-of-file. First I've heard of that. Which systems did that then? EOF is normally indicated by the file system, not by file data. -- Chee

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 12:31 PM, "sqlite-users on behalf of J Decker" wrote: > ctrl-z was end of file text character in DOS (wrote char 26; not FF) DOS wasn't an operating system. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://maili

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 10:22 AM, Peter Da Silva < peter.dasi...@flightaware.com> wrote: > On 1/26/18, 12:12 PM, "sqlite-users on behalf of Keith Medcalf" < > sqlite-users-boun...@mailinglists.sqlite.org on behalf of > kmedc...@dessus.com> wrote: > > Actually, EOF (0xFF) *is* part of a text file,

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 12:12 PM, "sqlite-users on behalf of Keith Medcalf" wrote: > Actually, EOF (0xFF) *is* part of a text file, and is the byte in an ASCII > byte-stream that indicates end-of-file. In the "old days" the bytes > following the last-byte in a stream and the end of a storage block > (se

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Keith Medcalf
:) ). --- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-Original Message- >From: sqlite-users [mailto:sqlite-users- >boun...@mailinglists.sqlite.org] On Behalf Of Peter Da Silva >Sent: Friday, 26 Janua

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
On Fri, Jan 26, 2018 at 5:55 AM, Peter Da Silva < peter.dasi...@flightaware.com> wrote: > What is the goal of this discussion? Changing the string terminator SQLite > uses? I think it's almost 50 years too late for that, but I'm sure that if > Unicode and UTF8 had been a thing in 1970 then C would

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
On 1/26/18, 8:24 AM, "sqlite-users on behalf of Gary R. Schmidt" wrote: > But how would you differentiate EOF??? (Let me guess, 0. :-) ) End of file is not part of the contents of the file or a string. It's metadata. ___ sqlite-users mailing li

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Gary R. Schmidt
On 27/01/2018 00:55, Peter Da Silva wrote: What is the goal of this discussion? Changing the string terminator SQLite uses? I think it's almost 50 years too late for that, but I'm sure that if Unicode and UTF8 had been a thing in 1970 then C would have selected FF as the string terminator. But

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Peter Da Silva
What is the goal of this discussion? Changing the string terminator SQLite uses? I think it's almost 50 years too late for that, but I'm sure that if Unicode and UTF8 had been a thing in 1970 then C would have selected FF as the string terminator. __

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread Clemens Ladisch
J Decker wrote: > U+009C 156 String Terminator ST "ST is used as the closing delimiter of a control string opened by APPLICATION PROGRAM COMMAND (APC), DEVICE CONTROL STRING (DCS), OPERATING SYSTEM COMMAND (OSC), PRIVACY MESSAGE (PM), or START OF STRING (SOS)." Regards, Clemens _

Re: [sqlite] UTF8 and NUL

2018-01-26 Thread J Decker
https://en.wikipedia.org/wiki/List_of_Unicode_characters#Control_codes Even the Control codes within unicode aren't FF. U+009C 156 String Terminator ST literal bytes \xC2\x9c are string terminator ... Was thinking that like APC and ST were higher than that... more in the range of 0xF8-0xFF On

[sqlite] UTF8 and NUL

2018-01-25 Thread J Decker
NUL is a valid utf8 character but FF is never valid. (would be like a 36 bit length specification) and practically anthing more than F8 is invalid utf8 character. Other than BOM https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 EF BB BF 239 187 191 // EF - 80 | 3b - 80 | 3f ( 0xfeff ) Many W

Re: [sqlite] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Tim Streater
On 29 Jun 2017 at 08:01, Eric Grange wrote: >> The sender, however, could be lying, and this needs to be considered > > This is an orthogonal problem: if the sender is sending you data that is > not what it should be, then he could just as well be sending you > well-encoded and well-formed but in

Re: [sqlite] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-29 Thread Eric Grange
> The sender, however, could be lying, and this needs to be considered This is an orthogonal problem: if the sender is sending you data that is not what it should be, then he could just as well be sending you well-encoded and well-formed but invalid data, or malware, or confidential/personal data

Re: [sqlite] UTF8-BOM and text encoding detection (was: UTF8-BOM not disregarded in CSV import)

2017-06-28 Thread Tim Streater
On 28 Jun 2017 at 14:20, Rowan Worth wrote: > On 27 June 2017 at 18:42, Eric Grange wrote: > >> So while in theory all the scenarios you describe are interesting, in >> practice seeing an utf-8 BOM provides an extremely >> high likeliness that a file will indeed be utf-8. Not always, but a memor

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-27 Thread Scott Robison
On Tue, Jun 27, 2017 at 4:18 AM, Richard Hipp wrote: > The CSV import feature of the SQLite command-line shell expects to > find UTF-8. It does not understand other encodings, and I have no > plans to add converters for alternative encodings any time soon. > > The latest version of trunk skips ov

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-27 Thread Mahmoud Al-Qudsi
Thank you. From: sqlite-users on behalf of Richard Hipp Sent: Tuesday, June 27, 2017 5:18:51 AM To: SQLite mailing list Subject: Re: [sqlite] UTF8-BOM not disregarded in CSV import The CSV import feature of the SQLite command-line shell expects to find UTF-8

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-27 Thread Richard Hipp
The CSV import feature of the SQLite command-line shell expects to find UTF-8. It does not understand other encodings, and I have no plans to add converters for alternative encodings any time soon. The latest version of trunk skips over a UTF-8 BOM at the beginning of the input file. -- D. Richa

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-27 Thread Cezary H. Noweta
Hello, On 2017-06-26 17:26, Scott Robison wrote: +1 FAQ quote: Q: When a BOM is used, is it only in 16-bit Unicode text? A: No, a BOM can be used as a signature no matter how the Unicode text is transformed: UTF-16, UTF-8, or UTF-32. Q: How I should deal with BOMs? A: Here are some g

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-27 Thread Cezary H. Noweta
On 2017-06-26 15:01, jose isaias cabrera wrote: I have made a desicion to always include the BOM in all my text files whether they are UTF8, UTF16 or UTF32 little or big endian. I think all of us should also. I'm sorry, if I introduced ambiguity, but I had described SQLite's and SQLite shell'

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Scott Robison
On Jun 26, 2017 9:02 AM, "Simon Slavin" wrote: There is no convention for "This software understands both UTF-16BE and UTF-16LE but nothing else.". If it handles any BOMs, it should handle all five. However, it can handle them by identifying, for example, UTF-32BE and returning an error indicat

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Peter da Silva
I didn’t mean to imply you had to scan the whole content for a BOM, but rather for illegal characters in the absence of a BOM. On 6/26/17, 10:02 AM, "sqlite-users on behalf of Simon Slavin" wrote: Folks, I’m sorry to interrupt but I’ve just woken up to 11 posts in this thread and I see a

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Scott Robison
On Jun 26, 2017 4:05 AM, "Rowan Worth" wrote: On 26 June 2017 at 16:55, Scott Robison wrote: > Byte Order Mark isn't perfectly descriptive when used with UTF-8. Neither > is dialing a cell phone. Language evolves. > It's not descriptive in the slightest because UTF-8's byte order is *specified

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Simon Slavin
Folks, I’m sorry to interrupt but I’ve just woken up to 11 posts in this thread and I see a lot of inaccurate 'facts' posted here. Rather than pick up on statements in individual posts (which would unfairly pick on some people as being less accurate than others) I’d like to post facts straight

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Peter da Silva
Just occurred to me: another problem with the BOM is that some people who are *not* writing UTF-8 are cargo-culting the BOM in anyway. So you may have to scan the whole file to see if it’s really UTF-8 anyway. You’re better off just assuming UTF-8 everywhere, generating an error (and backing ou

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread jose isaias cabrera
At the bottom... -Original Message- From: Eric Grange Sent: Monday, June 26, 2017 3:09 AM To: SQLite mailing list Subject: Re: [sqlite] UTF8-BOM not disregarded in CSV import Alas, there is no end in sight to the pain for the Unicode decision to not make the BOM compulsory for UTF-8

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Peter da Silva
On 6/26/17, 2:09 AM, "sqlite-users on behalf of Eric Grange" wrote: > Alas, there is no end in sight to the pain for the Unicode decision to not > make the BOM compulsory for UTF-8. It’s not actually providing any “byte order” information. It’s only used for round-tripping conversion from oth

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Richard Damon
On 6/26/17 3:09 AM, Eric Grange wrote: Alas, there is no end in sight to the pain for the Unicode decision to not make the BOM compulsory for UTF-8. Making it optional or non-necessary basically made every single text file ambiguous, with non-trivial heuristics and implicit conventions required

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Eric Grange
>Easily solved by never including a superflous BOM in UTF-8 text And that easy option has worked beautifully for 20 years... not. Yes, BOM is a misnommer, yes it "wastes" 3 bytes, but in the real world "text files" have a variety of encodings. No BOM = you have to fire a whole suite of heuristics

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Rowan Worth
On 26 June 2017 at 16:55, Scott Robison wrote: > Byte Order Mark isn't perfectly descriptive when used with UTF-8. Neither > is dialing a cell phone. Language evolves. > It's not descriptive in the slightest because UTF-8's byte order is *specified by the encoding*. I'm not advocating one way

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Scott Robison
On Jun 25, 2017 1:16 PM, "Cezary H. Noweta" wrote: Certainly, there are no objections to extend an import's functionality in such a way that it ignores the initial 0xFEFF. However, an import should allow ZWNBSP as the first character, in its basic form, to be conforming to the standard. If we'

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Scott Robison
On Jun 26, 2017 1:47 AM, "Rowan Worth" wrote: On 26 June 2017 at 15:09, Eric Grange wrote: > Alas, there is no end in sight to the pain for the Unicode decision to not > make the BOM compulsory for UTF-8. > UTF-8 is byte oriented. The very concept of byte order is nonsense in this context as t

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Rowan Worth
On 26 June 2017 at 15:09, Eric Grange wrote: > Alas, there is no end in sight to the pain for the Unicode decision to not > make the BOM compulsory for UTF-8. > UTF-8 is byte oriented. The very concept of byte order is nonsense in this context as there is no multi-byte storage primitives to worr

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread J Decker
On Sun, Jun 25, 2017 at 12:16 PM, Cezary H. Noweta wrote: > Hello, > > > The standard says: ``Only UTF-16/32 (even not UTF-16/32LE/BE) encoding > forms can contain BOM''. Let's conform to this. > > I concur with that. Since UTF-8 is only bytes; what would a BOM even change? certainly longer val

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-26 Thread Eric Grange
Alas, there is no end in sight to the pain for the Unicode decision to not make the BOM compulsory for UTF-8. Making it optional or non-necessary basically made every single text file ambiguous, with non-trivial heuristics and implicit conventions required instead, resulting in character corruptio

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-25 Thread Cezary H. Noweta
Hello, On 2017-06-23 22:12, Mahmoud Al-Qudsi wrote: I think you and I are on the same page here, Clemens? I abhor the BOM, but the question is whether or not SQLite will cater to the fact that the bigger names in the industry appear hell-bent on shoving it in users’ documents by default. Give

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-23 Thread Mahmoud Al-Qudsi
” commands, perhaps leeway can be shown in breaking with standards for the sake of compatibility and sanity? Mahmoud From: Clemens Ladisch Sent: Friday, June 23, 2017 2:25 AM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] UTF8-BOM not disregarded in CSV import Mahmoud Al-Qudsi wrote

Re: [sqlite] UTF8-BOM not disregarded in CSV import

2017-06-23 Thread Clemens Ladisch
Mahmoud Al-Qudsi wrote: > with `.import ……`, SQLite3 includes a BOM (UTF-8) as part of the first > column of the first record. The Unicode Standard 9.0 says in section 3.10: | When represented in UTF-8, the byte order mark turns into the byte | sequence . Its usage at the beginning of a UTF-8 data

[sqlite] UTF8-BOM not disregarded in CSV import

2017-06-21 Thread Mahmoud Al-Qudsi
Hello all, Let me start off with my apologies if this is a documented issue; I did search the fossil tickets but did not find anything for “BOM”. As of SQLite 3.19.3, under `.mode csv` and with `.import ……`, SQLite3 includes a BOM (UTF-8) as part of the first column of the first record. IMHO,

Re: [sqlite] UTF8 LIKE stranges

2017-05-23 Thread Clemens Ladisch
Vlczech - Tomáš Volf wrote: > CREATE TABLE people ( > firstname TEXT, > surname TEXT > ); > INSERT INTO people('Tomáš', 'Surname'); > > "SELECT * FROM people WHERE firstname LIKE ?" > For binding I use: sqlite3_bind_text(stmt, 1, name.c_str(), -1, > SQLITE_STATIC); SQLITE_STATIC works only i

[sqlite] UTF8 LIKE stranges

2017-05-23 Thread Vlczech - Tomáš Volf
Hello, I have some strange behaviout in LIKE query in SQLite. Letš see some very simplified example:   Let's have a table CREATE TABLE people (   firstname  TEXT,   surname TEXT ); and in it following data: INSERT INTO people('Tomáš', 'Surname'); created by sqlite3_exec() function.     Then I use

Re: [sqlite] UTF8 LIKE stranges

2017-05-23 Thread Vlczech - Tomáš Volf
Sorry for "spam", I hope that previous HTML form of mail (with bullet lists) will be readable. There is, for sure and better readability for non-HTML clients, plain text version of previous mail:   Hello, I have some strange behaviout in LIKE query in SQLite. Letš see some very simplified examp

Re: [sqlite] UTF8 support?

2008-10-27 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 William Kyngesburye wrote: > So, sqlite supports UTF8 directly - UTF8 in, UTF8 out. No. SQLite supports Unicode internally. The APIs let you supply and receive Unicode strings in UTF8 and UTF16. The actual encoding serialized to disk depends on a

Re: [sqlite] UTF8 support?

2008-10-27 Thread William Kyngesburye
On Oct 27, 2008, at 10:23 AM, MikeW wrote: > William Kyngesburye <[EMAIL PROTECTED]> writes: > >> >> Does SQlite support UTF8 directly? Or is this what the ICU extension >> is for? Does the sqlite3 shell program support UTF8? >> >> There is this spatialite extension which includes a modified sql

Re: [sqlite] UTF8 support?

2008-10-27 Thread MikeW
William Kyngesburye <[EMAIL PROTECTED]> writes: > > Does SQlite support UTF8 directly? Or is this what the ICU extension > is for? Does the sqlite3 shell program support UTF8? > > There is this spatialite extension which includes a modified sqlite3 > shell program that "implements full UNI

[sqlite] UTF8 support?

2008-10-27 Thread William Kyngesburye
Does SQlite support UTF8 directly? Or is this what the ICU extension is for? Does the sqlite3 shell program support UTF8? There is this spatialite extension which includes a modified sqlite3 shell program that "implements full UNICODE support". So I'm a little confused. - William Kyn

Re: [sqlite] utf8 decode

2006-09-20 Thread Roy Tam
Hi, I think you should provide sample data in order to dig out the problem. Regards, 2006/9/20, 卢炎君 <[EMAIL PROTECTED]>: Hi guys First of all, all data be complied as UTF-8 stored in my DB. Second, When I used sqlite browser tool (from sourceforge)to browsed my DB, the result of chine

[sqlite] utf8 decode

2006-09-20 Thread 卢炎君
Hi guys First of all, all data be complied as UTF-8 stored in my DB. Second, When I used sqlite browser tool (from sourceforge)to browsed my DB, the result of chinese characters are correct, then I write a function which just call sqlite3_column_text inside it, Demo like below: const char

Re: [sqlite] UTF8

2006-08-01 Thread Gerry Snyder
Cesar David Rodas Maldonado wrote: Thanks Daniel! Now i have another question! Is any way to serialize all the dates given a preference to SELECT a delay to the insert. I am building a Small Library in C & SQLite that will be under GPL, is something like Lucene. Please help me how to give a p

Re: [sqlite] UTF8

2006-07-27 Thread Cesar David Rodas Maldonado
Thanks Daniel! Now i have another question! Is any way to serialize all the dates given a preference to SELECT a delay to the insert. I am building a Small Library in C & SQLite that will be under GPL, is something like Lucene. Please help me how to give a preference to SELECT and a delay to INS

Re: [sqlite] UTF8

2006-07-27 Thread Daniel van Ham Colchete
Cesar David Rodas Maldonado wrote: > I wanted to ask how can i know if a given text is UTF8 or ISO-8859-1? Well, there might be a way if you only want to know if the text is UTF-8 or ISO-8859-1 (it means that you already know that is one is the other). There are some invalid UTF-8 sequences. If yo

Re: [sqlite] UTF8

2006-07-26 Thread Cesar David Rodas Maldonado
Thanks peter! :D On 7/26/06, Peter Cunderlik <[EMAIL PROTECTED]> wrote: > I wanted to ask how can i know if a given text is UTF8 or ISO-8859-1? If you need conversions, the simplest would be to do it manually using look-up tables. AFAIK none of the Latin-1 characters take more than 2 bytes in

Re: [sqlite] UTF8

2006-07-26 Thread Peter Cunderlik
I wanted to ask how can i know if a given text is UTF8 or ISO-8859-1? If you need conversions, the simplest would be to do it manually using look-up tables. AFAIK none of the Latin-1 characters take more than 2 bytes in UTF-8, so having 2*256 bytes long table won't hurt. If you want to decode s

Re: [sqlite] UTF8

2006-07-26 Thread Cesar David Rodas Maldonado
I'm sorry! English is not my first language!! :D I wanted to ask how can i know if a given text is UTF8 or ISO-8859-1? Thanks and please forgive me for my english! :D On 7/26/06, Cory Nelson <[EMAIL PROTECTED]> wrote: ASCII is completely valid UTF-8, so no conversion is necessary. On 7/26/06

Re: [sqlite] UTF8

2006-07-26 Thread Cory Nelson
ASCII is completely valid UTF-8, so no conversion is necessary. On 7/26/06, Cesar David Rodas Maldonado <[EMAIL PROTECTED]> wrote: How can i know if a given text is UTF8 or ascii? and how can i convert between ascii to UTF8? -- Cory Nelson http://www.int64.org

[sqlite] UTF8

2006-07-26 Thread Cesar David Rodas Maldonado
How can i know if a given text is UTF8 or ascii? and how can i convert between ascii to UTF8?

Re: [sqlite] UTF8 question

2004-01-12 Thread William M. Droste
SQLite will store anything UTF-8 or ANSI as long as its null terminated and escaped for '. The only place where encoding makes a difference is in functions like length. Will Steven Van Ingelgem wrote: If I have a table like: CREATE TABLE routing ( FIELD1 VARCHAR(40) ); what if I get a stri

[sqlite] UTF8 question

2004-01-11 Thread Steven Van Ingelgem
If I have a table like: CREATE TABLE routing ( FIELD1 VARCHAR(40) ); what if I get a string which is in ANSI = 40chars, but in UTF8 > 40chars? (for example because it uses ü and such characters...) Does it get stored correctly? G00fy, (aka KaReL, aka Steven) Main Webpage : http://komma.cjb.