Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread J Decker
It's not that it's impossible.
https://github.com/mackyle/sqlite/compare/master...d3x0r:AllowInlineNulChars

Improves handling in TCL, sqlite shell, and passing complete composed SQL
strings to prepare...

Which is why I know if you don't use the broken C-flavored(tainted) SQL
String functions, storing and retrieiving STRINGS with NUL characters is
easily handled which is why I started with my first response of 'what
tool are you actually using?'

In the storage, both the length and data are stored. (have been since 3.0
or before)

On Thu, Apr 19, 2018 at 4:11 PM, J Decker <d3c...@gmail.com> wrote:

>
>
> On Thu, Apr 19, 2018 at 4:07 PM, Keith Medcalf <kmedc...@dessus.com>
> wrote:
>
>>
>> And what makes you think a "javascript string" is a "C string"?  While
>> the "string" part may be the same, "javascript" certainly does not equal
>> "C".  Just like you do not have issues with embedded zero-bytes in "pascal
>> strings".  Note that "pascal" != "C" even though "string" == "string".
>>
>>
> by the same reasoning that you apply saying SQL strings are C strings.
>
>
>> Note that the sqlite3_value_text returns the data (including embedded
>> zero bytes), but not the length.
>
>
> sqlite3_value_bytes() returns the number of bytes.
>
> 
>
>> If you pass the data returned thereby to a function expecting a C string
>> (zero terminated), it will terminate at the first zero byte encountered.
>> If you retrieve the length and the data separately and construct
>> pascal-style strings and pass them to functions expecting "pascal" style
>> strings, then the embedded zero is just "string data" (NB:  pascal is used
>> only as an example -- many X strings contain an embedded length for any
>> given value of X -- C strings do not).
>>
>> 
>
> Obviously "javascript" strings contain a length indicator and are not
>> zero-terminated.
>>
>> ---
>> The fact that there's a Highway to Hell but only a Stairway to Heaven
>> says a lot about anticipated traffic volume.
>>
>>
>> >-Original Message-
>> >From: sqlite-users [mailto:sqlite-users-
>> >boun...@mailinglists.sqlite.org] On Behalf Of J Decker
>> >Sent: Thursday, 19 April, 2018 16:41
>> >To: SQLite mailing list
>> >Subject: Re: [sqlite] SQLite3 - Search on text field with \0 binary
>> >data
>> >
>> >On Thu, Apr 19, 2018 at 3:37 PM, J Decker <d3c...@gmail.com> wrote:
>> >
>> >>
>> >>
>> >> On Thu, Apr 19, 2018 at 3:22 PM, Keith Medcalf
>> ><kmedc...@dessus.com>
>> >> wrote:
>> >>
>> >>>
>> >>> Actually, nothing in the C or C++ world will "go past" the NULL
>> >byte
>> >>> since the very definition of a C string is a "bunch-o-bytes that
>> >are
>> >>> non-zero followed by one that is".
>> >>>
>> >>> And sory for the double response; but if C/C++ couldn't handle a
>> >NUL
>> >character (the character is 1 L) then spidermonkey/chakra/V8 would
>> >have
>> >problems with NUL characters in javascript strings.  But it doesn't.
>> >Why
>> >is that?
>> >
>> >
>> >> that doesnt' mean you can use a custom token structure that
>> >contains both
>> >> the pointer and length of the data. (which it already has)
>> >> sure, using standard C api - strlen, etc sure... but sqlite uses a
>> >custom
>> >> function internally sqlite3stlren30  which can easily be extended
>> >to take
>> >> the length of the string; but wait, if it was saved, it wouldn't
>> >need to be
>> >> called, and a overall performance gain is created.
>> >>
>> >> the biggest problem is really the internal function
>> >'(something)printf'
>> >> which returns a char *, and has no space to return the length, like
>> >> snprintf would.
>> >>
>> >> and I can easily put nuls into a string
>> >>
>> >> char buf[256];
>> >> int len = snprintf( buf, 256, "put a nul %c here and here %c", 0, 0
>> >);
>> >> and the length returned would be 27.
>> >>
>> >>
>> >>> If you want to embed non UTF8 text you should be using a BLOB not
>> >TEXT.
>> >>> Text means "an array of non-zero characters termina

Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread J Decker
On Thu, Apr 19, 2018 at 4:07 PM, Keith Medcalf <kmedc...@dessus.com> wrote:

>
> And what makes you think a "javascript string" is a "C string"?  While the
> "string" part may be the same, "javascript" certainly does not equal "C".
> Just like you do not have issues with embedded zero-bytes in "pascal
> strings".  Note that "pascal" != "C" even though "string" == "string".
>
>
by the same reasoning that you apply saying SQL strings are C strings.


> Note that the sqlite3_value_text returns the data (including embedded zero
> bytes), but not the length.


sqlite3_value_bytes() returns the number of bytes.



> If you pass the data returned thereby to a function expecting a C string
> (zero terminated), it will terminate at the first zero byte encountered.
> If you retrieve the length and the data separately and construct
> pascal-style strings and pass them to functions expecting "pascal" style
> strings, then the embedded zero is just "string data" (NB:  pascal is used
> only as an example -- many X strings contain an embedded length for any
> given value of X -- C strings do not).
>
> 

Obviously "javascript" strings contain a length indicator and are not
> zero-terminated.
>
> ---
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
> >-----Original Message-----
> >From: sqlite-users [mailto:sqlite-users-
> >boun...@mailinglists.sqlite.org] On Behalf Of J Decker
> >Sent: Thursday, 19 April, 2018 16:41
> >To: SQLite mailing list
> >Subject: Re: [sqlite] SQLite3 - Search on text field with \0 binary
> >data
> >
> >On Thu, Apr 19, 2018 at 3:37 PM, J Decker <d3c...@gmail.com> wrote:
> >
> >>
> >>
> >> On Thu, Apr 19, 2018 at 3:22 PM, Keith Medcalf
> ><kmedc...@dessus.com>
> >> wrote:
> >>
> >>>
> >>> Actually, nothing in the C or C++ world will "go past" the NULL
> >byte
> >>> since the very definition of a C string is a "bunch-o-bytes that
> >are
> >>> non-zero followed by one that is".
> >>>
> >>> And sory for the double response; but if C/C++ couldn't handle a
> >NUL
> >character (the character is 1 L) then spidermonkey/chakra/V8 would
> >have
> >problems with NUL characters in javascript strings.  But it doesn't.
> >Why
> >is that?
> >
> >
> >> that doesnt' mean you can use a custom token structure that
> >contains both
> >> the pointer and length of the data. (which it already has)
> >> sure, using standard C api - strlen, etc sure... but sqlite uses a
> >custom
> >> function internally sqlite3stlren30  which can easily be extended
> >to take
> >> the length of the string; but wait, if it was saved, it wouldn't
> >need to be
> >> called, and a overall performance gain is created.
> >>
> >> the biggest problem is really the internal function
> >'(something)printf'
> >> which returns a char *, and has no space to return the length, like
> >> snprintf would.
> >>
> >> and I can easily put nuls into a string
> >>
> >> char buf[256];
> >> int len = snprintf( buf, 256, "put a nul %c here and here %c", 0, 0
> >);
> >> and the length returned would be 27.
> >>
> >>
> >>> If you want to embed non UTF8 text you should be using a BLOB not
> >TEXT.
> >>> Text means "an array of non-zero characters terminated by a zero
> >byte" and
> >>> a BLOB means a "bag-o-bytes" of a specific size.
> >>>
> >>> Blob means binary; havihng to deal with a binary structure to
> >convert to
> >> a string and back is ridiculous when the interface already supports
> >storing
> >> and getting strings with \0 in them.
> >>
> >>
> >>> Things meants to work on C "strings" should always stop at the
> >zero
> >>> terminator.  Failure to do so can lead to AHBL.
> >>>
> >>>
> >> So don't use the standard library.  That was one of the first
> >htings I
> >> created for my MUD client; a smart text string class.  (I say class
> >in the
> >> generic term, not the literal, since it was written in C)
> >>
> >>
> >>> (Note, this applies to "wide" (as in word) and "fat" (as in double
> >word)
> >

Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread Keith Medcalf

And what makes you think a "javascript string" is a "C string"?  While the 
"string" part may be the same, "javascript" certainly does not equal "C".  Just 
like you do not have issues with embedded zero-bytes in "pascal strings".  Note 
that "pascal" != "C" even though "string" == "string".

Note that the sqlite3_value_text returns the data (including embedded zero 
bytes), but not the length.  If you pass the data returned thereby to a 
function expecting a C string (zero terminated), it will terminate at the first 
zero byte encountered.  If you retrieve the length and the data separately and 
construct pascal-style strings and pass them to functions expecting "pascal" 
style strings, then the embedded zero is just "string data" (NB:  pascal is 
used only as an example -- many X strings contain an embedded length for any 
given value of X -- C strings do not).

Obviously "javascript" strings contain a length indicator and are not 
zero-terminated.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of J Decker
>Sent: Thursday, 19 April, 2018 16:41
>To: SQLite mailing list
>Subject: Re: [sqlite] SQLite3 - Search on text field with \0 binary
>data
>
>On Thu, Apr 19, 2018 at 3:37 PM, J Decker <d3c...@gmail.com> wrote:
>
>>
>>
>> On Thu, Apr 19, 2018 at 3:22 PM, Keith Medcalf
><kmedc...@dessus.com>
>> wrote:
>>
>>>
>>> Actually, nothing in the C or C++ world will "go past" the NULL
>byte
>>> since the very definition of a C string is a "bunch-o-bytes that
>are
>>> non-zero followed by one that is".
>>>
>>> And sory for the double response; but if C/C++ couldn't handle a
>NUL
>character (the character is 1 L) then spidermonkey/chakra/V8 would
>have
>problems with NUL characters in javascript strings.  But it doesn't.
>Why
>is that?
>
>
>> that doesnt' mean you can use a custom token structure that
>contains both
>> the pointer and length of the data. (which it already has)
>> sure, using standard C api - strlen, etc sure... but sqlite uses a
>custom
>> function internally sqlite3stlren30  which can easily be extended
>to take
>> the length of the string; but wait, if it was saved, it wouldn't
>need to be
>> called, and a overall performance gain is created.
>>
>> the biggest problem is really the internal function
>'(something)printf'
>> which returns a char *, and has no space to return the length, like
>> snprintf would.
>>
>> and I can easily put nuls into a string
>>
>> char buf[256];
>> int len = snprintf( buf, 256, "put a nul %c here and here %c", 0, 0
>);
>> and the length returned would be 27.
>>
>>
>>> If you want to embed non UTF8 text you should be using a BLOB not
>TEXT.
>>> Text means "an array of non-zero characters terminated by a zero
>byte" and
>>> a BLOB means a "bag-o-bytes" of a specific size.
>>>
>>> Blob means binary; havihng to deal with a binary structure to
>convert to
>> a string and back is ridiculous when the interface already supports
>storing
>> and getting strings with \0 in them.
>>
>>
>>> Things meants to work on C "strings" should always stop at the
>zero
>>> terminator.  Failure to do so can lead to AHBL.
>>>
>>>
>> So don't use the standard library.  That was one of the first
>htings I
>> created for my MUD client; a smart text string class.  (I say class
>in the
>> generic term, not the literal, since it was written in C)
>>
>>
>>> (Note, this applies to "wide" (as in word) and "fat" (as in double
>word)
>>> and obese (as in quad word) strings as well.  They are a sequence
>of
>>> words/double-words/quad-words/ten-words (whatever) that are non-
>zero
>>> followed by one that is zero -- and the narrow/wide/fat/obese
>string ends
>>> at the zeo value).
>>>
>>>
>> utf8everywhere.org
>> No reason to use wide char.
>>
>>
>> get good, son. (sorry if that's overly offensive)
>>
>> ---
>>> The fact that there's a Highway to Hell but only a Stairway to
>Heaven
>>> says a lot about anticipated traffic volume.
>>>
>>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread J Decker
On Thu, Apr 19, 2018 at 3:37 PM, J Decker  wrote:

>
>
> On Thu, Apr 19, 2018 at 3:22 PM, Keith Medcalf 
> wrote:
>
>>
>> Actually, nothing in the C or C++ world will "go past" the NULL byte
>> since the very definition of a C string is a "bunch-o-bytes that are
>> non-zero followed by one that is".
>>
>> And sory for the double response; but if C/C++ couldn't handle a NUL
character (the character is 1 L) then spidermonkey/chakra/V8 would have
problems with NUL characters in javascript strings.  But it doesn't.  Why
is that?


> that doesnt' mean you can use a custom token structure that contains both
> the pointer and length of the data. (which it already has)
> sure, using standard C api - strlen, etc sure... but sqlite uses a custom
> function internally sqlite3stlren30  which can easily be extended to take
> the length of the string; but wait, if it was saved, it wouldn't need to be
> called, and a overall performance gain is created.
>
> the biggest problem is really the internal function '(something)printf'
> which returns a char *, and has no space to return the length, like
> snprintf would.
>
> and I can easily put nuls into a string
>
> char buf[256];
> int len = snprintf( buf, 256, "put a nul %c here and here %c", 0, 0 );
> and the length returned would be 27.
>
>
>> If you want to embed non UTF8 text you should be using a BLOB not TEXT.
>> Text means "an array of non-zero characters terminated by a zero byte" and
>> a BLOB means a "bag-o-bytes" of a specific size.
>>
>> Blob means binary; havihng to deal with a binary structure to convert to
> a string and back is ridiculous when the interface already supports storing
> and getting strings with \0 in them.
>
>
>> Things meants to work on C "strings" should always stop at the zero
>> terminator.  Failure to do so can lead to AHBL.
>>
>>
> So don't use the standard library.  That was one of the first htings I
> created for my MUD client; a smart text string class.  (I say class in the
> generic term, not the literal, since it was written in C)
>
>
>> (Note, this applies to "wide" (as in word) and "fat" (as in double word)
>> and obese (as in quad word) strings as well.  They are a sequence of
>> words/double-words/quad-words/ten-words (whatever) that are non-zero
>> followed by one that is zero -- and the narrow/wide/fat/obese string ends
>> at the zeo value).
>>
>>
> utf8everywhere.org
> No reason to use wide char.
>
>
> get good, son. (sorry if that's overly offensive)
>
> ---
>> The fact that there's a Highway to Hell but only a Stairway to Heaven
>> says a lot about anticipated traffic volume.
>>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread J Decker
On Thu, Apr 19, 2018 at 3:22 PM, Keith Medcalf  wrote:

>
> Actually, nothing in the C or C++ world will "go past" the NULL byte since
> the very definition of a C string is a "bunch-o-bytes that are non-zero
> followed by one that is".
>
> that doesnt' mean you can use a custom token structure that contains both
the pointer and length of the data. (which it already has)
sure, using standard C api - strlen, etc sure... but sqlite uses a custom
function internally sqlite3stlren30  which can easily be extended to take
the length of the string; but wait, if it was saved, it wouldn't need to be
called, and a overall performance gain is created.

the biggest problem is really the internal function '(something)printf'
which returns a char *, and has no space to return the length, like
snprintf would.

and I can easily put nuls into a string

char buf[256];
int len = snprintf( buf, 256, "put a nul %c here and here %c", 0, 0 );
and the length returned would be 27.


> If you want to embed non UTF8 text you should be using a BLOB not TEXT.
> Text means "an array of non-zero characters terminated by a zero byte" and
> a BLOB means a "bag-o-bytes" of a specific size.
>
> Blob means binary; havihng to deal with a binary structure to convert to a
string and back is ridiculous when the interface already supports storing
and getting strings with \0 in them.


> Things meants to work on C "strings" should always stop at the zero
> terminator.  Failure to do so can lead to AHBL.
>
>
So don't use the standard library.  That was one of the first htings I
created for my MUD client; a smart text string class.  (I say class in the
generic term, not the literal, since it was written in C)


> (Note, this applies to "wide" (as in word) and "fat" (as in double word)
> and obese (as in quad word) strings as well.  They are a sequence of
> words/double-words/quad-words/ten-words (whatever) that are non-zero
> followed by one that is zero -- and the narrow/wide/fat/obese string ends
> at the zeo value).
>
>
utf8everywhere.org
No reason to use wide char.


get good, son. (sorry if that's overly offensive)

---
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread Keith Medcalf

Actually, nothing in the C or C++ world will "go past" the NULL byte since the 
very definition of a C string is a "bunch-o-bytes that are non-zero followed by 
one that is".

If you want to embed non UTF8 text you should be using a BLOB not TEXT.  Text 
means "an array of non-zero characters terminated by a zero byte" and a BLOB 
means a "bag-o-bytes" of a specific size.

Things meants to work on C "strings" should always stop at the zero terminator. 
 Failure to do so can lead to AHBL.

(Note, this applies to "wide" (as in word) and "fat" (as in double word) and 
obese (as in quad word) strings as well.  They are a sequence of 
words/double-words/quad-words/ten-words (whatever) that are non-zero followed 
by one that is zero -- and the narrow/wide/fat/obese string ends at the zeo 
value).

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of J Decker
>Sent: Thursday, 19 April, 2018 14:09
>To: SQLite mailing list
>Subject: Re: [sqlite] SQLite3 - Search on text field with \0 binary
>data
>
>are you testing this in the sqlite3 shell or in your own program?
>You should have no issue getting thisdata,
>
>sqlite3_column_text followed by sqlite3_column_bytes  (the byte count
>is
>set when the data is fetched in some format).
>
>without modification, you can either excape \0 with'||char(0)||'
>...
>
>"My data \0with binary".
>"My data "||char(0)||" with binary".
>
>or use a parameter binding to get the value stored in the database.
>
>The row with a nul should be stored in the database; but the command
>line
>shell will not return the right values for selects; it will truncate
>that
>returned values at the \0.
>
>
>On Thu, Apr 19, 2018 at 5:27 AM, MARCHAND Loïc
><loic.march...@tagginfo.com>
>wrote:
>
>> I index a file in a SQLite DB.
>> I create my table with this:
>> CREATE TABLE Record (RecordID INTEGER,Data TEXT,PRIMARY KEY
>(RecordID))
>>
>> I read a file, and for each line I add a row on the table. Each
>line can
>> have binary data at end. It's not a problem for many chars, but \0
>char
>> make a problem.
>> If I have a line like this : "My data \0with binary".
>> When I try to get data after the \0 not worked (SELECT substr(Data,
>11, 5)
>> FROM Record return an empty string or SELECT substr(Data, 4, 10)
>FROM
>> Record return data)
>>
>> When I try to search a data (SELECT Data FROM Record WHERE Data
>LIKE
>> '%binar%') return 0 rows returned.
>>
>> How can I solve this problem ? I try to replace \0 by an other char
>> sequence, but it's not a good idea because can I have this sequence
>in my
>> file.
>> Thank you
>> Loïc
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread J Decker
On Thu, Apr 19, 2018 at 1:49 PM, David Raymond <david.raym...@tomtom.com>
wrote:

> After a little testing, of the core functions:
>
> Affected by \x00:
> substr
> like
> length
> quote
> replace when you're trying to replace the \x00
>
> Not affected by \x00:
> plain ol select
> instr
> lower
> upper
> trim
> ltrim
> rtrim
> replace when you're not replacing the \x00
> (works and replaces bits after the \x00 as well)
> ||
>
> At the moment I can't find anything in the documentation that covers this.
>
>
length has a note about terminating at nul.

I do know it's not an issue with the SQL standard; only implementations of
the standard.

The SQL standard is quite clear...

As mentioned here...
http://sqlite.1065341.n5.nabble.com/sqlite-command-line-tool-fails-to-dump-data-tp100196p100218.html




http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

spaces are used to separate syntactic elements. Multiple spaces and
 line breaks are treated as a single space. Apart from those symbols

 to which special functions were given above,
*other characters and  character strings in a formula stand for
themselves. In *addition,
 if the symbols to the right of the definition operator in a produc-

 tion consist entirely of BNF symbols, then those symbols stand for
 themselves and do not take on their special meaning.


For every portion of the string enclosed in square brackets,
either delete the brackets and their contents or change the
brackets to braces.


(from SQL 92) By this, I shouldn't also be able to use ~, `, Γειά σου Κόσμ,
Привет мир, or any other UNICODE character. (that is if you say things not
listed are " cannot contain embedded ;"


There is no specification that \0 means anything (in SQL).


So instead of like maybe you could use instr() instead?
> And it looks like messing with cast might do it to make substr work
>
> select cast(substr(cast(Data as blob), 11, 5) as text) from Record;
> select Data from Record where instr(Data, 'binar');
>
> Seems a little odd, as even when I declare the Data field as blob to begin
> with substr still doesn't work, but calling on the cast value does.
>
>
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of MARCHAND Loïc
> Sent: Thursday, April 19, 2018 8:27 AM
> To: sqlite-users@mailinglists.sqlite.org
> Subject: [sqlite] SQLite3 - Search on text field with \0 binary data
>
> I index a file in a SQLite DB.
> I create my table with this:
> CREATE TABLE Record (RecordID INTEGER,Data TEXT,PRIMARY KEY (RecordID))
>
> I read a file, and for each line I add a row on the table. Each line can
> have binary data at end. It's not a problem for many chars, but \0 char
> make a problem.
> If I have a line like this : "My data \0with binary".
> When I try to get data after the \0 not worked (SELECT substr(Data, 11, 5)
> FROM Record return an empty string or SELECT substr(Data, 4, 10) FROM
> Record return data)
>
> When I try to search a data (SELECT Data FROM Record WHERE Data LIKE
> '%binar%') return 0 rows returned.
>
> How can I solve this problem ? I try to replace \0 by an other char
> sequence, but it's not a good idea because can I have this sequence in my
> file.
> Thank you
> Loïc
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread David Raymond
After a little testing, of the core functions:

Affected by \x00:
substr
like
length
quote
replace when you're trying to replace the \x00

Not affected by \x00:
plain ol select
instr
lower
upper
trim
ltrim
rtrim
replace when you're not replacing the \x00
(works and replaces bits after the \x00 as well)
||

At the moment I can't find anything in the documentation that covers this.

So instead of like maybe you could use instr() instead?
And it looks like messing with cast might do it to make substr work

select cast(substr(cast(Data as blob), 11, 5) as text) from Record;
select Data from Record where instr(Data, 'binar');

Seems a little odd, as even when I declare the Data field as blob to begin with 
substr still doesn't work, but calling on the cast value does.


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of MARCHAND Loïc
Sent: Thursday, April 19, 2018 8:27 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] SQLite3 - Search on text field with \0 binary data

I index a file in a SQLite DB.
I create my table with this:
CREATE TABLE Record (RecordID INTEGER,Data TEXT,PRIMARY KEY (RecordID))

I read a file, and for each line I add a row on the table. Each line can have 
binary data at end. It's not a problem for many chars, but \0 char make a 
problem.
If I have a line like this : "My data \0with binary".
When I try to get data after the \0 not worked (SELECT substr(Data, 11, 5) FROM 
Record return an empty string or SELECT substr(Data, 4, 10) FROM Record return 
data)

When I try to search a data (SELECT Data FROM Record WHERE Data LIKE '%binar%') 
return 0 rows returned.

How can I solve this problem ? I try to replace \0 by an other char sequence, 
but it's not a good idea because can I have this sequence in my file.
Thank you
Loïc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread Simon Slavin
On 19 Apr 2018, at 1:27pm, MARCHAND Loïc  wrote:

> CREATE TABLE Record (RecordID INTEGER,Data TEXT,PRIMARY KEY (RecordID))
> 
> I read a file, and for each line I add a row on the table. Each line can have 
> binary data at end. It's not a problem for many chars, but \0 char make a 
> problem.
> If I have a line like this : "My data \0with binary".

Although this may not be causing your problem, you should declare "Data BLOB" 
not "Data TEXT".  Although SQLite itself doesn't care, some SQLite libraries 
may interpret TEXT as meaning that you want the data handled as C text, with \0 
meaning a terminator.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread J Decker
are you testing this in the sqlite3 shell or in your own program?
You should have no issue getting thisdata,

sqlite3_column_text followed by sqlite3_column_bytes  (the byte count is
set when the data is fetched in some format).

without modification, you can either excape \0 with'||char(0)||'...

"My data \0with binary".
"My data "||char(0)||" with binary".

or use a parameter binding to get the value stored in the database.

The row with a nul should be stored in the database; but the command line
shell will not return the right values for selects; it will truncate that
returned values at the \0.


On Thu, Apr 19, 2018 at 5:27 AM, MARCHAND Loïc 
wrote:

> I index a file in a SQLite DB.
> I create my table with this:
> CREATE TABLE Record (RecordID INTEGER,Data TEXT,PRIMARY KEY (RecordID))
>
> I read a file, and for each line I add a row on the table. Each line can
> have binary data at end. It's not a problem for many chars, but \0 char
> make a problem.
> If I have a line like this : "My data \0with binary".
> When I try to get data after the \0 not worked (SELECT substr(Data, 11, 5)
> FROM Record return an empty string or SELECT substr(Data, 4, 10) FROM
> Record return data)
>
> When I try to search a data (SELECT Data FROM Record WHERE Data LIKE
> '%binar%') return 0 rows returned.
>
> How can I solve this problem ? I try to replace \0 by an other char
> sequence, but it's not a good idea because can I have this sequence in my
> file.
> Thank you
> Loïc
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite3 - Search on text field with \0 binary data

2018-04-19 Thread MARCHAND Loïc
I index a file in a SQLite DB.
I create my table with this:
CREATE TABLE Record (RecordID INTEGER,Data TEXT,PRIMARY KEY (RecordID))

I read a file, and for each line I add a row on the table. Each line can have 
binary data at end. It's not a problem for many chars, but \0 char make a 
problem.
If I have a line like this : "My data \0with binary".
When I try to get data after the \0 not worked (SELECT substr(Data, 11, 5) FROM 
Record return an empty string or SELECT substr(Data, 4, 10) FROM Record return 
data)

When I try to search a data (SELECT Data FROM Record WHERE Data LIKE '%binar%') 
return 0 rows returned.

How can I solve this problem ? I try to replace \0 by an other char sequence, 
but it's not a good idea because can I have this sequence in my file.
Thank you
Loïc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users