On Fri, Jan 29, 2016 at 1:53 PM, Cecil Westerhof <cldwesterhof at gmail.com> wrote:
> 2016-01-29 10:12 GMT+01:00 Richard Hipp <drh at sqlite.org>: > > On 1/29/16, Cecil Westerhof <cldwesterhof at gmail.com> wrote: > > > In H2 I can find records containing non-ascii characters with: > > > SELECT * > > > FROM <TABLE> > > > WHERE STRINGENCODE(<COLUMN>) LIKE '%\\u%' > > > > > > Is something like this also possible with SQLite? > > > > Perhaps something like this: > > SELECT * FROM <table> > > WHERE <column> GLOB ('*[^'||char(1,45,127)||']*'); > > I tried it and it looks like it works. But I do not understand how it > works. Could you enlighten me? > char() returns text with 1 (unicode) char per unicode code point integer used as argument to this function. So char(1,45,127) return first-ascii-char - (i.e. dash) last-ascii-char (unicode is a superset of ascii). So the same way a-z is the character range between a and z, i.e. all lower-case letters NUL - DEL is the full range of ascii characters. (char(1) = NUL, char(127) = DEL) [^...] is the negation of the characters inside. So [^NUL-DEL] is all non-ascii (unicode) characters. It matches a single character. And *[^NULL-DEL]* allows that 1 character to be matched anywhere. * is the kleen star, which matches 0 or more arbitrary characters. Note that glob above is not unicode aware per se. It operates on UTF-8 encoded strings, and in that encoding, all non-ascii characters are encoded into multiple bytes, which (the bytes) all are in the 128-255 range. So it works indeed. > Note, however, that the "*[" combination in a GLOB pattern in SQLite > > is inefficient. So the above might be slow for a large amount of text. > because the kleen star is "griddy" by default. > > Perhaps a better approach would be to create an application-defined > > function to do the search. > > > > SELECT * FROM <table> WHERE contains_non_ascii(<column>); > > > > ?I am working (in this case) with bash. That is not possible then I > think.Or is it? > You can create a C shared library SQLite extension, that you .load in the sqlite3 shell. So that works in bash too, but you need to do some C programming. --DD