On Friday, 6 January, 2017 12:49, James K. Lowden <jklow...@schemamania.org> wrote:
> On Fri, 6 Jan 2017 10:23:06 +1100 > "dandl" <da...@andl.org> wrote: > > > Unix globbing for Linux is defined here: > > http://man7.org/linux/man-pages/man7/glob.7.html. AFAICT Sqlite does > > not implement this behaviour. > > A quick scan of SQLite sources shows only references to the glob > function, no implementation. In func.c, we find > > LIKEFUNC(glob, 2, &globInfo, SQLITE_FUNC_LIKE|SQLITE_FUNC_CASE), > > It looks to me like SQLite imports glob(3) as its default > implementation. Have you an example for which a glob pattern behaves > differently in SQLite versus C? > > (For those following along at home, bear in mind that glob(3) need not > necessarily be what your favorite shell uses.) > > If indeed SQLite is using the glob function from libc, ISTM it's > perfectly sufficient to refer to glob(7) for syntax, since that's the > documentation for the controlling implementation. SQLite does not use the glob function from the standard library -- the function is defined in func.c Both "glob" and "like" call the same function, likeFunc with different sets of user_data. likeFunc does a bunch of validation then calls patternCompare which actually implements the like and glob functionality. How like and glob work are documented in the preface to patternCompare. like implements the standard sql like using % (0 or more) and _ (exactly 1 char) as wildcard matches. glob implements unix globbing using * (0 or more) and ? (exactly 1) as wildcard matches. "sets" of characters are indicated by squockets (square brackets -- []). Different from the standard unix glob however, it uses ^ to invert the sense of a set rather than an !. Since it is unicode, a character is [\u0000-\u10FFFF]. [^1-7] is equivalent to a match of any of the remaining unicode characters. thus in unix/linux one may pronounce "match anything where one character is not the digits 1 through 7" as *[!1-7]* one would pronounce the same request to SQLite as *[^1-7]* This of course would match any string that was not composed entirely of only the characters 1 through 7 (not that there are no characters 1 through 7 in the string) -- and must be at least 1 character long. If one wanted to match strings that contained a 1 through 7 anywhere within, then one would pronounce *[1-7]* on both unix/linux and to SQLite Were one to want a glob that excluded all strings that contained the digits 1 though 7 anywhere within, then one would pronounce, in SQLite, WHERE NOT x GLOB '*[1-7]*' -- though this would also now match 0 length strings. There is no way to "invert" the match-sense of a glob pattern within the pattern itself. That is, one cannot use '^*[1-7]*' as an equivalent to the above inversion of the results of a positive match. GLOB patterns only search for a positive match, not an exclusion. The [^stuf] excludes the characters or range provided from the characters matched by a ? -- [^stuf] is not an exclusion of the characters stuf but rather a match for any of the other unicode characters except stuf -- in other words a "somewhat limited ?". _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users