On Tuesday, 1 October, 2019 11:22, Jim and Carol Ingram 
<ingram-ho...@mysteriousgrace.org> wrote:

>I'm trying to use SQLite3.exe (the command-line executable) to do some
>data mining of a database generated by a commercial software product that
>incorporates the .dll version (I can see sqlite3.dll in its Program Files
>folder).  That commercial software has a menu option to make a backup
>copy of the database, and that's what I'm attempting to mine.  I'm only using
>SELECT against it.

>Many of my attempted queries, especially those with JOINs on TEXT columns
>fail with an "Error: no such collation sequence: LOCALE_NOCASE" message.

This means that the column is using a collation sequence named LOCALE_NOCASE 
and that you do not have that collation sequence loaded, ie, that "there is no 
such collation sequence".

>The "LOCALE_NOCASE" error message is obviously caused by the Keyword
>column in the Keywords table that is declared as COLLATE LOCALE_NOCASE.

Yes.  The column is also indexed.  The index is also unuseable without the 
collation sequence.

>Searching the SQLite and third-party websites for "LOCALE_NOCASE" suggests that
>this particular collation is not built-in to SQLite, but may have been
>developed by the commercial software programmers.  Since I am accessing the
>database from the SQLite3 command line executable (and not the software's 
>.dll), I
>seem to have no access to collations that begin with "LOCALE_".  

>QUESTION ONE:  Are these "LOCALE_..." collations custom collations developed by
>the software programmers, or are they something I just haven't discovered yet
>in the SQLite documentation?

They are custom collation names.  Whether or not it is a custom collation that 
is any different from the built-in NOCASE collation can only be answered by the 
programmer who used that collation name.

>I've developed a workaround.  Starting with the Keywords table schema, I
>CREATE TABLE a "myKeywords" table with the same columns but omitting the
>COLLATE LOCALE_NOCASE declaration.  Then I INSERT INTO the new
>"myKeywords" table all of the data from the original Keywords table.  
>My original query then runs without error against the new "myKeywords" table.  
>But this is a pretty onerous approach.

>I have noted by trial and error that modifying the original query's last
>line to include COLLATE NOCASE ("group by k.Keyword COLLATE NOCASE;")
>makes the query work against the original Keywords table without error.  I
>suspect this is the correct way to address the error message rather than my
>admittedly messy and time-consuming workaround.  

Yes, this will work.  You are overriding the collation sequence defined for the 
column in the table definition and using the builtin NOCASE (ASCII only) case 
insensitive collation.  This means that the index on k.keyword will not be used 
since it is the wrong collation.

>QUESTION TWO:  Can anyone verify that this is indeed the acceptable way to 
>address the error
>message, or provide a more acceptable way to do it?

It will work however the indexes created using the original collation will be 
unuseable.

>And finally, QUESTION THREE: Can anyone point me to a thorough and
>detailed tutorial of the whole SQLite COLLATE subject, including creation of
>custom collations and loading them into the SQLite command-line executable if
>possible?  The documentation seems really thin in this area!

https://sqlite.org/c3ref/create_collation.html
https://sqlite.org/datatype3.html#collation

A collation is simply a sorting function.  It receives two character strings A 
and B and returns their sort order.  If A < B then -1 is returned.  If A == B 
then 0 is returned.  If A > B then 1 is returned.  The default collation is 
BINARY which is nothing more than strcmp(A,B) (ie, sort in binary order).

Since this is on Windows I would suspect that LOCALE_NOCASE is simply a 
collation which uses the builtin Win32 CompareStringW function (with some 
combination of flags) to allow handling of characters other than just ASII.

>TIA for any suggestions anyone can provide.

There are a couple of simple options that I can see:

(1) dump the database, modify the dump to get rid of the custom collation 
(perhaps use the builtin NOCASE collation, perhaps the data is not multilingual 
but is plain ASCII), and then reload the .dump file into a new database, and 
use that (the new database will be using an ASCII-only NOCASE collation, so if 
you have non-ASCII characters this may produce different results than your 
application)

sqlite3 broken_database.db .dump > dumpfile.sql
edit dumpfile.sql and change LOCALE_NOCASE to NOCASE
sqlite3 newdatabase.db < dumpfile.sql

(2) look in the application directories for another DLL that implements the 
collation sequence, and load it into your command shell using .load <dllname>

(3) use the version of the command shell that loads sqlite3.dll rather than 
having it statically linked into the executable.  This probably requires the 
most fiddling and actually getting the collation loaded depends on if it has 
been builtin to a custom sqlite3.dll or part of the application, or is 
somewhere else (like in another DLL).

Generally (2) is the most likely to work if you can find the extension that the 
application loads in the application directories.  Otherwise, you will have to 
dump/modify/reload the database to get rid of the custom collation.  
Alternatively, you could update the database schema (sqlite_master) table and 
change the collation name, then re-index the database (because the indexes will 
contain data from the old collation, not the one you changed it to).

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to