Firstly, thank you all for your answers,

egrep gets to work with a flat text file, which it can easily read
sequentially and get optimal performance from the OS's file
buffering/cache management.  It only needs to read a piece of the file
and scan for patterns, repeating until done.  The only structure it
needs to be aware of is line breaks, but that is so simple it can be
folded into the pattern scan itself.

You are mostly describing fgrep. Egrep does a regular expression search thus honouring column layout in the example I gave earlier. (FYI fgrep is 50% faster than egrep than egrep in this query)

The points are two:
1) Could sqlite structure allow for faster full column scan queries?
   More than 1000% slower performance seems way too much difference.
2) an index could (and should) be used when using LIKE 'asdf'

Well I actually solved this problem by using
PRAGMA case_sensitive_like = 1;
Now the index is being used (for all non-english characters).

But the online manual states:

"
if case_sensitive_like mode is enabled then the column must use the default BINARY collating sequence, if case_sensitive_like mode is disabled then the column must use the b uilt-in NOCASE collating sequence.

 ....
NOCASE - The same as binary, except the 26 upper case characters used by the English language are folded to their lower case equivalents before the comparison is performed.
"

so as I understand it there's no reason that the index was not used in the NOCASE scenario. It should do case-insensitive matches for the 26 latin characters and case-sensitive (binary) for all the others.

Or I'm still missing some essential sql parameter. Or perhaps my version is too old.

Thank you all for your time,
All the best,
-Spiros

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to