Firstly, thank you all for your answers,
egrep gets to work with a flat text file, which it can easily read
sequentially and get optimal performance from the OS's file
buffering/cache management. It only needs to read a piece of the file
and scan for patterns, repeating until done. The only structure it
needs to be aware of is line breaks, but that is so simple it can be
folded into the pattern scan itself.
You are mostly describing fgrep. Egrep does a regular expression search
thus honouring column layout in the example I gave earlier. (FYI fgrep
is 50% faster than egrep than egrep in this query)
The points are two:
1) Could sqlite structure allow for faster full column scan queries?
More than 1000% slower performance seems way too much difference.
2) an index could (and should) be used when using LIKE 'asdf'
Well I actually solved this problem by using
PRAGMA case_sensitive_like = 1;
Now the index is being used (for all non-english characters).
But the online manual states:
"
if case_sensitive_like mode is enabled then the column must use the
default BINARY collating sequence,
if case_sensitive_like mode is disabled then the column must use the b
uilt-in NOCASE collating sequence.
....
NOCASE - The same as binary, except the 26 upper case characters used
by the English language are folded to their lower case equivalents
before the comparison is performed.
"
so as I understand it there's no reason that the index was not used in
the NOCASE scenario. It should do case-insensitive matches for the 26
latin characters and case-sensitive (binary) for all the others.
Or I'm still missing some essential sql parameter. Or perhaps my version
is too old.
Thank you all for your time,
All the best,
-Spiros
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------