Thank you all for your replies, @RSmith:
My question is about the _specific_ case: SELECT * FROM ATable WHERE AColumn LIKE 'FixedSearchPart' || 'SomeWildCard' where the right side of the LIKE operator is a constant that has a fixed part at the beginning and a wildcard after that. The optimization is that the index is used to produce a rowset in which AColumn starts with 'FixedSearchPart' and then rows are filtered according to 'SomeWildCard'. For instance, in SELECT * FROM ATable WHERE AColumn LIKE 'Constantine%Yann%' As I understand it, the index scan will use the string 'Constantine' as argument and then the full string 'Constantine%Yann%' will be used to further filter the rowset. Of course any other case that has no fixed part at the start of the right-side string will have to fall back to a full scan and filter. Maybe the parameter notation was a little confusing but from the description you can deduce that it will not contain a wildcard. Essentially, the user will be asking for the rows where AColumn BEGINS WITH a string. @Jean-Christophe: Thank you for the extension. I will certainly have a look at it, but I already have the comparison algorithm (specific to latin/greek) from another case. So the only thing I have to do is to tailor it inside a custom collation. @Yuriy: Yes, the current interface for custom functions does not provide any information on what the arguments are, so it is impossible to deduce the collation of arguments inside the function body. That's why this has to be implemented by the SqLite query optimizer itself. I was thinking about something like: - The query optimizer understands that the left-side of the LIKE argument is a column with an index and a collation, and the right side is a literal with a fixed left part (after parameter substitution). - It takes the fixed part, appends a very high-order (preferably non-printable) character to it and use the index to materialize this clause: AColumn BETWEEN 'FixedPart' AND 'FixedPart' || <HighOrderChar>. If we were using ANSI I would suggest something like: AColumn BETWEEN 'FixedPart' AND 'FixedPart' || (char)0xFF but for UTF-8 I am not sure what to suggest. It would have to be a UTF8 character value that is guaranteed to be greater than any other usable character in any code page. I am not sure whether there actually is one. -Then it applies the whole right-side argument to this rowset using LIKE pattern matching to produce the correct result. @Simon: Your proposal is very clever but it has two weaknesses: - You are assuming that 'z' is the higher order character that can appear in a value. This is not the case; for instance greek characters have higher order than 'z'. This can be fixed (only for latin/greek) by using the highest order greek character 'ώ' (accented omega) instead of 'z'; but I would prefer a very high-order non-printable one instead. - It assumes that the column has a maximum width. Since SQLite does not have maximum widths for columns of TEXT affinity (although a max width can be implied and imposed by the app itself), improbable as it may be, there could be a row with the value :SearchString||'zzzzzzzzzzzz'||'a' which will erroneously not make it into the result set. Again, the very high-order non-printable character would solve this. So it can be a solution for a specific case with a given max value and a known set of code pages but it cannot be a general solution. --Constantine. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users