On Tue, Mar 22, 2011 at 2:25 PM, Dan Kennedy <[email protected]> wrote:
> On 03/22/2011 04:26 PM, Max Vlasov wrote:
> > Hi,
> >
> > recently I finally started experimenting with virtual tables and there's
> at
> > least one thing I can not understand.
> >
> > As I see xBestIndex/xFilter were developed to allow fast searching if the
> > implementation is able to do this. But there's also sql language that
> allows
> > very exotic queries. Some of them may be recognized by the
> implementation,
> > some not. If the former, one just can rely on sqlite double checking and
> > just do full scan. But there are also cases when it looks like
> recognition
> > is not possible. For example
> >
> > SELECT * FROM vtest where id> random()
> >
> > in this case xBestIndex just assumes some constant as the expression, so
> the
> > one who implements just can't detect probably unresolved query and thinks
> > that it can search quickly (binary search, for example). The call to
> xFilter
> > just passes first random value and sqlite will never call it again for
> the
> > same enumeration. So xFilter thinks this is the constant value used in
> the
> > query and jumps to the first correct row row never planning to jump back.
> > But this is actually a misleading action since in real world sqlite calls
> > random on every row and the rows bypassed are actually important and can
> be
> > evaluated to true. I mentioned random(), but there may be other cases,
> for
> > example when other fields are part of expressions.
>
> SQLite assumes that the result of each expression in the WHERE
> clause depends only on its inputs. If the input arguments are
> the same, the output should be do. Since random() has no inputs,
> SQLite figures that it must always return the same value.
>
> You can see a similar effect with:
>
> CREATE TABLE t1(a PRIMARY KEY, b);
> SELECT * FROM t1 WHERE a > random(); -- random() evaluated once.
> SELECT * FROM t1 WHERE +a > random(); -- random() evaluated many times
>
Dan, thanks, I double-checked your information and (ironically) I see that
the problem is with "the double check" :)
As I see now, sqlite does a great job that probably won't require any
additional steps for the problem I posted. So if the expression is not
"simple" in the terms I used, it just won't supply any constraint to
xBestIndex so automatically forcing full-scan. But if the double-check is
on, sqlite seems like actually checks random() for every result row and this
actually can give non-correct result.
Although I can not confirm the assumption with the numbers, but I also
checked this hypothesis with another "dynamic" expression using milliseconds
SELECT * FROM vtest WHERE (id = cast(strftime('%f','now')*1000 as integer))
and for a comparatively large dataset the value passed in xFilter is always
different to one returned if I just use full scan and double-checking (for
example 15719 vs 18984).
So it seems like virtual tables double checker always evaluates the
expression used for every row. One can live with that just by disabling
double-checking or not using such dynamics at all. I'm not sure whether such
a minor thing should be fixed in the core.
Thanks
Max Vlasov.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users