On Tue, May 13, 2014 at 2:16 AM, Gilles Dubuc <[email protected]> wrote:

> I noticed the IS NOT NULL version appear. Forcing the index in that case
>> may not help; I suspect not-null will just cause an index scan and double
>> the overhead compared to the table scan (because the secondary index isn't
>> clustered).
>>
>
> So what's the best thing to do for now? Not adding "IS NOT NULL" and
> letting it do a full table scan? That global query is indeed the last one
> that remains noticeably slow (it's running over 120+ million rows).
>

There are zero NULL values for "wiki" in those tables, so forcing the
secondary index won't help and will be slightly slower than a table scan on
the clustered primary key (extra level of indirection for each row touched).

I suggest removing the FORCE INDEX clauses entirely from both forms of the
query. Just keep the conditions pushed down to the unioned tables for now.

When filtering by a wiki name, the ix_*_wiki_timestamp indexes are chosen
regardless.

When not filtering by wiki, the optimizer should be free to choose the
is_*_timestamp index if possible (ie, if the tables ever hold a lot of data
older than the 30 day range).
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to