Sean,

First, thanks for killing those queries.

Second, I've cc-ed mobile tech to talk about this problem.  This query is
running periodically on a scheduler.  It is re-computing all data every
time it runs.  The mobile folks developed logic in the scheduler to
time-box some queries, but problems with Event Logging data and the
incredibly frequent need to re-run after data outages and backfills has
left this kind of monster query as the only sane option.  So our current
reality is that the only sane option is insane.

Indexing these tables to make this insane query run is like the old "Why
don't you turn in your crazy brother who thinks he's a chicken?  I would,
but I need the eggs." joke.  Yes we do need the eggs, but no we're not
going to get them from this query.

Marcel wrote a new scheduler that handles failures much better and is much
easier to work with in terms of re-runs.  That's the only option to keep
this kind of data generation alive.  Until we get that running, we are
going to cancel this and other non-timeboxed queries.  This patch:
https://gerrit.wikimedia.org/r/#/c/210364/ has mobile folks on it and
Sean.  Sean, if this problem happens again before we take action, just
merge this patch and that will stop the bleeding.

In the long run, we are trying to migrate this data to a scalable analytics
platform where you can run queries like this and the system will cleverly
re-compute only if the underlying data changes.  Druid, Pipeline DB, etc.
are candidates.  The intermediate step to that will be moving the Event
Logging pipeline to Kafka and Hadoop, which we are in the process of doing
right now.


On Tue, May 12, 2015 at 2:12 AM, Sean Pringle <[email protected]>
wrote:

> I just killed 100+ 3-day unindexed research queries on dbstore1002.
> All replication streams were lagging by nearly 1 day, and /tmp was
> hundreds of GB.
>
> This seems similar to the problem from ~2 weeks ago, when old /tmp did
> fill up. The queries were of the following form (but with some
> variation). We need some indexing scheme for  MobileWebEditing*
> tables, or to come up with a new approach.
>
> SELECT
>     Month.Date,
>     COALESCE(Web.Web, 0) AS Web
>
> -- http://stackoverflow.com/a/6871220/365238
> -- ... using MariaDB 10 SEQUENCE engine instead of
> information_schema.columns
> FROM (
>     SELECT DATE_FORMAT(
>         ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY, @num:=@num+1),
>         '%Y-%m-%d'
>     ) AS Date
>     FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30
> ) AS Month
> LEFT JOIN (
>     SELECT
>         DATE(timestamp) AS Date,
>         SUM(1) AS Web
>
>     FROM (SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount FROM MobileWebEditing_5644223
> UNION SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount FROM MobileWebEditing_6077315
> UNION SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount from MobileWebEditing_6637866
> UNION SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount from MobileWebEditing_7675117
> UNION SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount from MobileWebEditing_8599025) as
> MobileWebEditing WHERE
>         event_action = 'error' AND
>         wiki != 'testwiki'
>     GROUP BY Date
> ) AS Web ON Month.Date = Web.Date;
>
> EXPLAIN:
>
>
> +------+--------------+--------------------------+--------+---------------+---------+---------+------------+----------+----------------------------------------------+
> | id   | select_type  | table                    | type   |
> possible_keys | key     | key_len | ref        | rows     | Extra
>                                   |
>
> +------+--------------+--------------------------+--------+---------------+---------+---------+------------+----------+----------------------------------------------+
> |    1 | PRIMARY      | <derived2>               | ALL    | NULL
>    | NULL    | NULL    | NULL       |       30 |
>                        |
> |    1 | PRIMARY      | <derived4>               | ref    | key0
>    | key0    | 4       | Month.Date |   563154 | Using where
>                        |
> |    4 | DERIVED      | <derived5>               | ALL    | NULL
>    | NULL    | NULL    | NULL       | 56315405 | Using where; Using
> temporary; Using filesort |
> |    5 | DERIVED      | MobileWebEditing_5644223 | ALL    | NULL
>    | NULL    | NULL    | NULL       |  1152600 |
>                        |
> |    6 | UNION        | MobileWebEditing_6077315 | ALL    | NULL
>    | NULL    | NULL    | NULL       |   685212 |
>                        |
> |    7 | UNION        | MobileWebEditing_6637866 | ALL    | NULL
>    | NULL    | NULL    | NULL       |  1528269 |
>                        |
> |    8 | UNION        | MobileWebEditing_7675117 | ALL    | NULL
>    | NULL    | NULL    | NULL       |  1663281 |
>                        |
> |    9 | UNION        | MobileWebEditing_8599025 | ALL    | NULL
>    | NULL    | NULL    | NULL       | 51286043 |
>                        |
> | NULL | UNION RESULT | <union5,6,7,8,9>         | ALL    | NULL
>    | NULL    | NULL    | NULL       |     NULL |
>                        |
> |    2 | DERIVED      | <derived3>               | system | NULL
>    | NULL    | NULL    | NULL       |        1 |
>                        |
> |    2 | DERIVED      | seq_1_to_100             | index  | NULL
>    | PRIMARY | 8       | NULL       |      100 | Using index
>                        |
> |    3 | DERIVED      | NULL                     | NULL   | NULL
>    | NULL    | NULL    | NULL       |     NULL | No tables used
>                        |
>
> +------+--------------+--------------------------+--------+---------------+---------+---------+------------+----------+----------------------------------------------+
>
>
> ---
> DBA @ WMF
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to