Sean, fully agreed this is a problem.  The way I see it, I don't think
there's an easy solution.  This query is made with a template that fills in
different versions of an Event Logging schema.  This is bad for more than
just this performance reason:

* when a new schema revision is implemented, either dozens of queries have
to be updated or we have to live with this crappy performance
* when an old schema revision table is deleted, queries break regardless of
approach
* when querying ad-hoc, this is yet another stumbling block to getting at
the data

We are currently focused on moving Event Logging to Kafka, and I think that
the way we handle Event Logging analytics will change too much too soon for
us to address this.  If it's a problem for the database, I'd rather just
disable the queries for now.  Let me know and I'll make a judgement call to
either go through and fix them all or disable them all.

On Wed, Jun 24, 2015 at 8:43 AM, Sean Pringle <[email protected]>
wrote:

> Below are a couple examples of recent slow queries running on
> analytics-store. They both use subqueries and UNION. Neither push the
> WHERE conditions down to the subqueries, which means indexes are not
> used. If you include WHERE clauses in both inner and outer queries you
> will achieve index usage, especially for these sorts of eventlogging
> queries on event_action or wiki.
>
> MySQL / MariaDB query optimizer does do constant propagation, but
> UNION tends to trip it up in all sorts of exciting ways. Be verbose,
> abandon DRY principle, and make your life easier :-)
>
> Current query:
>
> SELECT Month.Date, COALESCE(Web.Web, 0) AS Web FROM (
>     SELECT DATE_FORMAT( ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY,
> @num:=@num+1), '%Y-%m-%d' ) AS Date
>         FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 ) AS Month LEFT
> JOIN (
>         SELECT DATE(timestamp) AS Date,
>         SUM(1) AS Web FROM (
>             SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount FROM MobileWebEditing_5644223
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 FROM MobileWebEditing_6077315
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 from MobileWebEditing_6637866
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 from MobileWebEditing_7675117
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 from MobileWebEditing_8599025
>         ) as MobileWebEditing
>     WHERE event_namespace != 0 AND event_action = 'success' AND wiki
> != 'testwiki'
>     GROUP BY Date
> ) AS Web ON Month.Date = Web.Date;
>
> Version with WHERE clauses pushed down to subqueries:
>
> SELECT Month.Date, COALESCE(Web.Web, 0) AS Web FROM (
>     SELECT DATE_FORMAT( ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY,
> @num:=@num+1), '%Y-%m-%d' ) AS Date
>         FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 ) AS Month LEFT
> JOIN (
>         SELECT DATE(timestamp) AS Date,
>         SUM(1) AS Web FROM (
>             SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount FROM MobileWebEditing_5644223
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 FROM MobileWebEditing_6077315 WHERE event_namespace !=
> 0 AND event_action = 'success' AND wiki != 'testwiki'
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 from MobileWebEditing_6637866 WHERE event_namespace !=
> 0 AND event_action = 'success' AND wiki != 'testwiki'
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 from MobileWebEditing_7675117 WHERE event_namespace !=
> 0 AND event_action = 'success' AND wiki != 'testwiki'
>             UNION SELECT timestamp, wiki, event_username,
> event_action, event_namespace, event_userEditCount
>                 from MobileWebEditing_8599025 WHERE event_namespace !=
> 0 AND event_action = 'success' AND wiki != 'testwiki'
>         ) as MobileWebEditing
>     WHERE event_namespace != 0 AND event_action = 'success' AND wiki
> != 'testwiki'
>     GROUP BY Date
> ) AS Web ON Month.Date = Web.Date;
>
> Current query (hours):
>
> SELECT day,
> `Attempted saves` / `Loads` AS "All wikis",
> `Attempted saves - dewiki` / `Loads - dewiki` AS "dewiki",
> `Attempted saves - enwiki` / `Loads - enwiki` AS "enwiki",
> `Attempted saves - eswiki` / `Loads - eswiki` AS "eswiki",
> `Attempted saves - frwiki` / `Loads - frwiki` AS "frwiki",
> `Attempted saves - ptwiki` / `Loads - plwiki` AS "plwiki",
> `Attempted saves - plwiki` / `Loads - ptwiki` AS "ptwiki" FROM (
>     SELECT DATE( timestamp ) AS day,
>     SUM( IF( event_action = 'ready', 1, 0 ) ) AS "Loads",
>     SUM( IF( event_action = 'ready' AND wiki = 'dewiki', 1, 0 ) ) AS
> "Loads - dewiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'enwiki', 1, 0 ) ) AS
> "Loads - enwiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'eswiki', 1, 0 ) ) AS
> "Loads - eswiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'frwiki', 1, 0 ) ) AS
> "Loads - frwiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'plwiki', 1, 0 ) ) AS
> "Loads - plwiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'ptwiki', 1, 0 ) ) AS
> "Loads - ptwiki",
>     SUM( IF( event_action = 'saveAttempt', 1, 0 ) ) AS "Attempted saves",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'dewiki', 1, 0 )
> ) AS "Attempted saves - dewiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'enwiki', 1, 0 )
> ) AS "Attempted saves - enwiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'eswiki', 1, 0 )
> ) AS "Attempted saves - eswiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'frwiki', 1, 0 )
> ) AS "Attempted saves - frwiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'plwiki', 1, 0 )
> ) AS "Attempted saves - plwiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'ptwiki', 1, 0 )
> ) AS "Attempted saves - ptwiki" FROM (
>         SELECT event_action, wiki, timestamp
>         FROM Edit_10604157
>         UNION ALL
>         SELECT event_action, wiki, timestamp
>         FROM Edit_10676603
>     ) AS valid_edit_schema_revisions
>     GROUP BY day
>     ORDER BY day
> ) as internalQuery;
>
> Version with WHERE clauses pushed down to subqueries:
>
> SELECT day,
> `Attempted saves` / `Loads` AS "All wikis",
> `Attempted saves - dewiki` / `Loads - dewiki` AS "dewiki",
> `Attempted saves - enwiki` / `Loads - enwiki` AS "enwiki",
> `Attempted saves - eswiki` / `Loads - eswiki` AS "eswiki",
> `Attempted saves - frwiki` / `Loads - frwiki` AS "frwiki",
> `Attempted saves - ptwiki` / `Loads - plwiki` AS "plwiki",
> `Attempted saves - plwiki` / `Loads - ptwiki` AS "ptwiki" FROM (
>     SELECT DATE( timestamp ) AS day,
>     SUM( IF( event_action = 'ready', 1, 0 ) ) AS "Loads",
>     SUM( IF( event_action = 'ready' AND wiki = 'dewiki', 1, 0 ) ) AS
> "Loads - dewiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'enwiki', 1, 0 ) ) AS
> "Loads - enwiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'eswiki', 1, 0 ) ) AS
> "Loads - eswiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'frwiki', 1, 0 ) ) AS
> "Loads - frwiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'plwiki', 1, 0 ) ) AS
> "Loads - plwiki",
>     SUM( IF( event_action = 'ready' AND wiki = 'ptwiki', 1, 0 ) ) AS
> "Loads - ptwiki",
>     SUM( IF( event_action = 'saveAttempt', 1, 0 ) ) AS "Attempted saves",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'dewiki', 1, 0 )
> ) AS "Attempted saves - dewiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'enwiki', 1, 0 )
> ) AS "Attempted saves - enwiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'eswiki', 1, 0 )
> ) AS "Attempted saves - eswiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'frwiki', 1, 0 )
> ) AS "Attempted saves - frwiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'plwiki', 1, 0 )
> ) AS "Attempted saves - plwiki",
>     SUM( IF( event_action = 'saveAttempt' AND wiki = 'ptwiki', 1, 0 )
> ) AS "Attempted saves - ptwiki" FROM (
>         SELECT event_action, wiki, timestamp
>         FROM Edit_10604157
>         WHERE event_action in ('saveAttempt', 'ready')
>         UNION ALL
>         SELECT event_action, wiki, timestamp
>         FROM Edit_10676603
>         WHERE event_action in ('saveAttempt', 'ready')
>     ) AS valid_edit_schema_revisions
>     GROUP BY day
>     ORDER BY day
> ) as internalQuery;
>
> --
> DBA @ WMF
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to