Sean, fully agreed this is a problem. The way I see it, I don't think there's an easy solution. This query is made with a template that fills in different versions of an Event Logging schema. This is bad for more than just this performance reason:
* when a new schema revision is implemented, either dozens of queries have to be updated or we have to live with this crappy performance * when an old schema revision table is deleted, queries break regardless of approach * when querying ad-hoc, this is yet another stumbling block to getting at the data We are currently focused on moving Event Logging to Kafka, and I think that the way we handle Event Logging analytics will change too much too soon for us to address this. If it's a problem for the database, I'd rather just disable the queries for now. Let me know and I'll make a judgement call to either go through and fix them all or disable them all. On Wed, Jun 24, 2015 at 8:43 AM, Sean Pringle <[email protected]> wrote: > Below are a couple examples of recent slow queries running on > analytics-store. They both use subqueries and UNION. Neither push the > WHERE conditions down to the subqueries, which means indexes are not > used. If you include WHERE clauses in both inner and outer queries you > will achieve index usage, especially for these sorts of eventlogging > queries on event_action or wiki. > > MySQL / MariaDB query optimizer does do constant propagation, but > UNION tends to trip it up in all sorts of exciting ways. Be verbose, > abandon DRY principle, and make your life easier :-) > > Current query: > > SELECT Month.Date, COALESCE(Web.Web, 0) AS Web FROM ( > SELECT DATE_FORMAT( ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY, > @num:=@num+1), '%Y-%m-%d' ) AS Date > FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 ) AS Month LEFT > JOIN ( > SELECT DATE(timestamp) AS Date, > SUM(1) AS Web FROM ( > SELECT timestamp, wiki, event_username, event_action, > event_namespace, event_userEditCount FROM MobileWebEditing_5644223 > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > FROM MobileWebEditing_6077315 > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > from MobileWebEditing_6637866 > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > from MobileWebEditing_7675117 > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > from MobileWebEditing_8599025 > ) as MobileWebEditing > WHERE event_namespace != 0 AND event_action = 'success' AND wiki > != 'testwiki' > GROUP BY Date > ) AS Web ON Month.Date = Web.Date; > > Version with WHERE clauses pushed down to subqueries: > > SELECT Month.Date, COALESCE(Web.Web, 0) AS Web FROM ( > SELECT DATE_FORMAT( ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY, > @num:=@num+1), '%Y-%m-%d' ) AS Date > FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 ) AS Month LEFT > JOIN ( > SELECT DATE(timestamp) AS Date, > SUM(1) AS Web FROM ( > SELECT timestamp, wiki, event_username, event_action, > event_namespace, event_userEditCount FROM MobileWebEditing_5644223 > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > FROM MobileWebEditing_6077315 WHERE event_namespace != > 0 AND event_action = 'success' AND wiki != 'testwiki' > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > from MobileWebEditing_6637866 WHERE event_namespace != > 0 AND event_action = 'success' AND wiki != 'testwiki' > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > from MobileWebEditing_7675117 WHERE event_namespace != > 0 AND event_action = 'success' AND wiki != 'testwiki' > UNION SELECT timestamp, wiki, event_username, > event_action, event_namespace, event_userEditCount > from MobileWebEditing_8599025 WHERE event_namespace != > 0 AND event_action = 'success' AND wiki != 'testwiki' > ) as MobileWebEditing > WHERE event_namespace != 0 AND event_action = 'success' AND wiki > != 'testwiki' > GROUP BY Date > ) AS Web ON Month.Date = Web.Date; > > Current query (hours): > > SELECT day, > `Attempted saves` / `Loads` AS "All wikis", > `Attempted saves - dewiki` / `Loads - dewiki` AS "dewiki", > `Attempted saves - enwiki` / `Loads - enwiki` AS "enwiki", > `Attempted saves - eswiki` / `Loads - eswiki` AS "eswiki", > `Attempted saves - frwiki` / `Loads - frwiki` AS "frwiki", > `Attempted saves - ptwiki` / `Loads - plwiki` AS "plwiki", > `Attempted saves - plwiki` / `Loads - ptwiki` AS "ptwiki" FROM ( > SELECT DATE( timestamp ) AS day, > SUM( IF( event_action = 'ready', 1, 0 ) ) AS "Loads", > SUM( IF( event_action = 'ready' AND wiki = 'dewiki', 1, 0 ) ) AS > "Loads - dewiki", > SUM( IF( event_action = 'ready' AND wiki = 'enwiki', 1, 0 ) ) AS > "Loads - enwiki", > SUM( IF( event_action = 'ready' AND wiki = 'eswiki', 1, 0 ) ) AS > "Loads - eswiki", > SUM( IF( event_action = 'ready' AND wiki = 'frwiki', 1, 0 ) ) AS > "Loads - frwiki", > SUM( IF( event_action = 'ready' AND wiki = 'plwiki', 1, 0 ) ) AS > "Loads - plwiki", > SUM( IF( event_action = 'ready' AND wiki = 'ptwiki', 1, 0 ) ) AS > "Loads - ptwiki", > SUM( IF( event_action = 'saveAttempt', 1, 0 ) ) AS "Attempted saves", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'dewiki', 1, 0 ) > ) AS "Attempted saves - dewiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'enwiki', 1, 0 ) > ) AS "Attempted saves - enwiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'eswiki', 1, 0 ) > ) AS "Attempted saves - eswiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'frwiki', 1, 0 ) > ) AS "Attempted saves - frwiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'plwiki', 1, 0 ) > ) AS "Attempted saves - plwiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'ptwiki', 1, 0 ) > ) AS "Attempted saves - ptwiki" FROM ( > SELECT event_action, wiki, timestamp > FROM Edit_10604157 > UNION ALL > SELECT event_action, wiki, timestamp > FROM Edit_10676603 > ) AS valid_edit_schema_revisions > GROUP BY day > ORDER BY day > ) as internalQuery; > > Version with WHERE clauses pushed down to subqueries: > > SELECT day, > `Attempted saves` / `Loads` AS "All wikis", > `Attempted saves - dewiki` / `Loads - dewiki` AS "dewiki", > `Attempted saves - enwiki` / `Loads - enwiki` AS "enwiki", > `Attempted saves - eswiki` / `Loads - eswiki` AS "eswiki", > `Attempted saves - frwiki` / `Loads - frwiki` AS "frwiki", > `Attempted saves - ptwiki` / `Loads - plwiki` AS "plwiki", > `Attempted saves - plwiki` / `Loads - ptwiki` AS "ptwiki" FROM ( > SELECT DATE( timestamp ) AS day, > SUM( IF( event_action = 'ready', 1, 0 ) ) AS "Loads", > SUM( IF( event_action = 'ready' AND wiki = 'dewiki', 1, 0 ) ) AS > "Loads - dewiki", > SUM( IF( event_action = 'ready' AND wiki = 'enwiki', 1, 0 ) ) AS > "Loads - enwiki", > SUM( IF( event_action = 'ready' AND wiki = 'eswiki', 1, 0 ) ) AS > "Loads - eswiki", > SUM( IF( event_action = 'ready' AND wiki = 'frwiki', 1, 0 ) ) AS > "Loads - frwiki", > SUM( IF( event_action = 'ready' AND wiki = 'plwiki', 1, 0 ) ) AS > "Loads - plwiki", > SUM( IF( event_action = 'ready' AND wiki = 'ptwiki', 1, 0 ) ) AS > "Loads - ptwiki", > SUM( IF( event_action = 'saveAttempt', 1, 0 ) ) AS "Attempted saves", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'dewiki', 1, 0 ) > ) AS "Attempted saves - dewiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'enwiki', 1, 0 ) > ) AS "Attempted saves - enwiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'eswiki', 1, 0 ) > ) AS "Attempted saves - eswiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'frwiki', 1, 0 ) > ) AS "Attempted saves - frwiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'plwiki', 1, 0 ) > ) AS "Attempted saves - plwiki", > SUM( IF( event_action = 'saveAttempt' AND wiki = 'ptwiki', 1, 0 ) > ) AS "Attempted saves - ptwiki" FROM ( > SELECT event_action, wiki, timestamp > FROM Edit_10604157 > WHERE event_action in ('saveAttempt', 'ready') > UNION ALL > SELECT event_action, wiki, timestamp > FROM Edit_10676603 > WHERE event_action in ('saveAttempt', 'ready') > ) AS valid_edit_schema_revisions > GROUP BY day > ORDER BY day > ) as internalQuery; > > -- > DBA @ WMF > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
