Replacing with mobile-l... Who's best to make updates?
On Tuesday, May 12, 2015, Dan Andreescu <[email protected]> wrote: > Sean, > > First, thanks for killing those queries. > > Second, I've cc-ed mobile tech to talk about this problem. This query is > running periodically on a scheduler. It is re-computing all data every > time it runs. The mobile folks developed logic in the scheduler to > time-box some queries, but problems with Event Logging data and the > incredibly frequent need to re-run after data outages and backfills has > left this kind of monster query as the only sane option. So our current > reality is that the only sane option is insane. > > Indexing these tables to make this insane query run is like the old "Why > don't you turn in your crazy brother who thinks he's a chicken? I would, > but I need the eggs." joke. Yes we do need the eggs, but no we're not > going to get them from this query. > > Marcel wrote a new scheduler that handles failures much better and is much > easier to work with in terms of re-runs. That's the only option to keep > this kind of data generation alive. Until we get that running, we are > going to cancel this and other non-timeboxed queries. This patch: > https://gerrit.wikimedia.org/r/#/c/210364/ has mobile folks on it and > Sean. Sean, if this problem happens again before we take action, just > merge this patch and that will stop the bleeding. > > In the long run, we are trying to migrate this data to a scalable > analytics platform where you can run queries like this and the system will > cleverly re-compute only if the underlying data changes. Druid, Pipeline > DB, etc. are candidates. The intermediate step to that will be moving the > Event Logging pipeline to Kafka and Hadoop, which we are in the process of > doing right now. > > > On Tue, May 12, 2015 at 2:12 AM, Sean Pringle <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> I just killed 100+ 3-day unindexed research queries on dbstore1002. >> All replication streams were lagging by nearly 1 day, and /tmp was >> hundreds of GB. >> >> This seems similar to the problem from ~2 weeks ago, when old /tmp did >> fill up. The queries were of the following form (but with some >> variation). We need some indexing scheme for MobileWebEditing* >> tables, or to come up with a new approach. >> >> SELECT >> Month.Date, >> COALESCE(Web.Web, 0) AS Web >> >> -- http://stackoverflow.com/a/6871220/365238 >> -- ... using MariaDB 10 SEQUENCE engine instead of >> information_schema.columns >> FROM ( >> SELECT DATE_FORMAT( >> ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY, @num:=@num+1), >> '%Y-%m-%d' >> ) AS Date >> FROM seq_1_to_100, (SELECT @num:=-1) num LIMIT 30 >> ) AS Month >> LEFT JOIN ( >> SELECT >> DATE(timestamp) AS Date, >> SUM(1) AS Web >> >> FROM (SELECT timestamp, wiki, event_username, event_action, >> event_namespace, event_userEditCount FROM MobileWebEditing_5644223 >> UNION SELECT timestamp, wiki, event_username, event_action, >> event_namespace, event_userEditCount FROM MobileWebEditing_6077315 >> UNION SELECT timestamp, wiki, event_username, event_action, >> event_namespace, event_userEditCount from MobileWebEditing_6637866 >> UNION SELECT timestamp, wiki, event_username, event_action, >> event_namespace, event_userEditCount from MobileWebEditing_7675117 >> UNION SELECT timestamp, wiki, event_username, event_action, >> event_namespace, event_userEditCount from MobileWebEditing_8599025) as >> MobileWebEditing WHERE >> event_action = 'error' AND >> wiki != 'testwiki' >> GROUP BY Date >> ) AS Web ON Month.Date = Web.Date; >> >> EXPLAIN: >> >> >> +------+--------------+--------------------------+--------+---------------+---------+---------+------------+----------+----------------------------------------------+ >> | id | select_type | table | type | >> possible_keys | key | key_len | ref | rows | Extra >> | >> >> +------+--------------+--------------------------+--------+---------------+---------+---------+------------+----------+----------------------------------------------+ >> | 1 | PRIMARY | <derived2> | ALL | NULL >> | NULL | NULL | NULL | 30 | >> | >> | 1 | PRIMARY | <derived4> | ref | key0 >> | key0 | 4 | Month.Date | 563154 | Using where >> | >> | 4 | DERIVED | <derived5> | ALL | NULL >> | NULL | NULL | NULL | 56315405 | Using where; Using >> temporary; Using filesort | >> | 5 | DERIVED | MobileWebEditing_5644223 | ALL | NULL >> | NULL | NULL | NULL | 1152600 | >> | >> | 6 | UNION | MobileWebEditing_6077315 | ALL | NULL >> | NULL | NULL | NULL | 685212 | >> | >> | 7 | UNION | MobileWebEditing_6637866 | ALL | NULL >> | NULL | NULL | NULL | 1528269 | >> | >> | 8 | UNION | MobileWebEditing_7675117 | ALL | NULL >> | NULL | NULL | NULL | 1663281 | >> | >> | 9 | UNION | MobileWebEditing_8599025 | ALL | NULL >> | NULL | NULL | NULL | 51286043 | >> | >> | NULL | UNION RESULT | <union5,6,7,8,9> | ALL | NULL >> | NULL | NULL | NULL | NULL | >> | >> | 2 | DERIVED | <derived3> | system | NULL >> | NULL | NULL | NULL | 1 | >> | >> | 2 | DERIVED | seq_1_to_100 | index | NULL >> | PRIMARY | 8 | NULL | 100 | Using index >> | >> | 3 | DERIVED | NULL | NULL | NULL >> | NULL | NULL | NULL | NULL | No tables used >> | >> >> +------+--------------+--------------------------+--------+---------------+---------+---------+------------+----------+----------------------------------------------+ >> >> >> --- >> DBA @ WMF >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> <javascript:_e(%7B%7D,'cvml','[email protected]');> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > >
_______________________________________________ Mobile-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mobile-l
