Has anyone taken responsibility for fixing this?

On Sun, May 4, 2014 at 5:58 AM, Sean Pringle <[email protected]> wrote:

> Hi All
>
> The query form listed inline below runs periodically on s1-analytics-slave
> and now analytics-store. It generates 30 days worth of data without gaps by
> joining a table known to have more than 30 rows.
>
> That old trick is perfectly OK. However the table chosen is
> information_schema.columns and that's NOT OK :-)
>
> The problem is that the metadata in some information_schema tables must be
> materialized every time they're accessed. In this case the .frm file for
> *every table in the system* must be opened and checked before the query
> runs[1]. Yes, every table.
>
> This was probably always slow on s1-analytics-slave with enwiki + log +
> personal tables. It's even slower now that analytics-store holds all wikis.
>
> SELECT
>     Month.Date,
>     COALESCE(Web.Web, 0) AS Web
> FROM
>     (SELECT
>         DATE_FORMAT(ADDDATE(CURDATE() - INTERVAL 30 - 1 DAY,
> @num:=@num+1), '%Y-%m-%d') AS Date
>         FROM information_schema.columns, (SELECT @num:=-1) num LIMIT 30)
> AS Month
> LEFT JOIN
>     (SELECT DATE(timestamp) AS Date, SUM(1) AS Web FROM (
>         SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount
>             FROM MobileWebEditing_5644223
>         UNION
>         SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount
>             FROM MobileWebEditing_6077315
>         UNION
>         SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount
>             from MobileWebEditing_6637866
>         UNION
>         SELECT timestamp, wiki, event_username, event_action,
> event_namespace, event_userEditCount
>             from MobileWebEditing_7675117
>         ) as MobileWebEditing
>     WHERE
>         event_action = 'error'
>         AND wiki != 'testwiki'
>     GROUP BY Date
> ) AS Web
> ON Month.Date = Web.Date;
>
> MariaDB 10 has an SQL trick for generating sequences. Hackish, but simple:
>
> SELECT seq FROM seq_1_to_5;
>
> +-----+
> | seq |
> +-----+
> |   1 |
> |   2 |
> |   3 |
> |   4 |
> |   5 |
> +-----+
>
> https://mariadb.com/kb/en/sequence/
>
> Alternatively, we can put a real sequence table somewhere handy with, say,
> 1000 integers.
>
> BR
> Sean
>
> [1] The file accesses can be alleviated by having large table [definition]
> cache(s), however then we're talking hundreds of thousands more open file
> handles which is a whole new ceiling waiting to be hit :-)
>
> --
> DBA @ WMF
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to