T hanks Bruce that was really smart! On Thu, Sep 29, 2016 at 3:26 PM, Bruce Myers <[email protected]> wrote:
> The query is using the tl_namespace index (slow) instead of the tl_from > index (fast). Since you can't force an index in a view, I tried removing > the 'and tl_namespace = 10' qualifier and it ran in 40 seconds. It used the > tl_from index. Since templatelinks usually point to the 10 namespace > anyway, there shouldn't be to many, if any, false positives by leaving the > qualifier out. > > > On 09/28/2016 03:03 PM, Huji Lee wrote: > > I wanted to revive this thread: can someone please help me find a way to > boost that query? > > On Fri, Sep 23, 2016 at 12:39 PM, Huji Lee <[email protected]> wrote: > >> Jaime, your hunch seems to be correct. I ran SHOW EXPLAIN on that query >> every few minutes until it timed out, and the number rows it was querying >> increased drastically. It very quickly reached 2399220000, with the last >> value I go before timeout being 3693530000. That is just way too many. >> >> T139090 could be the cause; it went into effect the day after last >> successful query, and it does affect indexes used by the query. >> >> One thing that can obviously make my query faster is to force it to run >> the joins in a particular order. I tried to enforce it through creating >> temporary tables in memory, but it failed: https://quarry.wmflabs.org/que >> ry/12719 >> >> Another approach would be to create indexes. However, my permissions >> don't allow me to see what indexes currently exist (is there a way around >> that?) so I cannot decide what new indexes can be added. >> >> On Fri, Sep 23, 2016 at 12:56 PM, Jaime Crespo <[email protected]> >> wrote: >> >>> So I cannot give you very specific advise, but based on the EXPLAIN, it >>> seems that you may be trying to read too many rows, up to >>> 100K * 300K * 300K (10 000 M) and then sort them, which is probably >>> going to be very slow. >>> >>> If they used to work in the past there could be 2 probable explanations: >>> a lot of rows have been inserted recently on one or several of the tables >>> (e.g. a new template or category with many members) or the indexes have >>> changed. Note that redoing the queries is something that we have to do >>> constantly in production because rows quantities change. I also can think >>> of this change that happened recently on production, but cannot say for >>> sure if it is related or could affect you negatively: >>> >>> https://phabricator.wikimedia.org/T139090 >>> >>> Maybe that affects your query and it is as easy to fix as reordering >>> your columns/changing slightly the filters or its order. If someone has a >>> suggestion to make it faster, that requires a change to labs, remember that >>> labs replicas are not static, and new indexes can be added if needed by the >>> community, and I will gladly apply them myself. >>> >>> On Fri, Sep 23, 2016 at 6:02 PM, Huji Lee <[email protected]> wrote: >>> >>>> With two connections to the DB via Terminal, it is possible! Here is >>>> what I got; it matches what you sent above too: >>>> >>>> >>>> >>>> MariaDB [fawiki_p]> show processlist; >>>> +---------+--------+-------------------+----------+--------- >>>> +------+----------------------------+----------------------- >>>> ------------------------------------------------------------ >>>> -------------------+----------+ >>>> | Id | User | Host | db | Command | Time | >>>> State | Info >>>> >>>> | Progress | >>>> +---------+--------+-------------------+----------+--------- >>>> +------+----------------------------+----------------------- >>>> ------------------------------------------------------------ >>>> -------------------+----------+ >>>> | 3556437 | s51403 | 10.68.23.58:53391 | fawiki_p | Query | 19 | >>>> Queried about 5450000 rows | select page_title, count(ll_lang) from page >>>> join category on page_title = cat_title left join catego | 0.000 | >>>> | 3556865 | s51403 | 10.68.23.58:54083 | fawiki_p | Query | 0 | >>>> init | show processlist >>>> | >>>> 0.000 | >>>> +---------+--------+-------------------+----------+--------- >>>> +------+----------------------------+----------------------- >>>> ------------------------------------------------------------ >>>> -------------------+----------+ >>>> 2 rows in set (0.00 sec) >>>> >>>> MariaDB [fawiki_p]> show explain for 3556437; >>>> +------+-------------+---------------+--------+------------- >>>> ---------------------------+-----------------------------+-- >>>> -------+------------------------+--------+------------------ >>>> ----------------------------+ >>>> | id | select_type | table | type | >>>> possible_keys | key | >>>> key_len | ref | rows | Extra >>>> | >>>> +------+-------------+---------------+--------+------------- >>>> ---------------------------+-----------------------------+-- >>>> -------+------------------------+--------+------------------ >>>> ----------------------------+ >>>> | 1 | SIMPLE | page | ref | >>>> name_title,page_redirect_namespace_len | page_redirect_namespace_len | >>>> 5 | const,const | 364177 | Using where; Using temporary; >>>> Using filesort | >>>> | 1 | SIMPLE | category | eq_ref | >>>> cat_title | cat_title | >>>> 257 | fawiki.page.page_title | 1 | Using >>>> index | >>>> | 1 | SIMPLE | categorylinks | ref | >>>> cl_timestamp,cl_sortkey | cl_timestamp | >>>> 257 | fawiki.page.page_title | 12 | Using where; Using >>>> index | >>>> | 1 | SIMPLE | templatelinks | ref | >>>> tl_from,tl_namespace | tl_namespace | >>>> 4 | const | 390622 | Using where; Using >>>> index | >>>> | 1 | SIMPLE | langlinks | ref | >>>> ll_from | ll_from | >>>> 4 | fawiki.page.page_id | 104911 | Using >>>> index | >>>> +------+-------------+---------------+--------+------------- >>>> ---------------------------+-----------------------------+-- >>>> -------+------------------------+--------+------------------ >>>> ----------------------------+ >>>> >>>> >>>> On Fri, Sep 23, 2016 at 11:58 AM, Huji Lee <[email protected]> wrote: >>>> >>>>> Oh, I misread your comment. >>>>> >>>>> I doubt SHOW EXPLAIN would work via Quarry, so I am trying to get it >>>>> to work via terminal access to Labs. I wonder though if it works for a >>>>> query that fails to complete. >>>>> >>>>> On Fri, Sep 23, 2016 at 11:53 AM, Jaime Crespo <[email protected]> >>>>> wrote: >>>>> >>>>>> Huji, note I suggested SHOW EXPLAIN, not EXPLAIN; it is a different >>>>>> command, check the link I provided and report if that doesn't work for >>>>>> you. >>>>>> >>>>>> On Fri, Sep 23, 2016 at 5:31 PM, Huji Lee <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> See https://quarry.wmflabs.org/query/12718 and the error message >>>>>>> returned. >>>>>>> >>>>>>> EXPLAIN only works if you have access to the underlying tables. I >>>>>>> only have access to the VIEWs built on top of actual wiki tables (for >>>>>>> good >>>>>>> reason, same applies to most of us) so I get an error when I try >>>>>>> EXPLAIN. >>>>>>> >>>>>>> On Fri, Sep 23, 2016 at 11:24 AM, Jaime Crespo < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Explain (you should be able to run SHOW EXPLAIN on your own >>>>>>>> queries: http://s.petrunia.net/blog/?p=89): >>>>>>>> >>>>>>>> EXPLAIN select page_title, count(ll_lang) from page join category >>>>>>>> on page_title = cat_title left join categorylinks on page_title = cl_to >>>>>>>> left join templatelinks on tl_from = page_id and tl_namespace = 10 and >>>>>>>> tl_title in ( 'رده_خالی' , 'رده_بهتر', 'رده_ابهامزدایی', >>>>>>>> 'رده_ردیابیکردن' ) left join langlinks on page_id = ll_from where >>>>>>>> page_namespace = 14 and page_is_redirect = 0 and cl_to is null and >>>>>>>> tl_title >>>>>>>> is null group by page_title order by 2, 1 limit 5000\G >>>>>>>> *************************** 1. row *************************** >>>>>>>> id: 1 >>>>>>>> select_type: SIMPLE >>>>>>>> table: page >>>>>>>> type: ref >>>>>>>> possible_keys: name_title,page_redirect_namespace_len >>>>>>>> key: page_redirect_namespace_len >>>>>>>> key_len: 5 >>>>>>>> ref: const,const >>>>>>>> rows: 364273 >>>>>>>> Extra: Using where; Using temporary; Using filesort >>>>>>>> *************************** 2. row *************************** >>>>>>>> id: 1 >>>>>>>> select_type: SIMPLE >>>>>>>> table: category >>>>>>>> type: eq_ref >>>>>>>> possible_keys: cat_title >>>>>>>> key: cat_title >>>>>>>> key_len: 257 >>>>>>>> ref: fawiki.page.page_title >>>>>>>> rows: 1 >>>>>>>> Extra: Using index >>>>>>>> *************************** 3. row *************************** >>>>>>>> id: 1 >>>>>>>> select_type: SIMPLE >>>>>>>> table: categorylinks >>>>>>>> type: ref >>>>>>>> possible_keys: cl_timestamp,cl_sortkey >>>>>>>> key: cl_timestamp >>>>>>>> key_len: 257 >>>>>>>> ref: fawiki.page.page_title >>>>>>>> rows: 12 >>>>>>>> Extra: Using where; Using index >>>>>>>> *************************** 4. row *************************** >>>>>>>> id: 1 >>>>>>>> select_type: SIMPLE >>>>>>>> table: templatelinks >>>>>>>> type: ref >>>>>>>> possible_keys: tl_from,tl_namespace >>>>>>>> key: tl_namespace >>>>>>>> key_len: 4 >>>>>>>> ref: const >>>>>>>> rows: 390610 >>>>>>>> Extra: Using where; Using index >>>>>>>> *************************** 5. row *************************** >>>>>>>> id: 1 >>>>>>>> select_type: SIMPLE >>>>>>>> table: langlinks >>>>>>>> type: ref >>>>>>>> possible_keys: ll_from >>>>>>>> key: ll_from >>>>>>>> key_len: 4 >>>>>>>> ref: fawiki.page.page_id >>>>>>>> rows: 104910 >>>>>>>> Extra: Using index >>>>>>>> >>>>>>>> On Fri, Sep 23, 2016 at 5:16 PM, Huji Lee <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I have a query [1] which I run weekly to identify empty >>>>>>>>> categories; those that don't have interwiki links and stay empty for a >>>>>>>>> while are then deleted by a sysop. >>>>>>>>> >>>>>>>>> The query ran just fine every week until Sep 2nd. [2] Since then, >>>>>>>>> the query times out and the page doesn't get updated. >>>>>>>>> >>>>>>>>> The query is no that complex (a SELECT with five JOINs) and used >>>>>>>>> to finish in about 3-5 minutes. I don't have EXPLAIN access on Labs >>>>>>>>> so I >>>>>>>>> cannot tell what is slowing it down. Can someone kindly take a look >>>>>>>>> and >>>>>>>>> advise why this suddenly stopped working? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Huji >>>>>>>>> >>>>>>>>> [1] https://quarry.wmflabs.org/query/3760 >>>>>>>>> [2] https://fa.wikipedia.org/w/index.php?title=%D9%88%DB%8C%DA%A >>>>>>>>> 9%DB%8C%E2%80%8C%D9%BE%D8%AF%DB%8C%D8%A7:%DA%AF%D8%B2%D8%A7% >>>>>>>>> D8%B1%D8%B4_%D8%AF%DB%8C%D8%AA%D8%A7%D8%A8%DB%8C%D8%B3/%D8%B >>>>>>>>> 1%D8%AF%D9%87%E2%80%8C%D9%87%D8%A7%DB%8C_%D8%AE%D8%A7%D9%84% >>>>>>>>> DB%8C&action=history&uselang=en >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Labs-l mailing list >>>>>>>>> [email protected] >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Jaime Crespo >>>>>>>> <http://wikimedia.org> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Labs-l mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Labs-l mailing list >>>>>>> [email protected] >>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jaime Crespo >>>>>> <http://wikimedia.org> >>>>>> >>>>>> _______________________________________________ >>>>>> Labs-l mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Labs-l mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>> >>>> >>> >>> >>> -- >>> Jaime Crespo >>> <http://wikimedia.org> >>> >>> _______________________________________________ >>> Labs-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>> >>> >> > > > _______________________________________________ > Labs-l mailing > [email protected]https://lists.wikimedia.org/mailman/listinfo/labs-l > > > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l > >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
