Running an EXPLAIN on all three reveal identical indices being used.  They 
should theoretically all run equally fast.

Cyberpower678
English Wikipedia Account Creation Team
English Wikipedia Administrator
Global User Renamer

> On Aug 7, 2019, at 22:15, Huji Lee <[email protected]> wrote:
> 
> I have a question for the MySQL/MariaDB experts.
> 
> Short and sweet
> 
> How is that Query 1 <https://quarry.wmflabs.org/query/38237> runs in seconds 
> and so does Query 2 <https://quarry.wmflabs.org/query/38243>, but Query 3 
> <https://quarry.wmflabs.org/query/38244> -- which is essentially the same 
> thing except it tries to bring columns from both sides of the join -- takes 
> forever to run?
> 
> Details
> 
> I have a bot that uses this script 
> <https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/findproxy.py>
>  to identify and block IPs associated with open proxies. To be parsimonious 
> with the blocks, it only blocks the said IP and not its associated range. 
> However, many proxy IPs belong to a web hosting range and it would be better 
> to block the entire range. The goal of the query is to find all active blocks 
> made by my bot, sort them in order of IP address, and emulate the LEAD() 
> function -- which we still don't have because we have not upgraded to MariaDB 
> 10.2 on Labs servers -- to make it easy to find cases where two consecutive 
> IPs start with the same two octets (like 100.24.X.Y and 100.24.C.D) so that I 
> can manually investigate those in more detail.
> 
> The nested SELECT that is repeated twice simply generates a list of all 
> active blocks by my bot. Query 1 shows all of them (88 rows) and Query 2 
> shows only the rows in which the LEAD subquery row has a rownumber that is 
> equal to 1 + that of a row in the original 88-row data. Of note, this also 
> has 88 rows, though one of the rows is all NULLs because for the last row of 
> the data we should not find a match in the LEAD subqeury.
> 
> Anyhow, Query 3 simply aims to put the actual data and the LEAD subquery data 
> side by side and that is where things fall apart somehow. I cannot run an 
> EXPLAIN on this through Quarry, because Quarry does not like SQL variables :/ 
> and I have no idea how else to diagnose this problem.
> 
> Thanks!
> 
> Huji
> _______________________________________________
> Wikimedia Cloud Services mailing list
> [email protected] (formerly [email protected])
> https://lists.wikimedia.org/mailman/listinfo/cloud

_______________________________________________
Wikimedia Cloud Services mailing list
[email protected] (formerly [email protected])
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to