Running an EXPLAIN on all three reveal identical indices being used. They should theoretically all run equally fast.
Cyberpower678 English Wikipedia Account Creation Team English Wikipedia Administrator Global User Renamer > On Aug 7, 2019, at 22:15, Huji Lee <[email protected]> wrote: > > I have a question for the MySQL/MariaDB experts. > > Short and sweet > > How is that Query 1 <https://quarry.wmflabs.org/query/38237> runs in seconds > and so does Query 2 <https://quarry.wmflabs.org/query/38243>, but Query 3 > <https://quarry.wmflabs.org/query/38244> -- which is essentially the same > thing except it tries to bring columns from both sides of the join -- takes > forever to run? > > Details > > I have a bot that uses this script > <https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/findproxy.py> > to identify and block IPs associated with open proxies. To be parsimonious > with the blocks, it only blocks the said IP and not its associated range. > However, many proxy IPs belong to a web hosting range and it would be better > to block the entire range. The goal of the query is to find all active blocks > made by my bot, sort them in order of IP address, and emulate the LEAD() > function -- which we still don't have because we have not upgraded to MariaDB > 10.2 on Labs servers -- to make it easy to find cases where two consecutive > IPs start with the same two octets (like 100.24.X.Y and 100.24.C.D) so that I > can manually investigate those in more detail. > > The nested SELECT that is repeated twice simply generates a list of all > active blocks by my bot. Query 1 shows all of them (88 rows) and Query 2 > shows only the rows in which the LEAD subquery row has a rownumber that is > equal to 1 + that of a row in the original 88-row data. Of note, this also > has 88 rows, though one of the rows is all NULLs because for the last row of > the data we should not find a match in the LEAD subqeury. > > Anyhow, Query 3 simply aims to put the actual data and the LEAD subquery data > side by side and that is where things fall apart somehow. I cannot run an > EXPLAIN on this through Quarry, because Quarry does not like SQL variables :/ > and I have no idea how else to diagnose this problem. > > Thanks! > > Huji > _______________________________________________ > Wikimedia Cloud Services mailing list > [email protected] (formerly [email protected]) > https://lists.wikimedia.org/mailman/listinfo/cloud
_______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud
