Amazingly, the problematic query never finishes executing. Still trying to figure out what is going on here. Any help is appreciated.
On Wed, Aug 7, 2019 at 10:21 PM Maximilian Doerr <[email protected]> wrote: > Running an EXPLAIN on all three reveal identical indices being used. They > should theoretically all run equally fast. > > Cyberpower678 > English Wikipedia Account Creation Team > English Wikipedia Administrator > Global User Renamer > > On Aug 7, 2019, at 22:15, Huji Lee <[email protected]> wrote: > > I have a question for the MySQL/MariaDB experts. > > *Short and sweet* > > How is that Query 1 <https://quarry.wmflabs.org/query/38237> runs in > seconds and so does Query 2 <https://quarry.wmflabs.org/query/38243>, but > Query > 3 <https://quarry.wmflabs.org/query/38244> -- which is essentially the > same thing except it tries to bring columns from both sides of the join -- > takes forever to run? > > *Details* > > I have a bot that uses this script > <https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/findproxy.py> > to identify and block IPs associated with open proxies. To be parsimonious > with the blocks, it only blocks the said IP and not its associated range. > However, many proxy IPs belong to a web hosting range and it would be > better to block the entire range. The goal of the query is to find all > active blocks made by my bot, sort them in order of IP address, and emulate > the LEAD() function -- which we still don't have because we have not > upgraded to MariaDB 10.2 on Labs servers -- to make it easy to find cases > where two consecutive IPs start with the same two octets (like 100.24.X.Y > and 100.24.C.D) so that I can manually investigate those in more detail. > > The nested SELECT that is repeated twice simply generates a list of all > active blocks by my bot. Query 1 shows all of them (88 rows) and Query 2 > shows only the rows in which the LEAD subquery row has a rownumber that is > equal to 1 + that of a row in the original 88-row data. Of note, this also > has 88 rows, though one of the rows is all NULLs because for the last row > of the data we should not find a match in the LEAD subqeury. > > Anyhow, Query 3 simply aims to put the actual data and the LEAD subquery > data side by side and that is where things fall apart somehow. I cannot run > an EXPLAIN on this through Quarry, because Quarry does not like SQL > variables :/ and I have no idea how else to diagnose this problem. > > Thanks! > > Huji > _______________________________________________ > Wikimedia Cloud Services mailing list > [email protected] (formerly [email protected]) > https://lists.wikimedia.org/mailman/listinfo/cloud > > > _______________________________________________ > Wikimedia Cloud Services mailing list > [email protected] (formerly [email protected]) > https://lists.wikimedia.org/mailman/listinfo/cloud
_______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud
