Hey, One important optimization you can use and it's often missed out (and it's going to be needed more as we normalize more tables) is join decomposition. It basically means you don't join and query but do two (or several) queries separately in your code. This might seem counter intuitive but it's pretty useful for several reasons (like MySQL will cache the normalizing table and answer faster or you yourself can cache some parts). You can read more about those in "High Performance MySQL" book.
HTH On Mon, Jun 3, 2019 at 3:43 AM Jesse Plamondon-Willard < [email protected]> wrote: > Hello, > > Some Toolforge SQL queries have much worse performance when updated for > the new comment table. > > For example, see one previous SQL query > <https://tools.wmflabs.org/sql-optimizer?use=enwiki_p&sql=SELECT%0D%0A++++++++++++++++++++user_id%2C%0D%0A++++++++++++++++++++user_registration%2C%0D%0A++++++++++++++++++++DATE_FORMAT%28user_registration%2C+%22%25Y-%25m-%25d+%25H%3A%25i%22%29+AS+registration%2C%0D%0A++++++++++++++++++++user_editcount%2C%0D%0A++++++++++++++++++++GROUP_CONCAT%28ug_group+SEPARATOR+%22%2C+%22%29+AS+user_groups%2C%0D%0A++++++++++++++++++++ipb_by_text%2C%0D%0A++++++++++++++++++++%27%27+AS+ipb_reason%2C+--+column+deleted%0D%0A++++++++++++++++++++DATE_FORMAT%28ipb_timestamp%2C+%22%25Y-%25m-%25d+%25H%3A%25i%22%29+AS+ipb_timestamp%2C%0D%0A++++++++++++++++++++ipb_deleted%2C%0D%0A++++++++++++++++++++COALESCE%28DATE_FORMAT%28ipb_expiry%2C+%22%25Y-%25m-%25d+%25H%3A%25i%22%29%2C+ipb_expiry%29+AS+ipb_expiry%0D%0A++++++++++++++++FROM%0D%0A++++++++++++++++++++user%0D%0A++++++++++++++++++++LEFT+JOIN+user_groups+ON+user_id+%3D+ug_user%0D%0A++++++++++++++++++++LEFT+JOIN+ipblocks_ipindex+ON+user_id+%3D+ipb_user%0D%0A++++++++++++++++WHERE+user_name+%3D+%27Pathoschild%27%0D%0A++++++++++++++++LIMIT+1> > and its updated version > <https://tools.wmflabs.org/sql-optimizer?use=enwiki_p&sql=SELECT%0D%0A++++++++++++++++++++user_id%2C%0D%0A++++++++++++++++++++user_registration%2C%0D%0A++++++++++++++++++++DATE_FORMAT%28user_registration%2C+%22%25Y-%25m-%25d+%25H%3A%25i%22%29+AS+registration%2C%0D%0A++++++++++++++++++++user_editcount%2C%0D%0A++++++++++++++++++++GROUP_CONCAT%28ug_group+SEPARATOR+%22%2C+%22%29+AS+user_groups%2C%0D%0A++++++++++++++++++++ipb_by_text%2C%0D%0A++++++++++++++++++++comment_text+AS+ipb_reason%2C%0D%0A++++++++++++++++++++DATE_FORMAT%28ipb_timestamp%2C+%22%25Y-%25m-%25d+%25H%3A%25i%22%29+AS+ipb_timestamp%2C%0D%0A++++++++++++++++++++ipb_deleted%2C%0D%0A++++++++++++++++++++COALESCE%28DATE_FORMAT%28ipb_expiry%2C+%22%25Y-%25m-%25d+%25H%3A%25i%22%29%2C+ipb_expiry%29+AS+ipb_expiry%0D%0A++++++++++++++++FROM%0D%0A++++++++++++++++++++user%0D%0A++++++++++++++++++++LEFT+JOIN+user_groups+ON+user_id+%3D+ug_user%0D%0A++++++++++++++++++++LEFT+JOIN+ipblocks_ipindex+ON+user_id+%3D+ipb_user%0D%0A++++++++++++++++++++LEFT+JOIN+comment+ON+ipb_reason_id+%3D+comment_id%0D%0A++++++++++++++++WHERE+user_name+%3D+%27Pathoschild%27%0D%0A++++++++++++++++LIMIT+1> > which just adds a join on the comment table. The change adds ten dependent > subqueries and increases Stalktoy load times from ≈15 seconds to ≈245 > seconds, and the same change for another query was enough to have my > tools temporarily rate-limited <https://phabricator.wikimedia.org/T217853> > . > > I haven't found a way to update those queries efficiently. Is there an > optimisation I'm missing? Why does the comment view need subqueries on ten > other tables, and is there an alternate version without the subqueries for > cases where we just need to join by ID? > > -- > Jesse Plamondon-Willard (Pathoschild) > _______________________________________________ > Wikimedia Cloud Services mailing list > [email protected] (formerly [email protected]) > https://lists.wikimedia.org/mailman/listinfo/cloud -- Amir (he/him)
_______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud
