On Tue, 13 Oct 2020 at 13:15, Konstantin Ryabitsev <konstan...@linuxfoundation.org> wrote: > I may be venting some frustration, but I'm now 4 hours into a > 2.1 -> 2.2 upgrade. The following query from 0039 has now been running > over 3 hours: > > COMMAND: Query > TIME: 10921 > STATE: Sending data > INFO: DELETE a FROM patchwork_seriesreference a > INNER JOIN patchwork_seriesreference b > WHERE > a.id < b.id > AND a.project_id = b.project_id > AND a.msgid = b.msgid > > That table has 220,000 rows, but it still seems strange that it's taking > so long to execute. As there is no way to know if it's about to be > finished or will take another few hours to run, I'm not willing to kill > it yet. > > Would adding an index across (id, project_id, msgid) make this query > faster to run?
I can confirm that adding a temporary index on (project_id, msgid) and then rerunning the above query makes it complete in less than 2 seconds. So, the migration should do the following: add the project field to the model create temporary non-unique index across msgid, project_id run the duplicate-removal query above delete temporary index create the unique index -K -- Konstantin Ryabitsev Director, IT Core Projects The Linux Foundation Montréal, Québec _______________________________________________ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork