On Tue, Aug 02, 2011 at 04:17:12PM +0200, Lukas Fleischer wrote: > On Tue, Aug 02, 2011 at 07:45:40AM -0500, Dan McGee wrote: > > On Tue, Aug 2, 2011 at 7:41 AM, Lukas Fleischer > > <[email protected]> wrote: > > > On Tue, Aug 02, 2011 at 07:13:05AM -0500, Dan McGee wrote: > > >> On Sun, Jul 31, 2011 at 12:05 PM, Lukas Fleischer > > >> <[email protected]> wrote: > > >> > This allows for merging comments and votes of deleted packages into > > >> > another one which is useful if a package needs to be renamed. > > >> > > > >> > Signed-off-by: Lukas Fleischer <[email protected]> > > >> > --- > > >> > web/lib/pkgfuncs.inc.php | 24 +++++++++++++++++++++++- > > >> > 1 files changed, 23 insertions(+), 1 deletions(-) > > >> > > > >> > diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php > > >> > index bb5a592..a81ee01 100644 > > >> > --- a/web/lib/pkgfuncs.inc.php > > >> > +++ b/web/lib/pkgfuncs.inc.php > > >> > @@ -659,10 +659,11 @@ function pkg_flag ($atype, $ids, $action = True) > > >> > { > > >> > * > > >> > * @param string $atype Account type, output of account_from_sid > > >> > * @param array $ids Array of package IDs to delete > > >> > + * @param int $mergepkgid Package to merge the deleted ones into > > >> > * > > >> > * @return string Translated error or success message > > >> > */ > > >> > -function pkg_delete ($atype, $ids) { > > >> > +function pkg_delete ($atype, $ids, $mergepkgid) { > > >> > if (!$atype) { > > >> > return __("You must be logged in before you can delete > > >> > packages."); > > >> > } > > >> > @@ -678,6 +679,27 @@ function pkg_delete ($atype, $ids) { > > >> > } > > >> > > > >> > $dbh = db_connect(); > > >> > + > > >> > + if ($mergepkgid) { > > >> > + /* Merge comments */ > > >> > + $q = "UPDATE IGNORE PackageComments "; > > >> Not a fan of this. This is not in any SQL standard so this would be > > >> another thing needing adjusted later if someone tried to run using > > >> non-MySQL. It is also extremely hacky IMO as you can end up > > >> suppressing real errors, and read this gem from the manual: > > >> "Rows for which columns are updated to values that would cause > > >> data conversion errors are updated to the closest valid values > > >> instead." > > >> Which just plain scares me. > > >> > > >> I'm not even sure why you are doing IGNORE on the comments merge > > >> (these can never conflict?), but for the votes merge, simply do a > > >> * UPDATE where it doesn't already exist. > > >> * DELETE all remaining. > > > > > > Well, I actually can't remember what I was thinking when I wrote the > > > comments query. I probably just copy-pasted it. That's the best excuse I > > > can come up with, yeah :) > > > > > > I'll fix this and think of a better filter criteria for the votes merge > > > query. I don't think we need the delete all remaining (duplicate) votes > > > though, as we do remove the associated packages anyway and "ON DELETE" > > > should do the rest. > > > > Ahh, true on the auto-cascade DELETE thoughts. If you just add a > > little more logic to the WHERE clause you should be able to make the > > UPDATES selective enough to avoid the duplicates issue. > > Yeah. Well, the issue here is that there might also be duplicates within > the deleted packages (e.g. if a user voted on two of the packages that > we want to merge but didn't vote on the target package itself), so for > each of the affected votes we would have to: > > 1) Check if the user already voted on the target package. If so, discard > the vote. An alternative approach is to exclude all votes that > originate from a user who has voted on the target package as well > right from the start. > > 2) Check if there's another vote from the same user on any of the other > packages in the remove queue. If so, only move one of them to the new > package. Maybe we could also group by user and remove such duplicates > beforehand. > > We also have the limitation that MySQL doesn't allow to UPDATE a table > and SELECT from the same table in a subquery. So we will either have do > use deep nested subqueries, a self join or a temporary table here. > > Another idea that just came into my mind is that we could also extract > all potentially affected votes, group them by user, delete all records > but one from each group and finally update the remaining record to match > the target package's ID. Not sure if that can be implemented as a single > SQL query. > > I'll think about how to implement that in detail when ever I get around > to doing it...
I tried implementing that in a single query and ended up with a bunch of nested subqueries which were rather confusing. Maybe we should really consider using a temporary table here - particularly since this will be a function that will be rarely used and performance shouldn't be that significant here (as opposed to maintainability). If anyone can come up with a simple query that covers all aspects, suggestions welcome!
