On Wed, Aug 10, 2011 at 5:48 PM, Lukas Fleischer <[email protected]> wrote: > On Wed, Aug 10, 2011 at 05:35:46PM -0500, Dan McGee wrote: >> On Wed, Aug 10, 2011 at 4:51 PM, Lukas Fleischer >> <[email protected]> wrote: >> > On Tue, Aug 02, 2011 at 04:17:12PM +0200, Lukas Fleischer wrote: >> >> On Tue, Aug 02, 2011 at 07:45:40AM -0500, Dan McGee wrote: >> >> > On Tue, Aug 2, 2011 at 7:41 AM, Lukas Fleischer >> >> > <[email protected]> wrote: >> >> > > On Tue, Aug 02, 2011 at 07:13:05AM -0500, Dan McGee wrote: >> >> > >> On Sun, Jul 31, 2011 at 12:05 PM, Lukas Fleischer >> >> > >> <[email protected]> wrote: >> >> > >> > This allows for merging comments and votes of deleted packages into >> >> > >> > another one which is useful if a package needs to be renamed. >> >> > >> > >> >> > >> > Signed-off-by: Lukas Fleischer <[email protected]> >> >> > >> > --- >> >> > >> > web/lib/pkgfuncs.inc.php | 24 +++++++++++++++++++++++- >> >> > >> > 1 files changed, 23 insertions(+), 1 deletions(-) >> >> > >> > >> >> > >> > diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php >> >> > >> > index bb5a592..a81ee01 100644 >> >> > >> > --- a/web/lib/pkgfuncs.inc.php >> >> > >> > +++ b/web/lib/pkgfuncs.inc.php >> >> > >> > @@ -659,10 +659,11 @@ function pkg_flag ($atype, $ids, $action = >> >> > >> > True) { >> >> > >> > * >> >> > >> > * @param string $atype Account type, output of account_from_sid >> >> > >> > * @param array $ids Array of package IDs to delete >> >> > >> > + * @param int $mergepkgid Package to merge the deleted ones into >> >> > >> > * >> >> > >> > * @return string Translated error or success message >> >> > >> > */ >> >> > >> > -function pkg_delete ($atype, $ids) { >> >> > >> > +function pkg_delete ($atype, $ids, $mergepkgid) { >> >> > >> > if (!$atype) { >> >> > >> > return __("You must be logged in before you can >> >> > >> > delete packages."); >> >> > >> > } >> >> > >> > @@ -678,6 +679,27 @@ function pkg_delete ($atype, $ids) { >> >> > >> > } >> >> > >> > >> >> > >> > $dbh = db_connect(); >> >> > >> > + >> >> > >> > + if ($mergepkgid) { >> >> > >> > + /* Merge comments */ >> >> > >> > + $q = "UPDATE IGNORE PackageComments "; >> >> > >> Not a fan of this. This is not in any SQL standard so this would be >> >> > >> another thing needing adjusted later if someone tried to run using >> >> > >> non-MySQL. It is also extremely hacky IMO as you can end up >> >> > >> suppressing real errors, and read this gem from the manual: >> >> > >> "Rows for which columns are updated to values that would cause >> >> > >> data conversion errors are updated to the closest valid values >> >> > >> instead." >> >> > >> Which just plain scares me. >> >> > >> >> >> > >> I'm not even sure why you are doing IGNORE on the comments merge >> >> > >> (these can never conflict?), but for the votes merge, simply do a >> >> > >> * UPDATE where it doesn't already exist. >> >> > >> * DELETE all remaining. >> >> > > >> >> > > Well, I actually can't remember what I was thinking when I wrote the >> >> > > comments query. I probably just copy-pasted it. That's the best >> >> > > excuse I >> >> > > can come up with, yeah :) >> >> > > >> >> > > I'll fix this and think of a better filter criteria for the votes >> >> > > merge >> >> > > query. I don't think we need the delete all remaining (duplicate) >> >> > > votes >> >> > > though, as we do remove the associated packages anyway and "ON DELETE" >> >> > > should do the rest. >> >> > >> >> > Ahh, true on the auto-cascade DELETE thoughts. If you just add a >> >> > little more logic to the WHERE clause you should be able to make the >> >> > UPDATES selective enough to avoid the duplicates issue. >> >> >> >> Yeah. Well, the issue here is that there might also be duplicates within >> >> the deleted packages (e.g. if a user voted on two of the packages that >> >> we want to merge but didn't vote on the target package itself), so for >> >> each of the affected votes we would have to: >> >> >> >> 1) Check if the user already voted on the target package. If so, discard >> >> the vote. An alternative approach is to exclude all votes that >> >> originate from a user who has voted on the target package as well >> >> right from the start. >> >> >> >> 2) Check if there's another vote from the same user on any of the other >> >> packages in the remove queue. If so, only move one of them to the new >> >> package. Maybe we could also group by user and remove such duplicates >> >> beforehand. >> >> >> >> We also have the limitation that MySQL doesn't allow to UPDATE a table >> >> and SELECT from the same table in a subquery. So we will either have do >> >> use deep nested subqueries, a self join or a temporary table here. >> >> >> >> Another idea that just came into my mind is that we could also extract >> >> all potentially affected votes, group them by user, delete all records >> >> but one from each group and finally update the remaining record to match >> >> the target package's ID. Not sure if that can be implemented as a single >> >> SQL query. >> >> >> >> I'll think about how to implement that in detail when ever I get around >> >> to doing it... >> > >> > I tried implementing that in a single query and ended up with a bunch of >> > nested subqueries which were rather confusing. Maybe we should really >> > consider using a temporary table here - particularly since this will be >> > a function that will be rarely used and performance shouldn't be that >> > significant here (as opposed to maintainability). >> > >> > If anyone can come up with a simple query that covers all aspects, >> > suggestions welcome! >> >> -- original query >> UPDATE PackageVotes >> SET PackageID = ? >> WHERE PackageID IN (?, ?, ?); >> >> -- new queries >> -- only update votes that wouldn't exist after modification >> UPDATE PackageVotes >> SET PackageID = ?merge >> WHERE PackageID IN (?others) >> AND UsersID NOT IN ( >> -- this silly double SELECT works around shitty MySQL not >> letting you reference the updated table in the update, as it forces >> materialization of the subquery >> SELECT * FROM ( >> SELECT UsersID >> FROM PackageVotes >> WHERE PackageID = ?merge >> ) temp >> ); > > That's what I had when I wrote the query for the very first time but > that won't work if we merge more than one package at once. Read my > next-to-last email in this thread for some more details.
Oh damn it, I thought I set up my test data right- I'll see what I can tweak. >> >> -- all remaining un-updated votes are dupes, delete them >> DELETE FROM PackageVotes >> WHERE PackageID IN (?, ?, ?); > > No need to do this. ON DELETE CASCADE, remember? :) I know, but explicit >> implicit, especially once you wrap it in a transaction, I don't know. -Dan
