[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Sam Reed (reedy) s...@reedyboy.net changed: What|Removed |Added Blocks||40867 -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Bartosz Dziewoński matma@gmail.com changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED Assignee|jforres...@wikimedia.org|sprin...@wikimedia.org --- Comment #20 from Bartosz Dziewoński matma@gmail.com --- That was bug 52077. Closing this. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #18 from Steven Walling swall...@wikimedia.org --- (In reply to comment #16) Much better, but I'm still seeing some issues: Looking for 500 blanking tags gives 498 blanking plus 2 labeled as just mobile edit. http://en.wikipedia.org/w/api. php?action=querylist=recentchangesrctag=blankingrclimit=500rcprop=user%7C comment%7Ctitle%7Ctags%7Ctimestamp There are other strange things going on with tags... http://en.wikipedia.org/wiki/Wikipedia_talk:Tags#Incorrect_tagging Not sure if it's related or if we should file a separate bug for incorrect tagging. I think mobile is also suffering from this issue (or was as of yesterday). -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #19 from Bartosz Dziewoński matma@gmail.com --- Whatever is causing that (maybe just a misconfigured local filter?), it's most likely not related to this bug. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #17 from Robert Rohde ro...@robertrohde.com --- As a follow up, the two problematic tags I note in Comment 16 are both recent. It is possible they have a different underlying cause than the previous corruption. For example, this might represent a logic error in how the mobile edit tag is being recorded. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #12 from Sean Pringle sprin...@wikimedia.org --- As Robert suggested in comment 8, the rebuild process missed some rows where revisions had multiple tags. The script has been fixed and will run in batches on enwiki today. More info shortly... -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #13 from Sean Pringle sprin...@wikimedia.org --- Btw, change_tag still looks complete to me; the binlog shows no problems there. Should just be the tag_summary rebuild logic at fault. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #14 from Sean Pringle sprin...@wikimedia.org --- Rebuild #2 of tag_summary has completed and the reports in comment 8 look better (to me). Anyone care to verify... -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #15 from James Forrester jforres...@wikimedia.org --- (In reply to comment #14) Rebuild #2 of tag_summary has completed and the reports in comment 8 look better (to me). Anyone care to verify... Appears to work for me, yes. Might be worth waiting for others to weigh-in, but from my POV this is fixed. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #16 from Robert Rohde ro...@robertrohde.com --- Much better, but I'm still seeing some issues: Looking for 500 blanking tags gives 498 blanking plus 2 labeled as just mobile edit. http://en.wikipedia.org/w/api.php?action=querylist=recentchangesrctag=blankingrclimit=500rcprop=user%7Ccomment%7Ctitle%7Ctags%7Ctimestamp -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #9 from Andre Klapper aklap...@wikimedia.org --- (In reply to comment #8) A API query of 200 revisions tags as flagged as blanking: While this query returns 200 entries, we find that only 188 of them report as actually having the blanking tag. That's still the case today. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Greg Grossmeier g...@wikimedia.org changed: What|Removed |Added Priority|Highest |High Assignee|sprin...@wikimedia.org |jforres...@wikimedia.org --- Comment #10 from Greg Grossmeier g...@wikimedia.org --- Lowering priority a bit since I don't there is data loss here (the table that was used to recreate the data still exists). James: Assigning to you to determine the priority for getting around to fixing this data (since it affects VE related data, and you know what metrics are being tracked). -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #11 from Sean Pringle sprin...@wikimedia.org --- Am investigating whether the tag_summary rebuild was conceptually flawed with regard to revisions with multiple tags, or not. Also dumping enwiki binlogs on a slave (we have a month's worth) and pulling out all change_tag queries. Will reload them offline and join against a copy of change_tag to prove whether it is, in fact, completely intact. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #1 from Bartosz Dziewoński matma@gmail.com --- Only seems to affect en.wp right now (works correctly on pl.wp and mw.org, for example). -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Greg Grossmeier g...@wikimedia.org changed: What|Removed |Added Priority|Unprioritized |Highest CC||g...@wikimedia.org Severity|normal |major -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Rob Lanphier ro...@wikimedia.org changed: What|Removed |Added CC||ro...@wikimedia.org --- Comment #2 from Rob Lanphier ro...@wikimedia.org --- Sean and Asher narrowed this down to a problem with the schema change tool that we use, and are working on a strategy to fix the data. This looks like it's strictly a db-related problem that once fixed should stay fixed (assuming we don't try another similar schema migration before an upstream fix is made to the migration tool) -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Rob Lanphier ro...@wikimedia.org changed: What|Removed |Added Assignee|wikibugs-l@lists.wikimedia. |sprin...@wikimedia.org |org | -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #3 from Bartosz Dziewoński matma@gmail.com --- Was it determined if any other databases apart from en.wp's one were affected? -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #4 from Sam Reed (reedy) s...@reedyboy.net --- (In reply to comment #3) Was it determined if any other databases apart from en.wp's one were affected? Not sure. The wikis that potentially may have this issue are: + 'arwiki' = true, + 'commonswiki' = true, + 'cswiki' = true, + 'dewiki' = true, + 'elwiki' = true, + 'enwiki' = true, + 'enwikisource' = true, + 'enwiktionary' = true, + 'eswiki' = true, + 'etwiki' = true, + 'fawiki' = true, + 'fiwiki' = true, + 'frwiki' = true, + 'hewiki' = true, + 'huwiki' = true, + 'idwiki' = true, + 'itwiki' = true, + 'jawiki' = true, + 'ltwiki' = true, + 'mrwiki' = true, + 'nlwiki' = true, + 'plwiki' = true, + 'ptwiki' = true, + 'rowiki' = true, + 'ruwiki' = true, + 'simplewiki' = true, + 'svwiki' = true, + 'trwiki' = true, + 'ukwiki' = true, + 'zhwiki' = true, cf bug 40867#c6 -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #5 from Sean Pringle sprin...@wikimedia.org --- Firstly, we've determined this problem occurred due to an (apparent) bug in pt-online-schema-change when using a combination of: - A table without primary key - A table with unique indexes that all include nullable columns - An unfortunately timed REPLACE statement in normal db traffic Posc does online table alteration by: - Creating a copy of the table with altered schema - Setting triggers on the original table to keep the copy updated - Copying data across using a batch process In this case, posc set a DELETE trigger on tag_summary using a poor UNIQUE index (ts_log_id) with low cardinality and a nullable field. Then during the batching process, an external REPLACE statement with ts_log_id=NULL caused many too many rows to be deleted in the temporary table being altered. Given that many rows in tag_summary have ts_log_id=NULL, the table was massively reduced in size. Now to the fix: We've checked the other wikis and found no problems; only enwiki was affected. Furthermore, only enwiki.tag_summary was affected. We've verified that enwiki.change_tag is complete and did not suffer the same problem. This was based on: - Index cardinality and table size information collected before running the schema migration - An investigation of the events in the binary log surrounding the migration period Currently we are rebuilding tag_summary based on change_tag data. That will complete within 30 mins at the time of writing this comment. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 --- Comment #6 from Sean Pringle sprin...@wikimedia.org --- enwiki.tag_summary rebuild is complete. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Steven Walling swall...@wikimedia.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Steven Walling swall...@wikimedia.org --- Just checked this on-wiki as well. Seems fixed. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 51254] tag_summary missing records
https://bugzilla.wikimedia.org/show_bug.cgi?id=51254 Robert Rohde ro...@robertrohde.com changed: What|Removed |Added Status|RESOLVED|REOPENED CC||ro...@robertrohde.com Resolution|FIXED |--- --- Comment #8 from Robert Rohde ro...@robertrohde.com --- Sorry to add to what I'm sure was a bit of a hectic day for someone, but I'm still seeing lingering bits of corruption. Perhaps some sort of edge case that wasn't handled correctly by the rebuild? 99.9% of tags may be okay at this point, but here are some example that still seem to be errors. A API query of 200 revisions tags as flagged as blanking: http://en.wikipedia.org/w/api.php?action=querylist=recentchangesrctag=blankingrclimit=200rcprop=user%7Ccomment%7Ctitle%7Ctags%7Ctimestamp|idsrccontinue=2013-07-12T22:20:40Z|589061595 While this query returns 200 entries, we find that only 188 of them report as actually having the blanking tag. The remainder are things like rcid=590123889 timestamp=2013-07-12T14:30:16Z tagvisualeditor/tag rcid=590032703 timestamp=2013-07-12T00:33:31Z tagmobile edit/tag Where some other tag is reported but the expected blanking tag is not reported. For another example of this issue see the API query for the visualeditor-needcheck tag: http://en.wikipedia.org/w/api.php?action=querylist=recentchangesrctag=visualeditor-needcheckrclimit=200rcprop=user%7Ccomment%7Ctitle%7Ctags%7Ctimestamp|ids This tag should only be applied if the visualeditor tag is also present, but we observe that most of the results have either visualeditor or visualeditor-needcheck but not both. A few entries even have other tags entirely. What appears to have happened is that rebuild didn't correctly handle cases where a single revision was subject to multiple tags. Instead it looks as though the rebuilt table applies at most one tag to each of the historical revisions. Most of the time that's okay since few revisions actually have multiple tags, but it still leaves a bit of corruption and missing data on the rare cases when a revision is expected to have multiple tags. -- You are receiving this mail because: You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l