[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-12-09 Thread Addshore
Addshore added a comment. I guess that is mainly down to the data being 2 weeks old and also the constant stream of edits fixing the issue and also the maint script running. TASK DETAIL https://phabricator.wikimedia.org/T239470 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-12-09 Thread Ladsgroup
Ladsgroup added a comment. I ran some checks on your numbers. It seems in some ranges, things are clean, like only 30% have issues in some ranges more than > 95%. This is the numbers separated by millions. 1 means up to Q1Mio: 1 364510 2 415854 3 380372 4 302745 5

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-12-09 Thread Addshore
Addshore added a comment. So, these snapshots were taken roughly 2 weeks ago now and the number of items that appear to have issues is 49,021,987 (a non trivial number.) There will always be some false positives here as each snapshot of each table is taken in series, so some of the

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-12-09 Thread alaa_wmde
alaa_wmde added a comment. > I ran the rebuild script over one of the bugged items, and this correctly fixed the entries in the new store.  not-surprising yet great news :) > Is this probably all left over stuff from the deadlocks issue? Yes I'm guessing combination of

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-12-09 Thread Addshore
Addshore added a comment. I ran the rebuild script over one of the bugged items, and this correctly fixed the entries in the new store. Is this probably all left over stuff from the deadlocks issue? @Ladsgroup where did the re run of the rebuild script start from again? Maybe we

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-12-09 Thread Stashbot
Stashbot added a comment. Mentioned in SAL (#wikimedia-operations) [2019-12-09T10:46:10Z] T239470 addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki wikidatawiki --from-id=1007 --to-id=1007

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-11-29 Thread Addshore
Addshore added a comment. Aaaah, WRITE_BOTH... TASK DETAIL https://phabricator.wikimedia.org/T239470 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Addshore Cc: alaa_wmde, Ladsgroup, WMDE-leszek, Addshore, Aklapper, Iflorez, darthmon_wmde,

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-11-29 Thread Addshore
Addshore added a comment. Item 70 million is in the tables... mysql:research@dbstore1005.eqiad.wmnet [wikidatawiki]> select count() from wbt_item_terms where wbit_item_id = 7000; +--+ | count() | +--+ | 48 | +--+ 1 row in set

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-11-29 Thread Ladsgroup
Ladsgroup added a comment. In T239470#5701655 , @Addshore wrote: > Looks like there are 2,709,497 missing items while comparing items ids 0 to 70,000,000.. > Dump of those IDs coming soon! Wait, the script is done until 46Mio and

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-11-29 Thread Addshore
Addshore added a comment. Looks like there are 2,709,497 missing items while comparing items ids 0 to 70,000,000.. Dump of those IDs coming soon! TASK DETAIL https://phabricator.wikimedia.org/T239470 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T239470: Check the success of the terms migration (does it have holes)

2019-11-29 Thread Addshore
Addshore added a comment. Using the sqooped tables.. Looking at the first 10 million items // Find the diff spark.sql(""" SELECT term_entity_id as old, wbit_item_id as new FROM ( SELECT DISTINCT term_entity_id FROM joal.wikibase_wb_terms