jcrespo added a comment.
I forgot to say we suspect the same thing happens on other s3 hosts, but these 2 previous wikis create so many errors that it is difficult to say until we fix these 2.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL
Catrope added a comment.
Thanks @jcrespo , @Reedy and @Ladsgroup for taking action.
In T171027#3668493, @jcrespo wrote:
Notification for users: We are going to disable wikidata recentchanges (meaning, changes on pages on other wikis coming from changes done on wikidata; the recentchanges at
gerritbot added a comment.
Change 383107 had a related patch set uploaded (by Reedy; owner: Reedy):
[operations/mediawiki-config@master] Disable inject recent changes on all client wikis
https://gerrit.wikimedia.org/r/383107TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL
Stashbot added a comment.
Mentioned in SAL (#wikimedia-operations) [2017-10-09T10:50:52Z] Synchronized wmf-config/Wikibase-production.php: Disable injecing RC records for commonswiki and ruwiki (T171027) (duration: 00m 47s)TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL
gerritbot added a comment.
Change 383100 merged by jenkins-bot:
[operations/mediawiki-config@master] Disable injecing RC records for commonswiki and ruwiki
https://gerrit.wikimedia.org/r/383100TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL
gerritbot added a comment.
Change 383100 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[operations/mediawiki-config@master] Disable injecing RC records for commonswiki and ruwiki
https://gerrit.wikimedia.org/r/383100TASK
Stashbot added a comment.
Mentioned in SAL (#wikimedia-operations) [2017-10-09T10:38:33Z] Synchronized php-1.31.0-wmf.2/extensions/Wikidata/extensions/Wikibase/client: Re-instate $wgWBClientSettings[injectRecentChanges] (T171027) (duration: 00m 56s)TASK
gerritbot added a comment.
Change 383093 merged by jenkins-bot:
[mediawiki/extensions/Wikidata@wmf/1.31.0-wmf.2] Re-instate $wgWBClientSettings['injectRecentChanges']
https://gerrit.wikimedia.org/r/383093TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL
gerritbot added a comment.
Change 383093 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/Wikidata@wmf/1.31.0-wmf.2] Re-instate $wgWBClientSettings['injectRecentChanges']
https://gerrit.wikimedia.org/r/383093TASK
jcrespo added a comment.
BTW, the decision was already mentioned as the 4th option of @Catrope suggestions "Disable Wikidata RC on large wikis until we have a more scalable implementation of the feature". I think nobody predicted how bad things were at the time.TASK
Ladsgroup added a comment.
Let's merge and deploy yours in MondayTASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespo, LadsgroupCc: D3r1ck01, matej_suchanek, Ankry, Ladsgroup, Lsanabria, Josve05a,
gerritbot added a comment.
Change 382990 abandoned by Ladsgroup:
Add config var for disabling inject RC records
https://gerrit.wikimedia.org/r/382990TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespo,
Reedy added a comment.
I've done this as a revert in https://gerrit.wikimedia.org/r/#/c/382989/ as we still use the config in wmf-config for the repo wikisTASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:
gerritbot added a comment.
Change 382990 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/Wikibase@master] Add config var for disabling inject RC records
https://gerrit.wikimedia.org/r/382990TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL
Ladsgroup added a comment.
Yeah, I'm building a new one, hold onTASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespo, LadsgroupCc: D3r1ck01, matej_suchanek, Ankry, Ladsgroup, Lsanabria, Josve05a, Bawolff,
Reedy added a comment.
https://github.com/wikimedia/mediawiki-extensions-Wikibase/commit/cb9cdd33f7414b74eb54c1c2ee59d1004f98aff0#diff-02c5d304c0c716a8c29613958c112f79 removed the remnants after
Reedy added a comment.
In T171027#3667193, @Bawolff wrote:
I believe that the setting you are looking for is $wgWBClientSettings['injectRecentChanges']
I can see that mentioned in Wikibase.php in mw-config.. I can't see it anywhere in the actual code (unless it's obfuscated)
It's definitely
Reedy added a comment.
I believe, the wikibase-InjectRCRecords is "at fault" (to the extent, it's the one pushing stuff into the recent changes table)
If we override it in CommonSettings, replacing the array value with a NullJob... That should hopefully neuter it.
Failing that, stopping
jcrespo added a comment.
We can do it on commons and ruwiki, which are probably the most affected ones, plus some on s3.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespoCc: D3r1ck01, matej_suchanek,
Bawolff added a comment.
In T171027#3667090, @jcrespo wrote:
The introduction of wikidata events on recentchanges has converted the "light" recentchanges table into a monolithical 500GB table:
commonswiki]> SHOW TABLE STATUS like 'recentchanges'\G
*** 1. row
Ladsgroup added a comment.
I'm around to help anything about the problem.TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: jcrespo, LadsgroupCc: D3r1ck01, matej_suchanek, Ankry, Ladsgroup, Lsanabria, Josve05a,
Mattflaschen-WMF added a comment.
In T171027#3628964, @Bawolff wrote:
They're not fully redundant, since rc_type for Wikidata is RC_EXTERNAL (from core, thus not Wikidata-specific).
Afaict only wikidata uses it.
That's true (at least for WMF), but as long as it's RC_EXTERNAL in core (not even
matej_suchanek added a comment.
rc_type is almost unused in Wikibase (not used for filtering).TASK DETAILhttps://phabricator.wikimedia.org/T171027EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: matej_suchanekCc: matej_suchanek, Ankry, Ladsgroup, Lsanabria,
Bawolff added a comment.
They're not fully redundant, since rc_type for Wikidata is RC_EXTERNAL (from core, thus not Wikidata-specific).
Afaict only wikidata uses it. Flow uses a different number. Personally i think it would have made more sense to stick with numbers and have each ext pick a
Mattflaschen-WMF added a comment.
Adding rc_type or rc_source (which one? these two fields seem largely redundant with each other) to the start of some indexes, so that queries that don't need to look at Wikidata rows can filter them out efficiently
They're not fully redundant, since rc_type for
jcrespo added a comment.
@Catrope Unfortunately it will take quite a bit of time to research and investigate the index, and I'm not able to attend to this right now due to several other currently happening infrastructure fires going on. I already commented some possibilities and I am open to other
Catrope added a comment.
Thanks for the detailed analysis @Bawolff and @jcrespo . My $0.02:
Given the overwhelming number of Wikidata entries in the RC table (95% or more on some wikis, I was very surprised by that), I think that's the main problem that we should tackle. @Bawolff's split query
jcrespo added a comment.
After testing some indexes, I do not see a huge improvement- we can reduce from scanning 100M rows to 18M, but there can be always a combination of query parameters that does not filter many rows on recentchanges. Paging by id (or timestamp) is the only reliable solution
jcrespo added a comment.
I am testing with a new index on dbstore1002; meanwhile I had a chat with Bawolff and he mentioned that rc used to be a small table where many indexes an inefficient scanning was possible because it was a much smaller summary of revision. Apparently with the latest
jcrespo added a comment.
This is my view of the issue, based on the comments above.
Short term:
If the number of watchlist items for the users is less than N (N to be determined), join Watchlist -> recentchanges, then "suffer" a small in-memory sort (I belive this is the current situation)
30 matches
Mail list logo