GWicke has uploaded a new change for review.
https://gerrit.wikimedia.org/r/295027
Change subject: For discussion: Reduce purge volume by moving dependent purges
to RefreshLinksJob
......................................................................
For discussion: Reduce purge volume by moving dependent purges to
RefreshLinksJob
On edit, we currently create
a) a HTMLCacheUpdateJob, which
- updates page_touched for all pages transcluded by a page, and
- performs a CDN purge of all those pages.
b) a RefreshLinksJob, which will re-render the same set of pages, but won't
purge them.
RefreshLinksJob is significantly more expensive than HTMLCacheUpdateJob, and
takes a while when a template used in millions of pages was edited.
HTMLCacheUpdateJob on the other hand only performs relatively cheap database
queries, and quickly sends out a lot of CDN purges.
The chief advantage of this scheme is timely CDN purges, which ensures that
anonymous users quickly pick up template edits across the site. Disadvantages
are:
- The quick processing of CDN purges can result in bursts of very high purge
rates, and minimizes the chances of coalescing purges from several quick
edits into a single CDN purge.
- High-volume anonymous traffic will trigger parser cache misses after the
HTMLCacheUpdateJob has executed. This leads to higher latency for users, and
can create spikes in the load of app servers, databases, memcached etc.
This patch addresses these issues by moving CDN purges from HTMLCacheUpdateJob
to RefreshLinksJob. As a consequence, the following changes are expected:
- CDN purges should be less bursty, as RefreshLinksJob processing rates are
more limited.
- Multiple edits to the same popular template should result in only a single
CDN purge for the vast majority of pages using the template, as subsequent
page_touched increments will abort earlier refreshlinks jobs & purges. It is
expected that this should reduce the overall rate of CDN purges
signficantly.
- By purging the CDN only after the page cache has been updated, anonymous
traffic is no longer going to hit parser cache misses with the associated
latency increase, and won't cause a spike in load on the infrastructure from
a high rate of re-renders.
However, the downside is clearly that the purging of dependent pages is going
to be delayed, in line with the pace of RefreshLinksJob processing. This will
only affect anonymous users, as authenticated users will still trigger
immediate re-renders based on page_touched. It will also not affect the edited
pages themselves, as those are still purged immediately.
I believe that considering the performance and stability benefits of this
change, this is a reasonable trade-off to make. However, this is a judgment
call, which is why I am posting this patch for discussion.
Change-Id: Idb2867e2d90b11aa1bf2f249058b24c0c0a92036
---
M includes/jobqueue/jobs/HTMLCacheUpdateJob.php
M includes/jobqueue/jobs/RefreshLinksJob.php
2 files changed, 9 insertions(+), 18 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/mediawiki/core
refs/changes/27/295027/1
diff --git a/includes/jobqueue/jobs/HTMLCacheUpdateJob.php
b/includes/jobqueue/jobs/HTMLCacheUpdateJob.php
index 7acbdf2..14cd38b 100644
--- a/includes/jobqueue/jobs/HTMLCacheUpdateJob.php
+++ b/includes/jobqueue/jobs/HTMLCacheUpdateJob.php
@@ -131,24 +131,6 @@
__METHOD__
);
}
- // Get the list of affected pages (races only mean something
else did the purge)
- $titleArray = TitleArray::newFromResult( $dbw->select(
- 'page',
- [ 'page_namespace', 'page_title' ],
- [ 'page_id' => $pageIds, 'page_touched' =>
$dbw->timestamp( $touchTimestamp ) ],
- __METHOD__
- ) );
-
- // Update CDN
- $u = CdnCacheUpdate::newFromTitles( $titleArray );
- $u->doUpdate();
-
- // Update file cache
- if ( $wgUseFileCache ) {
- foreach ( $titleArray as $title ) {
- HTMLFileCache::clearFileCache( $title );
- }
- }
}
public function workItemCount() {
diff --git a/includes/jobqueue/jobs/RefreshLinksJob.php
b/includes/jobqueue/jobs/RefreshLinksJob.php
index 8870569..495d648 100644
--- a/includes/jobqueue/jobs/RefreshLinksJob.php
+++ b/includes/jobqueue/jobs/RefreshLinksJob.php
@@ -264,6 +264,15 @@
InfoAction::invalidateCache( $title );
+ // Update CDN
+ $u = CdnCacheUpdate::newSimplePurge( $title );
+ $u->doUpdate();
+
+ // Update file cache
+ if ( $wgUseFileCache ) {
+ HTMLFileCache::clearFileCache( $title );
+ }
+
return true;
}
--
To view, visit https://gerrit.wikimedia.org/r/295027
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idb2867e2d90b11aa1bf2f249058b24c0c0a92036
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/core
Gerrit-Branch: master
Gerrit-Owner: GWicke <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits