Krinkle has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/263566

Change subject: [WIP] Implement /w/static.php
......................................................................

[WIP] Implement /w/static.php

Currently requests to static/php-{version}/resources/* are handled by 
HHVM-static.
Per T99096, we want to change this to be handled by this script instead so that
we can vary based on multiversion by hostname instead of hardcoded in the path.

This allows:
* Cache is not trashed every few days when wikis move from one version to 
another.
* Hash validation to avoid cache poisoning during deployment (see T47877).

This fixes:
* (bug) Requests for static/* are cached for too long. Changes to existing 
files effectively
  don't get deployed because the url doesn't change, and cache expiry is too 
long.
* (enhancement) Requests for static/* are invalidated every week because of 
branch names.

WIP-TODO:
* Update MediaWiki core to output hashes in static urls.
* Fix static.php to more closely match behaviour of HHVM-static.
  - Header content-type is a different (application/x-javascript instead of 
application/javascript).
  - Header access-control-allow-origin is missing (why? what adds this?).
  - Header x-content-type-options:nosniff gets added (why? is this a problem?)
  - Gzipped response became two or three bytes smaller (why? gzip gets done at 
a different level?)

TODO:
* (After deploying this change) Set $wgResourceBasePath to '/w/static' in 
wmf-config.

Bug: T99096
Change-Id: I1860061a6dcff65df450056df51cc7ced6183ae8
---
M multiversion/MWWikiversions.php
M multiversion/updateBranchPointers
A w/static.php
3 files changed, 162 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/mediawiki-config 
refs/changes/66/263566/1

diff --git a/multiversion/MWWikiversions.php b/multiversion/MWWikiversions.php
index 0c90579..1bd3b7b 100644
--- a/multiversion/MWWikiversions.php
+++ b/multiversion/MWWikiversions.php
@@ -95,4 +95,11 @@
                }
                return $dbs;
        }
+
+       /**
+        * @return array List of wiki versions
+        */
+       public static function getAvailableBranchDirs() {
+               return glob( MEDIAWIKI_DEPLOYMENT_DIR . '/php-*', GLOB_ONLYDIR 
) ?: array();
+       }
 }
diff --git a/multiversion/updateBranchPointers 
b/multiversion/updateBranchPointers
index a7bfe08..2751332 100755
--- a/multiversion/updateBranchPointers
+++ b/multiversion/updateBranchPointers
@@ -23,19 +23,16 @@
 }
 
 function updateBranchPointers( $dryRun = false, $all = false ) {
-       $branchDirs = glob( MEDIAWIKI_DEPLOYMENT_DIR . '/php-*', GLOB_ONLYDIR );
+       $branchDirs = MWWikiversions::getAvailableBranchDirs();
 
-       if ( !is_array( $branchDirs ) || count( $branchDirs ) < 1 ) {
+       if ( !$branchDirs ) {
                fwrite( STDERR, __FUNCTION__ . ': no deployment branch 
directories found in ' . MEDIAWIKI_DEPLOYMENT_DIR . "\n" );
                exit( 1 );
        }
 
-       // Order directories using version_compare.
-       // The native sort is lexographical which gives the wrong result for
-       // collections such as [1.23wmf9, 1.23wmf10].
+       // Default sort is lexographical which incorrectly sorts e.g. 
[1.23wmf9, 1.23wmf10]
        usort( $branchDirs, 'version_compare' );
        $branches = array();
-
        if ( $all ) {
                foreach ( $branchDirs as $target ) {
                        $dir = explode( '-', basename( $target ) );
diff --git a/w/static.php b/w/static.php
new file mode 100644
index 0000000..d542b15
--- /dev/null
+++ b/w/static.php
@@ -0,0 +1,152 @@
+<?php
+/**
+ * Serve static files in a multiversion-friendly way.
+ *
+ * See https://phabricator.wikimedia.org/T99096 for design requirements.
+ *
+ * Overview:
+ *
+ * - multiversion requires the MediaWiki script directory (/w) to be shared
+ *   accross all domains. Files in /w are generic and load the real MediaWiki
+ *   entry point based on the currently configured version based on host name.
+ * - MediaWiki configuration sets $wgResourceBasePath to "/w/static".
+ * - Apache configuration rewrites "/w/static/*" to /w/static.php (this file).
+ * - static.php streams the file from the appropiate MediaWiki branch 
directory.
+ *
+ * In addition to the above, this file also looks in older MediaWiki branch
+ * directories in order to support references from our static HTML cache for 
30 days.
+ * Whilst responses from static may also be cached, they are not linked or 
guruanteed.
+ * As such, this file must be able to respond to request for older resources 
as well.
+ */
+require_once './MWVersion.php';
+require getMediaWiki( 'includes/WebStart.php' );
+
+function staticShowError( $message ) {
+       header( 'Content-Type: text/plain; charset=utf-8' );
+       echo "$message\n";
+}
+
+/**
+ * Stream file from disk to web response
+ * Based on StreamFile::stream()
+ * @param string $filePath
+ * @param int $maxAge Time in seconds to cache successful response
+ */
+function staticStreamFile( $filePath, $maxAge = 500 ) {
+       $stat = stat( $filePath );
+       if ( !$stat ) {
+               header( 'HTTP/1.1 404 Not Found' );
+               staticShowError( 'Unknown file path' );
+               return;
+       }
+
+       $ctype = StreamFile::contentTypeFromPath( $filePath, /* safe: not for 
upload */ false );
+       if ( !$ctype || $ctype === 'unknown/unknown' ) {
+               header( 'HTTP/1.1 400 Bad Request' );
+               staticShowError( 'Invalid file type' );
+               return;
+       }
+
+       header( 'Last-Modified: ' . wfTimestamp( TS_RFC2822, $stat['mtime'] ) );
+       header( "Content-Type: $ctype" );
+       $maxAge = (int) $maxAge;
+       header( "Cache-Control: public, s-maxage=$maxAge, max-age=$maxAge" );
+
+       if ( !empty( $_SERVER['HTTP_IF_MODIFIED_SINCE'] ) ) {
+               $ims = preg_replace( '/;.*$/', '', 
$_SERVER['HTTP_IF_MODIFIED_SINCE'] );
+               if ( wfTimestamp( TS_UNIX, $stat['mtime'] ) <= strtotime( $ims 
) ) {
+                       ini_set( 'zlib.output_compression', 0 );
+                       header( 'HTTP/1.1 304 Not Modified' );
+                       return;
+               }
+       }
+
+       header( 'Content-Length: ' . $stat['size'] );
+       readfile( $filePath );
+}
+
+function respondStaticFile() {
+       global $wgScriptPath;
+
+       if ( !isset( $_SERVER['REQUEST_URI'] ) ) {
+               header( 'HTTP/1.1 500 Internal Server Error' );
+               staticShowError( 'Invalid request' );
+               return;
+       }
+
+       // Strip query parameters
+       $uriPath = parse_url( $_SERVER['REQUEST_URI'], PHP_URL_PATH );
+
+       // Strip prefix
+       $urlPrefix = "$wgScriptPath/static/";
+       if ( strpos( $uriPath, $urlPrefix ) !== 0 ) {
+               header( 'HTTP/1.1 400 Bad Request' );
+               staticShowError( 'Bad request' );
+               return;
+       }
+       $path = substr( $uriPath, strlen( $urlPrefix ) );
+
+       $cacheLong = 30 * 24 * 3600; // 30 days
+       $cacheShort = 5 * 60; // 5 minutes
+       $hashSize = 8;
+
+       // Validation hash
+       $hash = isset( $_GET['v'] ) ? $_GET['v'] : false;
+       $fallback = false;
+       $maxAge = $cacheShort;
+
+       // Get branch dirs and sort with newest first
+       $branchDirs = MWWikiversions::getAvailableBranchDirs();
+       usort( $branchDirs, function ( $a, $b ) {
+               return version_compare( $b, $a );
+       } );
+
+       // Try each version in descending order
+       // - Requests without a validation hash will get the latest version.
+       //   (If the file no longer exists in the latest version, it will 
correctly
+       //   fall back to the last available version.)
+       // - Requests with validation hash get the first match. If none found, 
falls back to the last
+       //   available version. Cache expiry is shorted in that case to allow 
eventual-consistency and
+       //   avoids cache poisoning (see T47877).
+       foreach ( $branchDirs as $branchDir ) {
+               // Use realpath() to prevent path escalation through e.g. "../"
+               $filePath = realpath( "$branchDir/$path" );
+               if ( !$filePath ) {
+                       continue;
+               }
+
+               if ( strpos( $filePath, $branchDir ) !== 0 ) {
+                       header( 'HTTP/1.1 400 Bad Request' );
+                       staticShowError( 'Bad request' );
+                       return;
+               }
+
+               if ( file_exists( $filePath ) ) {
+                       if ( $hash ) {
+                               // Set fallback to the newest existing version.
+                               if ( !$fallback ) {
+                                       $fallback = $branchDir;
+                               }
+                               $sha1 = sha1_file( $filePath );
+                               if ( substr( $sha1, 0, $hashSize ) !== substr( 
$hash, 0, $hashSize ) ) {
+                                       continue;
+                               }
+                               // Cache hash-validated responses for long
+                               $maxAge = $cacheLong;
+                       }
+                       staticStreamFile( $filePath, $maxAge );
+                       return;
+               }
+       }
+
+       if ( !$fallback ) {
+               header( 'HTTP/1.1 404 Not Found' );
+               staticShowError( 'Unknown file' );
+               return;
+       }
+
+       staticStreamFile( "$fallback/$path", $maxAge );
+}
+
+wfResetOutputBuffers();
+respondStaticFile();

-- 
To view, visit https://gerrit.wikimedia.org/r/263566
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I1860061a6dcff65df450056df51cc7ced6183ae8
Gerrit-PatchSet: 1
Gerrit-Project: operations/mediawiki-config
Gerrit-Branch: master
Gerrit-Owner: Krinkle <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to