jenkins-bot has submitted this change and it was merged.
Change subject: Hook Thumbor into the custom rewrite.py Swift middleware
......................................................................
Hook Thumbor into the custom rewrite.py Swift middleware
This is almost the same rewrite.py used in production. Requests now
go from Varnish to Swift. Which when it doesn't have the file it needs,
requests it from Thumbor, which generates it on the fly.
Thumbor is responsible for storing the thumbnail in swift and serving
it to rewrite.py for it to bubble back up the request chain.
I had to disable some strange hardcoded proxying (where the image
scaler proxies itself?) and add support for headers that are meaningful.
The xkey header in particular is set up so that it's generated in thumbor
and stored in swift permanently. This way when an existing thumbnail
has to be pulled from swift, it will have an xkey. As a result, varnish
never needs to compute xkeys, it only consumes them. Which helps
mitigate encoding issues that might happen.
Overall this simplifies VCL greatly, which is kept dumb. Note that this
doesn't handle Swift shards yet, not multiwiki setups. Both of which
will be required for something like this to work in production.
Bug: T126521
Change-Id: I13d53401b1c4875ff74a111336fb58f35570d862
---
M puppet/hieradata/common.yaml
M puppet/modules/role/templates/swift/apache2.conf.erb
M puppet/modules/role/templates/thumbor/local_repo.php.erb
A puppet/modules/swift/files/SwiftMedia/wmf/__init__.py
A puppet/modules/swift/files/SwiftMedia/wmf/rewrite.py
M puppet/modules/swift/manifests/init.pp
M puppet/modules/swift/templates/proxy-server.conf.erb
M puppet/modules/swift/templates/ring.conf.erb
M puppet/modules/thumbor/manifests/init.pp
M puppet/modules/thumbor/templates/thumbor.conf.erb
M puppet/modules/thumbor/templates/varnish.vcl.erb
11 files changed, 446 insertions(+), 44 deletions(-)
Approvals:
BryanDavis: Looks good to me, approved
Filippo Giunchedi: Looks good to me, but someone else must approve
jenkins-bot: Verified
diff --git a/puppet/hieradata/common.yaml b/puppet/hieradata/common.yaml
index f485068..9e0581f 100644
--- a/puppet/hieradata/common.yaml
+++ b/puppet/hieradata/common.yaml
@@ -378,6 +378,8 @@
swift::account_cfg_file: /etc/swift/account-server.conf
swift::object_cfg_file: /etc/swift/object-server.conf
swift::container_cfg_file: /etc/swift/container-server.conf
+swift::thumbnail_container: images-local-thumb
+swift::public_container: images-local-public
trafficserver::deploy_dir: "%{hiera('mwv::services_dir')}/trafficserver"
trafficserver::version: 6.0.0
diff --git a/puppet/modules/role/templates/swift/apache2.conf.erb
b/puppet/modules/role/templates/swift/apache2.conf.erb
index 209ab7e..abbf623 100644
--- a/puppet/modules/role/templates/swift/apache2.conf.erb
+++ b/puppet/modules/role/templates/swift/apache2.conf.erb
@@ -1,4 +1,4 @@
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LogLevel trace8
-ProxyPassMatch "^/images/(?!thumb/)(.*)$" "http://127.0.0.1:<%=
scope['::swift::port'] %>/v1/AUTH_<%= scope['::swift::project']
%>/wiki-local-public/$1$2"
\ No newline at end of file
+ProxyPassMatch "^/images/(?!thumb/)(.*)$" "http://127.0.0.1:<%=
scope['::swift::port'] %>/v1/AUTH_<%= scope['::swift::project'] %>/wiki-<%=
scope['::swift::public_container'] %>/$1$2"
\ No newline at end of file
diff --git a/puppet/modules/role/templates/thumbor/local_repo.php.erb
b/puppet/modules/role/templates/thumbor/local_repo.php.erb
index 59e3af2..0f1c2fd 100644
--- a/puppet/modules/role/templates/thumbor/local_repo.php.erb
+++ b/puppet/modules/role/templates/thumbor/local_repo.php.erb
@@ -23,6 +23,9 @@
'tif' => 'http://127.0.0.1:6081' . $wgUploadPath . '/thumb',
),
),
+ 'public' => array(
+ 'container' => '<%= scope['::swift::public_container'] %>',
+ ),
),
'hashLevels' => $wgHashedUploadDirectory ? 2 : 0,
'thumbScriptUrl' => $wgThumbnailScriptPath,
diff --git a/puppet/modules/swift/files/SwiftMedia/wmf/__init__.py
b/puppet/modules/swift/files/SwiftMedia/wmf/__init__.py
new file mode 100644
index 0000000..1a72d32
--- /dev/null
+++ b/puppet/modules/swift/files/SwiftMedia/wmf/__init__.py
@@ -0,0 +1 @@
+__version__ = '1.1.0'
diff --git a/puppet/modules/swift/files/SwiftMedia/wmf/rewrite.py
b/puppet/modules/swift/files/SwiftMedia/wmf/rewrite.py
new file mode 100644
index 0000000..b82b339
--- /dev/null
+++ b/puppet/modules/swift/files/SwiftMedia/wmf/rewrite.py
@@ -0,0 +1,384 @@
+# Portions Copyright (c) 2010 OpenStack, LLC.
+# Everything else Copyright (c) 2011 Wikimedia Foundation, Inc.
+# all of it licensed under the Apache Software License, included by reference.
+
+# unit test is in test_rewrite.py. Tests are referenced by numbered comments.
+
+import webob
+import webob.exc
+import re
+from eventlet.green import urllib2
+import time
+import urlparse
+from swift.common.utils import get_logger
+from swift.common.wsgi import WSGIContext
+
+
+class DumbRedirectHandler(urllib2.HTTPRedirectHandler):
+ def http_error_301(self, req, fp, code, msg, headers):
+ return None
+
+ def http_error_302(self, req, fp, code, msg, headers):
+ return None
+
+
+class _WMFRewriteContext(WSGIContext):
+ """
+ Rewrite Media Store URLs so that swift knows how to deal with them.
+ """
+
+ def __init__(self, rewrite, conf):
+ WSGIContext.__init__(self, rewrite.app)
+ self.app = rewrite.app
+ self.logger = rewrite.logger
+
+ self.account = conf['account'].strip()
+ self.thumbhost = conf['thumbhost'].strip()
+ self.user_agent = conf['user_agent'].strip()
+ self.bind_port = conf['bind_port'].strip()
+ self.shard_container_list = [item.strip() for item in
conf['shard_container_list'].split(',')]
+ # this parameter controls whether URLs sent to the thumbhost are sent
as is (eg. upload/proj/lang/) or with the site/lang
+ # converted and only the path sent back (eg en.wikipedia/thumb).
+ self.backend_url_format = conf['backend_url_format'].strip() # asis,
sitelang
+
+ def handle404(self, reqorig, url, container, obj):
+ """
+ Return a webob.Response which fetches the thumbnail from the thumb
+ host and returns it. Note also that the thumb host might write it out
+ to Swift so it won't 404 next time.
+ """
+ # go to the thumb media store for unknown files
+ reqorig.host = self.thumbhost
+ # upload doesn't like our User-agent, otherwise we could call it
+ # using urllib2.url()
+ proxy_handler = urllib2.ProxyHandler({'http': self.thumbhost})
+ redirect_handler = DumbRedirectHandler()
+
+ if self.backend_url_format == 'sitelang':
+ opener = urllib2.build_opener(redirect_handler, proxy_handler)
+ else:
+ opener = urllib2.build_opener(redirect_handler)
+
+ # Pass on certain headers from the caller squid to the scalers
+ opener.addheaders = []
+ if reqorig.headers.get('User-Agent') is not None:
+ opener.addheaders.append(('User-Agent',
reqorig.headers.get('User-Agent')))
+ else:
+ opener.addheaders.append(('User-Agent', self.user_agent))
+ for header_to_pass in ['X-Forwarded-For', 'X-Forwarded-Proto',
+ 'Accept', 'Accept-Encoding', 'X-Original-URI']:
+ if reqorig.headers.get(header_to_pass) is not None:
+ opener.addheaders.append((header_to_pass,
reqorig.headers.get(header_to_pass)))
+
+ # At least in theory, we shouldn't be handing out links to originals
+ # that we don't have (or in the case of thumbs, can't generate).
+ # However, someone may have a formerly valid link to a file, so we
+ # should do them the favor of giving them a 404.
+ try:
+ # break apach the url, url-encode it, and put it back together
+ urlobj = list(urlparse.urlsplit(reqorig.url))
+ # encode the URL but don't encode %s and /s
+ urlobj[2] = urllib2.quote(urlobj[2], '%/')
+ encodedurl = urlparse.urlunsplit(urlobj)
+
+ # if sitelang, we're supposed to mangle the URL so that
+ #
http://upload.wikimedia.org/wikipedia/commons/thumb/a/a2/Little_kitten_.jpg/330px-Little_kitten_.jpg
+ # changes to
http://commons.wikipedia.org/w/thumb_handler.php/a/a2/Little_kitten_.jpg/330px-Little_kitten_.jpg
+ if self.backend_url_format == 'sitelang':
+ match =
re.match(r'^http://(?P<host>[^/]+)/(?P<proj>[^-/]+)/(?P<lang>[^/]+)/thumb/(?P<path>.+)',
encodedurl)
+ if match:
+ proj = match.group('proj')
+ lang = match.group('lang')
+ # and here are all the legacy special cases, imported from
thumb_handler.php
+ if(proj == 'wikipedia'):
+ if(lang in ['meta', 'commons', 'internal', 'grants']):
+ proj = 'wikimedia'
+ if(lang in ['mediawiki']):
+ lang = 'www'
+ proj = 'mediawiki'
+ hostname = '%s.%s.org' % (lang, proj)
+ if(proj == 'wikipedia' and lang == 'sources'):
+ #yay special case
+ hostname = 'wikisource.org'
+ # ok, replace the URL with just the part starting with
thumb/
+ # take off the first two parts of the path (eg
/wikipedia/commons/); make sure the string starts with a /
+ encodedurl = 'http://%s/w/thumb_handler.php/%s' %
(hostname, match.group('path'))
+ # add in the X-Original-URI with the swift got (minus the
hostname)
+ opener.addheaders.append(('X-Original-URI',
list(urlparse.urlsplit(reqorig.url))[2]))
+ else:
+ # ASSERT this code should never be hit since only thumbs
should call the 404 handler
+ self.logger.warn("non-thumb in 404 handler! encodedurl =
%s" % encodedurl)
+ resp = webob.exc.HTTPNotFound('Unexpected error')
+ return resp
+ else:
+ # log the result of the match here to test and make sure it's
sane before enabling the config
+ match =
re.match(r'^http://(?P<host>[^/]+)/(?P<proj>[^-/]+)/(?P<lang>[^/]+)/thumb/(?P<path>.+)',
encodedurl)
+ if match:
+ proj = match.group('proj')
+ lang = match.group('lang')
+ self.logger.warn("sitelang match has proj %s lang %s
encodedurl %s" % (proj, lang, encodedurl))
+ else:
+ self.logger.warn("no sitelang match on encodedurl: %s" %
encodedurl)
+
+ # ok, call the encoded url
+ upcopy = opener.open(encodedurl)
+ except urllib2.HTTPError, error:
+ # copy the urllib2 HTTPError into a webob HTTPError class as-is
+
+ class CopiedHTTPError(webob.exc.HTTPError):
+ code = error.code
+ title = error.msg
+
+ def html_body(self, environ):
+ return self.detail
+
+ def __init__(self):
+ super(CopiedHTTPError, self).__init__(
+ detail="".join(error.readlines()),
+ headers=error.hdrs.items())
+
+ resp = CopiedHTTPError()
+ return resp
+ except urllib2.URLError, error:
+ msg = 'There was a problem while contacting the image scaler: %s'
% \
+ error.reason
+ resp = webob.exc.HTTPServiceUnavailable(msg)
+ return resp
+
+ # get the Content-Type.
+ uinfo = upcopy.info()
+ c_t = uinfo.gettype()
+ content_length = uinfo.getheader('Content-Length', None)
+ # sometimes Last-Modified isn't present; use now() when that happens.
+ try:
+ last_modified = time.mktime(uinfo.getdate('Last-Modified'))
+ except TypeError:
+ last_modified = time.mktime(time.localtime())
+
+ resp = webob.Response(app_iter=upcopy, content_type=c_t)
+ # add in the headers if we've got them
+ for header in ['Content-Length', 'Content-Disposition',
'Last-Modified', 'Accept-Ranges', 'XKey', 'Engine', 'Server',
'Processing-Time', 'Processing-Utime']:
+ if(uinfo.getheader(header)):
+ resp.headers.add(header, uinfo.getheader(header))
+
+ # also add CORS; see also our CORS middleware
+ resp.headers.add('Access-Control-Allow-Origin', '*')
+
+ return resp
+
+ def handle_request(self, env, start_response):
+ req = webob.Request(env)
+
+ # Double (or triple, etc.) slashes in the URL should be ignored;
collapse them. fixes T34864
+ req.path_info = re.sub(r'/{2,}', '/', req.path_info)
+
+ # Keep a copy of the original request so we can ask the scalers for it
+ reqorig = req.copy()
+
+ # Containers have 5 components: project, language, repo, zone, and
shard.
+ # If there's no zone in the URL, the zone is assumed to be 'public'
(for b/c).
+ # Shard is optional (and configurable), and is only used for large
containers.
+ #
+ # Projects are wikipedia, wikinews, etc.
+ # Languages are en, de, fr, commons, etc.
+ # Repos are local, timeline, etc.
+ # Zones are public, thumb, temp, etc.
+ # Shard is extracted from "hash paths" in the URL and is 2 hex digits.
+ #
+ # These attributes are mapped to container names in the form of either:
+ # (a) proj-lang-repo-zone (if not sharded)
+ # (b) proj-lang-repo-zone.shard (if sharded)
+ # (c) global-data-repo-zone (if not sharded)
+ # (d) global-data-repo-zone.shard (if sharded)
+ #
+ # Rewrite wiki-global URLs of these forms:
+ # (a) http://upload.wikimedia.org/math/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/global-data-math-render/<relpath>
+ # (b) http://upload.wikimedia.org/<proj>/<lang>/math/<relpath> (legacy)
+ # =>
http://msfe/v1/AUTH_<hash>/global-data-math-render/<relpath>
+ #
+ # Rewrite wiki-relative URLs of these forms:
+ # (a) http://upload.wikimedia.org/<proj>/<lang>/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-local-public/<relpath>
+ # (b) http://upload.wikimedia.org/<proj>/<lang>/archive/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-local-public/archive/<relpath>
+ # (c) http://upload.wikimedia.org/<proj>/<lang>/thumb/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-local-thumb/<relpath>
+ # (d) http://upload.wikimedia.org/<proj>/<lang>/thumb/archive/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-local-thumb/archive/<relpath>
+ # (e) http://upload.wikimedia.org/<proj>/<lang>/thumb/temp/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-local-thumb/temp/<relpath>
+ # (f) http://upload.wikimedia.org/<proj>/<lang>/transcoded/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-local-transcoded/<relpath>
+ # (g) http://upload.wikimedia.org/<proj>/<lang>/timeline/<relpath>
+ # =>
http://msfe/v1/AUTH_<hash>/<proj>-<lang>-timeline-render/<relpath>
+
+ # regular uploads
+ match =
re.match(r'^/(?P<proj>[^/]+)/(?P<lang>[^/]+)/((?P<zone>transcoded|thumb)/)?(?P<path>((temp|archive)/)?[0-9a-f]/(?P<shard>[0-9a-f]{2})/.+)$',
req.path)
+ if match:
+ proj = match.group('proj')
+ lang = match.group('lang')
+ repo = 'local' # the upload repo name is "local"
+ # Get the repo zone (if not provided that means "public")
+ zone = (match.group('zone') if match.group('zone') else 'public')
+ # Get the object path relative to the zone (and thus container)
+ obj = match.group('path') # e.g. "archive/a/ab/..."
+ shard = match.group('shard')
+
+ # timeline renderings
+ if match is None:
+ # /wikipedia/en/timeline/a876297c277d80dfd826e1f23dbfea3f.png
+ match =
re.match(r'^/(?P<proj>[^/]+)/(?P<lang>[^/]+)/(?P<repo>timeline)/(?P<path>.+)$',
req.path)
+ if match:
+ proj = match.group('proj') # wikipedia
+ lang = match.group('lang') # en
+ repo = match.group('repo') # timeline
+ zone = 'render'
+ obj = match.group('path') #
a876297c277d80dfd826e1f23dbfea3f.png
+ shard = ''
+
+ # math renderings
+ if match is None:
+ # /math/c/9/f/c9f2055dadfb49853eff822a453d9ceb.png
+ # /wikipedia/en/math/c/9/f/c9f2055dadfb49853eff822a453d9ceb.png
(legacy)
+ match =
re.match(r'^(/(?P<proj>[^/]+)/(?P<lang>[^/]+))?/(?P<repo>math)/(?P<path>(?P<shard1>[0-9a-f])/(?P<shard2>[0-9a-f])/.+)$',
req.path)
+
+ if match:
+ proj = 'global'
+ lang = 'data'
+ repo = match.group('repo') # math
+ zone = 'render'
+ obj = match.group('path') #
c/9/f/c9f2055dadfb49853eff822a453d9ceb.png
+ shard = match.group('shard1') + match.group('shard2') # c9
+
+ # score renderings
+ if match is None:
+ # /score/j/q/jqn99bwy8777srpv45hxjoiu24f0636/jqn99bwy.png
+ # /score/override-midi/8/i/8i9pzt87wtpy45lpz1rox8wusjkt7ki.ogg
+ match = re.match(r'^/(?P<repo>score)/(?P<path>.+)$', req.path)
+ if match:
+ proj = 'global'
+ lang = 'data'
+ repo = match.group('repo') # score
+ zone = 'render'
+ obj = match.group('path') #
j/q/jqn99bwy8777srpv45hxjoiu24f0636/jqn99bwy.png
+ shard = ''
+
+ if match is None:
+ match = re.match(r'^/monitoring/(?P<what>.+)$', req.path)
+ if match:
+ what = match.group('what')
+ if what == 'frontend':
+ headers = {'Content-Type': 'application/octet-stream'}
+ resp = webob.Response(headers=headers, body="OK\n")
+ elif what == 'backend':
+ req.host = '127.0.0.1:%s' % self.bind_port
+ req.path_info = "/v1/%s/monitoring/backend" % self.account
+
+ app_iter = self._app_call(env)
+ status = self._get_status_int()
+ headers = self._response_headers
+
+ resp = webob.Response(status=status, headers=headers,
app_iter=app_iter)
+ else:
+ resp = webob.exc.HTTPNotFound('Monitoring type not found
"%s"' % (req.path))
+ return resp(env, start_response)
+
+ if match is None:
+ match = re.match(r'^/(?P<path>[^/]+)?$', req.path)
+ # /index.html /favicon.ico /robots.txt etc.
+ # serve from a default "root" container
+ if match:
+ path = match.group('path')
+ if not path:
+ path = 'index.html'
+
+ req.host = '127.0.0.1:%s' % self.bind_port
+ req.path_info = "/v1/%s/root/%s" % (self.account, path)
+
+ app_iter = self._app_call(env)
+ status = self._get_status_int()
+ headers = self._response_headers
+
+ resp = webob.Response(status=status, headers=headers,
app_iter=app_iter)
+ return resp(env, start_response)
+
+ # Internally rewrite the URL based on the regex it matched...
+ if match:
+ # Get the per-project "conceptual" container name, e.g.
"<proj><lang><repo><zone>"
+ container = "%s-%s-%s-%s" % (proj, lang, repo, zone)
+ # Add 2-digit shard to the container if it is supposed to be
sharded.
+ # We may thus have an "actual" container name like
"<proj><lang><repo><zone>.<shard>"
+ if container in self.shard_container_list:
+ container += ".%s" % shard
+
+ # Save a url with just the account name in it.
+ req.path_info = "/v1/%s" % (self.account)
+ port = self.bind_port
+ req.host = '127.0.0.1:%s' % port
+ url = req.url[:]
+ # Create a path to our object's name.
+ req.path_info = "/v1/%s/%s/%s" % (self.account, container,
urllib2.unquote(obj))
+ #self.logger.warn("new path is %s" % req.path_info)
+
+ # do_start_response just remembers what it got called with,
+ # because our 404 handler will generate a different response.
+ app_iter = self._app_call(env)
+ status = self._get_status_int()
+ headers = self._response_headers
+
+ if 200 <= status < 300 or status == 304:
+ # We have it! Just return it as usual.
+ #headers['X-Swift-Proxy']= `headers`
+ return webob.Response(status=status, headers=headers,
+ app_iter=app_iter)(env, start_response)
+ elif status == 404:
+ # only send thumbs to the 404 handler; just return a 404 for
everything else.
+ if repo == 'local' and zone == 'thumb':
+ resp = self.handle404(reqorig, url, container, obj)
+ return resp(env, start_response)
+ else:
+ resp = webob.exc.HTTPNotFound('File not found: %s' %
req.path)
+ return resp(env, start_response)
+ elif status == 401:
+ # if the Storage URL is invalid or has expired we'll get this
error.
+ resp = webob.exc.HTTPUnauthorized('Token may have timed out')
+ return resp(env, start_response)
+ else:
+ resp = webob.exc.HTTPNotImplemented('Unknown Status: %s' %
(status))
+ return resp(env, start_response)
+ else:
+ resp = webob.exc.HTTPNotFound('Regexp failed to match URI: "%s"' %
(req.path))
+ return resp(env, start_response)
+
+
+class WMFRewrite(object):
+ def __init__(self, app, conf):
+ self.app = app
+ self.conf = conf
+ self.logger = get_logger(conf)
+
+ def __call__(self, env, start_response):
+ # end-users should only do GET/HEAD, nothing else needs a rewrite
+ if env['REQUEST_METHOD'] not in ('HEAD', 'GET'):
+ return self.app(env, start_response)
+
+ # do nothing on authenticated and authentication requests
+ path = env['PATH_INFO']
+ if path.startswith('/auth') or path.startswith('/v1/AUTH_'):
+ return self.app(env, start_response)
+
+ context = _WMFRewriteContext(self, self.conf)
+ return context.handle_request(env, start_response)
+
+
+def filter_factory(global_conf, **local_conf):
+ conf = global_conf.copy()
+ conf.update(local_conf)
+
+ def wmfrewrite_filter(app):
+ return WMFRewrite(app, conf)
+
+ return wmfrewrite_filter
+
+# vim: set expandtab tabstop=4 shiftwidth=4 autoindent:
diff --git a/puppet/modules/swift/manifests/init.pp
b/puppet/modules/swift/manifests/init.pp
index 71f949f..8e49550 100644
--- a/puppet/modules/swift/manifests/init.pp
+++ b/puppet/modules/swift/manifests/init.pp
@@ -34,6 +34,12 @@
# [*container_cfg_file*]
# Swift container server configuration file. The file will be generated by
Puppet.
#
+# [*thumbnail_container*]
+# Swift container where thumbnails will be stored.
+#
+# [*public_container*]
+# Swift container where originals will be stored.
+#
class swift (
$storage_dir,
$port,
@@ -45,6 +51,8 @@
$account_cfg_file,
$object_cfg_file,
$container_cfg_file,
+ $thumbnail_container,
+ $public_container,
) {
include ::apache::mod::proxy
include ::apache::mod::proxy_http
@@ -55,6 +63,7 @@
require_package('swift-object')
require_package('swift-proxy')
require_package('python-swiftclient')
+ require_package('python-webob')
user { 'swift':
ensure => present,
@@ -100,6 +109,15 @@
mode => '0644',
}
+ file { '/usr/local/lib/python2.7/dist-packages/wmf/':
+ owner => 'root',
+ group => 'root',
+ mode => '0444',
+ source => 'puppet:///modules/swift/SwiftMedia/wmf/',
+ recurse => 'remote',
+ require => Package['python-webob'],
+ }
+
swift::ring { $account_cfg_file:
ring_type => 'account',
cfg_file => $account_cfg_file,
@@ -133,18 +151,19 @@
File["${storage_dir}/1"],
File[$cfg_file],
File[$proxy_cfg_file],
+ File['/usr/local/lib/python2.7/dist-packages/wmf/'],
Ring[$account_cfg_file],
Ring[$object_cfg_file],
Ring[$container_cfg_file],
],
}
- swift::container { 'wiki-local-public':
+ swift::container { "wiki-${public_container}":
public => true,
require => Exec['swift-init'],
}
- swift::container { 'wiki-local-thumb':
+ swift::container { "wiki-${thumbnail_container}":
public => true,
require => Exec['swift-init'],
}
diff --git a/puppet/modules/swift/templates/proxy-server.conf.erb
b/puppet/modules/swift/templates/proxy-server.conf.erb
index f0eb79a..321a052 100644
--- a/puppet/modules/swift/templates/proxy-server.conf.erb
+++ b/puppet/modules/swift/templates/proxy-server.conf.erb
@@ -7,7 +7,7 @@
log_facility = LOG_LOCAL1
[pipeline:main]
-pipeline = healthcheck cache tempauth proxy-server
+pipeline = rewrite healthcheck cache tempauth proxy-server
[app:proxy-server]
use = egg:swift#proxy
@@ -24,3 +24,21 @@
[filter:cache]
use = egg:swift#memcache
+
+[filter:rewrite]
+# the auth system turns our login and key into an account / token pair.
+# the account remains valid forever, but the token times out.
+account = AUTH_<%= scope['::swift::project'] %>
+# the name of the scaler cluster.
+thumbhost = 127.0.0.1:8888
+# upload doesn't like our User-agent (Python-urllib/2.6), otherwise we could
call it using urllib2.urlopen()
+user_agent = Mozilla/5.0
+# this list is the containers that should be sharded
+shard_container_list = ''
+# backend_url_format controls whether we pass the URL through to the thumbhost
unmolested
+# or mangle it to be consumed by mediawiki. ms5 takes URLs unmolested,
mediawiki wants them
+# transformed to something more palatable (specifically, turning
http://upload/proj/lang/ into http://lang.proj/
+# valid options are 'asis' (leave it alone) and 'sitelang' (change upload to
lang.site.org)
+backend_url_format = 'asis'
+
+paste.filter_factory = wmf.rewrite:filter_factory
\ No newline at end of file
diff --git a/puppet/modules/swift/templates/ring.conf.erb
b/puppet/modules/swift/templates/ring.conf.erb
index 487ca3e..7a136a2 100644
--- a/puppet/modules/swift/templates/ring.conf.erb
+++ b/puppet/modules/swift/templates/ring.conf.erb
@@ -12,6 +12,8 @@
[app:<%= @ring_type %>-server]
use = egg:swift#<%= @ring_type %>
+# Only for object-server
+allowed_headers = content-disposition, content-encoding, x-delete-at,
x-object-manifest, x-content-duration, xkey
[<%= @ring_type %>-replicator]
vm_test_mode = yes
diff --git a/puppet/modules/thumbor/manifests/init.pp
b/puppet/modules/thumbor/manifests/init.pp
index 464587e..fceb8f7 100644
--- a/puppet/modules/thumbor/manifests/init.pp
+++ b/puppet/modules/thumbor/manifests/init.pp
@@ -61,6 +61,9 @@
# For pycurl, a dependency of thumbor
require_package('libcurl4-gnutls-dev')
+ # For lxml, a dependency of thumbor-plugins
+ require_package('libxml2-dev', 'libxslt1-dev')
+
$statsd_host = 'localhost'
$statsd_prefix = 'Thumbor'
@@ -93,6 +96,8 @@
# Needs to be an explicit dependency, for the packages pointing to
git repos
Package['git'],
Package['libcurl4-gnutls-dev'],
+ Package['libxml2-dev'],
+ Package['libxslt1-dev'],
],
timeout => 600, # This venv can be particularly long to download and
setup
}
@@ -148,12 +153,6 @@
host => '127.0.0.1',
port => $::swift::port,
onlyif => 'req.url ~ "^/images/.*"',
- }
-
- varnish::backend { 'thumbor':
- host => '127.0.0.1',
- port => '8888',
- onlyif => 'req.url ~ "^/images/thumb/.*"',
}
varnish::config { 'thumbor':
diff --git a/puppet/modules/thumbor/templates/thumbor.conf.erb
b/puppet/modules/thumbor/templates/thumbor.conf.erb
index 3f78b64..e06ccf0 100644
--- a/puppet/modules/thumbor/templates/thumbor.conf.erb
+++ b/puppet/modules/thumbor/templates/thumbor.conf.erb
@@ -594,8 +594,8 @@
SWIFT_HOST = '127.0.0.1:<%= scope['::swift::port'] %>'
SWIFT_API_PATH = '/v1/AUTH_<%= scope['::swift::project'] %>/'
SWIFT_AUTH_PATH = '/auth/v1.0'
-SWIFT_ORIGINAL_CONTAINER = 'wiki-local-public'
-SWIFT_THUMBNAIL_CONTAINER = 'wiki-local-thumb'
+SWIFT_ORIGINAL_CONTAINER = 'wiki-<%= scope['::swift::public_container'] %>'
+SWIFT_THUMBNAIL_CONTAINER = 'wiki-<%= scope['::swift::thumbnail_container'] %>'
SWIFT_USER = '<%= scope['::swift::project'] %>:<%= scope['::swift::user'] %>'
SWIFT_KEY = '<%= scope['::swift::key'] %>'
diff --git a/puppet/modules/thumbor/templates/varnish.vcl.erb
b/puppet/modules/thumbor/templates/varnish.vcl.erb
index 3aaaae6..1ba5d18 100644
--- a/puppet/modules/thumbor/templates/varnish.vcl.erb
+++ b/puppet/modules/thumbor/templates/varnish.vcl.erb
@@ -12,16 +12,6 @@
}
sub vcl_recv {
- if (req.restarts == 0) {
- set req.backend_hint = swift;
- } else {
- set req.backend_hint = thumbor;
- # Restore original URL which has been rewritten for swift
- set req.url = req.http.X-Url;
- # This is to avoid overwriting the xkey served by Thumbor
- set req.http.X-Url = "";
- }
-
set req.http.X-Forwarded-For = client.ip;
# Since we expose varnish on the default port (6081) we need to rewrite
@@ -45,21 +35,15 @@
}
}
+ # Some oddity coming from mediawiki's swift code prefixing containers with
the wiki
+ # name, which we end up having to compensate for here
+ if (req.url ~ "^/images/") {
+ set req.url = "/wiki" + req.url;
+ }
+
# Reject any methods that aren't expected to work in the context of
thumbnails
if (req.method != "GET" && req.method != "HEAD") {
return (synth(405, "Method not allowed"));
- }
-
- # Swift needs the URLs rewritten, it can't do that alone
- if (req.backend_hint == swift) {
- # Save original URL for thumbor's sake
- set req.http.X-Url = req.url;
-
- if (req.url ~ "^/images/thumb/") {
- set req.url = regsub(req.url, "^/images/thumb/(.*)", "/v1/AUTH_<%=
scope['::swift::project'] %>/wiki-local-thumb/\1");
- } else {
- set req.url = regsub(req.url, "^/images/(.*)", "/v1/AUTH_<%=
scope['::swift::project'] %>/wiki-local-public/\1");
- }
}
return(hash);
@@ -82,19 +66,9 @@
}
}
- # If the thumbnail is a miss in swift, request it from thumbor
- if (resp.status == 404 && req.backend_hint == swift) {
- return(restart);
- }
-
# Let things go to the default-subs.vcl vcl_deliver
}
sub vcl_backend_response {
set beresp.http.Access-Control-Allow-Origin = "*";
-
- # Swift can't set the xkey itself, we have to help
- if (bereq.http.X-Url ~ "^/images/thumb/") {
- set beresp.http.xkey = regsub(bereq.http.X-Url,
"^/images/thumb/[^/]+/[^/]+/[^/]+/.*-([0-9a-zA-Z]+)[\.0-9a-zA-Z]+$", "\1");
- }
}
\ No newline at end of file
--
To view, visit https://gerrit.wikimedia.org/r/279373
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I13d53401b1c4875ff74a111336fb58f35570d862
Gerrit-PatchSet: 4
Gerrit-Project: mediawiki/vagrant
Gerrit-Branch: master
Gerrit-Owner: Gilles <[email protected]>
Gerrit-Reviewer: BryanDavis <[email protected]>
Gerrit-Reviewer: Dduvall <[email protected]>
Gerrit-Reviewer: Faidon Liambotis <[email protected]>
Gerrit-Reviewer: Filippo Giunchedi <[email protected]>
Gerrit-Reviewer: Gilles <[email protected]>
Gerrit-Reviewer: jenkins-bot <>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits