Repository: trafficserver Updated Branches: refs/heads/master 04c6fc93b -> 362c8b692
docs: Fix Metalink RFC number, s/plugins.config/plugin.config, and minor edits Project: http://git-wip-us.apache.org/repos/asf/trafficserver/repo Commit: http://git-wip-us.apache.org/repos/asf/trafficserver/commit/362c8b69 Tree: http://git-wip-us.apache.org/repos/asf/trafficserver/tree/362c8b69 Diff: http://git-wip-us.apache.org/repos/asf/trafficserver/diff/362c8b69 Branch: refs/heads/master Commit: 362c8b692bb28c6bbd1232cad5477539dff4539f Parents: 04c6fc9 Author: Jack Bates <[email protected]> Authored: Mon Mar 10 10:29:07 2014 -0700 Committer: James Peach <[email protected]> Committed: Mon Mar 10 11:26:17 2014 -0700 ---------------------------------------------------------------------- doc/reference/plugins/metalink.en.rst | 164 ++++++++++++++++------------- plugins/experimental/metalink/README | 29 +++-- 2 files changed, 102 insertions(+), 91 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/trafficserver/blob/362c8b69/doc/reference/plugins/metalink.en.rst ---------------------------------------------------------------------- diff --git a/doc/reference/plugins/metalink.en.rst b/doc/reference/plugins/metalink.en.rst index 7b804cc..8ba10dc 100644 --- a/doc/reference/plugins/metalink.en.rst +++ b/doc/reference/plugins/metalink.en.rst @@ -1,113 +1,125 @@ -.. Licensed to the Apache Software Foundation (ASF) under one - or more contributor license agreements. See the NOTICE file - distributed with this work for additional information - regarding copyright ownership. The ASF licenses this file - to you under the Apache License, Version 2.0 (the - "License"); you may not use this file except in compliance - with the License. You may obtain a copy of the License at - +.. Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed + with this work for additional information regarding copyright + ownership. The ASF licenses this file to you under the Apache + License, Version 2.0 (the "License"); you may not use this file + except in compliance with the License. You may obtain a copy of + the License at + http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, - software distributed under the License is distributed on an - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - KIND, either express or implied. See the License for the - specific language governing permissions and limitations - under the License. + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied. See the License for the specific language governing + permissions and limitations under the License. .. _metalink-plugin: + Metalink plugin =============== -The `metalink` plugin implements the -`Metalink <http://en.wikipedia.org/wiki/Metalink>`_ -protocol in order to try not to download the same file twice. This -improves cache efficiency and speeds up user downloads. +The `metalink` plugin implements the `Metalink`_ download description +format in order to try not to download the same file twice. This +improves cache efficiency and speeds up users' downloads. -Take standard headers and knowledge about objects in the cache and -potentially rewrite those headers so that a client will use a URL -that is already cached instead of one that isn't. - -The `metalink` headers are specified in :rfc:`6429` and :rfc:`3230` -and are sent by various download redirectors or content distribution -networks. +It takes standard headers and knowledge about objects in the cache and +potentially rewrites those headers so that a client will use a URL +that's already cached instead of one that isn't. The headers are +specified in :rfc:`6249` (Metalink/HTTP: Mirrors and Hashes) and +:rfc:`3230` (Instance Digests in HTTP) and are sent by various +download redirectors or content distribution networks. A lot of download sites distribute the same files from many different -mirrors and users don't know which mirrors are already cached. These +mirrors and users don't know which mirrors are already cached. These sites often present users with a simple download button, but the -button doesn't predictably access the same mirror, or a mirror that -is already cached. To users it seems like the download works sometimes +button doesn't predictably access the same mirror, or a mirror that's +already cached. To users it seems like the download works sometimes (takes seconds) and not others (takes hours), which is frustrating. An extreme example of this happens when users share a limited, possibly unreliable internet connection, as is common in parts of Africa for example. -How it works + +How it Works ------------ -When the `metalink` plugin sees a response with a ``Location: ...`` header and a -``Digest: SHA-256=...`` header, it checks to see if the URL in the Location -header is already cached. If it isn't, then it tries to find a URL -that is cached to use instead. It looks in the cache for some object -that matches the digest in the Digest header and if it finds -something, then it rewites the ``Location`` header with the URL from -that object. +When the plugin sees a response with a :mailheader:`Location: ...` +header and a :mailheader:`Digest: SHA-256=...` header, it checks if +the URL in the :mailheader:`Location` header is already cached. If it +isn't, then it tries to find a URL that is cached to use instead. It +looks in the cache for some object that matches the digest in the +:mailheader:`Digest` header and if it succeeds, then it rewites the +:mailheader:`Location` header with that object's URL. + +This way a client should get sent to a URL that's already cached and +won't download the file again. -That way a client should get sent to a URL that's already cached -and the user won't end up downloading the file again. Installation ------------ -`metalink` is a global plugin. It is enabled by adding it to your -:file:`plugin.config` file. There are no options. +The `metalink` plugin is a global plugin. Enable it by adding +``metalink.so`` to your :file:`plugin.config` file. There are no +options. + Implementation Status --------------------- -The `metalink` plugin implements the ``TS_HTTP_SEND_RESPONSE_HDR_HOOK`` -hook to check and potentially rewrite the ``Location: ...`` and -``Digest: SHA-256=...`` headers after responses are cached. It -doesn't do it before they're cached because the contents of the -cache can change after responses are cached. It uses :c:func:`TSCacheRead` -to check if the URL in the ``Location: ...`` header is already -cached. In future, the plugin should also check if the URL is fresh -or not. - -The plugin implements ``TS_HTTP_READ_RESPONSE_HDR_HOOK`` and a null -transform to compute the SHA-256 digest for content as it's added -to the cache, then uses :c:func:`TSCacheWrite` to associate the -digest with the request URL. This adds a new cache object where the -key is the digest and the object is the request URL. +The plugin implements the ``TS_HTTP_SEND_RESPONSE_HDR_HOOK`` hook to +check and potentially rewrite the :mailheader:`Location: ...` and +:mailheader:`Digest: SHA-256=...` headers after responses are cached. +It doesn't do it before they're cached because the contents of the +cache can change after responses are cached. It uses +:c:func:`TSCacheRead` to check if the URL in the +:mailheader:`Location: ...` header is already cached. In future, the +plugin should also check if the URL is fresh or not. + +The plugin implements the ``TS_HTTP_READ_RESPONSE_HDR_HOOK`` hook and +`a null transform`_ to compute the SHA-256 digest for content as it's +added to the cache. It uses SHA256_Init(), SHA256_Update(), and +SHA256_Final() from OpenSSL to compute the digest, then it uses +:c:func:`TSCacheWrite` to associate the digest with the request URL. +This adds a new cache object where the key is the digest and the +object is the request URL. To check if the cache already contains content that matches a digest, -the plugin must call :c:func:`TSCacheRead` with the digest as the -key, read the URL stored in the resultant object, and then call -:c:func:`TSCacheRead` again with this URL as the key. This is +the plugin must call :c:func:`TSCacheRead` with the digest as the key, +read the URL stored in the resultant object, and then call +:c:func:`TSCacheRead` again with this URL as the key. This is probably inefficient and should be improved. -An early version of the plugin scanned ``Link: <...>; rel=duplicate`` -headers. If the URL in the ``Location: ...`` header was not already -cached, it scanned ``Link: <...>; rel=duplicate`` headers for a URL -that was. The ``Digest: SHA-256=...`` header is superior because it -will find content that already exists in the cache in every case -that a ``Link: <...>; rel=duplicate`` header would, plus in cases -where the URL is not listed among the ``Link: <...>; rel=duplicate`` -headers, maybe because the content was downloaded from a URL not -participating in the content distribution network, or maybe because -there are too many mirrors to list in ``Link: <...>; rel=duplicate`` +An early version of the plugin scanned :mailheader:`Link: <...>; +rel=duplicate` headers. If the URL in the :mailheader:`Location: ...` +header wasn't already cached, it scanned :mailheader:`Link: <...>; +rel=duplicate` headers for a URL that was. The :mailheader:`Digest: +SHA-256=...` header is superior because it will find content that +already exists in the cache in every case that a :mailheader:`Link: +<...>; rel=duplicate` header would, plus in cases where the URL is not +listed among the :mailheader:`Link: <...>; rel=duplicate` headers, +maybe because the content was downloaded from a URL not participating +in the content distribution network, or maybe because there are too +many mirrors to list in :mailheader:`Link: <...>; rel=duplicate` headers. -The ``Digest: SHA-256=...`` header is also more efficient than ``Link: -<...>; rel=duplicate`` headers because it involves a constant number -of cache lookups. :rfc:`6249` requires a ``Digest: SHA-256=...`` header -or ``Link: <...>; rel=duplicate`` headers MUST be ignored: +The :mailheader:`Digest: SHA-256=...` header is also more efficient +than :mailheader:`Link: <...>; rel=duplicate` headers because it +involves a constant number of cache lookups. RFC 6249 requires a +:mailheader:`Digest: SHA-256=...` header or :mailheader:`Link: <...>; +rel=duplicate` headers MUST be ignored: + + If Instance Digests are not provided by the Metalink servers, the + :mailheader:`Link` header fields pertaining to this specification + MUST be ignored. + + Metalinks contain whole file hashes as described in Section 6, and + MUST include SHA-256, as specified in [FIPS-180-3]. - If Instance Digests are not provided by the Metalink servers, the - Link header fields pertaining to this specification MUST be ignored. - Metalinks contain whole file hashes as described in Section 6, - and MUST include SHA-256, as specified in [FIPS-180-3]. +.. _Metalink: http://en.wikipedia.org/wiki/Metalink +.. _a null transform: + /sdk/http-transformation-plugin/sample-null-transformation-plugin http://git-wip-us.apache.org/repos/asf/trafficserver/blob/362c8b69/plugins/experimental/metalink/README ---------------------------------------------------------------------- diff --git a/plugins/experimental/metalink/README b/plugins/experimental/metalink/README index 7f56dd8..46fdd7c 100644 --- a/plugins/experimental/metalink/README +++ b/plugins/experimental/metalink/README @@ -1,5 +1,5 @@ - Metalink + Metalink Try not to download the same file twice. Improve cache efficiency and speed up downloads. @@ -7,15 +7,15 @@ Take standard headers and knowledge about objects in the cache and potentially rewrite those headers so that a client will use a URL that's already cached instead of one that isn't. The headers are - specified in [RFC 6429] (Metalink/HTTP: Mirrors and Hashes) and + specified in [RFC 6249] (Metalink/HTTP: Mirrors and Hashes) and [RFC 3230] (Instance Digests in HTTP) and are sent by various download redirectors or content distribution networks. 1. Who Cares? - More important than saving a little bit of bandwidth, this saves - users from frustration. + More important than saving a little bandwidth, this saves users + from frustration. A lot of download sites distribute the same files from many different mirrors and users don't know which mirrors are already @@ -41,8 +41,8 @@ header is already cached. If it isn't, then it tries to find a URL that is cached to use instead. It looks in the cache for some object that matches the digest in the Digest header and if it - succeeds, then it rewites the Location header with the URL from - that object. + succeeds, then it rewites the Location header with that object's + URL. This way a client should get sent to a URL that's already cached and won't download the file again. @@ -50,17 +50,17 @@ 3. How to Use it - Just build the plugin and add it to your plugins.config file. + Just build the plugin and add it to your plugin.config file. The code is distributed along with recent versions of Traffic Server, in the "plugins/experimental/metalink" directory. To build - it, pass the "--enable-experimental-plugins" option to the Traffic - Server configure script when you build Traffic Server: + it, pass the "--enable-experimental-plugins" option to the + configure script when you build Traffic Server: <pre>$ ./configure --enable-experimental-plugins</pre> When you're done building Traffic Server, add "metalink.so" to your - plugins.config file to start using the plugin. + plugin.config file to start using the plugin. 4. Read More @@ -68,12 +68,11 @@ More details are on the [wiki page] in the Traffic Server wiki. - [RFC 6429] http://tools.ietf.org/html/rfc6249 + [RFC 6249] http://tools.ietf.org/html/rfc6249 - [RFC 3230] http://tools.ietf.org/html/rfc3230 + [RFC 3230] http://tools.ietf.org/html/rfc3230 [How to cache openSUSE repositories with Squid] - http://wiki.jessen.ch/index/How_to_cache_openSUSE_repositories_with_Squid + http://wiki.jessen.ch/index/How_to_cache_openSUSE_repositories_with_Squid - [wiki page] - https://cwiki.apache.org/confluence/display/TS/Metalink + [wiki page] https://cwiki.apache.org/confluence/display/TS/Metalink
