[pubsubhubbub] r382 committed - Edited wiki page through web user interface.

codesite-noreply Wed, 14 Jul 2010 13:30:42 -0700

Revision: 382
Author: codemonkeydan
Date: Wed Jul 14 13:17:25 2010
Log: Edited wiki page through web user interface.
http://code.google.com/p/pubsubhubbub/source/detail?r=382


Modified:
 /wiki/HubsAndFeedProxies.wiki

=======================================
--- /wiki/HubsAndFeedProxies.wiki       Wed Jul 14 13:16:20 2010
+++ /wiki/HubsAndFeedProxies.wiki       Wed Jul 14 13:17:25 2010
@@ -18,13 +18,11 @@

* *When using conditional redirects, hubs should be redirected too.* Ifa publisher has set up their feed to conditionally redirect to a feedproxy, this redirect should apply to all hubs as well; this ensures no onebut the feed proxy will receive the non-proxied feed content. Generallyspeaking, this is what publisher platforms do anyway, so this doesn'trequire a change to anything.* *When a hub crawls a proxied feed, the proxy should treat it as aping.* Hubs generally crawl feeds when they receive a ping; so assuming theproxy can trust the hub, any time a hub crawls a proxied feed, the proxycan assume that the hub was pinged, and can therefore treat the crawl as aping, and go crawl the source feed. For conditionally redirected feeds,this allows feed proxies to receive a notification that the source feed haschanged without being directly pinged by the publisher's platform -- theplatform just pings the hub, which crawls the source feed URL, and isredirected to the proxied feed. (Note that this requires that the proxy iscapable of identifying a request as having originated from a hub. We'lldiscuss how that could work below.)* *When hub links appear in the source feed, proxies should subscribe tothem.* In cases where the source feed URL differs from the proxied feedURL, and the feed is not conditionally redirected, the feed proxy canreceive updates to the source feed by subscribing to the hub(s) that showup in the source feed. The feed proxy can treat these updates as pings, andgo crawl the source feed to get the latest content (if the proxy requiresthe full source feed, rather than just the deltas provided by the hub.)- * *Feed proxies should ignore updates from hubs that came from their ownservers.* Since a feed proxy isn't able to tell (and shouldn't care)whether a source feed URL will conditionally redirect hubs to the proxy, itshould subscribe to hubs for updates for all source feeds that have hublinks in them. This means that when the publisher conditionally redirectsrequests for the source feed URL to the feed proxy, the proxy will receiveupdates from the hub for proxied feed content. Proxies should insert anelement into the feed (Atom) or channel (RSS) element in the proxied feedcontent that allows them to identify that content they receive from hubshas already been proxied by their service; when this element is present,the feed proxy can simply ignore the update, rather than-treating it as a ping (which would lead to a ping loop.) In FeedBurner'scase, this is accomplished using the "feedburner:info" element.+ * *Feed proxies should ignore updates from hubs that came from their ownservers.* Since a feed proxy isn't able to tell (and shouldn't care)whether a source feed URL will conditionally redirect hubs to the proxy, itshould subscribe to hubs for updates for all source feeds that have hublinks in them. This means that when the publisher conditionally redirectsrequests for the source feed URL to the feed proxy, the proxy will receiveupdates from the hub for proxied feed content. Proxies should insert anelement into the feed (Atom) or channel (RSS) element in the proxied feedcontent that allows them to identify that content they receive from hubshas already been proxied by their service; when this element is present,the feed proxy can simply ignore the update, rather than treating it as aping (which would lead to a ping loop.) In FeedBurner's case, this isaccomplished using the "feedburner:info" element.* *Pings should be rate-limited.* In order to prevent abuse / DOSing ofsource feeds, and to avoid ping loops involving hubs and proxies, feedproxies should limit the rate at which pings are accepted. Ideally thiswould be implemented as an exponential backoff, allowing for multipleupdates in a short time period, while still preventing abuse and ping loops.* *All hub links that appear in the source feed should be carried overto the proxied feed.* A publisher that wants their source feed to beavailable for subscribers through a particular hub probably also wants theproxied version of the feed to be available through that hub. So the feedproxy should include all hub links that appeared in the source feed. It canchoose to add other hub links; for example, FeedBurner always adds thereference hub to the feeds it proxies. Ideally this behavior should beconfigurable by publishers.* *Hubs should accept subscription requests for proxied feeds;subscribers should try another hub if they don't.* Since the feed proxyshould copy over hub links that appear in the source feed, hubs shouldaccept subscription requests for proxied feeds, which may be served under adifferent domain than the source feed. If a hub doesn't allow subscriptionrequests for the proxied feed, then a subscriber should try another hub(assuming there is another one in the proxied feed.)* *When feeds update, proxies should ping all hubs in the proxied feed(apart from a hub that is crawling it).* When a hub link appears in aproxied feed, that tells subscribers that they can receive updates to thefeed by subscribing to that link. To make sure that hub actually receivesupdates, the feed proxy should send pings to any hub links that appear inthe proxied feed. However, when a hub crawls a proxied feed URL, it willalready receive an updated version of the feed; so a feed proxy should notsend a ping to a hub when it gets crawled by that hub and notices that theproxied feed has changed. (This helps avoid ping loops.) Note that thisrequires that the proxy is capable of telling which hub link in the proxiedfeed not to send a ping for, based on the headers in the hub's request.We'll discuss how that could work below.

* *When a hub receives a ping for a feed URL, it should crawl and updatesubscribers for all feeds that have the same Atom ID or channel link.*Since the same feed content can be served under different URLs, when a hubreceives a ping for a URL, it should crawl that URL, look at the atom ID(Atom) or channel link (RSS) in the resulting feed, and crawl and updatesubscribers for all other URLs that resolve to feeds with the same Atom ID/ channel link. (This requires the hub to maintain a mapping from Atom ID /channel link -> <feed URLs>.) A hub can also match up Atom and RSS versionsof the same feed based on the alternate link in Atom matching the channellink in RSS. Crawling all the URLs for a given atom ID or channel linkeliminates the need for a feed proxy to send a ping for every URL underwhich a feed may be served. *It is necessary for the hub to crawl each URL,since different URLs may resolve to different variants of the same feed.*

When things are implemented according to the rules described above, thenfeed proxies receive updates to source feeds from hubs (whether they areconditionally redirected or not), and hubs receive updates from feedproxies for proxied feed content.

[pubsubhubbub] r382 committed - Edited wiki page through web user interface.

Reply via email to