Thank you for your responses. This was *very* helpful for me and I hope others who are trying to understand the reasoning behind a fat-pings-only approach.
I remain with some concerns around feed format agnostic hubs and heavy payloads but I actually need to sit down and do more homework on this. It may well be the case - like many posters argued here - that these concerns are practically not relevant. Alex On Wed, Oct 28, 2009 at 11:58 AM, Bob Wyman <[email protected]> wrote: > On Wed, Oct 28, 2009 at 12:39 AM, Julien <[email protected]> wrote: >> Not sure why large feeds would be >> different from smaller feeds. > > A feed may be "large" because it has many entries. If only one entry in the > feed has changed, it is inefficient to copy all the entries of the feed > since you will discard most of them. > A feed may be "large" because it contains even a small number of "large" > entries. It will still be inefficient to receive the entire feed since you > will be discarding, as previously seen, a large number of bytes. > > Receiving only updated entries is more efficient since you don't suffer the > waste that is inherent in polling multi-entry feed documents that contain > previously seen entries. > > bob wyman > > > On Wed, Oct 28, 2009 at 12:39 AM, Julien <[email protected]> wrote: >> >> Hey, >> >> >> On Oct 27, 11:07 am, Brett Slatkin <[email protected]> wrote: >> > Hey Alex, >> > >> > On Tue, Oct 27, 2009 at 9:56 AM, Alex Barth <[email protected]> wrote: >> > > How do you guys see the advantages/disadvantages of POSTing feed data >> > > in these scenarios: >> > >> > > 1. Hub does not serve delta feed. In my mind, this is can be >> > > interesting for 3 reasons: a) building simple hubs that don't inspect >> > > feeds at all b) building hubs that are completely agnostic to their >> > > feed formats, c) hubs convert feed to standard format, subscribers >> > > pull the first feed data from hub, not from original publisher (heck, >> > > how do the superfeedr guys do that?) >> > >> > I think (a) isn't too compelling. We're going to have a few, very >> > well-tested hub implementations that people can run or use as a hosted >> > service. >> Agreed... and also, I think the protocol was built to stay simple. >> Based on that , we should avoid having lighter (non-compatible) >> implementations. >> However, I think the "diffing" should not be part of the protocol >> itself, but stay "vague" or at least open to other data than RSS/Atom. >> > >> > We would like (b) to be part of the core spec eventually, with other >> > secondary specs that explain how to do differential updates for >> > secondary content types (if necessary). >> Agreed! >> > >> > For (c), Superfeedr is acting as a federated hub, meaning they >> > subscribe to all other hubs' updates and proxy them to their >> > subscribers. This allows for composition and data transformation. >> Yes... that is exactly what we do. And more than just "formats" we >> also map other protocols into PubSubHubbub, like RSSCloud or (soon...) >> SUP, but also streams from app such as identica/twitter... etc. >> >> c) hubs convert feed to standard format, subscribers >> > > pull the first feed data from hub, not from original publisher (heck, >> > > how do the superfeedr guys do that?) >> Well, we just parse the new content and map that into a consistent >> form. Then, we push the updates to our subscribers. In our case, we do >> not store anything, which means that (as the protocol works anyway), >> nobody pulls from us. >> >> > >> > > 2. Large data sets (i. e. DC's 2009 crime feed has >> > > 1.2MB)http://data.octo.dc.gov/ >> > >> > I think distributing just the changes is significantly more efficient >> > for large feeds. Instead of pushing 1.2MB each time the feed changes >> > to 1000+ subscribers, you can just send the newest 2KB update. >> > >> > Combined with the Atom Tombstoning draft spec >> > (http://www.ietf.org/id/draft-snell-atompub-tombstones-06.txt) we >> > should be able to get Hubbub to communicate new and deleted content in >> > the same way. >> >> Not sure why large feeds would be different from smaller feeds. >> >> > >> > > 3. Many and often changing subscribers - wouldn't this lead to >> > > unnecessarily sent large POST requests to subscribers that actually >> > > don't exist anymore? >> > >> > Subscriptions in the hub have a lease period and must be checked for >> > validity ever-so-often. This allows the hub to prune old/bad >> > subscribers that aren't receiving the feed anymore. Again, this let's >> > the data flow be streamlined to the minimum bandwidth possible. >> > >> > Hope that helps, >> > >> > -Brett >
