[pubsubhubbub] Re: Are fat pings efficient?

Alex Barth Sat, 31 Oct 2009 09:50:31 -0700

Thank you for your responses. This was *very* helpful for me and I
hope others who are trying to understand the reasoning behind a
fat-pings-only approach.


I remain with some concerns around feed format agnostic hubs and heavy
payloads but I actually need to sit down and do more homework on this.
It may well be the case - like many posters argued here - that these
concerns are practically not relevant.

Alex

On Wed, Oct 28, 2009 at 11:58 AM, Bob Wyman <[email protected]> wrote:
> On Wed, Oct 28, 2009 at 12:39 AM, Julien <[email protected]> wrote:
>> Not sure why large feeds would be
>> different from smaller feeds.
>
> A feed may be "large" because it has many entries. If only one entry in the
> feed has changed, it is inefficient to copy all the entries of the feed
> since you will discard most of them.
> A feed may be "large" because it contains even a small number of "large"
> entries. It will still be inefficient to receive the entire feed since you
> will be discarding, as previously seen, a large number of bytes.
>
> Receiving only updated entries is more efficient since you don't suffer the
> waste that is inherent in polling multi-entry feed documents that contain
> previously seen entries.
>
> bob wyman
>
>
> On Wed, Oct 28, 2009 at 12:39 AM, Julien <[email protected]> wrote:
>>
>> Hey,
>>
>>
>> On Oct 27, 11:07 am, Brett Slatkin <[email protected]> wrote:
>> > Hey Alex,
>> >
>> > On Tue, Oct 27, 2009 at 9:56 AM, Alex Barth <[email protected]> wrote:
>> > > How do you guys see the advantages/disadvantages of POSTing feed data
>> > > in these scenarios:
>> >
>> > > 1. Hub does not serve delta feed. In my mind, this is can be
>> > > interesting for 3 reasons: a) building simple hubs that don't inspect
>> > > feeds at all b) building hubs that are completely agnostic to their
>> > > feed formats, c) hubs convert feed to standard format, subscribers
>> > > pull the first feed data from hub, not from original publisher (heck,
>> > > how do the superfeedr guys do that?)
>> >
>> > I think (a) isn't too compelling. We're going to have a few, very
>> > well-tested hub implementations that people can run or use as a hosted
>> > service.
>> Agreed... and also, I think the protocol was built to stay simple.
>> Based on that , we should avoid having lighter (non-compatible)
>> implementations.
>> However, I think the "diffing" should not be part of the protocol
>> itself, but stay "vague" or at least open to other data than RSS/Atom.
>> >
>> > We would like (b) to be part of the core spec eventually, with other
>> > secondary specs that explain how to do differential updates for
>> > secondary content types (if necessary).
>> Agreed!
>> >
>> > For (c), Superfeedr is acting as a federated hub, meaning they
>> > subscribe to all other hubs' updates and proxy them to their
>> > subscribers. This allows for composition and data transformation.
>> Yes... that is exactly what we do. And more than just "formats" we
>> also map other protocols into PubSubHubbub, like RSSCloud or (soon...)
>> SUP, but also streams from app such as identica/twitter... etc.
>>
>>  c) hubs convert feed to standard format, subscribers
>> > > pull the first feed data from hub, not from original publisher (heck,
>> > > how do the superfeedr guys do that?)
>> Well, we just parse the new content and map that into a consistent
>> form. Then, we push the updates to our subscribers. In our case, we do
>> not store anything, which means that (as the protocol works anyway),
>> nobody pulls from us.
>>
>> >
>> > > 2. Large data sets (i. e. DC's 2009 crime feed has
>> > > 1.2MB)http://data.octo.dc.gov/
>> >
>> > I think distributing just the changes is significantly more efficient
>> > for large feeds. Instead of pushing 1.2MB each time the feed changes
>> > to 1000+ subscribers, you can just send the newest 2KB update.
>> >
>> > Combined with the Atom Tombstoning draft spec
>> > (http://www.ietf.org/id/draft-snell-atompub-tombstones-06.txt) we
>> > should be able to get Hubbub to communicate new and deleted content in
>> > the same way.
>>
>> Not sure why large feeds would be different from smaller feeds.
>>
>> >
>> > > 3. Many and often changing subscribers - wouldn't this lead to
>> > > unnecessarily sent large POST requests to subscribers that actually
>> > > don't exist anymore?
>> >
>> > Subscriptions in the hub have a lease period and must be checked for
>> > validity ever-so-often. This allows the hub to prune old/bad
>> > subscribers that aren't receiving the feed anymore. Again, this let's
>> > the data flow be streamlined to the minimum bandwidth possible.
>> >
>> > Hope that helps,
>> >
>> > -Brett
>

[pubsubhubbub] Re: Are fat pings efficient?

Reply via email to