Actually PSHB does use fat pings - subscribers are POSTed the delta of any feed 
when a Publisher notifies a Hub that a change has taken place.

As to efficiency, I think the caching mechanism is slightly off track. Serving 
a cached delta feed, and sending a delta feed (which presumably is generated 
just once) are fairly equivelant. There are however differences in the backend 
request serving - whether the servicing is done using a proxy or a full 
application. In most cases, I'd assume the second is offloaded as much as 
possible to a basic task and not being run through an application proper so as 
to lower the request cost.

To your second point, Subscribers should never synchronously process updates. 
They should be dumped immediately to a job queue for asynchronous processing. 
This will help spread the processing load more evenly over time instead of 
being clumped together which I gather is what you're against. So it's receive 
update, verify it is an update (input validation), dump update to queue, and 
respond with a 200 code.

So, I think overall it's still quite an efficient system. The main thing is 
making sure each party is being efficient about it which is, of course, an 
implementation point the specification won't be commenting on. I think this 
will be the biggest mental block over time - web developers are pretty bad at 
thinking asynchronously ;).

Paddy

 Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
OpenID Europe Foundation Irish Representative





________________________________
From: Alexis Richardson <[email protected]>
To: [email protected]
Sent: Mon, October 26, 2009 4:29:43 PM
Subject: [pubsubhubbub] Re: Are fat pings efficient?


Alex

PSHB is not using fat pings.  There are use cases for fat pings that
are under discussion, but fat pings are not in the spec at this time.

alexis


On Mon, Oct 26, 2009 at 4:12 PM, Alex Barth <[email protected]> wrote:
>
> I am *very* excited about the pubsubhubbub work I'm seeing. I consider
> making it a mainstay of our aggregation infrastructure.
>
> Reading the spec and some of the issues on project page, my main
> question is:
>
> Why does PuSH POST the entire feed to subscribers?
>
> To me it would seem more efficient that the hub exposes the updated
> feed on a URL and then POSTs only this URL to the subscribers. The
> subscribers would then GET the feed from the hub.
>
> The amount of data to be posted would be a fraction, the updated feed
> hosted by the hub could be cached with a reverse proxy like Varnish or
> Squid. Subscribers could queue URLs neatly, then work them off
> asynchronously.
>
> Further, allowing POSTing a URL where updated data can be fetched
> would open Pubsubhubbub to be applied in fields where the data feeds
> are large (look at http://data.gov).
>
> What are the reasons behind the design decision on PuSH posting fat
> pings? Is there an option to post light pings that I am overlooking?
> Are there threads I should be reading up?
>
> Alex
>
> --
> I'm one of the geeks at http://developmentseed.org and as such I do a
> lot of work with aggregation for news tracking and Open Data in
> Drupal. Recently we launched an open source news tracker called
> Managing News http://managingnews.com. I maintain and have helped
> maintain 3 aggregators for Drupal (e. g. http://drupal.org/project/feedapi
> and its reincarnation: http://drupal.org/project/feeds).
>

Reply via email to