I am *very* excited about the pubsubhubbub work I'm seeing. I consider
making it a mainstay of our aggregation infrastructure.

Reading the spec and some of the issues on project page, my main
question is:

Why does PuSH POST the entire feed to subscribers?

To me it would seem more efficient that the hub exposes the updated
feed on a URL and then POSTs only this URL to the subscribers. The
subscribers would then GET the feed from the hub.

The amount of data to be posted would be a fraction, the updated feed
hosted by the hub could be cached with a reverse proxy like Varnish or
Squid. Subscribers could queue URLs neatly, then work them off
asynchronously.

Further, allowing POSTing a URL where updated data can be fetched
would open Pubsubhubbub to be applied in fields where the data feeds
are large (look at http://data.gov).

What are the reasons behind the design decision on PuSH posting fat
pings? Is there an option to post light pings that I am overlooking?
Are there threads I should be reading up?

Alex

--
I'm one of the geeks at http://developmentseed.org and as such I do a
lot of work with aggregation for news tracking and Open Data in
Drupal. Recently we launched an open source news tracker called
Managing News http://managingnews.com. I maintain and have helped
maintain 3 aggregators for Drupal (e. g. http://drupal.org/project/feedapi
and its reincarnation: http://drupal.org/project/feeds).

Reply via email to