I have mixed feelings about this. It feels like a layering violation for a hub to start parsing and mangling content that passes through it. Ideally as a consumer I want to get the same content whether I'm polling or using PubSubHubbub, otherwise I have to figure out what to do with the differences when I receive the same item in both contexts.
A particular hub might choose to offer normalization as a value-add service to feed consumers, which makes sense in a case like Superfeedr where the purpose of the service is to make life simpler for consumers by having them deal only with one hub and one format, but for a general-purpose hub like the one Google is running, I'd rather see it just pass things through verbatim.
In the normal case the hub is chosen by the publisher and so its "customer" is the publisher, but in the Superfeedr case a particular consumer decides to use Superfeedr's hub for all feeds, regardless of what "real" hub they use or whether they support hubbub at all. In this unusual case the hub's "customer" is the consumer rather than the publisher, and so in this case it makes sense for the hub to provide normalization and other such features on its consumer end.
I wonder if it would be worth making some kind of nomenclature distinction between a publisher-serving hub and a consumer-serving hub, since their behavior is often different and in practice a particular transaction may actually involve one of each, each one serving a slightly different role despite them both talking the same protocol.
rcade wrote:
One of the design philosophies that I like about PubSubHubbub is the decision to put as much of the complexity in the hub as possible, making it simple for feed publishers and readers to support the protocol. For this reason, when I created a Java application that receives updates, I was surprised when some fat pings arrived in RSS 2.0 format instead of Atom. Is there a possibility that fat pings will be normalized to Atom, no matter what format the originating feed employs? If you normalize the pings, PuSH clients only have to support one feed format. If you don't, clients have to support Atom, RSS 0.9, RSS 0.91 (Netscape), RSS 0.91 (UserLand), RSS 0.92, RSS 1.0 and RSS 2.0. Although all current feed-reading applications must support all of those formats, if PuSH normalizes pings, it makes it possible for future applications to be developed that only have to support Atom. That would be a nice selling point for developers.
