Hey Brett,

I will review it within this week. hope I can give you some feedback.

Kang

On Tue, Nov 2, 2010 at 5:49 AM, Brett Slatkin <[email protected]> wrote:

> Hey all,
>
> I wanted to ping on this thread. Has anyone had a chance to review
> this proposal? Does it sound sane? Would it help anyone out there deal
> with the load they're seeing from various Track APIs (like Superfeedr
> track?).
>
> Thanks,
>
> -Brett
>
> On Wed, Oct 6, 2010 at 7:45 PM, Brett Slatkin <[email protected]> wrote:
> > Hey all,
> >
> > Had some ideas I've been kicking around in discussions with various
> > folks. Would love some early feedback.
> >
> >
> > == Background
> >
> > The plan right now is to ditch the "Aggregated Content Distribution"
> > section of the spec (see
> > http://code.google.com/p/pubsubhubbub/issues/detail?id=105). There is
> > a variety of issues with it and it's never been deployed. However, I
> > believe there is still a need for efficient aggregated delivery that
> > follows from Bob Wyman's ideas about content filtering
> > (http://groups.google.com/group/pubsubhubbub/msg/820f7f29b7c22d46).
> >
> > Take the Google Buzz Track API for example
> > (http://code.google.com/apis/buzz/v1/using_rest.html#activity-track).
> > Let's say you have these two Track subscriptions registered (both
> > PubSubHubbub topics):
> >
> > https://www.googleapis.com/buzz/v1/activities/track?q=Bilbo
> > https://www.googleapis.com/buzz/v1/activities/track?q=Baggins
> >
> > An item comes through that matches both terms (a post with the author
> > "Bilbo Baggins"). Your PuSH subscriber will receive *two* copies of
> > that message, one for each subscription, each on a different callback
> > URL that was registered when you setup the PuSH subscription. This
> > gets much worse as the number of Track queries and potential overlaps
> > increases; it's *especially* awful for geographic queries which
> > intrinsically overlap.
> >
> > Bob's solution is to deliver a single copy of the "Bilbo Baggins" post
> > but annotate it with *which queries* it matched. I like this idea, but
> > I want to 1) change how we express the annotation, 2) make it easy for
> > existing clients to migrate to the new scheme, 3) not add any new
> > parameters (e.g., "hub.filter") to the PuSH protocol.
> >
> >
> > == The Proposal
> >
> > PubSubHubbub-enabled feeds will declare a new aggregation relation
> > ("http://pubsubhubbub.org/aggregation";). The "href" is picked by the
> > publisher and is a statement of "things with this aggregation URL I
> > can batch together into aggregated delivery." For example, with the
> > Buzz Track API feeds we could do:
> >
> > <feed>
> >  <link rel="self"
> > href="https://www.googleapis.com/buzz/v1/activities/track?q=Bilbo"/>
> >  <link rel="http://pubsubhubbub.org/aggregation";
> > href="https://www.googleapis.com/buzz/v1/activities/combined"/>
> >  ...
> > </feed>
> >
> > Subscribers would see this new "rel" link and know that they could
> > subscribe to that new topic
> > ("https://www.googleapis.com/buzz/v1/activities/combined";) to get
> > aggregated delivery. What does it mean to get aggregated delivery?
> > Essentially, *all* of the subscriber's existing subscriptions with
> > that same "aggregation" link value would *STOP* delivering, and
> > instead the subscriber would get POSTs on a *single* callback that
> > look like this:
> >
> > POST /my-aggregated-callback HTTP/1.1
> > Link: <https://www.googleapis.com/buzz/v1/activities/combined>;
> > rel="http://pubsubhubbub.org/aggregation";,
> >        <https://www.googleapis.com/buzz/v1/activities/track?q=Bilbo>;
> > rel="self",
> >        <https://www.googleapis.com/buzz/v1/activities/track?q=Baggins>;
> > rel="self"
> > X-Hub-Signature: ...
> >
> > <feed>
> >  <link rel="self"
> > href="https://www.googleapis.com/buzz/v1/activities/combined"/>
> >  <link rel="http://pubsubhubbub.org/aggregation";
> > href="https://www.googleapis.com/buzz/v1/activities/combined"/>
> >  ...
> > </feed>
> >
> > Thus you will only get one copy of each item. The list of queries
> > matched will be in the Link header so users know why they're getting
> > the item.
> >
> > This proposal would fundamentally decouple subscription verification
> > from event delivery. If the subscriber adds a new PuSH subscription
> > with the same "aggregation" link value, non-obviously it will use the
> > normal callback URL for PuSH verification but send all content
> > delivery to the aggregated callback. Unsubscription will also use a
> > separate callback URL for verification. If the subscriber unsubscribes
> > from the aggregation URL, then all of the subscriptions will revert
> > back to the old way of doing things.
> >
> >
> > == Open questions
> >
> > Random list of questions:
> >
> > * What granularity do you use to move the existing subscriptions to
> > the aggregated endpoint? Does the publisher do it by domain, by URL
> > prefix, by some other token?
> > * Should the "self" links in the aggregated delivery be for the feeds
> > you subscribed to, or should you instead pass through the callback
> > URLs that *would* have been used for normal delivery? The latter
> > approach could be useful for subscribers who put context data into
> > their callback URLs.
> > * Will this allow us to finally put the Topic header in content
> > delivery as users have requested a million times?
> > (http://code.google.com/p/pubsubhubbub/issues/detail?id=79)
> > * Can this scheme be reused for aggregated delivery across different
> > sites, so subscribers get fewer POSTs?
> >
> >
> > Thanks for reading!
> >
> > -Brett
> >
>



-- 
Stay hungry,Stay foolish.

Twitter: http://twitter.com/lookon | Buzz:
http://www.google.com/profiles/areyoulookon | Blog:
http://throw-dice.appspot.com

Reply via email to