Luckily, probably 80% of subscribers will be doing something so simple that it's not even an issue. I do agree Hub's should have a timeout to encourage good practice. I don't think keeping the connection open is a big deal if you're doing it right. Hell, some people let you keep a connection open indefinitely, even "at scale" (see Twitter Stream API). You'll see in my upcoming hub implementation...
On Mon, Oct 26, 2009 at 1:41 PM, Pádraic Brady <[email protected]>wrote: > It's sort of an expectations game with the question being what does the Hub > expect. Ideally, it's expecting to POST a delta, get a 2xx response, and > move on to the next Subscriber. If, however, the Subscriber acts > synchronously then the Hub is carrying the cost of maintaining a connection > while the Subscriber does all the update processing work before sending a > 2xx response. > > Should be Hub be stuck waiting for a response because the Subscriber is > doing work of absolutely no impact to the expected 2xx response? Personally, > I don't think so. That clashes with a web developers instinct to treat all > work within a single request as being essential to the response which is why > most will (I agree absolutely) not use a separate queue. Doesn't make it > correct or efficient though. Subscribers should give the Hub the expected > response once it's request needs are met - i.e. the Subscriber received the > update and verified it as being valid. Anything outside that is not > essential to the Hub response. > > I think it's important because synchronous processing will land Hubs with > the impact of a Subscriber's ill advised practices - clumped requests taking > forever since the server is bogged down in swap, inefficient database ops, > slow processing, etc. If I were running a Hub, I'd paint a 10 second max > timeout on my connections and make it abundantly clear to Subscribers that > not meeting that timeout is their problem to solve. > > Maybe I'm being harsh though ;). I just don't like building it into > practices that poor implementations can get away with bogging down other > parties for no good reason. It's practically begging for people to do the > wrong thing because it's actively tolerated. As a wise man once said, > programming really is the one discipline where we seem unfathomably obsessed > with making life easier for the less skilled of its members. > > > Paddy > > Pádraic Brady > > http://blog.astrumfutura.com > http://www.survivethedeepend.com > OpenID Europe Foundation Irish Representative<http://www.openideurope.eu/> > > > ------------------------------ > *From:* Jeff Lindsay <[email protected]> > *To:* [email protected] > *Sent:* Mon, October 26, 2009 6:14:40 PM > > *Subject:* [pubsubhubbub] Re: Are fat pings efficient? > > > To your second point, Subscribers should never synchronously process >> updates. They should be dumped immediately to a job queue for asynchronous >> processing. This will help spread the processing load more evenly over time >> instead of being clumped together which I gather is what you're against. So >> it's receive update, verify it is an update (input validation), dump update >> to queue, and respond with a 200 code. >> >> > Actually I don't see anything wrong with handling the event synchronously. > While it's courteous to the hub, the hubs will HAVE to be able to handle > this because that's just how most people will do it. From the subscriber > perspective, a job queue is unnecessary because their web server should > already be handling the request asynchronously. Apache is generally already > a big worker pool using incoming HTTP requests as the job queue. > > -- > Jeff Lindsay > http://webhooks.org -- Make the web more programmable > http://shdh.org -- A party for hackers and thinkers > http://tigdb.com -- Discover indie games > http://progrium.com -- More interesting things > -- Jeff Lindsay http://webhooks.org -- Make the web more programmable http://shdh.org -- A party for hackers and thinkers http://tigdb.com -- Discover indie games http://progrium.com -- More interesting things
