[pubsubhubbub] Re: Spec 0.4 review

Roman Tue, 19 Jun 2012 08:17:29 -0700

Thanks for your reply, Julien. I added [email protected] to CC,
I hope you don't mind.


On Mon, Jun 18, 2012 at 5:05 PM, Julien Genestoux <
[email protected]> wrote:

> Roman.
>
> Please see below for online responses...
>
> On Mon, Jun 18, 2012 at 2:10 PM, Roman <[email protected]> wrote:
>
>> Hi Julien,
>>
>> I went through the 0.4 spec and wanted to ask a few questions:
>>
>> *1.* There are a few 'url' and 'urls' scattered around. I think they
>> should be changed to 'URL' and 'URLs' repectively.
>>
>
> You are right. Fixed.
>
>
>>
>> *2. Section 4.1*. *The subscriber's callback URL SHOULD be unique for
>> each subscription.*
>>
>> Why is it necessary? Why is it a good thing?
>>
>
> So, it's not necessary, (but my english nuances may be off). If it was
> necessary, I'd have put MUST.
> Now, it's good for a lot of reasons, the simplest one is a much better
> ability to debug things out.
> It is generally extremely convenient...
>

I think it's better to leave it up to subscriber. As far as the hub is
concerned, there is no preference for unique URLs. And I can imagine simple
subscribers that work fine with the same callback for all topics.

*3. Section 5.1.* *Subscriber Sends Subscription Request*
>>
>> What is the rationale behind removing hub.verify request parameter?
>>
> Simplification. This is generally quite confusing I believe for many
> people who we helped implement the protocol.
> Now, we had to chose between sync and async. Sync may not always be
> enforceable because the publisher may take time to accept the
> subscription... Now, when using async, we lose the adbility to quickly know
> whether the subscription worked or failed, but since we introduced the
> 5.2.2, I believe we have everything in order.
>

Makes sense.

*4. Section 5.1.* *Subscribers MAY also include additional HTTP Query
>> params, as well as HTTP Headers if they are required by the hub, or the
>> publisher.*
>>
>> By the publisher? Should the hub forward the headers and extra query
>> parameters to the publisher?
>>
> Yes, but that's outside of the spec (as the whole publihser <-> hub
> relationship).
>

I think it's better to remove "or the publisher" because it's confusing.


>
> Also, s/HTTP Query params/request parameters/.
>>
> Changed.
>
>
>>
>> *5. Section 5.1.* *This header should indicate on behalf of which user
>> the subscription is being performed.*
>>
>> The use of 'should' here is confusing because it's doesn't say that
>> either party is supposed do something.
>>
> Well, the subscriber should include that data... (specifically because the
> hub/publisher may refuse the subscription if it's missing).
> The hub may or may not  forward that data to the publisher when this one
> requires it (in the content of subscription validation).
>
> I would rephrase it as follows:
>>
>>  In the context of social web applications, it is considered good
>> practice to include a From HTTP header (as described in section 14.22 of
>> Hypertext Transfer Protocol [RFC2616]) to indicate on behalf of which user
>> the subscription is being performed.
>>
> Replaced...
>
>
>>  *6. Section 5.1.1.* *The topic URL MUST be the one advertised by the
>> publisher in a Hub Link Header Header during the discovery phase.*
>>
>> s/Hub Link/Self Link/
>>
> Fixed.
>
>
>> *7. Section 5.1.1.* *The topic URL can otherwise be free-form following
>> the URI spec [RFC3986].*
>>
>> What does free-form mean here?
>>
> Took it from the previous version of the spec... written by Brett, Brad
> and Mart. Not sure what they exactly meant, besides that there is no
> "semantic" requirement as long as it's a URI.
>
>
>>
>> *8. Section 5.2.* *Hub Verifies Intent of the Subscriber*
>>
>> What's the rationale behind removing hub.verify_token request parameter?
>>
> It's useless. (it was initially introduced to simplify the subscriber's
> ability to map verifications of intent with subscriptions... which can be
> easily done with different callbacks for each subscriptions.)
>

Makes sense, although I expect most hubs to support hub.verify_token for
backward compatibility.

 *9. Section 5.2.1.* *Verification Details*
>>
>> Should the hub retry in case of 404? How about other errors?
>>
> I don't think the hub should retry. My rationale behind this is that
> subscribers should be the ones to make sure their subscriptions go through
> fine... not the hub.
>

I'm still not sure what's the best retry strategy here. No-retry policy
makes hub auto retries unreliable and pretty much useless.

- A simple subscriber makes a permanent subscription to
http://foo.com/atom.xml.
- The hub verifies the subscription and everything is fine.
- A week later hub decides to reverify the subscription, but unfortunately
the subscriber is temporarily down.
- The subscription is removed.

This means every subscriber MUST resubscribe periodically. Maybe that's not
a bad thing, but the spec gives mixed signals.

Also, retrying would cause security issues (it becomes easier to make DOS
> attacks for example).
>

DOS can be mitigated by limiting the number of retries (for example, 3
retries over 6 hours).

*10. Section 5.2.2.* *The subscription may be intentionally denied by the
>> hub at any point (even if it was previously accepted). The Subscriber
>> SHOULD then consider that the subscription is not possible anymore.*
>>
>> What does 'intentionally denied' mean here? Could you explain?
>>
> Well, the hub should be allowed to deny a previously accepted
> subscription...even if the subscription hasn't issued a (re)subscription.
> The rationale is that the publisher or a hub may decide to ban a
> subscribers. That would happen for example with social applications: a user
> may 'block' another user from following them.
>

Let's drop 'intentionally' from the spec.

Also, s/may/MAY/.

It's not clear what the last sentence means: *The Subscriber SHOULD then
consider that the subscription is not possible anymore.* Does it mean the
subscriber must not try to resubscribe? That would be weird.


>
>> *11. Section 5.2.2.* *If (and when), the subscription is accepted, the
>> hub will inform the subscriber by sending an HTTP [RFC2616] GET request to
>> the subscriber's callback URL as given in the subscription request.*
>>
>> The verb 'will' is not clear here. Perhaps, MUST?
>>
> Fixed.
>
>
>>
>> What worries me with this protocol is that hubs will issue two GET
>> requests to the subscribers on each subscription.
>>
>
>
>> The alternative is to merge two requests into one:
>> a. Subscriber sends subscription request.
>> b. (optional) Hub validates request.
>> c. If validation was performed and has failed, hub sends subscriber
>> notification with hub.mode = "denied".
>> d. Else, hub sends verification request to the subscriber.
>>
>
> That seems decent... my only worry is again for DOS. An ill-intended
> subscriber may very well issue subscription
> requests at a very high frequency. But I guess that it can/should easily
> be filtered at another level by the hub.
>
>
>>
>> Also, it's reasonable to allow hubs to send notifications with hub.mode =
>> "denied" at any point, even after the subscription was set up and working
>> for a while. Then the flow can be simplified:
>>
>> a. Subscriber sends subscription request.
>> b. Hub sends verification request to the subscriber. If it fails, the
>> subscription is ignored.
>> c. (optional) Hub validates request at some point in the future.
>> d. If validation failed, hub sends subscriber notification with hub.mode
>> = "denied".
>> e. If validation succeeded, c-e can repeat.
>>
>
> That works too conceptually. However, in this case, I'm worried about
> 'mixed signals' :)
> As a subscriber may assume that the subscription is accepted, even though
> it was technically
> never accepted by the publisher...
>

>
>> There doesn't seem to be a reason to notify subscriber with hub.mode =
>> "accepted". The subscriber can assume that the subscription is working
>> properly until it receives hub.mode = "denied".
>>
> I don't like that... because there is a 'silent failure' scenario here. If
> for some reason the hub goes down and never validates subscriptions/informs
> the subscriber of failed subscription, then the subscriber assumes that
> everything is fine.
>

If you consider the fact that the hub may deny submissions at any point,
this situation isn't any different than this.

Consider the following scenario:
- Subscriber subscribes to http://foo.com/atom.xml.
- Everything is fine, the subscriber is notified with hub.mode = "accepted"
or hub.mode = "subscribe".
- At some later point, the hub decides to deny the subscription.

Now the subscriber thinks that everything is fine, although the
subscription is denied. It's exactly the same situation as the one you are
trying to avoid.

Also, hub that follows the second protocol from my suggestion isn't really
violating the first protocol. The hub may do validation after telling the
subscriber that submission is accepted because the hub is allowed to run
validation at any point.

Does it make sense?


>
>> What do you think?
>>
>
> The first option seems reasonable.
>
>
> *12. Section 5.3.* *Automatic Subscription Refreshing*
>>
>> This is the same as in the 0.3 spec, but still. Does this section apply
>> onto to the case of infinite subscriptions? Or must hub also refresh finite
>> submissions?
>>
> Yes, hubs SHOULD refresh, for convenience... but  subscribers MUST refresh
> too, whichever comes first.
> I don't want to force subscribers to keep track of lease terminations if
> they only want permanent subscriptions...
>
>
> And one more question. How should 404 be handled when renewing
>> subscriptions? Should hub retry? Is it different from the initial
>> verification request?
>>
> Not different...
>
> I really appreciate your help! I hope you don't mind my comments...
>
>
>>
>> Roman.
>>
>
>

[pubsubhubbub] Re: Spec 0.4 review

Reply via email to