Re: [pubsubhubbub] Re: deleted an published entry scenario

John Panzer Thu, 25 Feb 2010 12:03:57 -0800

FYI.  I am up against the wall on defining this for Salmon, so I'm going to
just spec use of draft-snell-atompub-tombstones-06.txt for this (even though
it's expired as of Dec 10, yay!).  I hope that it's true that deleted-entry
is trivial to support in PubSubHubbub.


If anyone wants to object, I'll be throwing up a draft and announcing it
soon.

On Wed, Oct 7, 2009 at 9:32 PM, Nicholas Granado <[email protected]> wrote:

> Rad. So anyone have any thoughts on a format that could snap into this?
> Would a different format require the hub to at least know how to diff it?
>
>
> On Wed, Oct 7, 2009 at 7:49 PM, Matthew Terenzio <[email protected]>wrote:
>
>> It makes sense to me that some formats have this baked in. If I understand
>> Bob correctly, Atom explicitly does not address it, so other formats will
>> leave it up to applications or elsewhere in the flow. But it seemed that
>> most agreed that it didn't belong in PSHB.
>>
>>
>> On Wed, Oct 7, 2009 at 9:39 PM, Nicholas Granado <[email protected]>wrote:
>>
>>> It makes sense that PSHB's spec should be about how pub / sub / hub
>>> interaction, and leave it up to the format of the transport (ATOM) to deal
>>> with changes in the feed's state (I know touchy to call it
>>> "state"..sorry...)?  I've seen a scenario where the publisher and subscriber
>>> get out of sync.  So in that scenario the publisher should hook, with the
>>> updates (deletion), and the subscriber could then parse for that update
>>> (deletion or new entry)? I could see if this were done in the right way, no
>>> updates to the hub would be needed, it would be more of the format that
>>> would need the love/support. And the complexity could be offloaded to the
>>> application to keep track of deleted or new, which makes sense in terms of
>>> blogs. Is this completely retarded? Or is this somewhat on the right track
>>> with what everyone else is proposing?
>>>
>>> Nick
>>>
>>> On Wed, Oct 7, 2009 at 12:06 PM, Matthew Terenzio 
>>> <[email protected]>wrote:
>>>
>>> At this point I'm not going to continue despite it being an interesting
>>> thread so far because it's not my style and it appears to be drifting away
>>> from productive discussion. But I'm sorry you had such a tough time with it.
>>> I guess that's the way it works in lots of industries where money and fame
>>> is involved.
>>>
>>>
>>> On Wed, Oct 7, 2009 at 2:58 PM, Bob Wyman <[email protected]> wrote:
>>>
>>> On Wed, Oct 7, 2009 at 12:40 PM, Matthew Terenzio 
>>> <[email protected]>wrote:
>>> > yet so many Atom users actually created
>>> > Atom documents that aggregators used
>>> > in a very similar way to RSS.
>>> A great many people put a great deal of effort into making Atom better
>>> than the previous formats. This involved, as I've indicated, thinking
>>> through a great many use cases that were not well handled by the many
>>> flavors of RSS. However, the community has never really been able to benefit
>>> from the work due to the heavy political pressure to maintain backwards
>>> compatibility with the legacy RSS format. Thus, we saw many feed producers
>>> who generated both RSS and Atom feeds and, because it is easier to do, they
>>> ended up implementing the "lowest common denominator" for both feed formats.
>>> We also see that virtually every tool that consumes either RSS or Atom also
>>> consumes the other. Thus, since virtually everyone that produces feeds
>>> produces Atom and virtually everyone that reads feeds reads Atom, there is
>>> simply no technical reason for anyone to continue to support the legacy RSS
>>> format. Continued support for RSS does nothing but prevent innovation and
>>> progress in this space. This is a high price to pay in order to support the
>>> ego of a single individual...
>>>
>>> bob wyman
>>>
>>>
>>> On Wed, Oct 7, 2009 at 12:40 PM, Matthew Terenzio 
>>> <[email protected]>wrote:
>>>
>>> That's interesting that you felt that way and yet so many Atom users
>>> actually created Atom documents that aggregators used in a very similar way
>>> to RSS. You would have thought that a different paradigm would have emerged
>>> similar to XMPP. Maybe this time around.
>>>
>>>
>>> On Wed, Oct 7, 2009 at 12:14 PM, Bob Wyman <[email protected]> wrote:
>>>
>>> On Wed, Oct 7, 2009 at 10:59 AM, Matthew Terenzio 
>>> <[email protected]>wrote:
>>> > But to say there is no use case for knowing
>>> > the current state of the feed (if that is what
>>> > you were saying) seems to be over-reaching
>>> > even if it wouldn't help in this case.
>>> The "current state of the feed" is, by definition in Atom, irrelevant.
>>> Atom is about entries, not feed documents. Feed documents are simply
>>> collections of entries that have, at some time, been associated with the
>>> "feed." (Note: A "feed document" is a concrete object. A "Feed" is a
>>> conceptual thing -- a potentially un-ending stream of entries.) While in
>>> common usage, the entries in a feed document will be the most recent subset
>>> of entries associated with the feed and those entries will normally be
>>> inserted into the feed document in the order that they were created or
>>> updated, these artifacts of "normal" usage are defined in Atom as having no
>>> semantic content. I realize that this probably seems like a fairly subtle
>>> point, however, it was the need to address this kind of subtlety that was a
>>> primary motivator for the definition of Atom in the first place. Issues like
>>> this are not, for instance, dealt with in the definition of RSS...
>>> (Grumble...)
>>>
>>> It is perhaps important to remember that when we were defining Atom, we
>>> had in mind (among many other things) systems that worked in precisely the
>>> same manner that PSHB does. PSHB is, after all, simply an HTTP REST
>>> implementation of a subset of the capabilities that we were then delivering
>>> based on XMPP/PubSub, or even before that with BEEP/APEX PubSub... As a
>>> result of our experience with this pattern of application, we knew that if
>>> the "current state of the feed" had meaning, then it would introduce all
>>> sorts of undesirable and usually unnecessary complexity into these systems.
>>> Thus, we defined the problem out of existence by saying that it is entries
>>> that matter, not feeds. The presence or absence of an entry in a feed
>>> document at any specific time is irrelevant and so is the order of entries
>>> within a feed document or the co-occurence of entries in a feed document.
>>> This massively reduces the complexity of PSHB like systems and, in fact,
>>> allows them to gain greater efficiencies and utility since they can focus
>>> just on distributing entries without having to worry about distributing all
>>> kinds of information about feed state.
>>>
>>> Now, while it is really useful to establish the base principles that Atom
>>> does, it is recognized that there are often *application* requirements for
>>> an ability to "retract" or "remove from circulation" some entry or the
>>> information contained in an entry. Often, this can be accomplished by simply
>>> inserting into the feed an updated version of the entry. (Perhaps the title,
>>> body, and summary now all read: "deleted"...) For applications that need
>>> some stronger semantic for "deletion" or "retraction," it might make sense
>>> to define an application specific extension that explicitly flags things as
>>> retracted. For instance, you might be publishing "Offers to sell" or "Offers
>>> to buy" in Atom. At some point you want to be able to explicitly retract
>>> your offer -- perhaps because you sold all available units. You might also
>>> want to be able to "expire" your offers after some specific amount of time
>>> -- whether or not you actually bought or sold  anything.
>>>
>>> While retractions, cancellations, expirations, etc. are all wonderfully
>>> useful ideas, it turns out that it is very difficult to define a single
>>> model for these things that will apply to all cases. Thus, Atom doesn't
>>> address these issues and leaves it as a problem for applications and
>>> extensions layered on top of Atom. I suggest that PSHB should take the same
>>> approach. PSHB should focus on providing the means by which entries flow
>>> between publishers and subscribers -- it should leave interpretation of the
>>> entries up to other services and/or applications.
>>>
>>> bob wyman
>>>
>>>
>>> On Wed, Oct 7, 2009 at 10:59 AM, Matthew Terenzio 
>>> <[email protected]>wrote:
>>>
>>> If an item in the feed is removed and you fetch it within the given
>>> window, it won't be there.
>>>
>>> If I store a cache of the feed on my server and update it when there is a
>>> change, the entry will no longer be on my server either.
>>>
>>> Surely there must be some aggregators that have worked like this, no?
>>>
>>> You are very much right that it is not the same as deletion and the life
>>> of an entry would be independent of the feed even if deletion were available
>>> in the spec because not everyone might support it or as you suggested, the
>>> entry might have moved downstream.
>>>
>>> But to say there is no use case for knowing the current state of the feed
>>> (if that is what you were saying) seems to be over-reaching even if it
>>> wouldn't help in this case.
>>>
>>>
>>>
>>> On Tue, Oct 6, 2009 at 12:16 PM, Bob Wyman <[email protected]> wrote:
>>>
>>> On Tue, Oct 6, 2009 at 9:20 AM, Niko Sams <[email protected]> wrote:
>>> > If PSHB doesn't support deletion, then I must
>>> > fetch the original feed on every notification -
>>> > and ignore the supplied atom feed completely.
>>> Why would you "fetch the original feed on every notification"? What
>>> information would you get by doing that?
>>> Atom provides no means to mark an item as deleted. Thus, reading the feed
>>> won't tell you what is "deleted."
>>>
>>> I'm assuming that you realize that the mere removal of an item from a
>>> feed is *not* the same as deletion. In this context, a "deletion" is really
>>> more like a "retraction." The contents of a feed document are only a sliding
>>> window on the virtual "feed" of all entries published to the feed over time.
>>> The presence or absence of an entry in any particular feed document does not
>>> carry information. The "life" of an entry is independent of its presence
>>> within any particular feed document.
>>>
>>> What do you learn by fetching the original feed? (Note: The atom format
>>> spec would say: "Nothing!")
>>>
>>> bob wyman
>>>
>>>
>>> On Tue, Oct 6, 2009 at 9:20 AM, Niko Sams <[email protected]> wrote:
>>>
>>>
>>> Hi,
>>>
>>> > Deletion in this kind of system is exceptionally difficult. This is why
>>> we
>>> > left any form of deletion out of the Atom spec itself. Please don't go
>>> down
>>> > this path without a great deal of careful consideration... PSHB is
>>> getting
>>> > more and more complicated all the time. Do you really want to deal with
>>> the
>>> > mess that will be created if folk think you're trying to handle
>>> arbitrarily
>>> > complex distributed synchronization issues including deletions?
>>> If PSHB doesn't support deletion, then I must fetch the original feed
>>> on every
>>> notification - and ignore the supplied atom feed completely.
>>> Even if it is difficult - it is very important.
>>>
>>> Niko
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: [pubsubhubbub] Re: deleted an published entry scenario

Reply via email to