Re: [pubsubhubbub] Re: FYI: EventedAPI

Jeff Lindsay Tue, 29 Nov 2011 16:42:49 -0800

Okay, I better understand your position and perspective on this. Btw, are
you in the area (SF)? It would be interesting to discuss topic vs content
based subscriptions in person because I have thought/worked with it a lot,
but not in those terms.


-jeff

On Tue, Nov 29, 2011 at 2:06 PM, Bob Wyman <[email protected]> wrote:

>
>
> On Mon, Nov 28, 2011 at 8:45 PM, Jeff Lindsay <[email protected]> wrote:
>
>> The idea was that the hub should publish Atom entries and only Atom
>>> entries. Of course, the entries would contain atom.source elements to show
>>> the feeds with which they were associated. Also, the hub should do
>>> de-duping to ensure that any particular entry isn't sent more than once.
>>>
>>
>> Yeah, I get the reasoning behind Atom and I understand it's more general
>> use. The problem is in order to make something useful and easy to adopt,
>> you need to really facilitate what people are already doing and are
>> familiar with. Not everybody wants to work with Atom, despite all its
>> benefits. Having Atom as a representation or as a possible payload is
>> great, but depending on its semantics, forcing it to be required for PSHB
>> to be useful is not a great idea... or least a pragmatic one IMO.
>>
>>
>>> We could build all the above things very easily based on systems that
>>> publish Atom feeds and allow content-based (query-based) subscriptions.
>>>
>>
>> Call me crazy, but I'm in love with the Unix philosophy of doing one
>> thing well and designing for composition of more complex systems from
>> simple parts.
>>
> "Designing for composition of more complex systems from simple parts" is
> an excellent goal. The problem is that in order to facilitate composition,
> you must have some idea of what kinds of complex systems you're going to
> compose. Given the application domain under discussion (Publish/Subscribe
> even if some other name is used), the problem here is that we know from
> many long years of experience that it is difficult to build a content-based
> system on top of a topic-based system yet it is trivial to build a
> topic-based system on top of a content-based system. It is important where
> you start when designing systems. Things get path-dependent very quickly.
>
> The problem is that design decisions made to facilitate topic-based system
> construction tend to make harder the job of building content based systems.
> Take, for example, the regular discussion of "firehoses" which are almost
> always a common subject of discussion with topic-based systems but are
> generally irrelevant when discussing content-based systems. A firehose,
> which adds complexity to the topic-based implementation, is almost always
> needed when people want to do any kind of content-based work on top of a
> topic-based system. (That can include either real-time filtering or dumping
> of data into a database for later "content-based" retrieval or searching.)
> A firehose is simply a mechanism to de-mux or merge together the many
> topic-based streams that were created in order to provide a topic-based
> subscription model. If you start with a topic-based system, you almost
> always need to construct firehoses in order to make content-based routing
> possible. On the other hand, if you start with a content-based system and
> have "topic" as an attribute of each published item, then it is trivial to
> create "topic" streams since they are simply single-attribute subscriptions
> keyed on the "topic" attribute.
>
> If you start with a content-based model but want topic-based, then instead
> of subscribing to topic "foobar" you assume that all published items have
> an attribute named "topic" and you subscribe to "topic = 'foobar'". A
> "topic-based" system is thus nothing more than the most simple use of a
> content-based system. Of course, the advantage of using a trivial
> content-based interface to emulate a topic-based system is that you can
> then easily expand the capability of the base system to support more
> complex filters or queries. You can go from just a single attribute and
> exact-match to allowing full Boolean expressions, etc. without making a
> significant change to the subscription interface -- the changes are only to
> the subscription query syntax and those changes can all produce proper
> supersets of the trival syntax.
>
> What I wonder is what, if any, benefit comes from baking "topic-based"
> into the subscription interface? Given that the alternative provides such
> flexibility down the road, what significant advantage do you get from
> limiting the system's expressiveness up-front?
>
>
>> Queries and filters, to me, are out of the scope of this protocol,
>> despite being very useful.
>>
> If you see my reasoning in the paragraphs above, you won't be surprised
> that I claim that in order to build a topic-based system, you already need
> to build "Queries and Filters."  The only difference is that if you build
> something like PSHB, you are building a very simple filter language that
> happens to be hard to extend. When people subscribe to topic "
> http://example.com/feed"; it is EXACTLY the same, semantically, as
> subscribing using the query "topic = 'http://example.com/feed'"... There
> is no significant introduction of complexity that results from going from
> topic-based to content-based -- only a much easier path to doing more
> interesting things in the future. (i.e. "topic='http://example.com/feedAND 
> content='foobar'" is just a step away...)
>
>
>> The reason is that anybody can create a subscriber or relay (perhaps even
>> a hub) that happens to do that filtering in its implementation.
>>
> Yes, anyone can build yet another aggregator to either consume firehoses
> or construct them and then filter them. But, just because a thing can be
> done, doesn't mean that we should insist that it be done -- unless there is
> a good reason not to allow alternatives. In this case, I can't see that
> there are. Building the basic system using the model of a trivial
> content-based system doesn't make it any more difficult to build other hubs
> or relays that can do arbitrary processing, however, it gives us the option
> of allowing a single system, with a standard interface, to do both the
> simple and the complex work in an integrated and more efficient manner.
>
>>
>> That said, I'm assuming this was more just to defend Atom and
>> content-based subscriptions, to which I would say: those examples should be
>> possible *if* you use Atom as your content container and have access to or
>> can build a subscription querier node. But it should also be possible if
>> the content is *not* Atom using the same approach of putting the filtering
>> in an intermediate node (or potentially being an implementation detail of a
>> hub).
>>
>> I just think the core should be simple and neutral, allowing more
>> specialized extensions, additions, and combinability. And for that, my
>> experience (and general observations) suggest that we should focus on
>> content-type neutral HTTP-based mechanisms.
>>
>> -jeff
>>
>>
>>>
>>> bob wyman
>>>
>>>
>>> On Mon, Nov 28, 2011 at 6:33 PM, Julien Genestoux <
>>> [email protected]> wrote:
>>>
>>>> Jeff, do you think you could help getting the folks at GitHub, Twilio,
>>>> FreshBooks, Pusher to come in here and participate? What would they love to
>>>> see in and out of PubSubHubbub so that it fits their needs?
>>>>
>>>> Bob, that's an interesting point. You said you wanted PSHB to be about
>>>> entries rather than feeds. I'm not sure I understand this. I guess you
>>>> would still need to subscribe to an endpoint that would emit a collection
>>>> of entries, right?
>>>>
>>>> Julien
>>>>
>>>>
>>>>
>>>> On Tue, Nov 29, 2011 at 12:16 AM, Bob Wyman <[email protected]> wrote:
>>>>
>>>>> On Mon, Nov 28, 2011 at 5:31 PM, Julien <[email protected]>
>>>>>  wrote:
>>>>>
>>>>> > PubSubHubbub is currently too
>>>>> > much oriented toward data feeds
>>>>> Personally, I think that PSHB "went wrong" when folk insisted that it
>>>>> support RSS instead of just Atom. In the Atom format we had gone to great
>>>>> trouble to ensure that "entry" was a top-level item and that entries had
>>>>> the same semantics whether they were inside feeds or on their own. (Not 
>>>>> the
>>>>> case with RSS.) One of the reasons that I worked to make this the case was
>>>>> that I've been wanting to do pubsub with arbitrary content for many
>>>>> years... The idea was that an Atom entry is a reasonable wrapper or
>>>>> container for just about any content you might want to publish. (MIME 
>>>>> types
>>>>> distinguish the content type.) Thus, a system for syndicating Atom entries
>>>>> could be used to reasonably syndicate just about anything. But, when
>>>>> support for RSS feeds came into the PSHB spec, all sorts of things got
>>>>> confused... PSHB should have been about the entries, not the feeds...
>>>>>
>>>>> bob wyman
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 28, 2011 at 5:31 PM, Julien <[email protected]>wrote:
>>>>>
>>>>>> Jeff, thanks for sharing so quickly :)
>>>>>> I perfectly agree and acknowledge that PubSubHubbub is currently too
>>>>>> much oriented toward data feeds, and content in general, while it's
>>>>>> just a sub-case.
>>>>>> I also think the "realtime" aspect of things doesn't matter that much,
>>>>>> and is just a consequence of the "push" design. When you trigger
>>>>>> events, there is no reason to do it later than sooner.
>>>>>>
>>>>>> The spec should evolve in something that works as well for events than
>>>>>> for content.
>>>>>> It should be "subscribe to a web resource, get events". [this can be
>>>>>> decorated in any way people want to work with feeds, with publisher/
>>>>>> hubs merged or distinct, with no data... etc.]
>>>>>>
>>>>>> Julien
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Nov 28, 11:21 pm, Jeff Lindsay <[email protected]> wrote:
>>>>>> > On Mon, Nov 28, 2011 at 2:02 PM, Julien Genestoux <
>>>>>> >
>>>>>> > [email protected]> wrote:
>>>>>> > > Jeff, please do share your feelings. Help us make PubSubHubbub
>>>>>> better!
>>>>>> > > Bob, obviously pubsubhubub should be less about blogging and/or
>>>>>> news. I
>>>>>> > > started a thread about supporting any kind of arbitrary data, and
>>>>>> this is
>>>>>> > > what I had in mind as a way to suppoty any kind of content, and
>>>>>> any type of
>>>>>> > > updates (with our without payload).
>>>>>> >
>>>>>> > To this point, my main feeling is that, yes, PSHB is focused too
>>>>>> much on
>>>>>> > content. While I think this is useful (as its been the primary use
>>>>>> case),
>>>>>> > it's not a wide enough net to really have critical mass as a
>>>>>> project. I
>>>>>> > originally thought it was good that it was very focused and didn't
>>>>>> solve
>>>>>> > *my* particular problems. I also thought it was good it focused on a
>>>>>> > tangible goal of making feeds more realtime. However, I think time
>>>>>> has
>>>>>> > shown it was not enough to be a big enough deal to sustain momentum
>>>>>> as a
>>>>>> > project.
>>>>>> >
>>>>>> > The problem is that this general problem PSHB solves has many
>>>>>> different
>>>>>> > views/perspectives/languages. For example, it can be message
>>>>>> oriented and
>>>>>> > talk about pubsub. Or it can be event oriented and talk about
>>>>>> events etc
>>>>>> > (the perspective used by Phil and them). Or it can even be thought
>>>>>> of as
>>>>>> > callbacks or hooks (webhooks). There are other similar concepts with
>>>>>> > different language as well: updates/notifications, observers, etc.
>>>>>> The two
>>>>>> > main ones seem to be events vs messages/pubsub, and I'm not sure
>>>>>> which one
>>>>>> > is generally consider more general than the other. Ultimately,
>>>>>> technically,
>>>>>> > they're more or less the same thing, but I think the framing makes
>>>>>> a *big*
>>>>>> > difference.
>>>>>> >
>>>>>> > Anyway, that's the start of my ideas around this.
>>>>>> >
>>>>>> > -jeff
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > > Julien
>>>>>> >
>>>>>> > > On Mon, Nov 28, 2011 at 9:33 PM, Bob Wyman <[email protected]> wrote:
>>>>>> >
>>>>>> > >> The sitehttp://www.mostlybaked.com/provides a number of quick
>>>>>> sketches
>>>>>> > >> of applications that are things that I personally think should
>>>>>> work well
>>>>>> > >> over PSHB if the focus of PSHB became less about blogging and
>>>>>> more about
>>>>>> > >> the general case of publishing and subscribing to streams of
>>>>>> data on the
>>>>>> > >> Internet. Also, Phil often talks about the kinds of things that
>>>>>> he'd like
>>>>>> > >> to do with the EventedAPI on his blog. ex:
>>>>>> > >>
>>>>>> http://www.windley.com/archives/2011/11/personal_event_networks_and_v.
>>>>>> ..
>>>>>> >
>>>>>> > >> bob wyman
>>>>>> >
>>>>>> > >> On Mon, Nov 28, 2011 at 1:16 PM, Bob Wyman <[email protected]>
>>>>>> wrote:
>>>>>> >
>>>>>> > >>> See:http://www.eventedapi.org/spec
>>>>>> >
>>>>>> > >>> As we consider what can be done to move PubSubHubbub forward,
>>>>>> it might
>>>>>> > >>> make sense to take a look at some other protocols that folk
>>>>>> have defined to
>>>>>> > >>> determine if there is anything in them that PubSubHubbub should
>>>>>> be
>>>>>> > >>> implemented or if they do things better that PSHB does. The
>>>>>> folk at Kynetx (
>>>>>> > >>>http://apps.kynetx.com/) have been building up a PSHB-like
>>>>>> system for
>>>>>> > >>> some time now... I'm not sure I understand why PSHB wouldn't,
>>>>>> in fact,
>>>>>> > >>> serve their needs.
>>>>>> >
>>>>>> > >>> bob wyman
>>>>>> >
>>>>>> > --
>>>>>> > Jeff Lindsayhttp://progrium.com
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Jeff Lindsay
>> http://progrium.com
>>
>
>


-- 
Jeff Lindsay
http://progrium.com

Re: [pubsubhubbub] Re: FYI: EventedAPI

Reply via email to