Okay, I better understand your position and perspective on this. Btw, are you in the area (SF)? It would be interesting to discuss topic vs content based subscriptions in person because I have thought/worked with it a lot, but not in those terms.
-jeff On Tue, Nov 29, 2011 at 2:06 PM, Bob Wyman <[email protected]> wrote: > > > On Mon, Nov 28, 2011 at 8:45 PM, Jeff Lindsay <[email protected]> wrote: > >> The idea was that the hub should publish Atom entries and only Atom >>> entries. Of course, the entries would contain atom.source elements to show >>> the feeds with which they were associated. Also, the hub should do >>> de-duping to ensure that any particular entry isn't sent more than once. >>> >> >> Yeah, I get the reasoning behind Atom and I understand it's more general >> use. The problem is in order to make something useful and easy to adopt, >> you need to really facilitate what people are already doing and are >> familiar with. Not everybody wants to work with Atom, despite all its >> benefits. Having Atom as a representation or as a possible payload is >> great, but depending on its semantics, forcing it to be required for PSHB >> to be useful is not a great idea... or least a pragmatic one IMO. >> >> >>> We could build all the above things very easily based on systems that >>> publish Atom feeds and allow content-based (query-based) subscriptions. >>> >> >> Call me crazy, but I'm in love with the Unix philosophy of doing one >> thing well and designing for composition of more complex systems from >> simple parts. >> > "Designing for composition of more complex systems from simple parts" is > an excellent goal. The problem is that in order to facilitate composition, > you must have some idea of what kinds of complex systems you're going to > compose. Given the application domain under discussion (Publish/Subscribe > even if some other name is used), the problem here is that we know from > many long years of experience that it is difficult to build a content-based > system on top of a topic-based system yet it is trivial to build a > topic-based system on top of a content-based system. It is important where > you start when designing systems. Things get path-dependent very quickly. > > The problem is that design decisions made to facilitate topic-based system > construction tend to make harder the job of building content based systems. > Take, for example, the regular discussion of "firehoses" which are almost > always a common subject of discussion with topic-based systems but are > generally irrelevant when discussing content-based systems. A firehose, > which adds complexity to the topic-based implementation, is almost always > needed when people want to do any kind of content-based work on top of a > topic-based system. (That can include either real-time filtering or dumping > of data into a database for later "content-based" retrieval or searching.) > A firehose is simply a mechanism to de-mux or merge together the many > topic-based streams that were created in order to provide a topic-based > subscription model. If you start with a topic-based system, you almost > always need to construct firehoses in order to make content-based routing > possible. On the other hand, if you start with a content-based system and > have "topic" as an attribute of each published item, then it is trivial to > create "topic" streams since they are simply single-attribute subscriptions > keyed on the "topic" attribute. > > If you start with a content-based model but want topic-based, then instead > of subscribing to topic "foobar" you assume that all published items have > an attribute named "topic" and you subscribe to "topic = 'foobar'". A > "topic-based" system is thus nothing more than the most simple use of a > content-based system. Of course, the advantage of using a trivial > content-based interface to emulate a topic-based system is that you can > then easily expand the capability of the base system to support more > complex filters or queries. You can go from just a single attribute and > exact-match to allowing full Boolean expressions, etc. without making a > significant change to the subscription interface -- the changes are only to > the subscription query syntax and those changes can all produce proper > supersets of the trival syntax. > > What I wonder is what, if any, benefit comes from baking "topic-based" > into the subscription interface? Given that the alternative provides such > flexibility down the road, what significant advantage do you get from > limiting the system's expressiveness up-front? > > >> Queries and filters, to me, are out of the scope of this protocol, >> despite being very useful. >> > If you see my reasoning in the paragraphs above, you won't be surprised > that I claim that in order to build a topic-based system, you already need > to build "Queries and Filters." The only difference is that if you build > something like PSHB, you are building a very simple filter language that > happens to be hard to extend. When people subscribe to topic " > http://example.com/feed" it is EXACTLY the same, semantically, as > subscribing using the query "topic = 'http://example.com/feed'"... There > is no significant introduction of complexity that results from going from > topic-based to content-based -- only a much easier path to doing more > interesting things in the future. (i.e. "topic='http://example.com/feedAND > content='foobar'" is just a step away...) > > >> The reason is that anybody can create a subscriber or relay (perhaps even >> a hub) that happens to do that filtering in its implementation. >> > Yes, anyone can build yet another aggregator to either consume firehoses > or construct them and then filter them. But, just because a thing can be > done, doesn't mean that we should insist that it be done -- unless there is > a good reason not to allow alternatives. In this case, I can't see that > there are. Building the basic system using the model of a trivial > content-based system doesn't make it any more difficult to build other hubs > or relays that can do arbitrary processing, however, it gives us the option > of allowing a single system, with a standard interface, to do both the > simple and the complex work in an integrated and more efficient manner. > >> >> That said, I'm assuming this was more just to defend Atom and >> content-based subscriptions, to which I would say: those examples should be >> possible *if* you use Atom as your content container and have access to or >> can build a subscription querier node. But it should also be possible if >> the content is *not* Atom using the same approach of putting the filtering >> in an intermediate node (or potentially being an implementation detail of a >> hub). >> >> I just think the core should be simple and neutral, allowing more >> specialized extensions, additions, and combinability. And for that, my >> experience (and general observations) suggest that we should focus on >> content-type neutral HTTP-based mechanisms. >> >> -jeff >> >> >>> >>> bob wyman >>> >>> >>> On Mon, Nov 28, 2011 at 6:33 PM, Julien Genestoux < >>> [email protected]> wrote: >>> >>>> Jeff, do you think you could help getting the folks at GitHub, Twilio, >>>> FreshBooks, Pusher to come in here and participate? What would they love to >>>> see in and out of PubSubHubbub so that it fits their needs? >>>> >>>> Bob, that's an interesting point. You said you wanted PSHB to be about >>>> entries rather than feeds. I'm not sure I understand this. I guess you >>>> would still need to subscribe to an endpoint that would emit a collection >>>> of entries, right? >>>> >>>> Julien >>>> >>>> >>>> >>>> On Tue, Nov 29, 2011 at 12:16 AM, Bob Wyman <[email protected]> wrote: >>>> >>>>> On Mon, Nov 28, 2011 at 5:31 PM, Julien <[email protected]> >>>>> wrote: >>>>> >>>>> > PubSubHubbub is currently too >>>>> > much oriented toward data feeds >>>>> Personally, I think that PSHB "went wrong" when folk insisted that it >>>>> support RSS instead of just Atom. In the Atom format we had gone to great >>>>> trouble to ensure that "entry" was a top-level item and that entries had >>>>> the same semantics whether they were inside feeds or on their own. (Not >>>>> the >>>>> case with RSS.) One of the reasons that I worked to make this the case was >>>>> that I've been wanting to do pubsub with arbitrary content for many >>>>> years... The idea was that an Atom entry is a reasonable wrapper or >>>>> container for just about any content you might want to publish. (MIME >>>>> types >>>>> distinguish the content type.) Thus, a system for syndicating Atom entries >>>>> could be used to reasonably syndicate just about anything. But, when >>>>> support for RSS feeds came into the PSHB spec, all sorts of things got >>>>> confused... PSHB should have been about the entries, not the feeds... >>>>> >>>>> bob wyman >>>>> >>>>> >>>>> >>>>> On Mon, Nov 28, 2011 at 5:31 PM, Julien <[email protected]>wrote: >>>>> >>>>>> Jeff, thanks for sharing so quickly :) >>>>>> I perfectly agree and acknowledge that PubSubHubbub is currently too >>>>>> much oriented toward data feeds, and content in general, while it's >>>>>> just a sub-case. >>>>>> I also think the "realtime" aspect of things doesn't matter that much, >>>>>> and is just a consequence of the "push" design. When you trigger >>>>>> events, there is no reason to do it later than sooner. >>>>>> >>>>>> The spec should evolve in something that works as well for events than >>>>>> for content. >>>>>> It should be "subscribe to a web resource, get events". [this can be >>>>>> decorated in any way people want to work with feeds, with publisher/ >>>>>> hubs merged or distinct, with no data... etc.] >>>>>> >>>>>> Julien >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 28, 11:21 pm, Jeff Lindsay <[email protected]> wrote: >>>>>> > On Mon, Nov 28, 2011 at 2:02 PM, Julien Genestoux < >>>>>> > >>>>>> > [email protected]> wrote: >>>>>> > > Jeff, please do share your feelings. Help us make PubSubHubbub >>>>>> better! >>>>>> > > Bob, obviously pubsubhubub should be less about blogging and/or >>>>>> news. I >>>>>> > > started a thread about supporting any kind of arbitrary data, and >>>>>> this is >>>>>> > > what I had in mind as a way to suppoty any kind of content, and >>>>>> any type of >>>>>> > > updates (with our without payload). >>>>>> > >>>>>> > To this point, my main feeling is that, yes, PSHB is focused too >>>>>> much on >>>>>> > content. While I think this is useful (as its been the primary use >>>>>> case), >>>>>> > it's not a wide enough net to really have critical mass as a >>>>>> project. I >>>>>> > originally thought it was good that it was very focused and didn't >>>>>> solve >>>>>> > *my* particular problems. I also thought it was good it focused on a >>>>>> > tangible goal of making feeds more realtime. However, I think time >>>>>> has >>>>>> > shown it was not enough to be a big enough deal to sustain momentum >>>>>> as a >>>>>> > project. >>>>>> > >>>>>> > The problem is that this general problem PSHB solves has many >>>>>> different >>>>>> > views/perspectives/languages. For example, it can be message >>>>>> oriented and >>>>>> > talk about pubsub. Or it can be event oriented and talk about >>>>>> events etc >>>>>> > (the perspective used by Phil and them). Or it can even be thought >>>>>> of as >>>>>> > callbacks or hooks (webhooks). There are other similar concepts with >>>>>> > different language as well: updates/notifications, observers, etc. >>>>>> The two >>>>>> > main ones seem to be events vs messages/pubsub, and I'm not sure >>>>>> which one >>>>>> > is generally consider more general than the other. Ultimately, >>>>>> technically, >>>>>> > they're more or less the same thing, but I think the framing makes >>>>>> a *big* >>>>>> > difference. >>>>>> > >>>>>> > Anyway, that's the start of my ideas around this. >>>>>> > >>>>>> > -jeff >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > > Julien >>>>>> > >>>>>> > > On Mon, Nov 28, 2011 at 9:33 PM, Bob Wyman <[email protected]> wrote: >>>>>> > >>>>>> > >> The sitehttp://www.mostlybaked.com/provides a number of quick >>>>>> sketches >>>>>> > >> of applications that are things that I personally think should >>>>>> work well >>>>>> > >> over PSHB if the focus of PSHB became less about blogging and >>>>>> more about >>>>>> > >> the general case of publishing and subscribing to streams of >>>>>> data on the >>>>>> > >> Internet. Also, Phil often talks about the kinds of things that >>>>>> he'd like >>>>>> > >> to do with the EventedAPI on his blog. ex: >>>>>> > >> >>>>>> http://www.windley.com/archives/2011/11/personal_event_networks_and_v. >>>>>> .. >>>>>> > >>>>>> > >> bob wyman >>>>>> > >>>>>> > >> On Mon, Nov 28, 2011 at 1:16 PM, Bob Wyman <[email protected]> >>>>>> wrote: >>>>>> > >>>>>> > >>> See:http://www.eventedapi.org/spec >>>>>> > >>>>>> > >>> As we consider what can be done to move PubSubHubbub forward, >>>>>> it might >>>>>> > >>> make sense to take a look at some other protocols that folk >>>>>> have defined to >>>>>> > >>> determine if there is anything in them that PubSubHubbub should >>>>>> be >>>>>> > >>> implemented or if they do things better that PSHB does. The >>>>>> folk at Kynetx ( >>>>>> > >>>http://apps.kynetx.com/) have been building up a PSHB-like >>>>>> system for >>>>>> > >>> some time now... I'm not sure I understand why PSHB wouldn't, >>>>>> in fact, >>>>>> > >>> serve their needs. >>>>>> > >>>>>> > >>> bob wyman >>>>>> > >>>>>> > -- >>>>>> > Jeff Lindsayhttp://progrium.com >>>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> Jeff Lindsay >> http://progrium.com >> > > -- Jeff Lindsay http://progrium.com
