On Mon, Nov 28, 2011 at 8:45 PM, Jeff Lindsay <[email protected]> wrote:

> The idea was that the hub should publish Atom entries and only Atom
>> entries. Of course, the entries would contain atom.source elements to show
>> the feeds with which they were associated. Also, the hub should do
>> de-duping to ensure that any particular entry isn't sent more than once.
>>
>
> Yeah, I get the reasoning behind Atom and I understand it's more general
> use. The problem is in order to make something useful and easy to adopt,
> you need to really facilitate what people are already doing and are
> familiar with. Not everybody wants to work with Atom, despite all its
> benefits. Having Atom as a representation or as a possible payload is
> great, but depending on its semantics, forcing it to be required for PSHB
> to be useful is not a great idea... or least a pragmatic one IMO.
>
>
>> We could build all the above things very easily based on systems that
>> publish Atom feeds and allow content-based (query-based) subscriptions.
>>
>
> Call me crazy, but I'm in love with the Unix philosophy of doing one thing
> well and designing for composition of more complex systems from simple
> parts.
>
"Designing for composition of more complex systems from simple parts" is an
excellent goal. The problem is that in order to facilitate composition, you
must have some idea of what kinds of complex systems you're going to
compose. Given the application domain under discussion (Publish/Subscribe
even if some other name is used), the problem here is that we know from
many long years of experience that it is difficult to build a content-based
system on top of a topic-based system yet it is trivial to build a
topic-based system on top of a content-based system. It is important where
you start when designing systems. Things get path-dependent very quickly.

The problem is that design decisions made to facilitate topic-based system
construction tend to make harder the job of building content based systems.
Take, for example, the regular discussion of "firehoses" which are almost
always a common subject of discussion with topic-based systems but are
generally irrelevant when discussing content-based systems. A firehose,
which adds complexity to the topic-based implementation, is almost always
needed when people want to do any kind of content-based work on top of a
topic-based system. (That can include either real-time filtering or dumping
of data into a database for later "content-based" retrieval or searching.)
A firehose is simply a mechanism to de-mux or merge together the many
topic-based streams that were created in order to provide a topic-based
subscription model. If you start with a topic-based system, you almost
always need to construct firehoses in order to make content-based routing
possible. On the other hand, if you start with a content-based system and
have "topic" as an attribute of each published item, then it is trivial to
create "topic" streams since they are simply single-attribute subscriptions
keyed on the "topic" attribute.

If you start with a content-based model but want topic-based, then instead
of subscribing to topic "foobar" you assume that all published items have
an attribute named "topic" and you subscribe to "topic = 'foobar'". A
"topic-based" system is thus nothing more than the most simple use of a
content-based system. Of course, the advantage of using a trivial
content-based interface to emulate a topic-based system is that you can
then easily expand the capability of the base system to support more
complex filters or queries. You can go from just a single attribute and
exact-match to allowing full Boolean expressions, etc. without making a
significant change to the subscription interface -- the changes are only to
the subscription query syntax and those changes can all produce proper
supersets of the trival syntax.

What I wonder is what, if any, benefit comes from baking "topic-based" into
the subscription interface? Given that the alternative provides such
flexibility down the road, what significant advantage do you get from
limiting the system's expressiveness up-front?


> Queries and filters, to me, are out of the scope of this protocol, despite
> being very useful.
>
If you see my reasoning in the paragraphs above, you won't be surprised
that I claim that in order to build a topic-based system, you already need
to build "Queries and Filters."  The only difference is that if you build
something like PSHB, you are building a very simple filter language that
happens to be hard to extend. When people subscribe to topic "
http://example.com/feed"; it is EXACTLY the same, semantically, as
subscribing using the query "topic = 'http://example.com/feed'"... There is
no significant introduction of complexity that results from going from
topic-based to content-based -- only a much easier path to doing more
interesting things in the future. (i.e. "topic='http://example.com/feed AND
content='foobar'" is just a step away...)


> The reason is that anybody can create a subscriber or relay (perhaps even
> a hub) that happens to do that filtering in its implementation.
>
Yes, anyone can build yet another aggregator to either consume firehoses or
construct them and then filter them. But, just because a thing can be done,
doesn't mean that we should insist that it be done -- unless there is a
good reason not to allow alternatives. In this case, I can't see that there
are. Building the basic system using the model of a trivial content-based
system doesn't make it any more difficult to build other hubs or relays
that can do arbitrary processing, however, it gives us the option of
allowing a single system, with a standard interface, to do both the simple
and the complex work in an integrated and more efficient manner.

>
> That said, I'm assuming this was more just to defend Atom and
> content-based subscriptions, to which I would say: those examples should be
> possible *if* you use Atom as your content container and have access to or
> can build a subscription querier node. But it should also be possible if
> the content is *not* Atom using the same approach of putting the filtering
> in an intermediate node (or potentially being an implementation detail of a
> hub).
>
> I just think the core should be simple and neutral, allowing more
> specialized extensions, additions, and combinability. And for that, my
> experience (and general observations) suggest that we should focus on
> content-type neutral HTTP-based mechanisms.
>
> -jeff
>
>
>>
>> bob wyman
>>
>>
>> On Mon, Nov 28, 2011 at 6:33 PM, Julien Genestoux <
>> [email protected]> wrote:
>>
>>> Jeff, do you think you could help getting the folks at GitHub, Twilio,
>>> FreshBooks, Pusher to come in here and participate? What would they love to
>>> see in and out of PubSubHubbub so that it fits their needs?
>>>
>>> Bob, that's an interesting point. You said you wanted PSHB to be about
>>> entries rather than feeds. I'm not sure I understand this. I guess you
>>> would still need to subscribe to an endpoint that would emit a collection
>>> of entries, right?
>>>
>>> Julien
>>>
>>>
>>>
>>> On Tue, Nov 29, 2011 at 12:16 AM, Bob Wyman <[email protected]> wrote:
>>>
>>>> On Mon, Nov 28, 2011 at 5:31 PM, Julien <[email protected]>
>>>>  wrote:
>>>>
>>>> > PubSubHubbub is currently too
>>>> > much oriented toward data feeds
>>>> Personally, I think that PSHB "went wrong" when folk insisted that it
>>>> support RSS instead of just Atom. In the Atom format we had gone to great
>>>> trouble to ensure that "entry" was a top-level item and that entries had
>>>> the same semantics whether they were inside feeds or on their own. (Not the
>>>> case with RSS.) One of the reasons that I worked to make this the case was
>>>> that I've been wanting to do pubsub with arbitrary content for many
>>>> years... The idea was that an Atom entry is a reasonable wrapper or
>>>> container for just about any content you might want to publish. (MIME types
>>>> distinguish the content type.) Thus, a system for syndicating Atom entries
>>>> could be used to reasonably syndicate just about anything. But, when
>>>> support for RSS feeds came into the PSHB spec, all sorts of things got
>>>> confused... PSHB should have been about the entries, not the feeds...
>>>>
>>>> bob wyman
>>>>
>>>>
>>>>
>>>> On Mon, Nov 28, 2011 at 5:31 PM, Julien <[email protected]>wrote:
>>>>
>>>>> Jeff, thanks for sharing so quickly :)
>>>>> I perfectly agree and acknowledge that PubSubHubbub is currently too
>>>>> much oriented toward data feeds, and content in general, while it's
>>>>> just a sub-case.
>>>>> I also think the "realtime" aspect of things doesn't matter that much,
>>>>> and is just a consequence of the "push" design. When you trigger
>>>>> events, there is no reason to do it later than sooner.
>>>>>
>>>>> The spec should evolve in something that works as well for events than
>>>>> for content.
>>>>> It should be "subscribe to a web resource, get events". [this can be
>>>>> decorated in any way people want to work with feeds, with publisher/
>>>>> hubs merged or distinct, with no data... etc.]
>>>>>
>>>>> Julien
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Nov 28, 11:21 pm, Jeff Lindsay <[email protected]> wrote:
>>>>> > On Mon, Nov 28, 2011 at 2:02 PM, Julien Genestoux <
>>>>> >
>>>>> > [email protected]> wrote:
>>>>> > > Jeff, please do share your feelings. Help us make PubSubHubbub
>>>>> better!
>>>>> > > Bob, obviously pubsubhubub should be less about blogging and/or
>>>>> news. I
>>>>> > > started a thread about supporting any kind of arbitrary data, and
>>>>> this is
>>>>> > > what I had in mind as a way to suppoty any kind of content, and
>>>>> any type of
>>>>> > > updates (with our without payload).
>>>>> >
>>>>> > To this point, my main feeling is that, yes, PSHB is focused too
>>>>> much on
>>>>> > content. While I think this is useful (as its been the primary use
>>>>> case),
>>>>> > it's not a wide enough net to really have critical mass as a
>>>>> project. I
>>>>> > originally thought it was good that it was very focused and didn't
>>>>> solve
>>>>> > *my* particular problems. I also thought it was good it focused on a
>>>>> > tangible goal of making feeds more realtime. However, I think time
>>>>> has
>>>>> > shown it was not enough to be a big enough deal to sustain momentum
>>>>> as a
>>>>> > project.
>>>>> >
>>>>> > The problem is that this general problem PSHB solves has many
>>>>> different
>>>>> > views/perspectives/languages. For example, it can be message
>>>>> oriented and
>>>>> > talk about pubsub. Or it can be event oriented and talk about events
>>>>> etc
>>>>> > (the perspective used by Phil and them). Or it can even be thought
>>>>> of as
>>>>> > callbacks or hooks (webhooks). There are other similar concepts with
>>>>> > different language as well: updates/notifications, observers, etc.
>>>>> The two
>>>>> > main ones seem to be events vs messages/pubsub, and I'm not sure
>>>>> which one
>>>>> > is generally consider more general than the other. Ultimately,
>>>>> technically,
>>>>> > they're more or less the same thing, but I think the framing makes a
>>>>> *big*
>>>>> > difference.
>>>>> >
>>>>> > Anyway, that's the start of my ideas around this.
>>>>> >
>>>>> > -jeff
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > > Julien
>>>>> >
>>>>> > > On Mon, Nov 28, 2011 at 9:33 PM, Bob Wyman <[email protected]> wrote:
>>>>> >
>>>>> > >> The sitehttp://www.mostlybaked.com/provides a number of quick
>>>>> sketches
>>>>> > >> of applications that are things that I personally think should
>>>>> work well
>>>>> > >> over PSHB if the focus of PSHB became less about blogging and
>>>>> more about
>>>>> > >> the general case of publishing and subscribing to streams of data
>>>>> on the
>>>>> > >> Internet. Also, Phil often talks about the kinds of things that
>>>>> he'd like
>>>>> > >> to do with the EventedAPI on his blog. ex:
>>>>> > >>
>>>>> http://www.windley.com/archives/2011/11/personal_event_networks_and_v.
>>>>> ..
>>>>> >
>>>>> > >> bob wyman
>>>>> >
>>>>> > >> On Mon, Nov 28, 2011 at 1:16 PM, Bob Wyman <[email protected]> wrote:
>>>>> >
>>>>> > >>> See:http://www.eventedapi.org/spec
>>>>> >
>>>>> > >>> As we consider what can be done to move PubSubHubbub forward, it
>>>>> might
>>>>> > >>> make sense to take a look at some other protocols that folk have
>>>>> defined to
>>>>> > >>> determine if there is anything in them that PubSubHubbub should
>>>>> be
>>>>> > >>> implemented or if they do things better that PSHB does. The folk
>>>>> at Kynetx (
>>>>> > >>>http://apps.kynetx.com/) have been building up a PSHB-like
>>>>> system for
>>>>> > >>> some time now... I'm not sure I understand why PSHB wouldn't, in
>>>>> fact,
>>>>> > >>> serve their needs.
>>>>> >
>>>>> > >>> bob wyman
>>>>> >
>>>>> > --
>>>>> > Jeff Lindsayhttp://progrium.com
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Jeff Lindsay
> http://progrium.com
>

Reply via email to