Re: [PubSub] collection node definition

Robin Collier Sat, 21 Nov 2009 07:08:11 -0800

> From: [email protected]
> To: [email protected]
> Date: Mon, 16 Nov 2009 18:01:18 +0100
> Subject: Re: [PubSub] collection node definition
> 
> On Sat, 2009-11-14 at 21:00 -0500, Robin Collier wrote:
> > [..]
> >
> > > The code-as-node model has several advantages over collections as
> > they
> > > are defined now. It allows for more dynamic associations, or even
> > > content-based subscriptions (prospective search, like Collecta). You
> > > don't need to make the associations explicit, because the logic is
> > in
> > > the system.
> > > 
> > While quite powerful, doesn't this make a system quite custom in
> > nature
> > to the point where discovery of capabilities and configuration become 
> > quite useless?  I also implies access to the backend system to be to 
> > insert the custom logic.  In the end, wouldn't this only be useful in
> > a 
> > very closed system.  It strikes me that this would not be too useful 
> > to an open system where you would not want the users to be able
> > insert code on a server.
> 
> I assume you mean generic vs. custom where you talk about open vs.
> closed, because we are talking about a protocol here. Even though most
> of our specifications go into detail about possible business rules in
> implementations, the focus is still about the relation between inputs
> and outputs. This is especially true for publish-subscribe.
> 
Actually, I was referring to an open vs. closed system deployment.  Even 
though we are talking about a protocol, it is important to realize how it is
expected to be used.  If the expectation is an open system, i.e. accessible
by the general public (which would be a common case for an IM based system)
then you can expect certain limitations on what users are able to define
and customize so the integrity of the system can be maintained.  If the
creation of nodes and collections of nodes is allowed as a service to the 
general public, then the rules and constraints need to be well defined.

On the other hand, in a closed system, or one in which the provider defines
and creates nodes and collections for general consumption, then such 
rules and constraints do not need to exist, except to allow different
implementations to work together (which is the point of the spec and true in
both scenarios).  So this is where I believe what you propose makes sense
and allows for very powerful custom logic, but I don't think that replaces
the truly 'open' case.

Personally, I see collections as a simple organizational tool for nodes
and I don't think they need to be any more than that.  My own messaging
background is in the enterprise using JMS, and that spec also allows for 
such a grouping (hierarchical only), although it is considered optional.
(Sidenote: This optional part kind of stinks though since it means you cannot 
make 
your code vendor neutral since there is no discovery mechanism for such
functionality.) 

> So yes, code-as-node is custom by nature. The assumption is that for
> systems that would benefit most from the concept of collections, having
> static configurations for parent-child node association is cumbersome at
> best. For most applications, it doesn't really matter what the precise
> associations of nodes are. The application just wants to subscribe to a
> particular set of updates.
> 
> Continuing with the example of modeling blog posts as leaf nodes and the
> whole blog as a collection, you would need to reconfigure either the
> parent or child node to associate the new blog post. If you do in-band
> publishing (i.e. you use a generic pubsub service and actively post to
> it from whatever blogging backend you have), you can also include the
> parent node when creating the new node for the blog post.
> 
> In that model, all associations need be stored explicitly and probably
> be kept in sync with the publishing entity (e.g. a web site service that
> keeps your blog). This becomes prohibitive with larger numbers,
> especially if you request the current configuration of the parent node.
> 
> At Mediamatic we have first used a generic publish-subscribe service
> implementation (Idavoll) to publish all objects (things) in our CMS as
> Atom entry documents. Then we realized that we couldn't feasibly
> implement anything like collections using that model. The current
> incarnation functions as an XMPP/PubSub interface to our backend,
> through API calls in both directions. Our collections are really
> node-as-code leaf nodes. It avoids all data duplication of the generic
> model and is more flexible. On top of that, we don't need to remember
> the associations, as this can be calculated at run time.
> 
> The only down side of the latter model is that you need some kind of
> outbox system for retrieving the last n items.
> 
> 
> > I guess I am thinking that this capability should be determined by
> > server implementations as an extended capability, and not necessarily
> > as part of the spec itself.
> 
> After talking to a bunch of people about modeling their application, I
> have the idea that a generic solution for collections is not practical
> for anything but toy projects. I'd like to be proven wrong at this.
> Please come up with useful, real-world examples where static node
> configuration is required and feasible to implement.
> 
> In any case, while I'd like to support the smaller use-case, I think
> that implementing the whole scheme of recording associations and
> traversing DAGs at publish time, along with checking authorization and
> more is not worth the trouble. My suggested alternative, though, was a
> breeze to implement. So I have to disagree here.
> 
> Collections as I see them are really just abstractions of content-based
> pubsub systems (hi Bob Wyman!), where you basically assign a fixed name
> (node identifier) to a particular query into the notification plasm. I
> am still interested in explicitly defining the minimally subscribe-able
> unit (like a blog post), so I want to to pass along a specific node from
> where a notification originates, though.
> 

That is an interesting concept, correct me if I am wrong, but this sounds
an awful lot like a view in a relational database.  I am not sure if I would 
consider this to be a collection though, it seems to me like another concept
which would be better called an aggregation node.  I guess I would
distinguish them by defining a collection node as a collection of nodes, whereas
an aggregation node is a collection of items from multiple nodes.

> ralphm
> 
                                          
_________________________________________________________________
Windows Live: Friends get your Flickr, Yelp, and Digg updates when they e-mail 
you.
http://go.microsoft.com/?linkid=9691817
Re: [PubSub] collection node definition

Reply via email to