On Thu, 2009-09-17 at 00:10 +0200, Fabio Forno wrote: > 2009/9/16 Peter Saint-Andre <[email protected]>: > >> Agreed. The only reason I champion collection nodes as I do is that > >> I don't know what "node as code" means. It might be better, but in the > >> mean time the only solution I have is collection nodes. > > > > Ralph? ;-) > > > > I try to guess ;) Though I don't see a direct connection with > collection nodes, this isa concept we are exploring thanks to Ralph's > implementation in wokkel which allows implementing custom logic behind > a node. At present we all think to pubsub nodes as simple dispatchers > that follow rules defined by node configuration and affiliations, i.e. > we publish an event and, by applying those rules, the event is > delivered as it is to a set of subscribers, always to the same ones > for any event. Things become more interesting when we can customize > the behavior of a node by writing our code processing a publish or > delete of an item, so that we can transform the item itself before > delivering it, aggregate the events, implement our delivery policy > (e.g. content based delivery, or things like delivering to just the > highest priority online subscriber). Perhaps the possibility of doing > content based delivery is something which is similar to collection > nodes, but this would require to specify some selection options in the > subscription.
Yeah, mostly this. The connection with collection nodes is that they both allow for subscribing to a node that will cause the subscriber to receive notifications from events that can also be subscribed to more specifically. Let's make that more concrete: Say I model my blog posts as individual nodes (e.g. 'blog:fosdem_2010'). You can subscribe to each of those separately, getting updates whenever a post changes. Assume the payload is an Atom Entry Document. This in itself is nice, and has use cases like keeping remote copies in sync, e.g. to show summaries. A problem with the above is that you have to know about the existence of a particular post. You won't be notified about new ones. What you would like is to subscribe to a node that will either yield notifications of the fact that a new node (blog post) exists, or direct notifications of the blog post itself. In come collections. A collection node (as it is currently defined) allows for associating nodes to it. E.g. the node 'blog:fosdem_2009' would be associated with the collection node 'blog'. A subscriber can choose between two models of subscription: 'nodes' or 'items'. The former will send notifications about changed associations (new blog posts), while the latter will simply make notifications from the associate nodes also go to the subscriber of the collection. In that case, the notifications carries a 'Collection' SHIM headers that holds the actual node subscribed to. Another solution is 'code-as-node'. Basically the system will magically also send out notifications to subscribers of 'blog' whenever 'blog:fosdem_2010' gets updated, or when 'blog:oscon_2010' appears. As it is currently done (at least by me), 'blog' looks like a leaf node, just like the others. The code-as-node model has several advantages over collections as they are defined now. It allows for more dynamic associations, or even content-based subscriptions (prospective search, like Collecta). You don't need to make the associations explicit, because the logic is in the system. On the other hand, there is no way to detect duplicates other than looking at the payload, or maybe through service-wide unique item identifiers. Having some implementation experience with both, I am thinking we should try to define collections more loosely in XEP-0248, allowing for making 'code-as-node' type nodes that act as a collection. I.e. notifications would be sent out as the 'original' node, but include information on the subscriptions that caused the notifications to be send to the recipient. Currently, I believe a combination of 'Collection' and 'SubID' headers is ambiguous in some cases. It would be nice if we could simply send along combinations of subscribed-to-node and SubID. Maybe in a new SHIM header. Also, pubsub is often useless without a way to retrieve previously published items. For this, items requests would need to be allowed for collection nodes. An implementation can decide for itself how to actually implement this, but caching the last n items send out for a particular node comes to mind. In complicated use-cases, per subscription. Think inboxes with a sliding window. As noted by Brian Cully back in June [1], we would need to be able to represent items from different nodes in one response. I could also imagine having an empty result for such a query, triggering the sending of notifications for the matching items asynchronously. This also prevents very large stanzas. Some of the other concerns I noted earlier in this thread still need to be looked into. ralphm [1] http://mail.jabber.org/pipermail/pubsub/2009-June/000227.html
