Hi, On Mon, Jun 18, 2012 at 9:02 PM, Michael Dürig <[email protected]> wrote: > - We can implement the polling approach (using a 0 timeout) but also have > the option to do blocking. Since this can be directly delegated to the > Microkernel (waitForCommit) the added complexity for this is minimal.
The complexity is still there, it's just one level below. Personally I'd rather drop the waitForCommit() method unless we really do have a hard use case for that functionality. The JCR spec doesn't mandate it. > - With the polling approach we offload more complexity to the consumer of > the API since one has to decide on a reasonable poll interval which might be > a difficult trade off between latency and server load. In my experience the client is usually in the best position to judge how timely it needs the observation events to be. The majority of observation use cases I know don't need immediate, sub-second triggering of events. Poll intervals of once a second, once a minute or once an hour are typically perfectly fine. In fact using blocking calls for observation is even a bit troublesome as it can easily lead to tricky race conditions when many listeners get triggered at exactly the same time. > - ChangeSet is just a container carrying the trees as they where after and > before the change. So this is very close to the diffing approach you > describe only a bit more explicit. Also ChangeSet is the place where > additional information like change set meta data could live. I'm close to > certain that we will need something along these lines (i.e. userData, > timestamps, user who initiated that change, session id of the originating > session). The reason why I worry about the ChangeSet concept is that it implies that each commit() produces a separate ChangeSet that then gets delivered to each observation listener for processing. This is troublesome for two key reasons: 1) Performance: Consider a large cluster that supports lots of concurrent writes hitting all cluster nodes. We should be able to support at least hundreds or thousands of commits per second on such systems, and ideally the only limit here would be the amount of available hardware. With the ChangeSet concept each of those commits would result in a separate waitForChanges() return value, which would cause event queues to start growing indefinitely if any one of the listeners can't keep up with the stream of incoming changes. The poll+diff approach avoids that problem since a listener only sees the combined set of changes across the polling interval. 2) Linearity: Our overall design explicitly allows concurrent commits that are only later merged together. This makes the concept of a "previous" or "following" ChangeSet somewhat troublesome. You could avoid that trouble by interpreting all concurrent commits from another cluster node as a singe merge ChangeSet, but then you already lose per-commit metadata. Again the poll+diff approach avoids this problem since it doesn't care how and from where changes entered the latest visible state of the tree. > - The approach aligns neatly with the JCR features: implement observation > using blocking calls and implement journalling by using non blocking calls. There's no concept of blocking calls for observation in JCR. BR, Jukka Zitting
