Re: Efficiently process observation event for local changes
fwiw: I think separating queues for listeners interested in local events from a queue for listeners interested in global events is a a very promising approach. Cheers Michael > On 23 Mar 2015, at 16:03, Chetan Mehrotra wrote: > > After discussing this further with Marcel and Michael we came to conclusion > that we can achieve similar performance by make use of persistent cache for > storing the diff. This would require slight change in way we interpret the > diff JSOP. This should not require any change in current logic related to > observation event generation. Opened OAK-2669 to track that. > > One thing that we might still want to do is to use separate queue size for > listeners interested in local events only and those which can work with > external event. On a system like AEM there 180 listeners which listen for > external changes and ~20 which only listen to local changes. So makes sense > to have bigger queues for such listners > > Chetan Mehrotra > > On Mon, Mar 23, 2015 at 4:09 PM, Michael Dürig wrote: > >> >> >> On 23.3.15 11:03 , Stefan Egli wrote: >> >>> Going one step further we could also discuss to completely moving the >>> handling of the 'observation queues' to an actual messaging system. >>> Whether this would be embedded to an oak instance or whether it would be >>> shared between instances in an oak cluster might be a different question >>> (the embedded variant would have less implication on the overall oak >>> model, esp also timing-wise). But the observation model quite exactly >>> matches the publish-subscribe semantics - it actually matches pub-sub more >>> than it fits into the 'cache semantics' to me. >>> >> >> Definitely something to try out, given someone find the time for it. ;-) >> Mind you that some time ago I implemented persisting events to Apache Kafka >> [1], which wasn't greeted with great enthusiasm though... >> >> OTOH the same concern regarding pushing the bottleneck to IO applies here. >> Furthermore filtering the persisted events through access control is >> something we need yet to figure out as AC is a) sessions scoped and b) >> depends on the tree hierarchy. >> >> Michael >> >> >> [1] https://github.com/mduerig/oak-kafka >> >> >> >>> .. just saying .. >>> >>> On 3/23/15 10:47 AM, "Michael Dürig" wrote: >>> >>> On 23.3.15 5:04 , Chetan Mehrotra wrote: > B - Proposed Changes > --- > > 1. Move the notion of listening to local events to Observer level - So > upon > any new change detected we only push the change to a given queue if its > local and bounded listener is only interested in local. Currently we > push > all changes which later do get filter out but we avoid doing that first > level itself and keep queue content limited to local changes only > I think there is no change needed in the Observer API itself as you can already figure out from the passed CommitInfo whether a commit is external or not. BTW please take care with the term "local" as there is also the concept of "session local" commits. > 2. Attach the calculated diff as part of commit info which is attached > to > the given change. This would allow eliminating the chances of the cache > miss altogether and would ensure observation is not delayed due to slow > processing of diff. This can be done on best effort basis if the diff > is to > large then we do not attach it and in that case we diff again > > 3. For listener which are only interested in local events we can use a > different queue size limit i.e. allow larger queues for such listener. > > Later we can also look into using a journal (or persistent queue) for > local > event processing. > Definitely something to try out. A few points to consider: * There doesn't seem to be too much of a difference to me whether this is routed via a cache or directly attached to commits. In wither way it adds additional memory requirements and churn, which need to be managed. * When introducing persisted queuing we need to be careful not to just move the bottleneck to IO. * An eventual implementation should not break the fundamental design. Either hide it in the implementation or find a clean way to put this into the overall design. Michael >>> >>> >>>
Re: Efficiently process observation event for local changes
Related to this, I've created https://issues.apache.org/jira/browse/OAK-2683 which is about an issue that happens when the observation queue limit is reached. Cheers, Stefan On 3/23/15 4:03 PM, "Chetan Mehrotra" wrote: >After discussing this further with Marcel and Michael we came to >conclusion >that we can achieve similar performance by make use of persistent cache >for >storing the diff. This would require slight change in way we interpret the >diff JSOP. This should not require any change in current logic related to >observation event generation. Opened OAK-2669 to track that. > >One thing that we might still want to do is to use separate queue size for >listeners interested in local events only and those which can work with >external event. On a system like AEM there 180 listeners which listen for >external changes and ~20 which only listen to local changes. So makes >sense >to have bigger queues for such listners > >Chetan Mehrotra > >On Mon, Mar 23, 2015 at 4:09 PM, Michael Dürig wrote: > >> >> >> On 23.3.15 11:03 , Stefan Egli wrote: >> >>> Going one step further we could also discuss to completely moving the >>> handling of the 'observation queues' to an actual messaging system. >>> Whether this would be embedded to an oak instance or whether it would >>>be >>> shared between instances in an oak cluster might be a different >>>question >>> (the embedded variant would have less implication on the overall oak >>> model, esp also timing-wise). But the observation model quite exactly >>> matches the publish-subscribe semantics - it actually matches pub-sub >>>more >>> than it fits into the 'cache semantics' to me. >>> >> >> Definitely something to try out, given someone find the time for it. ;-) >> Mind you that some time ago I implemented persisting events to Apache >>Kafka >> [1], which wasn't greeted with great enthusiasm though... >> >> OTOH the same concern regarding pushing the bottleneck to IO applies >>here. >> Furthermore filtering the persisted events through access control is >> something we need yet to figure out as AC is a) sessions scoped and b) >> depends on the tree hierarchy. >> >> Michael >> >> >> [1] https://github.com/mduerig/oak-kafka >> >> >> >>> .. just saying .. >>> >>> On 3/23/15 10:47 AM, "Michael Dürig" wrote: >>> >>> On 23.3.15 5:04 , Chetan Mehrotra wrote: > B - Proposed Changes > --- > > 1. Move the notion of listening to local events to Observer level - >So > upon > any new change detected we only push the change to a given queue if >its > local and bounded listener is only interested in local. Currently we > push > all changes which later do get filter out but we avoid doing that >first > level itself and keep queue content limited to local changes only > I think there is no change needed in the Observer API itself as you can already figure out from the passed CommitInfo whether a commit is external or not. BTW please take care with the term "local" as there is also the concept of "session local" commits. > 2. Attach the calculated diff as part of commit info which is >attached > to > the given change. This would allow eliminating the chances of the >cache > miss altogether and would ensure observation is not delayed due to >slow > processing of diff. This can be done on best effort basis if the diff > is to > large then we do not attach it and in that case we diff again > > 3. For listener which are only interested in local events we can use >a > different queue size limit i.e. allow larger queues for such >listener. > > Later we can also look into using a journal (or persistent queue) for > local > event processing. > Definitely something to try out. A few points to consider: * There doesn't seem to be too much of a difference to me whether this is routed via a cache or directly attached to commits. In wither way it adds additional memory requirements and churn, which need to be managed. * When introducing persisted queuing we need to be careful not to just move the bottleneck to IO. * An eventual implementation should not break the fundamental design. Either hide it in the implementation or find a clean way to put this into the overall design. Michael >>> >>> >>>
Re: Efficiently process observation event for local changes
After discussing this further with Marcel and Michael we came to conclusion that we can achieve similar performance by make use of persistent cache for storing the diff. This would require slight change in way we interpret the diff JSOP. This should not require any change in current logic related to observation event generation. Opened OAK-2669 to track that. One thing that we might still want to do is to use separate queue size for listeners interested in local events only and those which can work with external event. On a system like AEM there 180 listeners which listen for external changes and ~20 which only listen to local changes. So makes sense to have bigger queues for such listners Chetan Mehrotra On Mon, Mar 23, 2015 at 4:09 PM, Michael Dürig wrote: > > > On 23.3.15 11:03 , Stefan Egli wrote: > >> Going one step further we could also discuss to completely moving the >> handling of the 'observation queues' to an actual messaging system. >> Whether this would be embedded to an oak instance or whether it would be >> shared between instances in an oak cluster might be a different question >> (the embedded variant would have less implication on the overall oak >> model, esp also timing-wise). But the observation model quite exactly >> matches the publish-subscribe semantics - it actually matches pub-sub more >> than it fits into the 'cache semantics' to me. >> > > Definitely something to try out, given someone find the time for it. ;-) > Mind you that some time ago I implemented persisting events to Apache Kafka > [1], which wasn't greeted with great enthusiasm though... > > OTOH the same concern regarding pushing the bottleneck to IO applies here. > Furthermore filtering the persisted events through access control is > something we need yet to figure out as AC is a) sessions scoped and b) > depends on the tree hierarchy. > > Michael > > > [1] https://github.com/mduerig/oak-kafka > > > >> .. just saying .. >> >> On 3/23/15 10:47 AM, "Michael Dürig" wrote: >> >> >>> >>> On 23.3.15 5:04 , Chetan Mehrotra wrote: >>> B - Proposed Changes --- 1. Move the notion of listening to local events to Observer level - So upon any new change detected we only push the change to a given queue if its local and bounded listener is only interested in local. Currently we push all changes which later do get filter out but we avoid doing that first level itself and keep queue content limited to local changes only >>> >>> I think there is no change needed in the Observer API itself as you can >>> already figure out from the passed CommitInfo whether a commit is >>> external or not. BTW please take care with the term "local" as there is >>> also the concept of "session local" commits. >>> >>> 2. Attach the calculated diff as part of commit info which is attached to the given change. This would allow eliminating the chances of the cache miss altogether and would ensure observation is not delayed due to slow processing of diff. This can be done on best effort basis if the diff is to large then we do not attach it and in that case we diff again 3. For listener which are only interested in local events we can use a different queue size limit i.e. allow larger queues for such listener. Later we can also look into using a journal (or persistent queue) for local event processing. >>> >>> Definitely something to try out. A few points to consider: >>> >>> * There doesn't seem to be too much of a difference to me whether this >>> is routed via a cache or directly attached to commits. In wither way it >>> adds additional memory requirements and churn, which need to be managed. >>> >>> * When introducing persisted queuing we need to be careful not to just >>> move the bottleneck to IO. >>> >>> * An eventual implementation should not break the fundamental design. >>> Either hide it in the implementation or find a clean way to put this >>> into the overall design. >>> >>> Michael >>> >> >> >>
Re: Efficiently process observation event for local changes
On 23.3.15 11:03 , Stefan Egli wrote: Going one step further we could also discuss to completely moving the handling of the 'observation queues' to an actual messaging system. Whether this would be embedded to an oak instance or whether it would be shared between instances in an oak cluster might be a different question (the embedded variant would have less implication on the overall oak model, esp also timing-wise). But the observation model quite exactly matches the publish-subscribe semantics - it actually matches pub-sub more than it fits into the 'cache semantics' to me. Definitely something to try out, given someone find the time for it. ;-) Mind you that some time ago I implemented persisting events to Apache Kafka [1], which wasn't greeted with great enthusiasm though... OTOH the same concern regarding pushing the bottleneck to IO applies here. Furthermore filtering the persisted events through access control is something we need yet to figure out as AC is a) sessions scoped and b) depends on the tree hierarchy. Michael [1] https://github.com/mduerig/oak-kafka .. just saying .. On 3/23/15 10:47 AM, "Michael Dürig" wrote: On 23.3.15 5:04 , Chetan Mehrotra wrote: B - Proposed Changes --- 1. Move the notion of listening to local events to Observer level - So upon any new change detected we only push the change to a given queue if its local and bounded listener is only interested in local. Currently we push all changes which later do get filter out but we avoid doing that first level itself and keep queue content limited to local changes only I think there is no change needed in the Observer API itself as you can already figure out from the passed CommitInfo whether a commit is external or not. BTW please take care with the term "local" as there is also the concept of "session local" commits. 2. Attach the calculated diff as part of commit info which is attached to the given change. This would allow eliminating the chances of the cache miss altogether and would ensure observation is not delayed due to slow processing of diff. This can be done on best effort basis if the diff is to large then we do not attach it and in that case we diff again 3. For listener which are only interested in local events we can use a different queue size limit i.e. allow larger queues for such listener. Later we can also look into using a journal (or persistent queue) for local event processing. Definitely something to try out. A few points to consider: * There doesn't seem to be too much of a difference to me whether this is routed via a cache or directly attached to commits. In wither way it adds additional memory requirements and churn, which need to be managed. * When introducing persisted queuing we need to be careful not to just move the bottleneck to IO. * An eventual implementation should not break the fundamental design. Either hide it in the implementation or find a clean way to put this into the overall design. Michael
Re: Efficiently process observation event for local changes
Going one step further we could also discuss to completely moving the handling of the 'observation queues' to an actual messaging system. Whether this would be embedded to an oak instance or whether it would be shared between instances in an oak cluster might be a different question (the embedded variant would have less implication on the overall oak model, esp also timing-wise). But the observation model quite exactly matches the publish-subscribe semantics - it actually matches pub-sub more than it fits into the 'cache semantics' to me. .. just saying .. On 3/23/15 10:47 AM, "Michael Dürig" wrote: > > >On 23.3.15 5:04 , Chetan Mehrotra wrote: >> B - Proposed Changes >> --- >> >> 1. Move the notion of listening to local events to Observer level - So >>upon >> any new change detected we only push the change to a given queue if its >> local and bounded listener is only interested in local. Currently we >>push >> all changes which later do get filter out but we avoid doing that first >> level itself and keep queue content limited to local changes only > >I think there is no change needed in the Observer API itself as you can >already figure out from the passed CommitInfo whether a commit is >external or not. BTW please take care with the term "local" as there is >also the concept of "session local" commits. > >> >> 2. Attach the calculated diff as part of commit info which is attached >>to >> the given change. This would allow eliminating the chances of the cache >> miss altogether and would ensure observation is not delayed due to slow >> processing of diff. This can be done on best effort basis if the diff >>is to >> large then we do not attach it and in that case we diff again >> >> 3. For listener which are only interested in local events we can use a >> different queue size limit i.e. allow larger queues for such listener. >> >> Later we can also look into using a journal (or persistent queue) for >>local >> event processing. > >Definitely something to try out. A few points to consider: > >* There doesn't seem to be too much of a difference to me whether this >is routed via a cache or directly attached to commits. In wither way it >adds additional memory requirements and churn, which need to be managed. > >* When introducing persisted queuing we need to be careful not to just >move the bottleneck to IO. > >* An eventual implementation should not break the fundamental design. >Either hide it in the implementation or find a clean way to put this >into the overall design. > >Michael
Re: Efficiently process observation event for local changes
On 23.3.15 5:04 , Chetan Mehrotra wrote: B - Proposed Changes --- 1. Move the notion of listening to local events to Observer level - So upon any new change detected we only push the change to a given queue if its local and bounded listener is only interested in local. Currently we push all changes which later do get filter out but we avoid doing that first level itself and keep queue content limited to local changes only I think there is no change needed in the Observer API itself as you can already figure out from the passed CommitInfo whether a commit is external or not. BTW please take care with the term "local" as there is also the concept of "session local" commits. 2. Attach the calculated diff as part of commit info which is attached to the given change. This would allow eliminating the chances of the cache miss altogether and would ensure observation is not delayed due to slow processing of diff. This can be done on best effort basis if the diff is to large then we do not attach it and in that case we diff again 3. For listener which are only interested in local events we can use a different queue size limit i.e. allow larger queues for such listener. Later we can also look into using a journal (or persistent queue) for local event processing. Definitely something to try out. A few points to consider: * There doesn't seem to be too much of a difference to me whether this is routed via a cache or directly attached to commits. In wither way it adds additional memory requirements and churn, which need to be managed. * When introducing persisted queuing we need to be careful not to just move the bottleneck to IO. * An eventual implementation should not break the fundamental design. Either hide it in the implementation or find a clean way to put this into the overall design. Michael
Re: Efficiently process observation event for local changes
Just to clarify - By mistake I used 'dropped' above. There is no way for a normal running system to drop events. What I meant was when queue starts getting full then BackGround observer would collapse individual changes to bigger one. In doing that the events which were considered local would now be treated as external. Due to which listeners which are only interested in local event would miss out those changes. So again Oak does not loose/drop events. What is does do is converting a local event and treat it as an external event Chetan Mehrotra On Mon, Mar 23, 2015 at 9:34 AM, Chetan Mehrotra wrote: > Hi Team, > > Currently in some of the cases we are seeing issues where observation > event for local events are getting dropped due to observation queue getting > full specially on MongoMK based deployments. This happens mainly due to > slowness in generating diff on Mongo. > > Some of the observation listeners are only interested in local events. For > e.g. listener registered by Workflow component on an AEM instance is > listening for local events only to trigger workflows. So for such type of > listeners we have following requirements > > A - Design Constraints > -- > > 1. Workflow is relying ONLY on local changes > > 2. Workflow misses on processing assets if the observation queue becomes >full and this happens because observation logic stats merging the events > > 3. At least for DocumentMK for every local change done we already have >calculated the diff which we currently push to DiffCache upon commit. > > 4. Observation logic eventually knows that given listener (and thus the >wrapping observer) is only interested in local events > > So in essence we already have the diff but we pass it to the observer via > the diff cache (at least for DocumentMK). Would it make sense to change the > current approach slightly > > B - Proposed Changes > --- > > 1. Move the notion of listening to local events to Observer level - So > upon any new change detected we only push the change to a given queue if > its local and bounded listener is only interested in local. Currently we > push all changes which later do get filter out but we avoid doing that > first level itself and keep queue content limited to local changes only > > 2. Attach the calculated diff as part of commit info which is attached to > the given change. This would allow eliminating the chances of the cache > miss altogether and would ensure observation is not delayed due to slow > processing of diff. This can be done on best effort basis if the diff is to > large then we do not attach it and in that case we diff again > > 3. For listener which are only interested in local events we can use a > different queue size limit i.e. allow larger queues for such listener. > > Later we can also look into using a journal (or persistent queue) for > local event processing. > > Chetan Mehrotra > > PS 1 - Local events vs Global events > - > > Its important to distinguish the importance of local vs global events. > With global events we can miss on listening to find level changes i.e. its > ok to merge the events. However for local events its not possible to miss > such events as main a times listeners are only interested in changes > happening to same cluster nodes. If such events are lost then no other > cluster node would process such change. In AEM for instance this would > result in asset not getting processed. > > Further we local changes the cluster node already has the diff (for > DocumentMK) and hence generating events for such changes would not be > costly. While for global changes generating diff would be costly. So it > makes sense to make a clear separation between these 2 kind of observers > and ensure that observers listening for local changes do not suffer due to > slowness in generating diff for global changes > >
Efficiently process observation event for local changes
Hi Team, Currently in some of the cases we are seeing issues where observation event for local events are getting dropped due to observation queue getting full specially on MongoMK based deployments. This happens mainly due to slowness in generating diff on Mongo. Some of the observation listeners are only interested in local events. For e.g. listener registered by Workflow component on an AEM instance is listening for local events only to trigger workflows. So for such type of listeners we have following requirements A - Design Constraints -- 1. Workflow is relying ONLY on local changes 2. Workflow misses on processing assets if the observation queue becomes full and this happens because observation logic stats merging the events 3. At least for DocumentMK for every local change done we already have calculated the diff which we currently push to DiffCache upon commit. 4. Observation logic eventually knows that given listener (and thus the wrapping observer) is only interested in local events So in essence we already have the diff but we pass it to the observer via the diff cache (at least for DocumentMK). Would it make sense to change the current approach slightly B - Proposed Changes --- 1. Move the notion of listening to local events to Observer level - So upon any new change detected we only push the change to a given queue if its local and bounded listener is only interested in local. Currently we push all changes which later do get filter out but we avoid doing that first level itself and keep queue content limited to local changes only 2. Attach the calculated diff as part of commit info which is attached to the given change. This would allow eliminating the chances of the cache miss altogether and would ensure observation is not delayed due to slow processing of diff. This can be done on best effort basis if the diff is to large then we do not attach it and in that case we diff again 3. For listener which are only interested in local events we can use a different queue size limit i.e. allow larger queues for such listener. Later we can also look into using a journal (or persistent queue) for local event processing. Chetan Mehrotra PS 1 - Local events vs Global events - Its important to distinguish the importance of local vs global events. With global events we can miss on listening to find level changes i.e. its ok to merge the events. However for local events its not possible to miss such events as main a times listeners are only interested in changes happening to same cluster nodes. If such events are lost then no other cluster node would process such change. In AEM for instance this would result in asset not getting processed. Further we local changes the cluster node already has the diff (for DocumentMK) and hence generating events for such changes would not be costly. While for global changes generating diff would be costly. So it makes sense to make a clear separation between these 2 kind of observers and ensure that observers listening for local changes do not suffer due to slowness in generating diff for global changes