[jira] [Commented] (OAK-5636) potential NPE in ReplicaSetInfo

2017-02-13 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863754#comment-15863754
 ] 

Stefan Egli commented on OAK-5636:
--

+1, patch looks good to me

> potential NPE in ReplicaSetInfo
> ---
>
> Key: OAK-5636
> URL: https://issues.apache.org/jira/browse/OAK-5636
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Affects Versions: 1.6.0
>Reporter: Julian Reschke
>Assignee: Tomek Rękawek
>Priority: Critical
> Fix For: 1.8, 1.6.1
>
> Attachments: OAK-5636.patch
>
>
> seen in log:
> {noformat}
> java.lang.NullPointerException: null
> at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:192) 
> ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at 
> com.google.common.collect.SingletonImmutableSet.(SingletonImmutableSet.java:47)
>  ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at com.google.common.collect.ImmutableSet.of(ImmutableSet.java:94) 
> ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at 
> org.apache.jackrabbit.oak.plugins.document.mongo.replica.ReplicaSetInfo.updateRevisions(ReplicaSetInfo.java:264)
>  ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at 
> org.apache.jackrabbit.oak.plugins.document.mongo.replica.ReplicaSetInfo.updateReplicaStatus(ReplicaSetInfo.java:182)
>  ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at 
> org.apache.jackrabbit.oak.plugins.document.mongo.replica.ReplicaSetInfo.updateLoop(ReplicaSetInfo.java:145)
>  ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at 
> org.apache.jackrabbit.oak.plugins.document.mongo.replica.ReplicaSetInfo.run(ReplicaSetInfo.java:134)
>  ~[oak-run-1.8-SNAPSHOT.jar:1.8-SNAPSHOT]
> at java.lang.Thread.run(Unknown Source) [na:1.8.0_121]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-13 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863858#comment-15863858
 ] 

Stefan Egli commented on OAK-5619:
--

Thx [~mduerig], I'll adjust the javadoc accordingly as well.

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.6.1
>
> Attachments: OAK-5619.patch, OAK-5619.patch2
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-13 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5619:
-
Fix Version/s: (was: 1.6.1)
   1.8
   1.7.0

* fixed in trunk: http://svn.apache.org/viewvc?rev=1782797=rev

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.7.0, 1.8
>
> Attachments: OAK-5619.patch, OAK-5619.patch2
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-13 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-5619.
--
Resolution: Fixed

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.7.0, 1.8, 1.6.1
>
> Attachments: OAK-5619.patch, OAK-5619.patch2
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-13 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5619:
-
Fix Version/s: 1.6.1

* merged to 1.6 branch: http://svn.apache.org/viewvc?rev=1782801=rev

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.7.0, 1.8, 1.6.1
>
> Attachments: OAK-5619.patch, OAK-5619.patch2
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-09 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5619:
-
Summary: withIncludeAncestorsRemove reports unrelated top-level node 
deletion  (was: withIncludeAncestorsRemove reports unrelated parent deletion)

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.6.1
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-09 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859414#comment-15859414
 ] 

Stefan Egli commented on OAK-5619:
--

note that this doesn't affect any unrelated node but only sharing one of the 
ancestors' parent - including {{/}} - ie if you have a listener on {{/a/b/c/d}} 
then the deletion of {{/unrelated}} is reported, as well as the deletion of 
{{/a/b/unrelated}}.

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.6.1
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5626) ChangeProcessor doesn't reset 'blocking' flag when items from queue gets removed and commit-rate-limiter is null

2017-02-10 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861275#comment-15861275
 ] 

Stefan Egli commented on OAK-5626:
--

I'd argue we should go for the simpler time check just around that log.warn. It 
only happens on dequeue of a full queue (ie when it was compacting). So doesn't 
happen normally anyway, only under a lot of stress..

> ChangeProcessor doesn't reset 'blocking' flag when items from queue gets 
> removed and commit-rate-limiter is null
> 
>
> Key: OAK-5626
> URL: https://issues.apache.org/jira/browse/OAK-5626
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
>
> Following up on conversation at \[0]:
> {{ChangeProcessor#queueSizeChanged}} \[1] sets blocking flag to true if queue 
> size is hit (or beyond). The warning "Revision queue is full. Further 
> revisions will be compacted." is logged only when it *wasn't* blocking.
> BUT, when queue empties, blocking flag is reset inside if block for 
> commitRateLimiter!=null. That means an event chain like: 
> # qFull
> # log warn
> # qEmpties
> # qFull 
> won't log the WARN after step(4)
> \[0]: http://markmail.org/message/hgein5g3ohyjhw5n
> \[1]: 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L307



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5626) ChangeProcessor doesn't reset 'blocking' flag when items from queue gets removed and commit-rate-limiter is null

2017-02-10 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861226#comment-15861226
 ] 

Stefan Egli commented on OAK-5626:
--

bq. My concern is that this would lead to almost every commit leading to those 
WARN flooding the logs.
Agreed, I had the same thought.
bq. But, this part lies in fairly critical section - I'm not sure of getting 
time can be expensive or not?
Agreed, not sure if it's a problem though.
But one possible alternative might be to move the time check to the 
[{{removed()}}|https://github.com/apache/jackrabbit-oak/blob/4eac76dcbb262f10af9202cdcbc3e95dee40107a/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L303]
 case. It would get a bit unprecise and more tricky to implement, but 
{{removed()}} is called asynchronously and therefore less time critical.

So above could be eg implemented "modCnt" style using {{logCnt}}, 
{{suppressCnt}} and {{lastSuppressTime}} as follows: 
* {{logCnt}} tracks number of those log.warns and log.warn is only issued if 
{{logCnt<=suppressCnt}}, in which case it then does {{logCnt = suppressCnt + 
1}} (thus logCnt is maintained by the "add" thread only)
* the "remove" thread takes note of logCnt incrementing, measures time 
({{lastSuppressTime}}), and would increment suppressCnt only after eg 5min (so 
time is only ever checked in the remove thread). 

Not super nice, but possible

> ChangeProcessor doesn't reset 'blocking' flag when items from queue gets 
> removed and commit-rate-limiter is null
> 
>
> Key: OAK-5626
> URL: https://issues.apache.org/jira/browse/OAK-5626
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
>
> Following up on conversation at \[0]:
> {{ChangeProcessor#queueSizeChanged}} \[1] sets blocking flag to true if queue
> size is hit (or beyond). The warning "Revision queue is full. Further
> revisions will be compacted." is logged only when it *wasn't*
> blocking.
> BUT, when queue empties, blocking flag is reset inside if block for
> commitRateLimiter!=null. That means an event chain like: 
> # qFull
> # log warn
> # qEmpties
> # qFull 
> won't log the WARN after step(4)
> \[0]: http://markmail.org/message/hgein5g3ohyjhw5n
> \[1]: 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L307



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-10 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5619:
-
Attachment: OAK-5619.patch2

Attached [^OAK-5619.patch2] which is the same as the previous patch but adds 
one more testcase plus adds some clarification to a javadoc

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.6.1
>
> Attachments: OAK-5619.patch, OAK-5619.patch2
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-09 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5619:
-
Attachment: OAK-5619.patch

Attached [^OAK-5619.patch] which contains a suggested patch.

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.6.1
>
> Attachments: OAK-5619.patch
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (OAK-5619) withIncludeAncestorsRemove reports unrelated top-level node deletion

2017-02-09 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859302#comment-15859302
 ] 

Stefan Egli edited comment on OAK-5619 at 2/9/17 4:28 PM:
--

Added a test case to trunk (currently disabled) with which this can be 
reproduced: http://svn.apache.org/viewvc?rev=1782304=rev
EDIT: extended the test case to cover more different cases: 
http://svn.apache.org/viewvc?rev=1782350=rev


was (Author: egli):
Added a test case to trunk (currently disabled) with which this can be 
reproduced: http://svn.apache.org/viewvc?rev=1782304=rev

> withIncludeAncestorsRemove reports unrelated top-level node deletion
> 
>
> Key: OAK-5619
> URL: https://issues.apache.org/jira/browse/OAK-5619
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Affects Versions: 1.6.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.6.1
>
>
> withIncludeAncestorsRemove includes deletion of all parents of the registered 
> paths. When registering an include path {{/a/b/c}} this thus triggers an 
> event if {{/a}} is deleted. When registering an include glob path {{**/foo}} 
> then any parent path deletion will be reported.
> There is a bug currently whereas an include path {{/a/b/c}} results in any 
> parent deletion to be reported. This likely stems from the fact that for glob 
> paths any parent path deletion will be reported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-3707) Register composite commit hook with whiteboard

2016-08-16 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422659#comment-15422659
 ] 

Stefan Egli commented on OAK-3707:
--

[~edivad], what's your opinion on backporting this to 1.0.x/1.2.x branches?

> Register composite commit hook with whiteboard
> --
>
> Key: OAK-3707
> URL: https://issues.apache.org/jira/browse/OAK-3707
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Affects Versions: 1.3.11
>Reporter: Davide Giannella
>Assignee: Davide Giannella
> Fix For: 1.4, 1.3.13
>
> Attachments: OAK-3707-1.patch
>
>
> Register, during repository initialisation the composite of the CommitHook 
> with the whiteboard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4153) segment's compareAgainstBaseState wont call childNodeDeleted when deleting last and adding n nodes

2016-08-22 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-4153.
--
Resolution: Fixed

> segment's compareAgainstBaseState wont call childNodeDeleted when deleting 
> last and adding n nodes
> --
>
> Key: OAK-4153
> URL: https://issues.apache.org/jira/browse/OAK-4153
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Affects Versions: 1.2.13, 1.0.29, 1.4.1, 1.5.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.4.7, 1.2.19, 1.0.34, 1.5.1
>
> Attachments: OAK-4153-2.patch, OAK-4153-3.patch, OAK-4153.patch, 
> OAK-4153.simplified.patch
>
>
> {{SegmentNodeState.compareAgainstBaseState}} fails to call 
> {{NodeStateDiff.childNodeDeleted}} when for the same parent the only child is 
> deleted and at the same time multiple new, different children are added.
> Reason is that the [current 
> code|https://github.com/apache/jackrabbit-oak/blob/a9ce70b61567ffe27529dad8eb5d38ced77cf8ad/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentNodeState.java#L558]
>  for '{{afterChildName == MANY_CHILD_NODES}}' *and* '{{beforeChildName == 
> ONE_CHILD_NODE}}' does not handle all cases: it assumes that 'after' contains 
> the 'before' child and doesn't handle the situation where the 'before' child 
> has gone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4153) segment's compareAgainstBaseState wont call childNodeDeleted when deleting last and adding n nodes

2016-08-22 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430543#comment-15430543
 ] 

Stefan Egli commented on OAK-4153:
--

[~edivad], I see what you mean, however is that really necessary as it changes 
the 'truth' about when this was indeed fixed in 1.5 (1.5.1). As 1.5 is an 
unstable release I'm not sure how important the release-notes are, but yes 
would have been better to open a separate ticket for the backports probably.

> segment's compareAgainstBaseState wont call childNodeDeleted when deleting 
> last and adding n nodes
> --
>
> Key: OAK-4153
> URL: https://issues.apache.org/jira/browse/OAK-4153
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Affects Versions: 1.2.13, 1.0.29, 1.4.1, 1.5.0
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: 1.5.1, 1.4.7, 1.2.19, 1.0.34
>
> Attachments: OAK-4153-2.patch, OAK-4153-3.patch, OAK-4153.patch, 
> OAK-4153.simplified.patch
>
>
> {{SegmentNodeState.compareAgainstBaseState}} fails to call 
> {{NodeStateDiff.childNodeDeleted}} when for the same parent the only child is 
> deleted and at the same time multiple new, different children are added.
> Reason is that the [current 
> code|https://github.com/apache/jackrabbit-oak/blob/a9ce70b61567ffe27529dad8eb5d38ced77cf8ad/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentNodeState.java#L558]
>  for '{{afterChildName == MANY_CHILD_NODES}}' *and* '{{beforeChildName == 
> ONE_CHILD_NODE}}' does not handle all cases: it assumes that 'after' contains 
> the 'before' child and doesn't handle the situation where the 'before' child 
> has gone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-02 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15458826#comment-15458826
 ] 

Stefan Egli commented on OAK-4581:
--

One more comment re
bq. I expect unbounded queues to have adverse effects on the hotness of the 
various caches.
As pointed out, I think we should do pre-filtering. But we should probably also 
do pre-calculation of the actual events we want to deliver. Assuming that the 
listener's filters are "good" it would be efficient to go through the whole 
filter and basically store pre-manufactured events as a result only. This would 
mean that once an event is persisted it is no longer dependent on _any_ cache. 
But it also means that what will be persisted will _actually_ be consumed later 
on - as the raw and final event itself would basically be stored, no later 
filtering whatsoever. 
Doing the filtering as early as possible would also add load to the system at 
commit time, thus result in a natural throttling. As basically the computation 
of all the events as they will be delivered to the listeners then becomes part 
of the commit.
(_two birds with one stone_)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464458#comment-15464458
 ] 

Stefan Egli commented on OAK-4581:
--

I wouldn't rule out that there are better fits, for sure. The advantage of 
using tarMk is that we'd use our own stuff for another use case too.
* _there will be a performance impact as the lists grows_ : we can store in 
sub-folders named with a time pattern, similar as sling jobs does
* _garbage collections is difficult and expensive_ : gc would be done in a 
generational approach: rewrite the queue in a new tarMk once certain thresholds 
are reached (ideally this could be done when the queue is empty, so no rewrite 
would be necessary). But gc on that tarMk incarnation would be turned off 
completely.
* _adding and removing elements from the list will cause rewrites of buckets 
instead of just a single append, remove_ : the idea is to store in batches, so 
I was assuming this shouldn't be that bad

Overall I'd find it simpler, but if ppl agree that we shouldn't use tarMk for 
this, then sure, let's switch.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464465#comment-15464465
 ] 

Stefan Egli commented on OAK-4581:
--

Waiting with further implementations until we agree on how to proceed

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1546#comment-1546
 ] 

Stefan Egli commented on OAK-4581:
--

So could we adjust the retention time dynamically (is there an API) when the 
queue grows?

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4739) lease: immediate renew after long renew call

2016-09-02 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15457947#comment-15457947
 ] 

Stefan Egli commented on OAK-4739:
--

Using the _recovery lock_ we should be able to distinguish the case where an 
instance failed to update the lease (eg due to network hickup) but no other 
instance noticed this yet (including discovery-lite) and a case where anyone 
noticed this. Basically we have clear state boundaries between a lease being 
valid and an instance being in recovery state. If this is the case (and I 
believe it is) then we could indeed do a retry in case the lease fails but the 
recovery lock has not yet been acquired, no?

> lease: immediate renew after long renew call
> 
>
> Key: OAK-4739
> URL: https://issues.apache.org/jira/browse/OAK-4739
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Affects Versions: 1.5.8
>Reporter: Martin Böttcher
>  Labels: resilience
>
> A single temporary network issue can shut down the DocumentStore. We observed 
> the following situation:
> # org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.renewLease was 
> called (this is done regularly and completely normal)
> # the network had a temporary issue (whatsoever)
> # the database call terminated after a lot of time (the default db 
> maxWaitTime is 120 seconds).
> # org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.renewLease 
> decides that the current lease is too old (>120 seconds thats the default for 
> the oak.documentMK.leaseDurationSeconds property), sets a leaseCheckFailed 
> variable and throws an Exception
> # because leaseCheckFailed is set all following tries (if any) will 
> immediately throw an Exception, too.
> I'd recommend to make the ClusterNodeInfo code more robust so that at least 
> one retry will be made.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-02 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15457991#comment-15457991
 ] 

Stefan Egli commented on OAK-4581:
--

[~mduerig], thanks for the comments!
bq. Do we know that we need to go off-heap with that queue?
Agreed, the entries are normally cheap, but generally speaking it's the open 
map mechanism that can make them unbounded, thus larger. But even assuming they 
are cheap on average, you can have a situation where you have such a high 
traffic burst that you're overwhelming even a highly optimized listener logic. 
In which case the queues grow and you'll get an OutOfMemoryError. The benefit 
of persisting the queues (when they become big that is) is for such rare 
special cases only. And you'd have to construct such a rare case where 
basically you'd force an OutOfMemoryError versus with this patch not.
bq. I expect unbounded queues to have adverse effects on the hotness of the 
various caches.
Right. There's some ideas left that aren't much mentioned or fleshed out yet: 
on the one hand we should do _pre-filtering_ of events, such that only events 
end up on queues that are indeed meant to go to a listener. The listener 
shouldn't have to filter afterwards anymore. Currently we're putting events on 
each listener's queues and only filter after hte fact. If queues become large, 
then this very fact becomes an issue exactly due to cache inefficiencies in 
this case. Ie a lot of computation is then lost purely to figure out if a 
listener needs an entry or not (as it can't find it in the cache anymore). So 
with prefiltering this would not be an issue anymore. 
What would be left though is the cache-inefficiency for actual events that 
listeners _want_. There we might optimize by including a bit more info into 
what we persist, perhaps the actual diff if it's not too big etc etc.
bq. Any thoughts on how unbounded queues should interact with gc?
One approach that we currently target is to checkpoint the oldest entry, such 
that we prevent gc from removing it (assuming checkpoints are respected).
bq. However I dislike having to cope with serialising the open CommitInfo 
class. At least we should rely on a general purpose library here. 
Open for alternatives for sure! I was assuming that we need to store the 
CommitInfo obj, as that's what persisting is mostly about. And if something in 
there is not serializable, then we're lost and have to skip it (we can warn 
loudly though). What exactly were you thinking of as alternatives?
bq. I don't think PersistedBlockingQueue should use a node store as its 
back-end.
I'm probably not getting the entirety of this point. I guess one argument to 
reuse the tarMk is that it's something we have and know we can use it - we can 
surely use something else, for sure. Regarding GC the idea was to _not_ rely on 
GCing that observation-tarMk but to use generations of tarMk similar to how 
that's done in persistent cache: so we'd throw away a whole tarMk set once we 
switched to a new one.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only 

[jira] [Resolved] (OAK-4728) tarmk's FileStoreBuilder.build should use mkdirs instead of mkdir

2016-08-31 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-4728.
--
Resolution: Fixed

changed to {{mkdirs}} in http://svn.apache.org/viewvc?rev=1758610=rev

> tarmk's FileStoreBuilder.build should use mkdirs instead of mkdir
> -
>
> Key: OAK-4728
> URL: https://issues.apache.org/jira/browse/OAK-4728
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: Segment Tar 0.0.10
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: Segment Tar 0.0.12
>
>
> [FileStoreBuilder.build|https://github.com/apache/jackrabbit-oak/blob/2b6c2f5340f3b6485dda5c493f6343d232c883e9/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStoreBuilder.java#L338]
>  uses {{mkdir}} which can be problematic when using non-standard directories 
> such as is perhaps intended with OAK-4655. Using {{mkdirs}} instead is more 
> robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4728) tarmk's FileStoreBuilder.build should use mkdirs instead of mkdir

2016-08-31 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-4728:


 Summary: tarmk's FileStoreBuilder.build should use mkdirs instead 
of mkdir
 Key: OAK-4728
 URL: https://issues.apache.org/jira/browse/OAK-4728
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Affects Versions: Segment Tar 0.0.10
Reporter: Stefan Egli
Assignee: Stefan Egli
 Fix For: Segment Tar 0.0.12


[FileStoreBuilder.build|https://github.com/apache/jackrabbit-oak/blob/2b6c2f5340f3b6485dda5c493f6343d232c883e9/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStoreBuilder.java#L338]
 uses {{mkdir}} which can be problematic when using non-standard directories 
such as is perhaps intended with OAK-4655. Using {{mkdirs}} instead is more 
robust.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (OAK-4655) Enable configuring multiple segment nodestore instances in same setup

2016-08-31 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned OAK-4655:


Assignee: Stefan Egli

> Enable configuring multiple segment nodestore instances in same setup
> -
>
> Key: OAK-4655
> URL: https://issues.apache.org/jira/browse/OAK-4655
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar, segmentmk
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
> Fix For: 1.6
>
>
> With OAK-4369 and OAK-4490 its now possible to configure a new 
> SegmentNodeStore to act as secondry nodestore (OAK-4180). Recently for few 
> other features we see a requirement to configure a SegmentNodeStore just for 
> storage purpose. For e.g.
> # OAK-4180 - Enables use of SegmentNodeStore as a secondary store to 
> compliment DocumentNodeStore
> #* Always uses BlobStore from primary DocumentNodeStore
> #* Compaction to be enabled
> # OAK-4654 - Enable use of SegmentNodeStore for private mount in a 
> multiplexing nodestore setup
> #* Might use its own blob store
> #* Compaction might be disabled as it would be read only
> # OAK-4581 - Proposes to make use of SegmentNodeStore for storing event queue 
> offline
> In all these setups we need to configure a SegmentNodeStore which has 
> following aspect
> # NodeStore instance is not directly exposed but exposed via 
> {{NodeStoreProvider}} interface with {{role}} service property specifying the 
> intended usage
> # NodeStore here is not fully functional i.e. it would not be configured with 
> std observers, would not be used by ContentRepository etc
> # It needs to be ensured that any JMX MBean registered accounts for "role" so 
> that there is no collision
> With existing SegmentNodeStoreService we can only configure 1 nodestore. To 
> support above cases we need a OSGi config factory based implementation which 
> enables creation of multiple SegmentNodeStore instances (each with different 
> directory and different settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4655) Enable configuring multiple segment nodestore instances in same setup

2016-08-31 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4655:
-
Attachment: OAK-4655.v1.patch

Attaching [^OAK-4655.v1.patch] which is a suggestion for a 
{{SegmentNodeStoreFactory}} (one for both oak-segment and oak-segment-tar - 
they're basically twins, perhaps we don't need both..) that registers 
{{NodeStoreProviders}} with the corresponding {{role}} set (the role coming 
from the config for {{SegmentNodeStoreFactory}}).

> Enable configuring multiple segment nodestore instances in same setup
> -
>
> Key: OAK-4655
> URL: https://issues.apache.org/jira/browse/OAK-4655
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar, segmentmk
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4655.v1.patch
>
>
> With OAK-4369 and OAK-4490 its now possible to configure a new 
> SegmentNodeStore to act as secondry nodestore (OAK-4180). Recently for few 
> other features we see a requirement to configure a SegmentNodeStore just for 
> storage purpose. For e.g.
> # OAK-4180 - Enables use of SegmentNodeStore as a secondary store to 
> compliment DocumentNodeStore
> #* Always uses BlobStore from primary DocumentNodeStore
> #* Compaction to be enabled
> # OAK-4654 - Enable use of SegmentNodeStore for private mount in a 
> multiplexing nodestore setup
> #* Might use its own blob store
> #* Compaction might be disabled as it would be read only
> # OAK-4581 - Proposes to make use of SegmentNodeStore for storing event queue 
> offline
> In all these setups we need to configure a SegmentNodeStore which has 
> following aspect
> # NodeStore instance is not directly exposed but exposed via 
> {{NodeStoreProvider}} interface with {{role}} service property specifying the 
> intended usage
> # NodeStore here is not fully functional i.e. it would not be configured with 
> std observers, would not be used by ContentRepository etc
> # It needs to be ensured that any JMX MBean registered accounts for "role" so 
> that there is no collision
> With existing SegmentNodeStoreService we can only configure 1 nodestore. To 
> support above cases we need a OSGi config factory based implementation which 
> enables creation of multiple SegmentNodeStore instances (each with different 
> directory and different settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation

2016-08-31 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4581:
-
Attachment: OAK-4581.v0.patch

Attached [^OAK-4581.v0.patch] (which bases on 
[OAK-4655.v1.patch|https://issues.apache.org/jira/secure/attachment/12826406/OAK-4655.v1.patch]
 that introduces support for multiple SegmentNodeStores).

This is an ongoing effort, but wanted to share progress early. Here's the 
status:
* introduces a {{NodeStateSerializer}} that NSs should somehow implement in 
order to map {{NodeState->String}} and vice-verca (for storage of the event).
* introduces a {{EventQueueFactory}} that is used by the BackgroundObserver 
instead of hardcoding creation of a {{newArrayBlockingQueue}} (it still does 
the latter for sling compatibility cases, but that we should get rid of)
** one version of this is above {{newArrayBlockingQueue}} - ie an in-memory 
queue that can still be used for a few cases
** the new version though is {{PersistedEventQueueFactory}} that creates 
{{PersistedBlockingQueue}} where the storing magic will happen.
*** this last one is early stages - currently just show-cases using a 
secondary/thirdary.. store for persistence. (But storing is quite unoptimized 
there atm).
* so again, the main idea is that the {{BackgroundObserver}} remains largely 
unchanged - it still works on a {{queue}} and wouldn't notice what's behind the 
queue. The logic for storing/retrieving via persistence is hidden behind a 
{{BlockingQueue}} implementation, that's the main point here I think.

The consequences that this introduces will be:
* we can only store values in the CommitInfo that are serializable - others 
would have to be skipped/get lost
* creators of {{new Jcr()/new Oak()}} would pass the corresponding 
{{EventQueueFactory}} - thus the mapping would be done in the 'jcr factory 
service'. The EventQueueFactory will then propagate down the Repository/Session 
chain to the BackgroundObserver.

I'll continue working on the {{PersistedBlockingQueue}} including testing but 
would appreciate some early feedback about this approach.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep 

[jira] [Assigned] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-12 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned OAK-4796:


Assignee: Stefan Egli

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-12 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4796:
-
Fix Version/s: 1.6

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-12 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4796:
-
Affects Version/s: 1.5.9

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-12 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-4796:


 Summary: filter events before adding to ChangeProcessor's queue
 Key: OAK-4796
 URL: https://issues.apache.org/jira/browse/OAK-4796
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Reporter: Stefan Egli


Currently the 
[ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
 is in charge of doing the event diffing and filtering and does so in a pooled 
Thread, ie asynchronously, at a later stage independent from the commit. This 
has the advantage that the commit is fast, but has the following potentially 
negative effects:
# events (in the form of ContentChange Objects) occupy a slot of the queue even 
if the listener is not interested in it - any commit lands on any listener's 
queue. This reduces the capacity of the queue for 'actual' events to be 
delivered. It therefore increases the risk that the queue fills - and when full 
has various consequences such as loosing the CommitInfo etc.
# each event==ContentChange later on must be evaluated, and for that a diff 
must be calculated. Depending on runtime behavior that diff might be expensive 
if no longer in the cache (documentMk specifically).

As an improvement, this diffing+filtering could be done at an earlier stage 
already, nearer to the commit, and in case the filter would ignore the event, 
it would not have to be put into the queue at all, thus avoiding occupying a 
slot and later potentially slower diffing.

The suggestion is to implement this via the following algorithm:

* During the commit, in a {{Validator}} the listener's filters are evaluated - 
in an as-efficient-as-possible manner (Reason for doing it in a Validator is 
that this doesn't add overhead as oak already goes through all changes for 
other Validators). As a result a _list of potentially affected observers_ is 
added to the {{CommitInfo}} (false positives are fine).
** Note that the above adds cost to the commit and must therefore be carefully 
done and measured
** One potential measure could be to only do filtering when listener's queues 
are larger than a certain threshold (eg 10)
* The ChangeProcessor in {{contentChanged}} (in the one created in 
[createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
 then checks the new commitInfo's _potentially affected observers_ list and if 
it's not in the list, adds a {{NOOP}} token at the end of the queue. If there's 
already a NOOP there, the two are collapsed (this way when a filter is not 
affected it would have a NOOP at the end of the queue). If later on a no-NOOP 
item is added, the NOOP's {{root}} is used as the {{previousRoot}} for the 
newly added {{ContentChange}} obj.
** To achieve that, the ContentChange obj is extended to not only have the "to" 
{{root}} pointer, but also the "from" {{previousRoot}} pointer which currently 
is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-29 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532623#comment-15532623
 ] 

Stefan Egli edited comment on OAK-4796 at 9/29/16 3:24 PM:
---

As discussed offline with Marcel, I'll work on a patch for the 2nd variant, so 
that we can compare the complexity/result. 

Also realized that the moment when filtering is applied is critical: 
prefiltering (in a CommitHook or Observer) might be applied with a filter A, 
which could potentially be changed to A' before the event is delivered. 
Currently though before delivering, the new filter A' would be applied. This is 
wrong. We need to either do prefiltering or postfiltering and can't mix the 
two. Therefore for prefiltering it's essential to pass around the applied 
filter in the ContentChange obj and use that later at delivery time.


was (Author: egli):
As discussed offline with Marcel, I'll work on a patch for the 2nd variant, so 
that we can compare the complexity/result.

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4898) Allow for external changes to have a CommitInfo attached

2016-10-06 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15551192#comment-15551192
 ] 

Stefan Egli commented on OAK-4898:
--

+1
besides allowing the additional infos mentioned, I think we should change the 
logic to have CommitInfo never be null - even in the overflow case - with the 
same arguments in that it would allow to pass state between/to Observers.

> Allow for external changes to have a CommitInfo attached
> 
>
> Key: OAK-4898
> URL: https://issues.apache.org/jira/browse/OAK-4898
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Chetan Mehrotra
> Fix For: 1.6
>
>
> Currently the observation logic relies on fact that CommitInfo being null 
> means that changes are from other cluster node i.e. external changes. 
> We should change this semantic and provide a different way to indicate that 
> changes are external. This would allow a NodeStore implementation to still 
> pass in a CommitInfo which captures useful information about commit like 
> brief summary on what got changed which can be used for pre filtering 
> (OAK-4796)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-10-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548765#comment-15548765
 ] 

Stefan Egli edited comment on OAK-4796 at 10/5/16 2:11 PM:
---

bq. This is wrong. We need to either do prefiltering or postfiltering and can't 
mix the two. Therefore for prefiltering it's essential to pass around the 
applied filter in the ContentChange obj and use that later at delivery time.
Coming back to this point, there seems to be some issues with this based on the 
current design: Prior to prefiltering we had only postfiltering. And changing 
the FilterProvider was applied immediately - basically on all elements in the 
queue. With prefiltering this is, as pointed out, not correct: those elements 
in the queue already have gone through prefiltering, so postfiltering should be 
done with the same FilterProvider. Which means, the ChangeProcessor - which is 
in charge of postfiltering - should not use the FilterProvider set on its 
instance, but use the same that was used for prefiltering. Therefore the 
ChangeProcessor needs to be given the FilterProvider for each change that it 
processes. The way it receives changes though is via the 
Observer.contentChanged. Therefore about the only feasible place to pass the 
FilterProvider from BackgroundObserver to ChangeProcessor is via the CommitInfo.

Thing now is that for external and overflow entries the CommitInfo is null. So 
I'd say, as long as that's the case it's very hard to implement correctly 
switching the filter.

Unless this switch is done correctly, the only thing that can be said is that: 
when a filter is changed it is undefined for which changes both filters are 
applied (if the queue is not empty when switching).


was (Author: egli):
bq. This is wrong. We need to either do prefiltering or postfiltering and can't 
mix the two. Therefore for prefiltering it's essential to pass around the 
applied filter in the ContentChange obj and use that later at delivery time.
Coming back to this point, there seems to be some issues with this based on the 
current design: Prior to prefiltering we had only postfiltering. And changing 
the FilterProvider was applied immediately - basically on all elements in the 
queue. With prefiltering this is, as pointed out, not correct: those elements 
in the queue already have gone through prefiltering, so postfiltering should be 
done with the same FilterProvider. Which means, the ChangeProcessor - which is 
in charge of postfiltering - should not use the FilterProvider set on its 
instance, but use the same that was used for prefiltering. Therefore the 
ChangeProcessor needs to be given the FilterProvider for each change that it 
processes. The way it receives changes though is via the 
Observer.contentChanged. Therefore about the only feasible place to pass the 
FilterProvider from BackgroundObserver to ChangeProcessor is via the CommitInfo.

Thing now is that for external and overflow entries the CommitInfo is null. So 
I'd say, as long as that's the case it's very hard to implement correctly 
switching the filter.

Unless this switch is done correctly, the only thing that can be said is that: 
when a filter is changed it is undefined if the old, the new or both filters 
are applied to entries in the queue.

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk 

[jira] [Commented] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-10-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548765#comment-15548765
 ] 

Stefan Egli commented on OAK-4796:
--

bq. This is wrong. We need to either do prefiltering or postfiltering and can't 
mix the two. Therefore for prefiltering it's essential to pass around the 
applied filter in the ContentChange obj and use that later at delivery time.
Coming back to this point, there seems to be some issues with this based on the 
current design: Prior to prefiltering we had only postfiltering. And changing 
the FilterProvider was applied immediately - basically on all elements in the 
queue. With prefiltering this is, as pointed out, not correct: those elements 
in the queue already have gone through prefiltering, so postfiltering should be 
done with the same FilterProvider. Which means, the ChangeProcessor - which is 
in charge of postfiltering - should not use the FilterProvider set on its 
instance, but use the same that was used for prefiltering. Therefore the 
ChangeProcessor needs to be given the FilterProvider for each change that it 
processes. The way it receives changes though is via the 
Observer.contentChanged. Therefore about the only feasible place to pass the 
FilterProvider from BackgroundObserver to ChangeProcessor is via the CommitInfo.

Thing now is that for external and overflow entries the CommitInfo is null. So 
I'd say, as long as that's the case it's very hard to implement correctly 
switching the filter.

Unless this switch is done correctly, the only thing that can be said is that: 
when a filter is changed it is undefined if the old, the new or both filters 
are applied to entries in the queue.

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as 

[jira] [Comment Edited] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-10-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15548765#comment-15548765
 ] 

Stefan Egli edited comment on OAK-4796 at 10/5/16 2:14 PM:
---

bq. This is wrong. We need to either do prefiltering or postfiltering and can't 
mix the two. Therefore for prefiltering it's essential to pass around the 
applied filter in the ContentChange obj and use that later at delivery time.
Coming back to this point, there seems to be some issues with this based on the 
current design: Prior to prefiltering we had only postfiltering. And changing 
the FilterProvider was applied immediately - basically on all elements in the 
queue. With prefiltering this is, as pointed out, not correct: those elements 
in the queue already have gone through prefiltering, so postfiltering should be 
done with the same FilterProvider. Which means, the ChangeProcessor - which is 
in charge of postfiltering - should not use the FilterProvider set on its 
instance, but use the same that was used for prefiltering. Therefore the 
ChangeProcessor needs to be given the FilterProvider for each change that it 
processes. The way it receives changes though is via the 
Observer.contentChanged. Therefore about the only feasible place to pass the 
FilterProvider from BackgroundObserver to ChangeProcessor is via the CommitInfo.

Thing now is that for external and overflow entries the CommitInfo is null. So 
I'd say, as long as that's the case it's very hard to implement correctly 
switching the filter.

Unless this switch is done correctly, the only thing that can be said is that: 
when a filter is changed and the queue is not empty, then both filters are 
applied. However the listener doesn't know anything about the queue internas, 
so cannot make any conclusions based on that.


was (Author: egli):
bq. This is wrong. We need to either do prefiltering or postfiltering and can't 
mix the two. Therefore for prefiltering it's essential to pass around the 
applied filter in the ContentChange obj and use that later at delivery time.
Coming back to this point, there seems to be some issues with this based on the 
current design: Prior to prefiltering we had only postfiltering. And changing 
the FilterProvider was applied immediately - basically on all elements in the 
queue. With prefiltering this is, as pointed out, not correct: those elements 
in the queue already have gone through prefiltering, so postfiltering should be 
done with the same FilterProvider. Which means, the ChangeProcessor - which is 
in charge of postfiltering - should not use the FilterProvider set on its 
instance, but use the same that was used for prefiltering. Therefore the 
ChangeProcessor needs to be given the FilterProvider for each change that it 
processes. The way it receives changes though is via the 
Observer.contentChanged. Therefore about the only feasible place to pass the 
FilterProvider from BackgroundObserver to ChangeProcessor is via the CommitInfo.

Thing now is that for external and overflow entries the CommitInfo is null. So 
I'd say, as long as that's the case it's very hard to implement correctly 
switching the filter.

Unless this switch is done correctly, the only thing that can be said is that: 
when a filter is changed it is undefined for which changes both filters are 
applied (if the queue is not empty when switching).

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on 

[jira] [Updated] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-10-05 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4796:
-
Attachment: OAK-4796.changeSet.patch

Attaching a second variant of the patch ([^OAK-4796.changeSet.patch]) which is 
based on Chetan's suggestion to compose a set of changes (parent-paths, 
propertyNames, nodeTypes, nodeNames) in an Editor (actually, I've used a 
ValidatorProvider/Validator pair), stores it as a property in the CommitContext 
of the CommitInfo, and evaluates it in oak-jcr's ChangeProcessor. This patch 
also includes some minimal statistics in the consolidated listener stats that 
shows how many commits were either skipped (because the feature was disabled or 
the CommitInfo null etc), included or excluded. The ObservationTest passes with 
the prefiltering enabled, however I plan to add some more specific testing 
still. 
Would welcome a review of this second approach (compared to the first which was 
EventFilter-based). /cc [~mreutegg], [~chetanm], [~mduerig], [~catholicon]

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-10-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549104#comment-15549104
 ] 

Stefan Egli commented on OAK-4581:
--

Summarizing those numerous, recent threaded comments I believe we have 
consensus that the goal is to get rid of external use of BackgroundObserver 
(not scope of this ticket, but a consequence) and that we do the *persistence 
on the oak-jcr/ChangeProcessor level*. This makes the persistence independent 
from cache and GC problems, works fine with filter changes, but has the 
downside that it means larger temporary storage required (as events are 
'exploded'). The actual file format of the stored events is somewhat of an 
orthogonal/detail question, but can be something like a flat file.
Unless I hear objections I'm looking at following up on these assumption in the 
next days.
/cc [~chetanm], [~mduerig], [~mreutegg], [~reschke], [~catholicon]

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-10-06 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15551799#comment-15551799
 ] 

Stefan Egli commented on OAK-4796:
--

Many thanks for this thorough review, [~chetanm]! I'll dig through them in 
detail. Agreeing on most points, except perhaps re
bq. Probably we can keep things simple here and focus on included paths 
not sure what exactly you mean here: we've collected nodeTypes too, so are you 
suggesting not to filter by nodeType? if yes, why?

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4907) Collect changes (paths, nts, props..) of a commit in a validator

2016-10-10 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4907:
-
Attachment: OAK-4907.patch

Attaching [^OAK-4907.patch] which contains mainly:
* ChangeSet : the data object holding actual items changed (paths, names, 
types, properties)
* ChangeCollectorProvider: a type ValidationProvider that can be hooked into 
Oak to have above ChangeSets be generated and stored in CommitContexts

> Collect changes (paths, nts, props..) of a commit in a validator
> 
>
> Key: OAK-4907
> URL: https://issues.apache.org/jira/browse/OAK-4907
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4907.patch
>
>
> It would be useful to collect a set of changes of a commit (eg in a 
> validator) that could later be used in an Observer for eg prefiltering.
> Such a change collector should collect paths, nodetypes, properties, 
> node-names (and perhaps more at a later stage) of all changes and store the 
> result in the CommitInfo's CommitContext.
> Note that this is a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-21 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509287#comment-15509287
 ] 

Stefan Egli commented on OAK-4581:
--

/cc [~mreutegg]

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (OAK-4655) Enable configuring multiple segment nodestore instances in same setup

2016-09-21 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned OAK-4655:


Assignee: Tomek Rękawek  (was: Stefan Egli)

[~tomek.rekawek], looks good, more generic than my version which had 
'observation' hardcoded. Am assigning the ticket to you then, thx!

> Enable configuring multiple segment nodestore instances in same setup
> -
>
> Key: OAK-4655
> URL: https://issues.apache.org/jira/browse/OAK-4655
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar, segmentmk
>Reporter: Chetan Mehrotra
>Assignee: Tomek Rękawek
> Fix For: 1.6
>
> Attachments: OAK-4655.v1.patch, OAK-4655.v2.patch
>
>
> With OAK-4369 and OAK-4490 its now possible to configure a new 
> SegmentNodeStore to act as secondry nodestore (OAK-4180). Recently for few 
> other features we see a requirement to configure a SegmentNodeStore just for 
> storage purpose. For e.g.
> # OAK-4180 - Enables use of SegmentNodeStore as a secondary store to 
> compliment DocumentNodeStore
> #* Always uses BlobStore from primary DocumentNodeStore
> #* Compaction to be enabled
> # OAK-4654 - Enable use of SegmentNodeStore for private mount in a 
> multiplexing nodestore setup
> #* Might use its own blob store
> #* Compaction might be disabled as it would be read only
> # OAK-4581 - Proposes to make use of SegmentNodeStore for storing event queue 
> offline
> In all these setups we need to configure a SegmentNodeStore which has 
> following aspect
> # NodeStore instance is not directly exposed but exposed via 
> {{NodeStoreProvider}} interface with {{role}} service property specifying the 
> intended usage
> # NodeStore here is not fully functional i.e. it would not be configured with 
> std observers, would not be used by ContentRepository etc
> # It needs to be ensured that any JMX MBean registered accounts for "role" so 
> that there is no collision
> With existing SegmentNodeStoreService we can only configure 1 nodestore. To 
> support above cases we need a OSGi config factory based implementation which 
> enables creation of multiple SegmentNodeStore instances (each with different 
> directory and different settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-21 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509914#comment-15509914
 ] 

Stefan Egli commented on OAK-4581:
--

One additional note: should we choose to go the I-B route (queue in 
ChangeProcessor), then this improvement will not become usable by Sling's 
ResourceChangeListener - as that has switched to using an OakResourceListener 
(based on Oak's recommendation to do so, see SLING-3279) and directly bases on 
BackgroundObserver...

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-19 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4796:
-
Attachment: OAK-4796.patch

Attaching patch ([^OAK-4796.patch]) which contains the functionality as 
described, here's the implementation again in detail:
* An Observer can now voluntarily implement an extension interface 
{{ObserverValidatorProvider}} which has a {{Validator getRootValidator()}} 
method
* The new ObservationFilterValidatorProvider is the main integrator: it has a 
reference to all such ObserverValidatorProviders and hooks them into the 
validators via a CompositeValidator
* There's now an extension to BackgroundObserver called 
{{PrefilteringBackgroundObserver}} which implements the new 
ObserverValidatorProvider and does so by mapping the Validator interface to the 
EventFilter interface.
* The BackgroundObserver handles the new 
{{"oak.observation.observerFiltersEvaluated"}} and 
{{"oak.observation.interestedObservers"}} properties that are set on the 
CommitContext and if it figures a listener does *not* need any event for a 
commit, marks the *next non filtered* ContentChange with a new 
{{noopPreviousRoot}} (which is a pointer to the last filtered root)
** This noopPreviousRoot is then used to send out a new 
{{CommitInfo.NOOP_CHANGE}} token, which indicates to Observers that they should 
ignore this contentChanged call (but update the root/previousRoot accordingly).
* oak-jcr's ChangeProcessor now uses such a PrefilteringBackgroundObserver and 
does the last puzzle of filtering: it filters the entire change via 
{{includeCommit}} (which applies things like isExternal/isInternal)
* to limit potential performance effects the ChangeProcessor has a flag that 
control the size of the queue after which prefiltering will be done. Default 20.

Additionally, the patch can be used in 'test' mode, in which case a flag is set 
but BackgroundObserver doesn't evaluate it - instead the ChangeProcessor checks 
if the flag would have been correct. This test mode would be removed after 
enough confidence, but could be used in IT testing to verify that filtering 
would have been done correctly.

Pending tasks: 
* more test cases, IT
* performance testing

I would appreciate feedback/review of those having been involved in 
observation. /cc [~mduerig], [~chetanm], [~catholicon], [~mreutegg]

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)

[jira] [Commented] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-21 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509153#comment-15509153
 ] 

Stefan Egli commented on OAK-4796:
--

thx for the reviews! I'd look at fixing those points.
Before that would be good if we could decide which approach to take: the one 
suggested via the patch or the one suggested by Chetan.

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4759) Queueless change processor

2016-09-20 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506601#comment-15506601
 ] 

Stefan Egli commented on OAK-4759:
--

Regarding {{ConsolidatedChanges}}: I was wondering if we could not separate out 
the 'ignores commitInfo' part and introduce a similar, perhaps called 
{{IgnoresCommitInfo}} marker interface (point 2 from [suggestion on 
list|http://markmail.org/thread/3blnp3lmsc24nbea]): any listener flagged with 
that would set the {{ignoresEventInfo}} flag on the JackrabbitEventFilter and 
it would essentially always get {{null}} as the CommitInfo.
That would of course allow further optimizations under the hood. Eg in your 
case if a listener had both {{IgnoresCommitInfo}} and {{ConsolidatedChanges}} 
set, it would allow a queueless change processor...

> Queueless change processor
> --
>
> Key: OAK-4759
> URL: https://issues.apache.org/jira/browse/OAK-4759
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk, jcr
>Reporter: Marcel Reutegger
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4759.patch, OAK-4759.patch, jackrabbit-api.patch, 
> jackrabbit-api.patch
>
>
> The initial proposal for this improvement was:
> {quote}
> Change processing for listeners that are only interested in external
> events could be simplified because there is no commit info for
> external changes. The basic idea is that the node store implementation
> may be able to optimize batch processing of multiple external changes
> and does not need to process each external change individually. The
> DocumentNodeStore would use the journal to identify external changes
> and need to come up with a way to ignore overlapping local changes.
> With this new feature, expensive listeners that process local as well
> as external events could be split into two separate listeners, each
> optimized for the type of changes.
> {quote}
> Later the proposal was changed in a more general queueless change processor. 
> See first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-20 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506514#comment-15506514
 ] 

Stefan Egli commented on OAK-4581:
--

I'd like to move this ticket forward and believe we need a few decisions on the 
approach:

h4. I - Who to persist for
There are different possibilities as to where the persisted queue should sit:
h5. A - BackgroundObserver
In this class the BackgroundObserver's queue is persisted - and that can 
logically only be based on {{NodeState}}. This will thus support any type of 
downstream Observer including NodeObserver etc. Being based on Observer it 
requires GC-prevention.
Here's a list of concrete subvariants:
h6. 1 - store serialized root state
This seemlessly serializes and stores {{NodeState}} objects. Later on they are 
read and used for diffing. Which means the data must still be available to do 
the actual diff. This can be achieved by increasing the GC retention period one 
way or another. What's also important here is that the caches aren't poluted 
with these late diffs - ie they should probably not be stored in the cache in 
this late-delivery case.
h6. 2 - store serialized diff (and root state)
Besides serializing the {{NodeState}} this variant (also) stores the diff. This 
speeds up later diffing as the diff is then already there (it probably must be 
stored in 'a cache' temporarily, but only temporarily as it will only be used 
for one event, likely). This variant is still dependent on preventing GC 
though, as we're still on the Observer level, which works on {{NodeState}}.
h6. 3 - base it on the journal
Alternatively the journal is equipped with more diff-like information (perhaps 
with the full, but perhaps only partially), also see OAK-4586. Otherwise this 
has same characteristics as I-A-1 and I-A-2: GC must still be prevented, we're 
still on the Observation/NodeState level. It will be implementation dependent, 
as Segment doesn't have the same type of journal as Document has.
h5. B - ChangeProcessor
In this class the queue is handled on the ChangeProcessor level (not in 
BackgroundObserver), thus no longer based on NodeState, but now independent, 
the format just must be suitable for calculation and later delivery via 
onEvent. Being independent of NodeState allows to become independent of GC and 
cache-hotness issues. However, it's important to note that this class of 
solutions targets concrete EventListeners, not Observers in general!
h6. 1 - store serialized events
The ChangeProcessor calculates events as if for delivery to onEvent, but just 
persists the events as is. This will bloat the amount of data stored and 
increase I/O. However, later delivery is trivial as all the events are already 
there, they just have to be read and onEvent called.
h6. 2 - store serialized diff
The ChangeProcessor stores the serialized diff in a form that it can later be 
processed by the EventFilter and result in events for delivery to 
EventListener.onEvent. (This would then be independent from the NodeState)
h6. 3 - base it on the journal
If the journal contains the complete diff such that ChangeProcessor can 
evaluate the filters and deliver, then the journal could be enough (however 
that might be tricky to achieve). Also, this will be implementation dependent, 
as Segment doesn't have the same type of journal as Document has.
In any case, additionally the CommitInfo must be stored somewhere, either also 
in the Journal or per ChangeProcessor.
h4. II Serializing CommitInfo
Not sure if we have many options here, I think it's just something we have to 
do. And if Oak code prevents serialization, then we can fix it. If it's 
upper/application-layer code that causes problems, we can't do much other than 
issue a warn.
h4. III - Storage Layer
This depends a bit on the actual solution chosen. If we base it eg on journal, 
then a lot comes from there already. If we persist flat events, then surely an 
extra storage is needed.
h5. A - Use a SegmentNodeStore
* would be straight forward but has issues as mentioned by Michael.

h5. B - Use internas of SegmentNodeStore, eg SegmentWriter
* might be much more optimal, but adds dependencies on internas of tarmk.

h5. C - store as JSON in a flat file

[~mduerig], [~chetanm], [~catholicon], [~tmueller], which variant should we 
implement?

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a 

[jira] [Commented] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-20 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506572#comment-15506572
 ] 

Stefan Egli commented on OAK-4796:
--

[~chetanm], I see, your approach is completely different. Main differences I 
see:
# ObserverValidatorProvider approach:
* 100% filtering of local and external events (whereas the external part is not 
yet implemented, actually, but would be similar)
* for external events the diffing is done as today, so no performance 
improvements there. But we can also filter entire external events for the 
individual listeners as for local ones, just at a different location (in the 
backgroundRead somewhere)
# Extracted-Data approach:
* not 100% filtering, but perhaps close
* makes diffing for other instances in the cluster cheaper

so... let's decide which one to go for. 

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4677) stop oak-core bundle only transiently on lease failure

2016-08-17 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-4677:


 Summary: stop oak-core bundle only transiently on lease failure
 Key: OAK-4677
 URL: https://issues.apache.org/jira/browse/OAK-4677
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: documentmk
Affects Versions: 1.5.8, 1.4.6
Reporter: Stefan Egli
Assignee: Stefan Egli
 Fix For: 1.4.7, 1.5.9


Since OAK-3397 the oak-core bundle is stopped (via {{bundle.stop();}}) when the 
lease with the document store times out (ie lease failed to be updated in 
time). Using {{bundle.stop();}} leads to an unwanted side-effect, namely that 
it _changes and persists the autostart_ settings of the bundle. Ie on next 
startup the oak-core bundle will not automatically start.

Using {{bundle.stop(Bundle.STOP_TRANSIENT);}} would avoid this and achieve 
exactly what was the original intension: to stop the bundle (temporarily) until 
restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-1312) Bundle nodes into a document

2016-08-17 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15424175#comment-15424175
 ] 

Stefan Egli commented on OAK-1312:
--

Suggestions for an alternative name for _bundling nodes_:
* _collapse_ nodes
* _embed_ nodes

> Bundle nodes into a document
> 
>
> Key: OAK-1312
> URL: https://issues.apache.org/jira/browse/OAK-1312
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>Assignee: Chetan Mehrotra
>  Labels: performance
> Fix For: 1.6
>
>
> For very fine grained content with many nodes and only few properties per 
> node it would be more efficient to bundle multiple nodes into a single 
> MongoDB document. Mostly reading would benefit because there are less 
> roundtrips to the backend. At the same time storage footprint would be lower 
> because metadata overhead is per document.
> Feature branch - 
> https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4677) stop oak-core bundle only transiently on lease failure

2016-08-17 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-4677.
--
Resolution: Fixed

* [fixed in trunk|http://svn.apache.org/viewvc?rev=1756584=rev]
* [fixed in 1.4-branch|http://svn.apache.org/viewvc?rev=1756585=rev]

> stop oak-core bundle only transiently on lease failure
> --
>
> Key: OAK-4677
> URL: https://issues.apache.org/jira/browse/OAK-4677
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Affects Versions: 1.4.6, 1.5.8
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.4.7, 1.5.9
>
>
> Since OAK-3397 the oak-core bundle is stopped (via {{bundle.stop();}}) when 
> the lease with the document store times out (ie lease failed to be updated in 
> time). Using {{bundle.stop();}} leads to an unwanted side-effect, namely that 
> it _changes and persists the autostart_ settings of the bundle. Ie on next 
> startup the oak-core bundle will not automatically start.
> Using {{bundle.stop(Bundle.STOP_TRANSIENT);}} would avoid this and achieve 
> exactly what was the original intension: to stop the bundle (temporarily) 
> until restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-08-29 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445633#comment-15445633
 ] 

Stefan Egli commented on OAK-4581:
--

Re {{A.1 - Serialized Root States}} I see the following options:
# Store entire diff in a json format
# Assume that incoming NodeState in {{contentChanged(NodeState,CommitInfo)}} is 
still the current root, then create a checkpoint and store the checkpoint Id.
# a combination of 1. and 2. above: for small diffs store the diff, for large 
diffs, create a checkpoint
# or introduce a more lightweight checkpoint mechanism which allows to 
_serialize NodeState_ (compared to {{checkpoint}} which is more heavy weight as 
it stores an id->nodeState mapping including properties).

[~chetanm], wdyt, I think having the option to do 4. would be good, but afaics 
would require a change in either the {{NodeState}} or the {{NodeStore}} APIs. I 
fear that the assumption done in 2. would perhaps not always hold and is a bit 
a weak link - which would mean that the only other option would be 1. which is 
expensive.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (OAK-4581) Persistent local journal for more reliable event generation

2016-08-23 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned OAK-4581:


Assignee: Stefan Egli

I'll be looking at this one and will try to come up with a patch

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4717) TarNodeStore.checkpoint methods represent endless loop

2016-08-29 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-4717:


 Summary: TarNodeStore.checkpoint methods represent endless loop
 Key: OAK-4717
 URL: https://issues.apache.org/jira/browse/OAK-4717
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: upgrade
Affects Versions: 1.5.8
Reporter: Stefan Egli
Assignee: Tomek Rękawek


Noticed that in 
[TarNodeStore|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/cli/node/TarNodeStore.java#L88]
 all checkpoint related methods are endless loops. [~tomek.rekawek], is that 
intentional or a bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-27 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15525978#comment-15525978
 ] 

Stefan Egli commented on OAK-4581:
--

Created SLING-6070 for invoking reportChanges after child traversal finished

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4855) Expose actual listener.toString in consolidated listener mbean

2016-09-27 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15525741#comment-15525741
 ] 

Stefan Egli commented on OAK-4855:
--

[~chetanm], could you pls have a look at OAK-4855 and JCR-4032? I think it 
would be a simple but useful, small extension. Thx!

> Expose actual listener.toString in consolidated listener mbean
> --
>
> Key: OAK-4855
> URL: https://issues.apache.org/jira/browse/OAK-4855
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.10
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.5.11
>
> Attachments: OAK-4855.patch
>
>
> With SLING-6056 more listeners (in the form of ResourceChangeListeners) will 
> be mapped 1:1 to either BackgroundObservers or JCR EventListeners. That 
> means, they will also be exposed in the consolidated listeners stats. Without 
> any change though, all that can be seen in that stats is the name of that 
> 'bridge/mapper' listener (ie either JcrResourceListener or 
> OakResourceListener), since currently all that is exposed is 
> {{getClass().toString()}} - and the actual ResourceChangeListener sitting 2 
> steps behind is not visible.
> In JCR-4032 I'm suggesting to introduce a {{getToString()}} to the 
> EventListenerMBean, and once that would be available, this could be exposed 
> in the ConsolidatedListenerMBeanImpl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-27 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15525942#comment-15525942
 ] 

Stefan Egli commented on OAK-4581:
--

* re 'maps of added/removed/changed items' : I think we might be able to remove 
these, or lets say drastically reduce its size: by knowing that the events are 
coming in in a breadth first manner, the JcrResourceListener could build up and 
report changes whenever a child traversal finished - no need to wait with 
calling reportChanges until the very end (as the OakResourceListener sends out 
events also after child traversal finished).
* re 'osgiEventQueue' : that has gone with SLING-5163 / 
[here|https://github.com/apache/sling/commit/9c424dfca6a802a6b66b4b7981a313c5856a0e1f]
* Re 'Access control' : that's probably still an open issue indeed, however it 
is the case for both OakResourceListener *and* JcrResourceListener, so that's 
not an argument against JcrResourceListener

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-29 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532190#comment-15532190
 ] 

Stefan Egli commented on OAK-4581:
--

[~cziegeler], so IIUC then support for these changed/added/removed properties 
is the reason why these maps have to first be filled in the 
JcrResourceListener, and this is causing memory issue if we're talking about a 
huge change. If we no longer propagate property changes in the JRL, then we 
could avoid these maps. Do I understand you correct that you're suggesting to 
remove this support then? This would allow us indeed to get rid of the 
OakResourceListener.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4859) Warn if lease update is invoked with large delay

2016-09-28 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15529441#comment-15529441
 ] 

Stefan Egli commented on OAK-4859:
--

I wasn't too much aware of OAK-4770 details, good to know!
So OAK-4770 measure how long doing one particular lease update takes, while I 
was thinking of measuring the time between 2 calls to {{renewLease()}}.

> Warn if lease update is invoked with large delay
> 
>
> Key: OAK-4859
> URL: https://issues.apache.org/jira/browse/OAK-4859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Affects Versions: 1.5.10
>Reporter: Stefan Egli
> Fix For: 1.6
>
>
> DocumentMk's lease mechanism is built on the assumption that the lease is 
> periodically updated by each instance. If the update doesn't happen within a 
> certain time - and the instance hasn't crashed - there's the risk of the own 
> lease to fail. It is therefore important that the lease update happens 
> without (large) delay according to the configured period.
> One pattern where this doesn't happen is when the VM is under heavy load due 
> to JVM-Full-GC cycles. It seems likely that a memory problem doesn't normally 
> happen instantly, but slowly builds up. Based on this assumption we could 
> introduce a check that compares the actual time since last lease update with 
> the desired period. If these two diverge _a lot_ then we can at least issue a 
> log.warn. This might help to easier identify this type of lease failures and 
> perhaps find root causes earlier/easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4859) Warn if lease update is invoked with large delay

2016-09-28 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-4859:


 Summary: Warn if lease update is invoked with large delay
 Key: OAK-4859
 URL: https://issues.apache.org/jira/browse/OAK-4859
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: documentmk
Affects Versions: 1.5.10
Reporter: Stefan Egli
 Fix For: 1.6


DocumentMk's lease mechanism is built on the assumption that the lease is 
periodically updated by each instance. If the update doesn't happen within a 
certain time - and the instance hasn't crashed - there's the risk of the own 
lease to fail. It is therefore important that the lease update happens without 
(large) delay according to the configured period.
One pattern where this doesn't happen is when the VM is under heavy load due to 
JVM-Full-GC cycles. It seems likely that a memory problem doesn't normally 
happen instantly, but slowly builds up. Based on this assumption we could 
introduce a check that compares the actual time since last lease update with 
the desired period. If these two diverge _a lot_ then we can at least issue a 
log.warn. This might help to easier identify this type of lease failures and 
perhaps find root causes earlier/easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-27 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526714#comment-15526714
 ] 

Stefan Egli commented on OAK-4581:
--

[~cziegeler], re
bq. Now, an additional reason at least for Sling was that the event bridge we 
have was reading all added/changed nodes to get the resource type property.
I can't find this dependency in the code, do you know where that's coded in 
JcrResourceListener? IIUC then the reason for having these 
addedEvents/changedEvents/removedEvents maps in onEvent is to be able to build 
a JcrResourceChanged obj that contains all changed properties (unrelated to 
resource type - but perhaps that's hidden somewhere down deep..)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-27 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526714#comment-15526714
 ] 

Stefan Egli edited comment on OAK-4581 at 9/27/16 5:02 PM:
---

[~cziegeler], re
bq. Now, an additional reason at least for Sling was that the event bridge we 
have was reading all added/changed nodes to get the resource type property.
-I can't find this dependency in the code, do you know where that's coded in 
JcrResourceListener? IIUC then the reason for having these 
addedEvents/changedEvents/removedEvents maps in onEvent is to be able to build 
a JcrResourceChanged obj that contains all changed properties (unrelated to 
resource type - but perhaps that's hidden somewhere down deep..)-
EDIT: Are you referring to 
[OsgiObservationBridge.sendOsgiEvent:131|https://github.com/apache/sling/blob/9c424dfca6a802a6b66b4b7981a313c5856a0e1f/bundles/resourceresolver/src/main/java/org/apache/sling/resourceresolver/impl/observation/OsgiObservationBridge.java#L131]
 ? IIUC that reads the current resource explicitly to get the resourceType. So 
sure, having the resource type as part of the event (as OAK-4853 would provide) 
would be a handy thing, even if slighlty unexpected. However, I fail to see how 
this justifies the addedEvents/changedEvents/removedEvents in 
JcrResourceListener. IIUC the reason for building these maps is to be able to 
generate _correct_ JcrResourceChange objs - as they must contain all properties 
of a particular node - and those come in separate {{Events}}. So if the goal is 
to prevent those maps to become huge (or to not have them at all), then this 
has nothing to do with the resource type in my view. If we shouldn't rely on 
the breadth-first traversal, then the only alternative would be to get all 
properties that have been changed/added/removed on the corresponding {{Event}} 
of the node (which is slightly different from OAK-4853). wdyt?


was (Author: egli):
[~cziegeler], re
bq. Now, an additional reason at least for Sling was that the event bridge we 
have was reading all added/changed nodes to get the resource type property.
I can't find this dependency in the code, do you know where that's coded in 
JcrResourceListener? IIUC then the reason for having these 
addedEvents/changedEvents/removedEvents maps in onEvent is to be able to build 
a JcrResourceChanged obj that contains all changed properties (unrelated to 
resource type - but perhaps that's hidden somewhere down deep..)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care 

[jira] [Commented] (OAK-4855) Expose actual listener.toString in consolidated listener mbean

2016-10-03 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541827#comment-15541827
 ] 

Stefan Egli commented on OAK-4855:
--

right, there's one "toString" too many. I'll wait for the next jackrabbit 
release and will then finish this one up.

> Expose actual listener.toString in consolidated listener mbean
> --
>
> Key: OAK-4855
> URL: https://issues.apache.org/jira/browse/OAK-4855
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.10
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4855.patch
>
>
> With SLING-6056 more listeners (in the form of ResourceChangeListeners) will 
> be mapped 1:1 to either BackgroundObservers or JCR EventListeners. That 
> means, they will also be exposed in the consolidated listeners stats. Without 
> any change though, all that can be seen in that stats is the name of that 
> 'bridge/mapper' listener (ie either JcrResourceListener or 
> OakResourceListener), since currently all that is exposed is 
> {{getClass().toString()}} - and the actual ResourceChangeListener sitting 2 
> steps behind is not visible.
> In JCR-4032 I'm suggesting to introduce a {{getToString()}} to the 
> EventListenerMBean, and once that would be available, this could be exposed 
> in the ConsolidatedListenerMBeanImpl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522996#comment-15522996
 ] 

Stefan Egli commented on OAK-4581:
--

that sounds like a new class of listener that would want 'at-least-once' 
delivery in a cluster. Something probably useful, but I'm not sure if that fits 
into the observation umbrella of the JCR API. I think that's orthogonal to this 
ticket (somewhat) and could probably be handled separately?

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522461#comment-15522461
 ] 

Stefan Egli commented on OAK-4581:
--

[~catholicon], thx for your feedback! Re
bq. I thought this issue was about persisting change pointers
Agreed. And when looking at persisting NodeState(+CommitInfo) then the revGC 
issue comes up (you must later be able to do the diff, and revGC must not clean 
up things before the persited observation queues haven't been processed). And 
from this resulted the idea to not persist NodeState but the actual, calculated 
Event (even though that would bloat the storage, as it would become much 
simpler). However, this now again conflicts with support for any type of 
BackgroundObserver, not only ChangeProcessor. So I think the latter question 
becomes central now, and if we want to support any BackgroundObserver we need 
to persist NodeState and prevent revGC from cleaning up too early.
bq. Afaics, we still want remain wary of infinite storage of pointers
bq. Sure if we're saying that revGC deleted node states 
Exactly. There's a dilemma: we want to prevent revGC to being 'paused' for too 
long just because of observation - but if it isn't paused then such a slow or 
overwhelmed listener would loose events. We have to make a choice, it's a 
binary thing. Perhaps we have to cut off events after a certain time (eg after 
exactly 24hours to fit into the segment-tar's default generation cycle)? (The 
advantage of persisting events would have been that it wouldn't have needed 
such a cut-off..)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> 

[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523087#comment-15523087
 ] 

Stefan Egli commented on OAK-4581:
--

bq. 3. those observers get events post reboot
Not sure that's really the case. I would argue that after a restart these 
persisted events get deleted. IMO what 'persist' in the context of this ticket 
emphasizes is only additional memory at runtime, not that it survives restarts.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4855) Expose actual listener.toString in consolidated listener mbean

2016-09-27 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4855:
-
Attachment: OAK-4855.patch

Attaching patch [^OAK-4855.patch]

> Expose actual listener.toString in consolidated listener mbean
> --
>
> Key: OAK-4855
> URL: https://issues.apache.org/jira/browse/OAK-4855
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.10
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.5.11
>
> Attachments: OAK-4855.patch
>
>
> With SLING-6056 more listeners (in the form of ResourceChangeListeners) will 
> be mapped 1:1 to either BackgroundObservers or JCR EventListeners. That 
> means, they will also be exposed in the consolidated listeners stats. Without 
> any change though, all that can be seen in that stats is the name of that 
> 'bridge/mapper' listener (ie either JcrResourceListener or 
> OakResourceListener), since currently all that is exposed is 
> {{getClass().toString()}} - and the actual ResourceChangeListener sitting 2 
> steps behind is not visible.
> In JCR-4032 I'm suggesting to introduce a {{getToString()}} to the 
> EventListenerMBean, and once that would be available, this could be exposed 
> in the ConsolidatedListenerMBeanImpl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4855) Expose actual listener.toString in consolidated listener mbean

2016-09-27 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-4855:


 Summary: Expose actual listener.toString in consolidated listener 
mbean
 Key: OAK-4855
 URL: https://issues.apache.org/jira/browse/OAK-4855
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.10
Reporter: Stefan Egli
Assignee: Stefan Egli
 Fix For: 1.5.11


With SLING-6056 more listeners (in the form of ResourceChangeListeners) will be 
mapped 1:1 to either BackgroundObservers or JCR EventListeners. That means, 
they will also be exposed in the consolidated listeners stats. Without any 
change though, all that can be seen in that stats is the name of that 
'bridge/mapper' listener (ie either JcrResourceListener or 
OakResourceListener), since currently all that is exposed is 
{{getClass().toString()}} - and the actual ResourceChangeListener sitting 2 
steps behind is not visible.
In JCR-4032 I'm suggesting to introduce a {{getToString()}} to the 
EventListenerMBean, and once that would be available, this could be exposed in 
the ConsolidatedListenerMBeanImpl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-09-29 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532623#comment-15532623
 ] 

Stefan Egli commented on OAK-4796:
--

As discussed offline with Marcel, I'll work on a patch for the 2nd variant, so 
that we can compare the complexity/result.

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4916) Add support for excluding commits to BackgroundObserver

2016-10-10 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4916:
-
Attachment: OAK-4916.patch

Attaching [^OAK-4916.patch] which implements the {{isExcluded}} subclass hook 
as well as the {{NOOP_CHANGED}} CommitInfo in BackgroundObserver as described, 
including test cases.

> Add support for excluding commits to BackgroundObserver
> ---
>
> Key: OAK-4916
> URL: https://issues.apache.org/jira/browse/OAK-4916
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4916.patch
>
>
> As part of pre-filtering commits it would be useful to have support in the 
> BackgroundObserver (in general) that would allow to exclude certain commits 
> from being added to the (BackgroundObserver's) queue, thus keeping the queue 
> smaller. The actual filtering is up to subclasses.
> The suggested implementation is as follows:
> * a new method {{isExcluded}} is introduced which represents a subclass hook 
> for filtering
> * excluded commits are not added to the queue
> * when multiple commits are excluded subsequently, this is collapsed
> * the first non-excluded commit (ContentChange) added to the queue is marked 
> with the last non-excluded root state as the 'previous root'
> * downstream Observers are notified of the exclusion of a commit via a 
> special CommitInfo {{NOOP_CHANGE}}: this instructs it to exclude this change 
> while at the same time 'fast-forwarding' the root state to the new one.
> ** this extra token is one way of solving the problem that 
> {{Observer.contentChanged}} represents a diff between two states but does not 
> transport the 'from' state explicitly - that is implicitly taken from the 
> previous call to {{contentChanged}}. Thus using such a gap token 
> ({{NOOP_CHANGE}}) seems to be the only way to instruct Observers to skip a 
> change.
> To repeat: whoever extends BackgroundObserver with filtering must be aware of 
> the new {{NOOP_CHANGE}} token. Anyone not doing filtering will not get any 
> {{NOOP_CHANGE}} tokens though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-5023) add applyOnOwnNode property to OakEventFilter

2016-10-27 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-5023:


 Summary: add applyOnOwnNode property to OakEventFilter
 Key: OAK-5023
 URL: https://issues.apache.org/jira/browse/OAK-5023
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.12
Reporter: Stefan Egli
Assignee: Stefan Egli


There seems to be a rather frequent use case of observation around which would 
like to create a filter on a _child_ rather than on a _parent_: consider the 
case when you'd like to filter for the removal of a node that has a particular 
nodeType. This can't be achieved atm as the nodeType is applicable to the 
parent of the node that changes, not the node itself (ie child).

Therefore suggesting the introduction of a flag similar to the following:
{code}
boolean applyOnOwnNode;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-5019) Support glob patterns through OakEventFilter

2016-10-27 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5019:
-
Description: 
(Originally reported as JCR-4044, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description: )

In the Sling project, we would like to register JCR listeners based on glob 
patterns as defined in
https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/observation/filter/GlobbingPathFilter.html

So basically instead (or in addition) to specifying an absolute path, defining 
patterns.

[Discussion 
thread|https://lists.apache.org/thread.html/300f84574bbb039cebe35aab1afc21e043560a1c0471e456a2f5c0be@%3Cdev.jackrabbit.apache.org%3E]
/cc [~cziegeler]

  was:
(Originally reported as JCR-4044, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description:)

In the Sling project, we would like to register JCR listeners based on glob 
patterns as defined in
https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/observation/filter/GlobbingPathFilter.html

So basically instead (or in addition) to specifying an absolute path, defining 
patterns.

[Discussion 
thread|https://lists.apache.org/thread.html/300f84574bbb039cebe35aab1afc21e043560a1c0471e456a2f5c0be@%3Cdev.jackrabbit.apache.org%3E]
/cc [~cziegeler]


> Support glob patterns through OakEventFilter
> 
>
> Key: OAK-5019
> URL: https://issues.apache.org/jira/browse/OAK-5019
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.12
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>
> (Originally reported as JCR-4044, but moved to Oak as a result of introducing 
> the OakEventFilter in OAK-5013. From the original description: )
> In the Sling project, we would like to register JCR listeners based on glob 
> patterns as defined in
> https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/observation/filter/GlobbingPathFilter.html
> So basically instead (or in addition) to specifying an absolute path, 
> defining patterns.
> [Discussion 
> thread|https://lists.apache.org/thread.html/300f84574bbb039cebe35aab1afc21e043560a1c0471e456a2f5c0be@%3Cdev.jackrabbit.apache.org%3E]
> /cc [~cziegeler]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-5023) add applyOnOwnNode property to OakEventFilter

2016-10-27 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5023:
-
Description: 
(Originally reported as JCR-4048, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description: )

There seems to be a rather frequent use case of observation around which would 
like to create a filter on a _child_ rather than on a _parent_: consider the 
case when you'd like to filter for the removal of a node that has a particular 
nodeType. This can't be achieved atm as the nodeType is applicable to the 
parent of the node that changes, not the node itself (ie child).

Therefore suggesting the introduction of a flag similar to the following:
{code}
boolean applyOnOwnNode;
{code}

  was:
There seems to be a rather frequent use case of observation around which would 
like to create a filter on a _child_ rather than on a _parent_: consider the 
case when you'd like to filter for the removal of a node that has a particular 
nodeType. This can't be achieved atm as the nodeType is applicable to the 
parent of the node that changes, not the node itself (ie child).

Therefore suggesting the introduction of a flag similar to the following:
{code}
boolean applyOnOwnNode;
{code}


> add applyOnOwnNode property to OakEventFilter
> -
>
> Key: OAK-5023
> URL: https://issues.apache.org/jira/browse/OAK-5023
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.12
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>
> (Originally reported as JCR-4048, but moved to Oak as a result of introducing 
> the OakEventFilter in OAK-5013. From the original description: )
> There seems to be a rather frequent use case of observation around which 
> would like to create a filter on a _child_ rather than on a _parent_: 
> consider the case when you'd like to filter for the removal of a node that 
> has a particular nodeType. This can't be achieved atm as the nodeType is 
> applicable to the parent of the node that changes, not the node itself (ie 
> child).
> Therefore suggesting the introduction of a flag similar to the following:
> {code}
> boolean applyOnOwnNode;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-5019) Support glob patterns through OakEventFilter

2016-10-27 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-5019:


 Summary: Support glob patterns through OakEventFilter
 Key: OAK-5019
 URL: https://issues.apache.org/jira/browse/OAK-5019
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.12
Reporter: Stefan Egli
Assignee: Stefan Egli


(Originally reported as JCR-4044, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description:)

In the Sling project, we would like to register JCR listeners based on glob 
patterns as defined in
https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/observation/filter/GlobbingPathFilter.html

So basically instead (or in addition) to specifying an absolute path, defining 
patterns.

[Discussion 
thread|https://lists.apache.org/thread.html/300f84574bbb039cebe35aab1afc21e043560a1c0471e456a2f5c0be@%3Cdev.jackrabbit.apache.org%3E]
/cc [~cziegeler]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5013) Introduce observation filter extension mechanism to Oak

2016-10-27 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15611199#comment-15611199
 ] 

Stefan Egli commented on OAK-5013:
--

agreed, I'll adjust the wording accordingly. "write-through" was meant to be 
only between OakEventListener and the underlying, not that it modifies the 
filter after registration.

> Introduce observation filter extension mechanism to Oak
> ---
>
> Key: OAK-5013
> URL: https://issues.apache.org/jira/browse/OAK-5013
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.12
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.5.13
>
> Attachments: OAK-5013.patch
>
>
> During [discussions|http://markmail.org/thread/fyngo4dwb7fvlqdj] regarding 
> extending JackrabbitEventFilter an interesting suggestion by [~mduerig] came 
> up that we could extend the JackrabbitEventFilter in oak explicitly, rather 
> than modifying it in Jackrabbit each time we add something oak-specific.
> We should introduce a builder/interface pair in oak to reflect that. 
> Users that would like to use such oak-specific extensions would wrap a 
> JackrabbitEventFilter and get an OakEventFilter that contains enabling these 
> extensions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-5021) Improve observation of files

2016-10-27 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-5021:


 Summary: Improve observation of files
 Key: OAK-5021
 URL: https://issues.apache.org/jira/browse/OAK-5021
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.12
Reporter: Stefan Egli
Assignee: Stefan Egli


(Originally reported as JCR-4046, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description: )

A file in JCR is represented by at least two nodes, the nt:file node and a 
child node named jcr:content holding the contents of the file (and metadata).
This has the consequence that if the contents of a file changes, a change event 
of the jcr:content node is reported - but not of the nt:file node.
This makes creating listeners listening for changes in files complicated, as 
you can't use the file name to filter  - especially with glob patterns (see 
JCR-4044 - now OAK-5019) this becomes troublesome.
In addition, whenever you get a change for a jcr:content node, you have to 
check if the parent is a nt:file node and decide based on the result.
It would be great to have a flag on the JackrabbitEventFilter to enable smarter 
reporting just for nt:files: if a property on jcr:content is changed, a change 
to the nt:file node is reported.
See also SLING-6163 and OAK-4940

/cc [~cziegeler]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-5022) add includeSubtreeOnDelete flag to OakEventFilter

2016-10-27 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-5022:


 Summary: add includeSubtreeOnDelete flag to OakEventFilter
 Key: OAK-5022
 URL: https://issues.apache.org/jira/browse/OAK-5022
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.12
Reporter: Stefan Egli
Assignee: Stefan Egli


(Originally reported as JCR-4037, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description: )

In some cases of observation it would be useful to receive events of child node 
or properties of a parent/grandparent that was deleted. Currently (in Oak at 
least) one only receives a deleted event for the node that was deleted and none 
of the children.

Suggesting to (re)introduce a flag, eg as follows to the JackrabbitEventFilter:
{code}
boolean includeSubtreeOnRemove;
{code}

(Open for any better name of course)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-5020) Improved support for node removals

2016-10-27 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-5020:


 Summary: Improved support for node removals
 Key: OAK-5020
 URL: https://issues.apache.org/jira/browse/OAK-5020
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.12
Reporter: Stefan Egli
Assignee: Stefan Egli


(Originally reported as JCR-4045, but moved to Oak as a result of introducing 
the OakEventFilter in OAK-5013. From the original description: )

If a listener is subscribed for removal events of a subtree, e.g. /a/b/c/d it 
gets removal events for everything in that three.
However, if /a/b is removed, the listener is not informed at all, which makes 
the listener state inconsistent/invalid
I suggest to add a new flag to the JackrabbitEventFilter and if that is enabled 
the listener will get remove events of all the parent nodes - if the listener 
is interested in remove events of any kind.
/cc [~cziegeler]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-5013) Introduce observation filter extension mechanism to Oak

2016-10-26 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5013:
-
Fix Version/s: 1.5.13

> Introduce observation filter extension mechanism to Oak
> ---
>
> Key: OAK-5013
> URL: https://issues.apache.org/jira/browse/OAK-5013
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.12
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.5.13
>
> Attachments: OAK-5013.patch
>
>
> During [discussions|http://markmail.org/thread/fyngo4dwb7fvlqdj] regarding 
> extending JackrabbitEventFilter an interesting suggestion by [~mduerig] came 
> up that we could extend the JackrabbitEventFilter in oak explicitly, rather 
> than modifying it in Jackrabbit each time we add something oak-specific.
> We should introduce a builder/interface pair in oak to reflect that. 
> Users that would like to use such oak-specific extensions would wrap a 
> JackrabbitEventFilter and get an OakEventFilter that contains enabling these 
> extensions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-5013) Introduce observation filter extension mechanism to Oak

2016-10-26 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-5013:
-
Attachment: OAK-5013.patch

Attached [^OAK-5013.patch] shows a static factory and an abstract base class 
{{OakEventFilter}} which could serve as an API. Later changes to the 
{{OakEventFilter}} could then be done in a way that existing client code would 
not have to be touched ("@ProviderType")

> Introduce observation filter extension mechanism to Oak
> ---
>
> Key: OAK-5013
> URL: https://issues.apache.org/jira/browse/OAK-5013
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.12
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Attachments: OAK-5013.patch
>
>
> During [discussions|http://markmail.org/thread/fyngo4dwb7fvlqdj] regarding 
> extending JackrabbitEventFilter an interesting suggestion by [~mduerig] came 
> up that we could extend the JackrabbitEventFilter in oak explicitly, rather 
> than modifying it in Jackrabbit each time we add something oak-specific.
> We should introduce a builder/interface pair in oak to reflect that. 
> Users that would like to use such oak-specific extensions would wrap a 
> JackrabbitEventFilter and get an OakEventFilter that contains enabling these 
> extensions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-5013) Introduce observation filter extension mechanism to Oak

2016-10-26 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-5013:


 Summary: Introduce observation filter extension mechanism to Oak
 Key: OAK-5013
 URL: https://issues.apache.org/jira/browse/OAK-5013
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.5.12
Reporter: Stefan Egli
Assignee: Stefan Egli


During [discussions|http://markmail.org/thread/fyngo4dwb7fvlqdj] regarding 
extending JackrabbitEventFilter an interesting suggestion by [~mduerig] came up 
that we could extend the JackrabbitEventFilter in oak explicitly, rather than 
modifying it in Jackrabbit each time we add something oak-specific.
We should introduce a builder/interface pair in oak to reflect that. 
Users that would like to use such oak-specific extensions would wrap a 
JackrabbitEventFilter and get an OakEventFilter that contains enabling these 
extensions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-09 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4908:
-
Fix Version/s: 1.6

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.6, 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch, OAK-4908.v5.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4916) Add support for excluding commits to BackgroundObserver

2016-11-09 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4916:
-
Fix Version/s: 1.6

> Add support for excluding commits to BackgroundObserver
> ---
>
> Key: OAK-4916
> URL: https://issues.apache.org/jira/browse/OAK-4916
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6, 1.5.13
>
> Attachments: FilteringObserver.patch, OAK-4916.patch, 
> OAK-4916.v2.patch
>
>
> As part of pre-filtering commits it would be useful to have support in the 
> BackgroundObserver (in general) that would allow to exclude certain commits 
> from being added to the (BackgroundObserver's) queue, thus keeping the queue 
> smaller. The actual filtering is up to subclasses.
> The suggested implementation is as follows:
> * a new method {{isExcluded}} is introduced which represents a subclass hook 
> for filtering
> * excluded commits are not added to the queue
> * when multiple commits are excluded subsequently, this is collapsed
> * the first non-excluded commit (ContentChange) added to the queue is marked 
> with the last non-excluded root state as the 'previous root'
> * downstream Observers are notified of the exclusion of a commit via a 
> special CommitInfo {{NOOP_CHANGE}}: this instructs it to exclude this change 
> while at the same time 'fast-forwarding' the root state to the new one.
> ** this extra token is one way of solving the problem that 
> {{Observer.contentChanged}} represents a diff between two states but does not 
> transport the 'from' state explicitly - that is implicitly taken from the 
> previous call to {{contentChanged}}. Thus using such a gap token 
> ({{NOOP_CHANGE}}) seems to be the only way to instruct Observers to skip a 
> change.
> To repeat: whoever extends BackgroundObserver with filtering must be aware of 
> the new {{NOOP_CHANGE}} token. Anyone not doing filtering will not get any 
> {{NOOP_CHANGE}} tokens though.
> NOTE: See [comment further 
> below|https://issues.apache.org/jira/browse/OAK-4916?focusedCommentId=15572165=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15572165]
>  with a new suggested approach, which doesn't use NOOP_CHANGED but introduces 
> a new FilteringAwareObserver instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-11-09 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4796:
-
Fix Version/s: 1.6

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6, 1.5.13
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4907) Collect changes (paths, nts, props..) of a commit in a validator

2016-11-09 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4907:
-
Fix Version/s: 1.6

> Collect changes (paths, nts, props..) of a commit in a validator
> 
>
> Key: OAK-4907
> URL: https://issues.apache.org/jira/browse/OAK-4907
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6, 1.5.13
>
> Attachments: OAK-4907.patch, OAK-4907.v2.patch
>
>
> It would be useful to collect a set of changes of a commit (eg in a 
> validator) that could later be used in an Observer for eg prefiltering.
> Such a change collector should collect paths, nodetypes, properties, 
> node-names (and perhaps more at a later stage) of all changes and store the 
> result in the CommitInfo's CommitContext.
> Note that this is a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5072) ChangeCollectorProvider should enable metatype support

2016-11-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15639557#comment-15639557
 ] 

Stefan Egli commented on OAK-5072:
--

+1, 
thx, looks good (just a very minor typo fixed in 1768208)

> ChangeCollectorProvider should enable metatype support
> --
>
> Key: OAK-5072
> URL: https://issues.apache.org/jira/browse/OAK-5072
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.6, 1.5.13
>
>
> {{ChangeCollectorProvider}} exposes some OSGi config but does not have 
> metatype config enable. To allow configuring those config params it should be 
> enabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5066) Provide a config option to disable lease check at DocumentNodeStoreService level

2016-11-05 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15639611#comment-15639611
 ] 

Stefan Egli commented on OAK-5066:
--

[~chetanm], shouldn't {{setLeaseCheck}} be set to true by default? I think you 
might have to invert that flag there.

> Provide a config option to disable lease check at DocumentNodeStoreService 
> level
> 
>
> Key: OAK-5066
> URL: https://issues.apache.org/jira/browse/OAK-5066
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.6
>
> Attachments: OAK-5066-v1.patch
>
>
> Currently its not possible to disable lease check complete at 
> DocumentNodeStoreService. System property 
> {{oak.documentMK.disableLeaseCheck}} only disables some logic in 
> ClusterNodeInfo but lease check wrapper still gets used.
> We should modify the logic also avoid wrapping if this property is enabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-07 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646805#comment-15646805
 ] 

Stefan Egli commented on OAK-4908:
--

[~mreutegg], why are you reverting? Was just going to fix OAK-5082

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Blocker
>  Labels: review
> Fix For: 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646862#comment-15646862
 ] 

Stefan Egli commented on OAK-4908:
--

you mean the failing test of OAK-5082 or something else? One failing test 
shouldn't cause too much drama if it's going to be fixed quickly

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Blocker
>  Labels: review
> Fix For: 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5082) Test failure: testDerivedMixin(org.apache.jackrabbit.core.observation.MixinTest)

2016-11-08 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646954#comment-15646954
 ] 

Stefan Egli commented on OAK-5082:
--

OAK-4908 is currently reverted, however a fix for OAK-5082 would have been (in 
ObservationManagerImpl.addEventListener):
{code}
// OAK-5082 : node type filtering should not only be direct but include 
derived types
// one easy way to solve this is to 'explode' the node types at start 
by including
// all subtypes of every registered node type
//TODO: what if there are multiple hierarchy levels, does subTypes 
return those too?
HashSet explodedNodeTypes = newHashSet();
if (validatedNodeTypeNames != null) {
for (String nt : validatedNodeTypeNames) {
NodeTypeIterator it = ntMgr.getNodeType(nt).getSubtypes();
while(it.hasNext()) {
String subnt = String.valueOf(it.next());
explodedNodeTypes.add(subnt);
}
explodedNodeTypes.add(nt);
}
}
{code}

> Test failure: 
> testDerivedMixin(org.apache.jackrabbit.core.observation.MixinTest)
> 
>
> Key: OAK-5082
> URL: https://issues.apache.org/jira/browse/OAK-5082
> Project: Jackrabbit Oak
>  Issue Type: Test
>  Components: jcr
>Reporter: Amit Jain
>Assignee: Marcel Reutegger
>
> Failed on travis - 
> https://travis-ci.org/apache/jackrabbit-oak/builds/173972640 as well as 
> locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4908:
-
Attachment: OAK-4908.v5.patch

Attaching [^OAK-4908.v5.patch] which includes v4 changes by [~mreutegg], and 
includes the fix for OAK-5082 which includes exploding node types before 
passing to the ChangeSetFilterImpl. tests on oak-core/oak-jcr run fine with 
this - pending -PintegrationTesting

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.6
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch, OAK-4908.v5.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646903#comment-15646903
 ] 

Stefan Egli commented on OAK-4908:
--

unscheduling from 1.5.13 then

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.6
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5082) Test failure: testDerivedMixin(org.apache.jackrabbit.core.observation.MixinTest)

2016-11-08 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15647174#comment-15647174
 ] 

Stefan Egli commented on OAK-5082:
--

bq. Does the ChangeSet also include mixin types?
yes: 
[here|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/observation/ChangeCollectorProvider.java#L204]

> Test failure: 
> testDerivedMixin(org.apache.jackrabbit.core.observation.MixinTest)
> 
>
> Key: OAK-5082
> URL: https://issues.apache.org/jira/browse/OAK-5082
> Project: Jackrabbit Oak
>  Issue Type: Test
>  Components: jcr
>Reporter: Amit Jain
>Assignee: Marcel Reutegger
>
> Failed on travis - 
> https://travis-ci.org/apache/jackrabbit-oak/builds/173972640 as well as 
> locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4908:
-
Priority: Major  (was: Blocker)

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4908:
-
Fix Version/s: (was: 1.5.13)
   1.6

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.6
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15647227#comment-15647227
 ] 

Stefan Egli commented on OAK-4908:
--

if integration test runs fine locally (just started now) I'd suggest to still 
include this in today's 1.5.13, [~edivad], wdyt?

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.6
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch, OAK-4908.v5.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4796) filter events before adding to ChangeProcessor's queue

2016-11-07 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-4796.
--
   Resolution: Fixed
Fix Version/s: (was: 1.6)
   1.5.13

sub-tasks have been finished, closing this one

> filter events before adding to ChangeProcessor's queue
> --
>
> Key: OAK-4796
> URL: https://issues.apache.org/jira/browse/OAK-4796
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.5.9
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.5.13
>
> Attachments: OAK-4796.changeSet.patch, OAK-4796.patch
>
>
> Currently the 
> [ChangeProcessor.contentChanged|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L335]
>  is in charge of doing the event diffing and filtering and does so in a 
> pooled Thread, ie asynchronously, at a later stage independent from the 
> commit. This has the advantage that the commit is fast, but has the following 
> potentially negative effects:
> # events (in the form of ContentChange Objects) occupy a slot of the queue 
> even if the listener is not interested in it - any commit lands on any 
> listener's queue. This reduces the capacity of the queue for 'actual' events 
> to be delivered. It therefore increases the risk that the queue fills - and 
> when full has various consequences such as loosing the CommitInfo etc.
> # each event==ContentChange later on must be evaluated, and for that a diff 
> must be calculated. Depending on runtime behavior that diff might be 
> expensive if no longer in the cache (documentMk specifically).
> As an improvement, this diffing+filtering could be done at an earlier stage 
> already, nearer to the commit, and in case the filter would ignore the event, 
> it would not have to be put into the queue at all, thus avoiding occupying a 
> slot and later potentially slower diffing.
> The suggestion is to implement this via the following algorithm:
> * During the commit, in a {{Validator}} the listener's filters are evaluated 
> - in an as-efficient-as-possible manner (Reason for doing it in a Validator 
> is that this doesn't add overhead as oak already goes through all changes for 
> other Validators). As a result a _list of potentially affected observers_ is 
> added to the {{CommitInfo}} (false positives are fine).
> ** Note that the above adds cost to the commit and must therefore be 
> carefully done and measured
> ** One potential measure could be to only do filtering when listener's queues 
> are larger than a certain threshold (eg 10)
> * The ChangeProcessor in {{contentChanged}} (in the one created in 
> [createObserver|https://github.com/apache/jackrabbit-oak/blob/f4f4e01dd8f708801883260481d37fdcd5868deb/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ChangeProcessor.java#L224])
>  then checks the new commitInfo's _potentially affected observers_ list and 
> if it's not in the list, adds a {{NOOP}} token at the end of the queue. If 
> there's already a NOOP there, the two are collapsed (this way when a filter 
> is not affected it would have a NOOP at the end of the queue). If later on a 
> no-NOOP item is added, the NOOP's {{root}} is used as the {{previousRoot}} 
> for the newly added {{ContentChange}} obj.
> ** To achieve that, the ContentChange obj is extended to not only have the 
> "to" {{root}} pointer, but also the "from" {{previousRoot}} pointer which 
> currently is implicitly maintained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-07 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-4908.
--
Resolution: Fixed

Committed to trunk in http://svn.apache.org/viewvc?rev=1768558=rev

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Blocker
>  Labels: review
> Fix For: 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15647389#comment-15647389
 ] 

Stefan Egli commented on OAK-4908:
--

integration tests run fine and I hope it's fine to still commit this, which 
I've done now in http://svn.apache.org/viewvc?rev=1768673=rev

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.6
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch, OAK-4908.v5.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-4908.
--
Resolution: Fixed

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch, OAK-4908.v5.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4908) Best-effort prefiltering in ChangeProcessor based on ChangeSet

2016-11-08 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-4908:
-
Fix Version/s: (was: 1.6)
   1.5.13

> Best-effort prefiltering in ChangeProcessor based on ChangeSet
> --
>
> Key: OAK-4908
> URL: https://issues.apache.org/jira/browse/OAK-4908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core, jcr
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>  Labels: review
> Fix For: 1.5.13
>
> Attachments: OAK-4908.patch, OAK-4908.v2.patch, OAK-4908.v3.patch, 
> OAK-4908.v4.patch, OAK-4908.v5.patch
>
>
> This is a subtask as a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796:
> Based on the ChangeSet provided with OAK-4907 in the CommitContext, the 
> ChangeProcessor should do a best-effort prefiltering of commits before they 
> get added to the (BackgroundObserver's) queue.
> This consists of the following parts:
> * -the support for optionally excluding commits from being added to the queue 
> in the BackgroundObserver- EDIT: factored that out into OAK-4916
> * -the BackgroundObserver signaling downstream Observers that a change should 
> be excluded via a {{NOOP_CHANGE}} CommitInfo- EDIT: factored that out into 
> OAK-4916
> * the ChangeProcessor using OAK-4907's ChangeSet of the CommitContext for 
> best-effort prefiltering - and handling the {{NOOP_CHANGED}} CommitInfo 
> introduced in OAK-4916



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    2   3   4   5   6   7   8   9   10   11   >