[jira] [Commented] (OAK-4845) Regression: DefaultSyncContext does not sync membership to a local group

2016-09-26 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523580#comment-15523580
 ] 

Alexander Klimetschek commented on OAK-4845:


There was a bug in the test code, the verification if the user is a member of 
the group at the end of {{testMembershipForExistingLocalGroup}} was assuming a 
fixed order of the iterator.

Updated [^OAK-4845.patch] (and [new commit on 
github|https://github.com/alexkli/jackrabbit-oak/commit/398ae15d5d23c5416087260ac8a00b1fefdbbfb0]).

> Regression: DefaultSyncContext does not sync membership to a local group
> 
>
> Key: OAK-4845
> URL: https://issues.apache.org/jira/browse/OAK-4845
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: auth-external
>Affects Versions: 1.5.3, 1.4.7
>Reporter: Alexander Klimetschek
>Priority: Critical
> Attachments: OAK-4845.patch, missing-commits-in-tests.patch
>
>
> OAK-4397 introduced a regression: it does not allow syncing to a locally 
> existing group anymore (that does not belong to another IDP).
> Updating to 1.4.7 now gives "Existing authorizable 'X' is not a group from 
> this IDP 'foo'.", and the group is not synced and most importantly, 
> memberships for external users are not updated anymore. Looking at the group 
> in JCR, it does not have a {{rep:externalId}} at all, only a 
> {{rep:lastSynced}}.
> Code wise, this is because the {{rep:externalId}} is only ever set in 
> {{createGroup()}}, i.e. when the external sync creates that group initially. 
> If the group is already present locally (but not owned by another IDP!), then 
> it won't use it due to the new {{isSameIDP()}} check, which will also fail if 
> there is no {{rep:externalId}} property set.
> The use case is that we are defining the group locally as part of our 
> application already (including ACs etc.) and we want certain external users 
> be added to it, based on some some of their settings on the external IDP 
> side. FWIW, we don't care for the syncing of properties for the group itself, 
> just for the memberships.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Comment Edited] (OAK-4845) Regression: DefaultSyncContext does not sync membership to a local group

2016-09-26 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514871#comment-15514871
 ] 

Alexander Klimetschek edited comment on OAK-4845 at 9/26/16 4:46 PM:
-

Attached a patch which fixes the issue by allowing local groups without a 
{{rep:externalId}} (and includes above test). It still checks and prevents a 
group added by another identity provider.
* [^OAK-4845.patch] (alternatively --[on 
github|https://github.com/alexkli/jackrabbit-oak/commit/42c855cb583a85ef19ab839bf10d398967923d8c]
 (outdated)-- [on 
github|https://github.com/alexkli/jackrabbit-oak/commit/398ae15d5d23c5416087260ac8a00b1fefdbbfb0])

I could imagine there can be scenarios with 2 IDPs where folks might want to 
add users from one IDP to be added to groups from another IDP. In that case, 
reverting OAK-4397 would probably be the answer. Don't have a clear opinion on 
that, though.

I also noticed that some of the unit tests, especially 
{{testMembershipForExistingForeignGroup}}, of which I based the new test on, do 
not make sure {{root.commit()}} is called after the sync, which at least makes 
group membership changes not yet visible to {{user.declaredMemberOf()}} checked 
at the end.
* fixed that in [^missing-commits-in-tests.patch] (alternatively [on 
github|https://github.com/alexkli/jackrabbit-oak/commit/38689f79843b11a1f81b56f74d6e9810d93f34c9])


was (Author: alexander.klimetschek):
Attached a patch which fixes the issue by allowing local groups without a 
{{rep:externalId}} (and includes above test). It still checks and prevents a 
group added by another identity provider.
* [^OAK-4845.patch] (alternatively [on 
github|https://github.com/alexkli/jackrabbit-oak/commit/42c855cb583a85ef19ab839bf10d398967923d8c])

I could imagine there can be scenarios with 2 IDPs where folks might want to 
add users from one IDP to be added to groups from another IDP. In that case, 
reverting OAK-4397 would probably be the answer. Don't have a clear opinion on 
that, though.

I also noticed that some of the unit tests, especially 
{{testMembershipForExistingForeignGroup}}, of which I based the new test on, do 
not make sure {{root.commit()}} is called after the sync, which at least makes 
group membership changes not yet visible to {{user.declaredMemberOf()}} checked 
at the end.
* fixed that in [^missing-commits-in-tests.patch] (alternatively [on 
github|https://github.com/alexkli/jackrabbit-oak/commit/38689f79843b11a1f81b56f74d6e9810d93f34c9])

> Regression: DefaultSyncContext does not sync membership to a local group
> 
>
> Key: OAK-4845
> URL: https://issues.apache.org/jira/browse/OAK-4845
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: auth-external
>Affects Versions: 1.5.3, 1.4.7
>Reporter: Alexander Klimetschek
>Priority: Critical
> Attachments: OAK-4845.patch, missing-commits-in-tests.patch
>
>
> OAK-4397 introduced a regression: it does not allow syncing to a locally 
> existing group anymore (that does not belong to another IDP).
> Updating to 1.4.7 now gives "Existing authorizable 'X' is not a group from 
> this IDP 'foo'.", and the group is not synced and most importantly, 
> memberships for external users are not updated anymore. Looking at the group 
> in JCR, it does not have a {{rep:externalId}} at all, only a 
> {{rep:lastSynced}}.
> Code wise, this is because the {{rep:externalId}} is only ever set in 
> {{createGroup()}}, i.e. when the external sync creates that group initially. 
> If the group is already present locally (but not owned by another IDP!), then 
> it won't use it due to the new {{isSameIDP()}} check, which will also fail if 
> there is no {{rep:externalId}} property set.
> The use case is that we are defining the group locally as part of our 
> application already (including ACs etc.) and we want certain external users 
> be added to it, based on some some of their settings on the external IDP 
> side. FWIW, we don't care for the syncing of properties for the group itself, 
> just for the memberships.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4845) Regression: DefaultSyncContext does not sync membership to a local group

2016-09-26 Thread Alexander Klimetschek (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Klimetschek updated OAK-4845:
---
Attachment: (was: OAK-4845.patch)

> Regression: DefaultSyncContext does not sync membership to a local group
> 
>
> Key: OAK-4845
> URL: https://issues.apache.org/jira/browse/OAK-4845
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: auth-external
>Affects Versions: 1.5.3, 1.4.7
>Reporter: Alexander Klimetschek
>Priority: Critical
> Attachments: OAK-4845.patch, missing-commits-in-tests.patch
>
>
> OAK-4397 introduced a regression: it does not allow syncing to a locally 
> existing group anymore (that does not belong to another IDP).
> Updating to 1.4.7 now gives "Existing authorizable 'X' is not a group from 
> this IDP 'foo'.", and the group is not synced and most importantly, 
> memberships for external users are not updated anymore. Looking at the group 
> in JCR, it does not have a {{rep:externalId}} at all, only a 
> {{rep:lastSynced}}.
> Code wise, this is because the {{rep:externalId}} is only ever set in 
> {{createGroup()}}, i.e. when the external sync creates that group initially. 
> If the group is already present locally (but not owned by another IDP!), then 
> it won't use it due to the new {{isSameIDP()}} check, which will also fail if 
> there is no {{rep:externalId}} property set.
> The use case is that we are defining the group locally as part of our 
> application already (including ACs etc.) and we want certain external users 
> be added to it, based on some some of their settings on the external IDP 
> side. FWIW, we don't care for the syncing of properties for the group itself, 
> just for the memberships.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4845) Regression: DefaultSyncContext does not sync membership to a local group

2016-09-26 Thread Alexander Klimetschek (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Klimetschek updated OAK-4845:
---
Attachment: OAK-4845.patch

> Regression: DefaultSyncContext does not sync membership to a local group
> 
>
> Key: OAK-4845
> URL: https://issues.apache.org/jira/browse/OAK-4845
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: auth-external
>Affects Versions: 1.5.3, 1.4.7
>Reporter: Alexander Klimetschek
>Priority: Critical
> Attachments: OAK-4845.patch, missing-commits-in-tests.patch
>
>
> OAK-4397 introduced a regression: it does not allow syncing to a locally 
> existing group anymore (that does not belong to another IDP).
> Updating to 1.4.7 now gives "Existing authorizable 'X' is not a group from 
> this IDP 'foo'.", and the group is not synced and most importantly, 
> memberships for external users are not updated anymore. Looking at the group 
> in JCR, it does not have a {{rep:externalId}} at all, only a 
> {{rep:lastSynced}}.
> Code wise, this is because the {{rep:externalId}} is only ever set in 
> {{createGroup()}}, i.e. when the external sync creates that group initially. 
> If the group is already present locally (but not owned by another IDP!), then 
> it won't use it due to the new {{isSameIDP()}} check, which will also fail if 
> there is no {{rep:externalId}} property set.
> The use case is that we are defining the group locally as part of our 
> application already (including ACs etc.) and we want certain external users 
> be added to it, based on some some of their settings on the external IDP 
> side. FWIW, we don't care for the syncing of properties for the group itself, 
> just for the memberships.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4845) Regression: DefaultSyncContext does not sync membership to a local group

2016-09-26 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524062#comment-15524062
 ] 

Alexander Klimetschek commented on OAK-4845:


I noticed that a lost membership is not applied in such a local group 
membership either. This was always true, even before OAK-4397.

First the [isSameIDP() check inside 
syncMemberships()|https://github.com/apache/jackrabbit-oak/blob/8c928feafc77c40d75142f69931b6ff3b5837f9d/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L480-L482]
 fills the {{declaredExternalGroups}} map and then later the [removal of lost 
memberships|https://github.com/apache/jackrabbit-oak/blob/8c928feafc77c40d75142f69931b6ff3b5837f9d/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L535-L536]
 is only done against this list/map (code links are oak 1.4.2, but current 
trunk is identical). Hence any group that existed locally before, would not get 
a {{rep:externalId}}, and hence later on membership removal it would get 
ignored.

I see the need that you want to make sure a membership is not removed 
accidentally if it "additionally" comes in from another IDP. OAK-4397 
definitely made the entire behavior more consistent, in that you can't even add 
such memberships in the first place.

However, from all I can see in our use case we have no other option than have 
this group exist locally before (and definitely there are production systems 
with this state already). IMO it should be ok to say a "local only group" with 
no {{rep:externalId}} gets upgraded to an IDP group by adding the 
{{rep:externalId}} the first time it is synced - and from then on all the 
checks stay in place, to prevent another IDP to mess with it. This would be a 
slightly different patch.

> Regression: DefaultSyncContext does not sync membership to a local group
> 
>
> Key: OAK-4845
> URL: https://issues.apache.org/jira/browse/OAK-4845
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: auth-external
>Affects Versions: 1.5.3, 1.4.7
>Reporter: Alexander Klimetschek
>Priority: Critical
> Attachments: OAK-4845.patch, missing-commits-in-tests.patch
>
>
> OAK-4397 introduced a regression: it does not allow syncing to a locally 
> existing group anymore (that does not belong to another IDP).
> Updating to 1.4.7 now gives "Existing authorizable 'X' is not a group from 
> this IDP 'foo'.", and the group is not synced and most importantly, 
> memberships for external users are not updated anymore. Looking at the group 
> in JCR, it does not have a {{rep:externalId}} at all, only a 
> {{rep:lastSynced}}.
> Code wise, this is because the {{rep:externalId}} is only ever set in 
> {{createGroup()}}, i.e. when the external sync creates that group initially. 
> If the group is already present locally (but not owned by another IDP!), then 
> it won't use it due to the new {{isSameIDP()}} check, which will also fail if 
> there is no {{rep:externalId}} property set.
> The use case is that we are defining the group locally as part of our 
> application already (including ACs etc.) and we want certain external users 
> be added to it, based on some some of their settings on the external IDP 
> side. FWIW, we don't care for the syncing of properties for the group itself, 
> just for the memberships.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4845) Regression: DefaultSyncContext does not sync membership to a local group

2016-09-26 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524062#comment-15524062
 ] 

Alexander Klimetschek edited comment on OAK-4845 at 9/26/16 8:18 PM:
-

I noticed that a lost membership is not applied in such a "local group" case 
either. This was always true, even before OAK-4397.

First the [isSameIDP() check inside 
syncMemberships()|https://github.com/apache/jackrabbit-oak/blob/8c928feafc77c40d75142f69931b6ff3b5837f9d/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L480-L482]
 fills the {{declaredExternalGroups}} map and then later the [removal of lost 
memberships|https://github.com/apache/jackrabbit-oak/blob/8c928feafc77c40d75142f69931b6ff3b5837f9d/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L535-L536]
 is only done against this list/map (code links are oak 1.4.2, but current 
trunk is identical). Hence any group that existed locally before, would not get 
a {{rep:externalId}}, and hence later on membership removal it would get 
ignored.

I see the need that you want to make sure a membership is not removed 
accidentally if it "additionally" comes in from another IDP. OAK-4397 
definitely made the entire behavior more consistent, in that you can't even add 
such memberships in the first place.

However, from all I can see in our use case we have no other option than have 
this group exist locally before (and definitely there are production systems 
with this state already). IMO it should be ok to say a "local only group" with 
no {{rep:externalId}} gets upgraded to an IDP group by adding the 
{{rep:externalId}} the first time it is synced - and from then on all the 
checks stay in place, to prevent another IDP to mess with it. This would be a 
slightly different patch.


was (Author: alexander.klimetschek):
I noticed that a lost membership is not applied in such a local group 
membership either. This was always true, even before OAK-4397.

First the [isSameIDP() check inside 
syncMemberships()|https://github.com/apache/jackrabbit-oak/blob/8c928feafc77c40d75142f69931b6ff3b5837f9d/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L480-L482]
 fills the {{declaredExternalGroups}} map and then later the [removal of lost 
memberships|https://github.com/apache/jackrabbit-oak/blob/8c928feafc77c40d75142f69931b6ff3b5837f9d/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L535-L536]
 is only done against this list/map (code links are oak 1.4.2, but current 
trunk is identical). Hence any group that existed locally before, would not get 
a {{rep:externalId}}, and hence later on membership removal it would get 
ignored.

I see the need that you want to make sure a membership is not removed 
accidentally if it "additionally" comes in from another IDP. OAK-4397 
definitely made the entire behavior more consistent, in that you can't even add 
such memberships in the first place.

However, from all I can see in our use case we have no other option than have 
this group exist locally before (and definitely there are production systems 
with this state already). IMO it should be ok to say a "local only group" with 
no {{rep:externalId}} gets upgraded to an IDP group by adding the 
{{rep:externalId}} the first time it is synced - and from then on all the 
checks stay in place, to prevent another IDP to mess with it. This would be a 
slightly different patch.

> Regression: DefaultSyncContext does not sync membership to a local group
> 
>
> Key: OAK-4845
> URL: https://issues.apache.org/jira/browse/OAK-4845
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: auth-external
>Affects Versions: 1.5.3, 1.4.7
>Reporter: Alexander Klimetschek
>Priority: Critical
> Attachments: OAK-4845.patch, missing-commits-in-tests.patch
>
>
> OAK-4397 introduced a regression: it does not allow syncing to a locally 
> existing group anymore (that does not belong to another IDP).
> Updating to 1.4.7 now gives "Existing authorizable 'X' is not a group from 
> this IDP 'foo'.", and the group is not synced and most importantly, 
> memberships for external users are not updated anymore. Looking at the group 
> in JCR, it does not have a {{rep:externalId}} at all, only a 
> {{rep:lastSynced}}.
> Code wise, this is because the {{rep:externalId}} is only ever set in 
> {{createGroup()}}, i.e. when the external sync creates that group initially. 
> If the group is already present locally (but not owned by another IDP!), then 
> it 

[jira] [Assigned] (OAK-4826) Auto removal of orphaned checkpoints

2016-09-26 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger reassigned OAK-4826:
-

Assignee: Marcel Reutegger

> Auto removal of orphaned checkpoints
> 
>
> Key: OAK-4826
> URL: https://issues.apache.org/jira/browse/OAK-4826
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Marcel Reutegger
> Fix For: 1.6
>
>
> Currently if in a running system there are some orphaned checkpoint present 
> then they prevent the revision gc (compaction for segment) from being 
> effective. 
> So far the practice has been to use {{oak-run checkpoints rm-unreferenced}} 
> command to clean them up manually. This was set to manual as it was not 
> possible to determine whether current checkpoint is in use or not. 
> rm-unreferenced works with the basis that checkpoints are only made from 
> AsyncIndexUpdate and hence can check if the checkpoint is in use by cross 
> checking with {{:async}} state. Doing it in auto mode is risky as 
> {{checkpoint}} api can be used by any module.
> With OAK-2314 we also record some metadata like {{creator}} and {{name}}. 
> This can be used for auto cleanup. For example in some running system 
> following checkpoints are listed
> {noformat}
> Mon Sep 19 18:02:09 EDT 2016  Sun Jun 16 18:02:09 EDT 2019
> r15744787d0a-1-1
>  
> creator=AsyncIndexUpdate
> name=fulltext-async
> thread=sling-default-4070-Registered Service.653
>  
> Mon Sep 19 18:02:09 EDT 2016  Sun Jun 16 18:02:09 EDT 2019
> r15744787d0a-0-1
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-4072-Registered Service.656
>  
> Fri Aug 19 18:57:33 EDT 2016  Thu May 16 18:57:33 EDT 2019
> r156a50612e1-1-1
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-10-Registered Service.654
>  
> Wed Aug 10 12:13:20 EDT 2016  Tue May 07 12:25:52 EDT 2019
> r156753ac38d-0-1
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-6041-Registered Service.1966
> {noformat}
> As can be seen that last 2 checkpoints are orphan and they would prevent 
> revision gc. For auto mode we can use following heuristic
> # List all current checkpoints
> # Only keep the latest checkpoint for given {{creator}} and {{name}} combo. 
> Other entries from same pair which are older i.e. creation time can be 
> consider orphan and deleted
> This logic can be implemented 
> {{org.apache.jackrabbit.oak.checkpoint.Checkpoints}} and can be invoked by 
> Revision GC logic (both in DocumentNodeStore and SegmentNodeStore) to 
> determine the base revision to keep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4846) Avoid excessive disk space usage of mongodb, Jackrabbit OAK mongoMK

2016-09-26 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-4846.
---
Resolution: Invalid

Please use the mailing list for questions. See the participate section here: 
http://jackrabbit.apache.org/oak/

> Avoid excessive disk space usage of mongodb, Jackrabbit OAK mongoMK
> ---
>
> Key: OAK-4846
> URL: https://issues.apache.org/jira/browse/OAK-4846
> Project: Jackrabbit Oak
>  Issue Type: Bug
>Reporter: jhonny Villarroel
>
> Hi Guys,
> Is there any configuration to avoid excessive disk space usage with mongoMK , 
> currently my mongodb data folder uses 1TB,  and I don't have a significant 
> quantity of documents uploaded :)
> Thanks so much in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4850) List checkpoints

2016-09-26 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-4850:
-

 Summary: List checkpoints
 Key: OAK-4850
 URL: https://issues.apache.org/jira/browse/OAK-4850
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, documentmk, segmentmk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
Priority: Minor
 Fix For: 1.6


Introduce a new method on {{NodeStore}} that lists the currently valid 
checkpoints:

{code}
/**
 * Returns all currently valid checkpoints.
 *
 * @return valid checkpoints.
 */
@Nonnull
Iterable checkpoints();
{code}

The NodeStore interface already has methods to create and release a checkpoint, 
as well as retrieving the root state for a checkpoint, but it is currently not 
possible to _list_ checkpoints. Using the checkpoint facility as designed right 
now can lead to a situation where a checkpoint is orphaned. That is, some code 
created a checkpoint but was unable to store the reference because the system 
e.g. crashed. Orphaned checkpoints can affect garbage collection because they 
prevent it from cleaning up old data. Right now, this requires users to run 
tools like oak-run to get rid of those checkpoints.

As suggested in OAK-4826, client code should be able to automatically clean up 
unused checkpoints. This requires a method to list existing checkpoints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4850) List checkpoints

2016-09-26 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-4850:
--
Attachment: OAK-4850.patch

Proposed changes.

> List checkpoints
> 
>
> Key: OAK-4850
> URL: https://issues.apache.org/jira/browse/OAK-4850
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core, documentmk, segmentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.6
>
> Attachments: OAK-4850.patch
>
>
> Introduce a new method on {{NodeStore}} that lists the currently valid 
> checkpoints:
> {code}
> /**
>  * Returns all currently valid checkpoints.
>  *
>  * @return valid checkpoints.
>  */
> @Nonnull
> Iterable checkpoints();
> {code}
> The NodeStore interface already has methods to create and release a 
> checkpoint, as well as retrieving the root state for a checkpoint, but it is 
> currently not possible to _list_ checkpoints. Using the checkpoint facility 
> as designed right now can lead to a situation where a checkpoint is orphaned. 
> That is, some code created a checkpoint but was unable to store the reference 
> because the system e.g. crashed. Orphaned checkpoints can affect garbage 
> collection because they prevent it from cleaning up old data. Right now, this 
> requires users to run tools like oak-run to get rid of those checkpoints.
> As suggested in OAK-4826, client code should be able to automatically clean 
> up unused checkpoints. This requires a method to list existing checkpoints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4835) Provide generic option to interrupt online revision cleanup

2016-09-26 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-4835:
-
Component/s: documentmk

> Provide generic option to interrupt online revision cleanup
> ---
>
> Key: OAK-4835
> URL: https://issues.apache.org/jira/browse/OAK-4835
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>  Labels: compaction, gc
>
> JMX binding for stopping a running compaction process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4826) Auto removal of orphaned checkpoints

2016-09-26 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522364#comment-15522364
 ] 

Marcel Reutegger commented on OAK-4826:
---

Depends upon OAK-4850.

> Auto removal of orphaned checkpoints
> 
>
> Key: OAK-4826
> URL: https://issues.apache.org/jira/browse/OAK-4826
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Marcel Reutegger
> Fix For: 1.6
>
>
> Currently if in a running system there are some orphaned checkpoint present 
> then they prevent the revision gc (compaction for segment) from being 
> effective. 
> So far the practice has been to use {{oak-run checkpoints rm-unreferenced}} 
> command to clean them up manually. This was set to manual as it was not 
> possible to determine whether current checkpoint is in use or not. 
> rm-unreferenced works with the basis that checkpoints are only made from 
> AsyncIndexUpdate and hence can check if the checkpoint is in use by cross 
> checking with {{:async}} state. Doing it in auto mode is risky as 
> {{checkpoint}} api can be used by any module.
> With OAK-2314 we also record some metadata like {{creator}} and {{name}}. 
> This can be used for auto cleanup. For example in some running system 
> following checkpoints are listed
> {noformat}
> Mon Sep 19 18:02:09 EDT 2016  Sun Jun 16 18:02:09 EDT 2019
> r15744787d0a-1-1
>  
> creator=AsyncIndexUpdate
> name=fulltext-async
> thread=sling-default-4070-Registered Service.653
>  
> Mon Sep 19 18:02:09 EDT 2016  Sun Jun 16 18:02:09 EDT 2019
> r15744787d0a-0-1
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-4072-Registered Service.656
>  
> Fri Aug 19 18:57:33 EDT 2016  Thu May 16 18:57:33 EDT 2019
> r156a50612e1-1-1
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-10-Registered Service.654
>  
> Wed Aug 10 12:13:20 EDT 2016  Tue May 07 12:25:52 EDT 2019
> r156753ac38d-0-1
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-6041-Registered Service.1966
> {noformat}
> As can be seen that last 2 checkpoints are orphan and they would prevent 
> revision gc. For auto mode we can use following heuristic
> # List all current checkpoints
> # Only keep the latest checkpoint for given {{creator}} and {{name}} combo. 
> Other entries from same pair which are older i.e. creation time can be 
> consider orphan and deleted
> This logic can be implemented 
> {{org.apache.jackrabbit.oak.checkpoint.Checkpoints}} and can be invoked by 
> Revision GC logic (both in DocumentNodeStore and SegmentNodeStore) to 
> determine the base revision to keep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4835) Provide generic option to interrupt online revision cleanup

2016-09-26 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-4835:
-
Summary: Provide generic option to interrupt online revision cleanup  (was: 
Provide option to interrupt online revision cleanup on segmentmk)

> Provide generic option to interrupt online revision cleanup
> ---
>
> Key: OAK-4835
> URL: https://issues.apache.org/jira/browse/OAK-4835
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>  Labels: compaction, gc
>
> JMX binding for stopping a running compaction process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4851) Update httpclient to 4.3.6 in Oak 1.4- branches

2016-09-26 Thread Tommaso Teofili (JIRA)
Tommaso Teofili created OAK-4851:


 Summary: Update httpclient to 4.3.6 in Oak 1.4- branches
 Key: OAK-4851
 URL: https://issues.apache.org/jira/browse/OAK-4851
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: solr
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 1.5.10


It'd be good to update to latest version of httpclient (4.3.6) to incorporate 
the latest fixes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4851) Update httpclient to 4.3.6 in Oak 1.4- branches

2016-09-26 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522669#comment-15522669
 ] 

Tommaso Teofili commented on OAK-4851:
--

fixed by merging commit from OAK-4764 in branch 1.0, 1.2, 1.4.

> Update httpclient to 4.3.6 in Oak 1.4- branches
> ---
>
> Key: OAK-4851
> URL: https://issues.apache.org/jira/browse/OAK-4851
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.0.34, 1.4.8, 1.2.20
>
>
> It'd be good to update to latest version of httpclient (4.3.6) to incorporate 
> the latest fixes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4851) Update httpclient to 4.3.6 in Oak 1.4- branches

2016-09-26 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved OAK-4851.
--
Resolution: Fixed

> Update httpclient to 4.3.6 in Oak 1.4- branches
> ---
>
> Key: OAK-4851
> URL: https://issues.apache.org/jira/browse/OAK-4851
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.0.34, 1.4.8, 1.2.20
>
>
> It'd be good to update to latest version of httpclient (4.3.6) to incorporate 
> the latest fixes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4844) Analyse effects of simplified record ids

2016-09-26 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522577#comment-15522577
 ] 

Alex Parvulescu commented on OAK-4844:
--

test results using the patch: size is {{13GB}}
{noformat
Total size:
13 GB in  56521 data segments
768 KB in  3 bulk segments
3 GB in maps (46450859 leaf and branch records)
1 GB in lists (55469092 list and bucket records)
3 GB in values (value and block records of 70765678 properties, 
3429/378684/0/1214419 small/medium/long/external blobs, 46258452/1862224/159 
small/medium/long strings)
161 MB in templates (16772712 template records)
2 GB in nodes (251591739 node records)
links to non existing segments: []
{noformat}

> Analyse effects of simplified record ids
> 
>
> Key: OAK-4844
> URL: https://issues.apache.org/jira/browse/OAK-4844
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: performance
> Fix For: Segment Tar 0.0.14
>
> Attachments: OAK-4844.patch
>
>
> OAK-4631 introduced a simplified serialisation for record ids. This causes 
> their footprint on disk to increase from 3 bytes to 18 bytes. OAK-4631 has 
> some initial analysis on the effect this is having on repositories as a 
> whole. 
> I'm opening this issue as a dedicated task to further look into mitigation 
> strategies (if necessary). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4844) Analyse effects of simplified record ids

2016-09-26 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522616#comment-15522616
 ] 

Michael Dürig commented on OAK-4844:


Would be good to also have the performance numbers so we can compare to 
https://issues.apache.org/jira/browse/OAK-4631?focusedCommentId=15417276=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15417276

> Analyse effects of simplified record ids
> 
>
> Key: OAK-4844
> URL: https://issues.apache.org/jira/browse/OAK-4844
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: performance
> Fix For: Segment Tar 0.0.14
>
> Attachments: OAK-4844.patch
>
>
> OAK-4631 introduced a simplified serialisation for record ids. This causes 
> their footprint on disk to increase from 3 bytes to 18 bytes. OAK-4631 has 
> some initial analysis on the effect this is having on repositories as a 
> whole. 
> I'm opening this issue as a dedicated task to further look into mitigation 
> strategies (if necessary). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4851) Update httpclient to 4.3.6 in Oak 1.4- branches

2016-09-26 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated OAK-4851:
-
Fix Version/s: (was: 1.5.10)
   1.2.20
   1.4.8
   1.0.34

> Update httpclient to 4.3.6 in Oak 1.4- branches
> ---
>
> Key: OAK-4851
> URL: https://issues.apache.org/jira/browse/OAK-4851
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.0.34, 1.4.8, 1.2.20
>
>
> It'd be good to update to latest version of httpclient (4.3.6) to incorporate 
> the latest fixes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522741#comment-15522741
 ] 

Marcel Reutegger commented on OAK-4581:
---

In my view it would be best to review the usage of Observers outside of Oak 
again and use JCR/Jackrabbit EventListeners whenever possible. The Oak 
observation package is not a public API and the export version is set to zero. 
I think that was a good decision, because we repeatedly run into situations 
where Oak observer (or BackgroundObserver) usage outside of Oak lags behind 
current development. Examples include missing MBean support, default values for 
queue length or new callbacks on BackgroundObserver. IMO we should stop 
promoting Oak Observer when we actually consider it internal.

SLING-3279 was created to leverage filters not available in JCR. However, the 
most useful filter with multiple paths is already available since Jackrabbit 
2.7.5 (JCR-3745) and gathering names of added/removed/changed properties is 
also possible with plain JCR Events.

[~mduerig] & [~cziegeler], do you remember the main reasons why Oak Observers 
were considered superior over JCR EventListeners?

If there are features missing, I would rather add them to the Jackrabbit API 
and make it available to other users as well and change the JCR Resource 
implementation to only rely on the Jackrabbit API.

Once we have a clear separation of Oak internal Observer usage and client code 
using JCR EventListeners, we can more easily optimize those two parts 
individually. The former is the clear responsibility of the repository and 
would produce events as efficiently and fast as possible, while the latter gets 
those events at its own pace. The current situation is IMO problematic because 
the two aspects are mixed.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> 

[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522963#comment-15522963
 ] 

Vikas Saurabh commented on OAK-4581:


Btw, just to note (I'm unable to concretely argue if we really want to do this 
or not) - no matter how we proceed, the storage would be cluster-id specific. 
So, do we want to cater the case where some cluster node gets obliterated (with 
pending stored events) - would/should some other node pick that up? Adding the 
disclaimer again that it's a case that would hit us once we start to advertise 
"stronger guarantee delivery of observation events" - not that we must fix it 
(we can as document it that the guarantee is a best-effort thing)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522963#comment-15522963
 ] 

Vikas Saurabh edited comment on OAK-4581 at 9/26/16 12:53 PM:
--

Btw, just to note (I'm unable to concretely argue if we really want to do this 
or not) - no matter how we proceed, the storage would be cluster-id specific. 
So, do we want to cater the case where some cluster node gets obliterated (with 
pending stored events) - would/should some other node pick that up? Adding the 
disclaimer again that it's a case that would hit us once we start to advertise 
"stronger guarantee of delivery of observation events" - not that we must fix 
it (we can as document it that the guarantee is a best-effort thing).

If we go for it, then we'd also have to handle cases such as "cluster node 
crashed and burnt to never come up again", "cluster node crashed for 2 hours 
but got back up again", "some other cluster node waited for lease to expire and 
acquired the same cluster id as the one which got crashed recently", etc. The 
more we go ahead with solving this - it seems we'd start to invade into what 
sling jobs do today (have concrete notion of assignee etc with re-assignments 
as required)


was (Author: catholicon):
Btw, just to note (I'm unable to concretely argue if we really want to do this 
or not) - no matter how we proceed, the storage would be cluster-id specific. 
So, do we want to cater the case where some cluster node gets obliterated (with 
pending stored events) - would/should some other node pick that up? Adding the 
disclaimer again that it's a case that would hit us once we start to advertise 
"stronger guarantee delivery of observation events" - not that we must fix it 
(we can as document it that the guarantee is a best-effort thing).

If we go for it, then we'd also have to handle cases such as "cluster node 
crashed and burnt to never come up again", "cluster node crashed for 2 hours 
but got back up again", "some other cluster node waited for least and acquired 
the same cluster id as the one which got crashed", etc. The more we go ahead 
with solving this - it seems we'd start to invade into what sling jobs do today 
(have concrete notion of assignee etc with re-assignments as required)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use 

[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522996#comment-15522996
 ] 

Stefan Egli commented on OAK-4581:
--

that sounds like a new class of listener that would want 'at-least-once' 
delivery in a cluster. Something probably useful, but I'm not sure if that fits 
into the observation umbrella of the JCR API. I think that's orthogonal to this 
ticket (somewhat) and could probably be handled separately?

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522946#comment-15522946
 ] 

Vikas Saurabh commented on OAK-4581:


yeah, if we get around to single way of observation - then, I'd also like 1-B-1 
+ 3-C (same as what Marcel preferred).

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522963#comment-15522963
 ] 

Vikas Saurabh edited comment on OAK-4581 at 9/26/16 12:52 PM:
--

Btw, just to note (I'm unable to concretely argue if we really want to do this 
or not) - no matter how we proceed, the storage would be cluster-id specific. 
So, do we want to cater the case where some cluster node gets obliterated (with 
pending stored events) - would/should some other node pick that up? Adding the 
disclaimer again that it's a case that would hit us once we start to advertise 
"stronger guarantee delivery of observation events" - not that we must fix it 
(we can as document it that the guarantee is a best-effort thing).

If we go for it, then we'd also have to handle cases such as "cluster node 
crashed and burnt to never come up again", "cluster node crashed for 2 hours 
but got back up again", "some other cluster node waited for least and acquired 
the same cluster id as the one which got crashed", etc. The more we go ahead 
with solving this - it seems we'd start to invade into what sling jobs do today 
(have concrete notion of assignee etc with re-assignments as required)


was (Author: catholicon):
Btw, just to note (I'm unable to concretely argue if we really want to do this 
or not) - no matter how we proceed, the storage would be cluster-id specific. 
So, do we want to cater the case where some cluster node gets obliterated (with 
pending stored events) - would/should some other node pick that up? Adding the 
disclaimer again that it's a case that would hit us once we start to advertise 
"stronger guarantee delivery of observation events" - not that we must fix it 
(we can as document it that the guarantee is a best-effort thing)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> 

[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523022#comment-15523022
 ] 

Vikas Saurabh commented on OAK-4581:


What I meant was that this issue is saying that for JCR observation, we'd be 
more resilient for:
# event generated for node1
# node1 rebounces before event is dispatched to all observers
# those observers get events post reboot

That opens up the case for at-least-once-delivery for events that got generated 
for a given cluster node. My argument being that if we say that we just handle 
a sub-set of cases, then observation client still has to be resilient against 
loss of events - and if that observation client code is resilient what's the 
point of us trying to be more resilient.

Afaict, the issue got created because it's tricky/hard(/impossible??) for 
observation client to be resilient to loss of observation events (barring 
manual intervention). But, yes if we do this (just do simple stuff and not to 
give pedantic guarantee of delivery), then we can have tooling to help with 
that manual intervention.

Again, I'm not saying we need to do this(pedantic delivery) - just something 
that we should keep in mind about what the observation client might expect post 
this implementation.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522461#comment-15522461
 ] 

Stefan Egli commented on OAK-4581:
--

[~catholicon], thx for your feedback! Re
bq. I thought this issue was about persisting change pointers
Agreed. And when looking at persisting NodeState(+CommitInfo) then the revGC 
issue comes up (you must later be able to do the diff, and revGC must not clean 
up things before the persited observation queues haven't been processed). And 
from this resulted the idea to not persist NodeState but the actual, calculated 
Event (even though that would bloat the storage, as it would become much 
simpler). However, this now again conflicts with support for any type of 
BackgroundObserver, not only ChangeProcessor. So I think the latter question 
becomes central now, and if we want to support any BackgroundObserver we need 
to persist NodeState and prevent revGC from cleaning up too early.
bq. Afaics, we still want remain wary of infinite storage of pointers
bq. Sure if we're saying that revGC deleted node states 
Exactly. There's a dilemma: we want to prevent revGC to being 'paused' for too 
long just because of observation - but if it isn't paused then such a slow or 
overwhelmed listener would loose events. We have to make a choice, it's a 
binary thing. Perhaps we have to cut off events after a certain time (eg after 
exactly 24hours to fit into the segment-tar's default generation cycle)? (The 
advantage of persisting events would have been that it wouldn't have needed 
such a cut-off..)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> 

[jira] [Updated] (OAK-4835) Provide generic option to interrupt online revision cleanup

2016-09-26 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-4835:
-
Component/s: core

> Provide generic option to interrupt online revision cleanup
> ---
>
> Key: OAK-4835
> URL: https://issues.apache.org/jira/browse/OAK-4835
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk, segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>  Labels: compaction, gc
>
> JMX binding for stopping a running compaction process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522942#comment-15522942
 ] 

Vikas Saurabh commented on OAK-4581:


bq. The advantage of persisting events would have been that it wouldn't have 
needed such a cut-off..
Umm... do you mean that we'd be ok with infinite storage with persisted events? 
Or, that it de-couples storage concern with GC and hence the cut-off metric 
could be storage size (as against must-have time boundaries if we remain 
coupled)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522942#comment-15522942
 ] 

Vikas Saurabh edited comment on OAK-4581 at 9/26/16 12:43 PM:
--

bq. The advantage of persisting events would have been that it wouldn't have 
needed such a cut-off..
Umm... do you mean that we'd be ok with infinite storage with persisted events? 
Or, that it de-couples storage concern with GC and hence the cut-off metric 
could be something else (say, storage size v/s must-have time boundaries if we 
remain coupled)


was (Author: catholicon):
bq. The advantage of persisting events would have been that it wouldn't have 
needed such a cut-off..
Umm... do you mean that we'd be ok with infinite storage with persisted events? 
Or, that it de-couples storage concern with GC and hence the cut-off metric 
could be storage size (as against must-have time boundaries if we remain 
coupled)

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523087#comment-15523087
 ] 

Stefan Egli commented on OAK-4581:
--

bq. 3. those observers get events post reboot
Not sure that's really the case. I would argue that after a restart these 
persisted events get deleted. IMO what 'persist' in the context of this ticket 
emphasizes is only additional memory at runtime, not that it survives restarts.

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4673) Segment Tar should implement a cold standby functionality

2016-09-26 Thread Andrei Dulceanu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523101#comment-15523101
 ] 

Andrei Dulceanu commented on OAK-4673:
--

[~frm], could you mark this issue as RESOLVED, since all its subtasks are now 
resolved/closed?

> Segment Tar should implement a cold standby functionality
> -
>
> Key: OAK-4673
> URL: https://issues.apache.org/jira/browse/OAK-4673
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Andrei Dulceanu
> Fix For: Segment Tar 0.0.20
>
>
> oak-tarmk-standby currently works only with oak-segment. The new module 
> oak-segment-tar should work with oak-tarmk-standby or at least expose a 
> compatible cold standby functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4581) Persistent local journal for more reliable event generation

2016-09-26 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523201#comment-15523201
 ] 

Vikas Saurabh commented on OAK-4581:


Hmm... I probably put too much weight into C2 in description-
bq. The journal should survive restarts

> Persistent local journal for more reliable event generation
> ---
>
> Key: OAK-4581
> URL: https://issues.apache.org/jira/browse/OAK-4581
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Stefan Egli
>  Labels: observation
> Fix For: 1.6
>
> Attachments: OAK-4581.v0.patch
>
>
> As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
> drawbacks. Quite a bit of work is done to make diff generation faster. 
> However there are still chances of event queue getting filled up. 
> This issue is meant to implement a persistent event journal. Idea here being
> # NodeStore would push the diff into a persistent store via a synchronous 
> observer
> # Observors which are meant to handle such events in async way (by virtue of 
> being wrapped in BackgroundObserver) would instead pull the events from this 
> persisted journal
> h3. A - What is persisted
> h4. 1 - Serialized Root States and CommitInfo
> In this approach we just persist the root states in serialized form. 
> * DocumentNodeStore - This means storing the root revision vector
> * SegmentNodeStore - {color:red}Q1 - What does serialized form of 
> SegmentNodeStore root state looks like{color} - Possible the RecordId of 
> "root" state
> Note that with OAK-4528 DocumentNodeStore can rely on persisted remote 
> journal to determine the affected paths. Which reduces the need for 
> persisting complete diff locally.
> Event generation logic would then "deserialize" the persisted root states and 
> then generate the diff as currently done via NodeState comparison
> h4. 2 - Serialized commit diff and CommitInfo
> In this approach we can save the diff in JSOP form. The diff only contains 
> information about affected path. Similar to what is current being stored in 
> DocumentNodeStore journal
> h4. CommitInfo
> The commit info would also need to be serialized. So it needs to be ensure 
> whatever is stored there can be serialized or re calculated
> h3. B - How it is persisted
> h4. 1 - Use a secondary segment NodeStore
> OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
> [~mreutegg] suggested that for persisted local journal we can also utilize a 
> SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
> generation approach or relying on online compaction
> h4. 2- Make use of write ahead log implementations
> [~ianeboston] suggested that we can make use of some write ahead log 
> implementation like [1], [2] or [3]
> h3. C - How changes get pulled
> Some points to consider for event generation logic
> # Would need a way to keep pointers to journal entry on per listener basis. 
> This would allow each Listener to "pull" content changes and generate diff as 
> per its speed and keeping in memory overhead low
> # The journal should survive restarts
> [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
> [2] 
> https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
> [3] 
> https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4844) Analyse effects of simplified record ids

2016-09-26 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522577#comment-15522577
 ] 

Alex Parvulescu edited comment on OAK-4844 at 9/26/16 9:35 AM:
---

test results using the patch: size is {{13GB}}
{noformat}
Total size:
13 GB in  56521 data segments
768 KB in  3 bulk segments
3 GB in maps (46450859 leaf and branch records)
1 GB in lists (55469092 list and bucket records)
3 GB in values (value and block records of 70765678 properties, 
3429/378684/0/1214419 small/medium/long/external blobs, 46258452/1862224/159 
small/medium/long strings)
161 MB in templates (16772712 template records)
2 GB in nodes (251591739 node records)
links to non existing segments: []
{noformat}


was (Author: alex.parvulescu):
test results using the patch: size is {{13GB}}
{noformat
Total size:
13 GB in  56521 data segments
768 KB in  3 bulk segments
3 GB in maps (46450859 leaf and branch records)
1 GB in lists (55469092 list and bucket records)
3 GB in values (value and block records of 70765678 properties, 
3429/378684/0/1214419 small/medium/long/external blobs, 46258452/1862224/159 
small/medium/long strings)
161 MB in templates (16772712 template records)
2 GB in nodes (251591739 node records)
links to non existing segments: []
{noformat}

> Analyse effects of simplified record ids
> 
>
> Key: OAK-4844
> URL: https://issues.apache.org/jira/browse/OAK-4844
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: performance
> Fix For: Segment Tar 0.0.14
>
> Attachments: OAK-4844.patch
>
>
> OAK-4631 introduced a simplified serialisation for record ids. This causes 
> their footprint on disk to increase from 3 bytes to 18 bytes. OAK-4631 has 
> some initial analysis on the effect this is having on repositories as a 
> whole. 
> I'm opening this issue as a dedicated task to further look into mitigation 
> strategies (if necessary). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4844) Analyse effects of simplified record ids

2016-09-26 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522577#comment-15522577
 ] 

Alex Parvulescu edited comment on OAK-4844 at 9/26/16 9:40 AM:
---

test results using the patch: size is {{13GB}}
{noformat}
Total size:
13 GB in  56521 data segments
768 KB in  3 bulk segments
3 GB in maps (46450859 leaf and branch records)
1 GB in lists (55469092 list and bucket records)
3 GB in values (value and block records of 70765678 properties, 
3429/378684/0/1214419 small/medium/long/external blobs, 46258452/1862224/159 
small/medium/long strings)
161 MB in templates (16772712 template records)
2 GB in nodes (251591739 node records)
{noformat}


was (Author: alex.parvulescu):
test results using the patch: size is {{13GB}}
{noformat}
Total size:
13 GB in  56521 data segments
768 KB in  3 bulk segments
3 GB in maps (46450859 leaf and branch records)
1 GB in lists (55469092 list and bucket records)
3 GB in values (value and block records of 70765678 properties, 
3429/378684/0/1214419 small/medium/long/external blobs, 46258452/1862224/159 
small/medium/long strings)
161 MB in templates (16772712 template records)
2 GB in nodes (251591739 node records)
links to non existing segments: []
{noformat}

> Analyse effects of simplified record ids
> 
>
> Key: OAK-4844
> URL: https://issues.apache.org/jira/browse/OAK-4844
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: performance
> Fix For: Segment Tar 0.0.14
>
> Attachments: OAK-4844.patch
>
>
> OAK-4631 introduced a simplified serialisation for record ids. This causes 
> their footprint on disk to increase from 3 bytes to 18 bytes. OAK-4631 has 
> some initial analysis on the effect this is having on repositories as a 
> whole. 
> I'm opening this issue as a dedicated task to further look into mitigation 
> strategies (if necessary). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4813) Simplify the server side of cold standby

2016-09-26 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-4813.
-
Resolution: Fixed
  Assignee: Francesco Mari  (was: Andrei Dulceanu)

Committed a slightly modified version (minor aesthetic adjustments) at r1762301.

> Simplify the server side of cold standby
> 
>
> Key: OAK-4813
> URL: https://issues.apache.org/jira/browse/OAK-4813
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Francesco Mari
>Priority: Minor
> Fix For: Segment Tar 0.0.14
>
> Attachments: OAK-4813-01.patch
>
>
> With the changes introduced in OAK-4803, it would be nice to keep the 
> previous symmetry between the client and server and remove thus the  
> {{FileStore}} reference from the latter.
> Per [~frm]'s suggestion from one of the comments in OAK-4803:
> bq. In the end, these are the only three lines where the FileStore is used in 
> the server, which already suggests that this separation of concerns exists - 
> at least at the level of the handlers.
> {code:java}
> p.addLast(new GetHeadRequestHandler(new DefaultStandbyHeadReader(store)));
> p.addLast(new GetSegmentRequestHandler(new 
> DefaultStandbySegmentReader(store)));
> p.addLast(new GetBlobRequestHandler(new DefaultStandbyBlobReader(store)));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (OAK-4838) Move S3 classes to oak-blob-cloud module

2016-09-26 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain reopened OAK-4838:


> Move S3 classes to oak-blob-cloud module
> 
>
> Key: OAK-4838
> URL: https://issues.apache.org/jira/browse/OAK-4838
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: blob
>Reporter: Amit Jain
>Assignee: Amit Jain
>  Labels: technical_debt
> Fix For: 1.5.11
>
>
> Some S3 related classes are present in oak-core module. These should be moved 
> to oak-blob-cloud.
> This would also flip the module dependencies to oak-core -> oak-blob-cloud



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (OAK-4848) Improve oak-blob-cloud tests

2016-09-26 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain reopened OAK-4848:


> Improve oak-blob-cloud tests
> 
>
> Key: OAK-4848
> URL: https://issues.apache.org/jira/browse/OAK-4848
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: blob
>Reporter: Amit Jain
>Assignee: Amit Jain
>  Labels: technical_debt
> Fix For: 1.5.11
>
>
> Most S3 DataStore tests inherit from Jackrabbit 2's TestCaseBase. To make 
> extend them we should copy the class into Oak and let S3DataStore inherit 
> from that. In doing do we can also use the DataStore initialization part 
> already available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (OAK-4561) Avoid embedding Apache Commons Math in Segment Tar

2016-09-26 Thread Andrei Dulceanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu reassigned OAK-4561:


Assignee: Andrei Dulceanu  (was: Francesco Mari)

> Avoid embedding Apache Commons Math in Segment Tar
> --
>
> Key: OAK-4561
> URL: https://issues.apache.org/jira/browse/OAK-4561
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Andrei Dulceanu
> Fix For: Segment Tar 0.0.22
>
>
> Apache Commons Math is a relatively large dependency. If possible, embedding 
> it should be avoided in order not to increase the size of the Segment Tar 
> considerably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4673) Segment Tar should implement a cold standby functionality

2016-09-26 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-4673.
-
   Resolution: Fixed
 Assignee: Francesco Mari  (was: Andrei Dulceanu)
Fix Version/s: (was: Segment Tar 0.0.20)
   Segment Tar 0.0.14

Resolving this issue, as its blockers seem to be fixed.

> Segment Tar should implement a cold standby functionality
> -
>
> Key: OAK-4673
> URL: https://issues.apache.org/jira/browse/OAK-4673
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.14
>
>
> oak-tarmk-standby currently works only with oak-segment. The new module 
> oak-segment-tar should work with oak-tarmk-standby or at least expose a 
> compatible cold standby functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4852) Conditional update with relaxed cache consistency

2016-09-26 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-4852:
--
Labels: scalability  (was: )

> Conditional update with relaxed cache consistency
> -
>
> Key: OAK-4852
> URL: https://issues.apache.org/jira/browse/OAK-4852
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>  Labels: scalability
>
> With multiple DocumentNodeStore instances in a cluster, the root document can 
> become a bottleneck when many concurrent commits use it as a commit root.
> The DocumentStore API currently requires that all operations that update 
> documents immediately reflect all changes from other cluster nodes. Either 
> when returning the before Document or implicitly via a subsequent find or 
> query operation that may be served from the cache.
> However, the DocumentNodeStore only needs to see external changes on the root 
> node when it does a background read (or when it detects a collision). In 
> almost all cases it would be fine with an update operation that does not 
> immediately reflect external changes in the cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)