[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Amit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373932#comment-16373932
 ] 

Amit Jain commented on OAK-7083:


A point brought up by Chetan.
Is it a requirement that the Primary should not delete blobs used by the 
Secondary? I mean it would still be strange for the Primary (a Production 
system) to have to co-ordinate GC with a Secondary (staging or a test system) 
which it has no knowledge about. The indexing would have to be independent 
anyways with files in the Secondary's DataStore.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> 
>
> Key: OAK-7083
> URL: https://issues.apache.org/jira/browse/OAK-7083
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>Reporter: Matt Ryan
>Assignee: Matt Ryan
>Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Amit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373920#comment-16373920
 ] 

Amit Jain commented on OAK-7083:


[~mattvryan]
 Well the proposals you have outlined are not entirely different.
 Proposal 2 is the same as 1 just that it does not even make the code change 
and requires independent Mark/Sweep cycles. It still does not solve the problem 
of performance and wrong log messages.
{quote}In the normal shared data store use case I think the impact is that all 
of the connected repositories will try to run the sweep phase. The same blobs 
will be deleted by the first sweeper as would have been deleted before. It 
doesn't impact the ability to collect garbage,.
{quote}
It surely does impact for normal Shared DataStore deployments. The problem here 
is that since, there is only 1 repository sweeping, the state maintenance files 
(to know whether all repo sweep phase has run) will not be cleaned up. For a 
normal setup we want only 1 repo sweeping so that is good but when do we clean 
up those files here and when do we remove the reference files? That is the 
problem here. The 2nd run will again see these reference files and start from a 
stale state thus not taking new blobs into account.

If we say the repositories for CompositeDataStore each have to run this 
independently:
 * Mark on all repos
 * Sweep
 Then we don't have to make any change to the MarkSweepGarbageCollector and the 
process will be the same as currently for normal deployments.

For proposal 3 yes that would mean the Primary using the CompositeDataStore 
abstraction as well. But once it does it does not require any complicated setup 
for DataStore, GC etc. 

{quote}Any other proposals?{quote}
Essentially, I am Ok with proposal 2 for a start and then we can enhance with 
the proposal that I outlined of encoding the blob ids with the role/type of the 
DataStore. Could you please also add a response on the clarifications I had 
above.

{quote}but may impact efficiency or give confusing log messages (which might be 
fixable){quote}
How is it fixable with no information?

In addition I am beginning to think this particular use case might be slightly 
different from the CompositeDataStore where all the delegate DataStore are 
still managed by 1 repository. Here, the delegate DataStore are under the 
management of 2 different repositories.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> 
>
> Key: OAK-7083
> URL: https://issues.apache.org/jira/browse/OAK-7083
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>Reporter: Matt Ryan
>Assignee: Matt Ryan
>Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373665#comment-16373665
 ] 

Matt Ryan commented on OAK-7083:


Thanks [~amjain] for your comments and review so far.

Since there are a lot of questions I'm going to try to distill down to what I 
think are key issues and then work through the dependent issues as they come.

Let's consider first the proposals to handle garbage collection for composite 
data stores.  I think there are three currently.  For reference, my original 
proposal is:  Change the MarkSweepGarbageCollector so we don't remove any 
"references" files from the metadata area until all repositories connected to a 
data store have attempted the sweep phase.  I think the three proposals are:
 # Move forward with the change I proposed.
 # Require that every repository complete the "mark" phase before any 
repository can attempt a "sweep" phase.
 # Use my proposal but only for repositories using CompositeDataStore.

h3. Proposal 1

I believe the concern with proposal 1 is that production repositories sharing 
the same data store may run GC on completely different schedules.  We can't be 
sure that all repositories complete a mark phase before any repository attempts 
a sweep phase.  In the context of my proposal, I believe what this means is 
that blobs that should be deleted may take longer to delete than expected - for 
example, it may require a couple of invocations.

In the normal shared data store use case I think the impact is that all of the 
connected repositories will try to run the sweep phase.  The same blobs will be 
deleted by the first sweeper as would have been deleted before.  It doesn't 
impact the ability to collect garbage, but may impact efficiency or give 
confusing log messages (which might be fixable).

In the composite data store use case since either repository may have the 
ability to delete blobs that the other repository cannot delete this may mean 
that it takes multiple cycles to do this.  For example, assuming a production 
and staging system, if the staging system deletes a node with a blob reference, 
and then runs mark and then sweep, the sweep may fail since the production 
hasn't done the mark phase yet (no "references" file from production repo).  
Later, the production system would mark and then sweep, deleting the blobs but 
unable to delete blobs on the staging side.  However, with my change the 
"references" files remain, so the next time the staging system runs mark and 
sweep it will be able to sweep since all the "references" files are still 
there, and then it will delete the blob that became unreferenced before.

So eventually I think blobs that should be collected will end up collected 
although it may take a while.
h3. Proposal 2

If we require that every repository complete the "mark" phase before any 
repository can attempt a "sweep" phase, it won't eliminate the need for every 
repository to perform the sweep.  This is still needed because each repository 
has binaries that only can be deleted by that repository.

What it could do is hopefully coordinate the sweep phases so not so much time 
elapses as in proposal 1.

However, I think you still have to answer the question, what does a repository 
do if it is ready to sweep but not all repositories have completed the mark 
phase?  This is almost what we have now.  If not every repository has completed 
the mark phase, and one repository wants to sweep, what happens?  I assume it 
just cancels the sweep until the next scheduled GC time.  In which case I don't 
see how this is any better than proposal 1.
h3. Proposal 3

This proposal is to only use my GC changes with CompositeDataStore.  I'm not 
sure exactly what we mean by this.

We could say that it is only used in repositories that are using a 
CompositeDataStore.  This could be done, although it would probably require 
changing the node store code so that it obtains the garbage collector from a 
registered reference instead of instantiating it directly, and then having the 
different data stores register a garbage collector for use by the node store.  
It might complicate the dependency tree and other things depending on how the 
garbage collector becomes available to the node store (see the 
SegmentNodeStoreService code where the MarkSweepGarbageCollector is 
instantiated to see what I mean).

But it doesn't matter because this approach won't actually solve the problem, 
in my view.  The reason is that *both* of the systems participating have to use 
the same garbage collection algorithm.  In other words, if staging has the 
CompositeDataStore, it is going to rely upon the production system to write the 
"sweepComplete" metadata file and leave the "references" files in order for the 
staging system to successfully complete the "sweep" phase.  The production 
system isn't using CompositeDataStore, though, so if it is relying on 

[jira] [Comment Edited] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373552#comment-16373552
 ] 

Vikas Saurabh edited comment on OAK-7198 at 2/22/18 10:14 PM:
--

Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{\^\[\^}} and 
read quite big tree in memory from the dumped documents.


was (Author: catholicon):
Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{\^\[\^\}} and 
read quite big tree in memory from the dumped documents.

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373552#comment-16373552
 ] 

Vikas Saurabh edited comment on OAK-7198 at 2/22/18 10:13 PM:
--

Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{\^\[\^\}} and 
read quite big tree in memory from the dumped documents.


was (Author: catholicon):
Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{_\^\[\^\_}} and 
read quite big tree in memory from the dumped documents.

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373552#comment-16373552
 ] 

Vikas Saurabh edited comment on OAK-7198 at 2/22/18 10:12 PM:
--

Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{ \^\[\^\ }} and 
read quite big tree in memory from the dumped documents.


was (Author: catholicon):
Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{\^\[\^\ }} and 
read quite big tree in memory from the dumped documents.

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373552#comment-16373552
 ] 

Vikas Saurabh edited comment on OAK-7198 at 2/22/18 10:12 PM:
--

Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{_\^\[\^\_}} and 
read quite big tree in memory from the dumped documents.


was (Author: catholicon):
Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{ \^\[\^\ }} and 
read quite big tree in memory from the dumped documents.

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373552#comment-16373552
 ] 

Vikas Saurabh commented on OAK-7198:


Backported to 1.8 in [r1825101|https://svn.apache.org/r1825101].

This issue led to OOM whlie doing {{--doc-traversal-mode}} based oak-run 
indexing on mongo.  The issue was that due to this issue 
{{ChildNodeStateProvider}} tried to find a child node named {{\^\[\^\ }} and 
read quite big tree in memory from the dumped documents.

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-7198:
---
Labels:   (was: candidate_oak_1_8)

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7198) Index rule with REGEX_ALL_PROPS includes relative node

2018-02-22 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-7198:
---
Fix Version/s: 1.8.3

> Index rule with REGEX_ALL_PROPS includes relative node
> --
>
> Key: OAK-7198
> URL: https://issues.apache.org/jira/browse/OAK-7198
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.9.0, 1.10, 1.8.3
>
> Attachments: OAK-7198.patch
>
>
> A lucene index with an index rule that includes properties with 
> {{LuceneIndexConstants.REGEX_ALL_PROPS}} also includes some child node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373250#comment-16373250
 ] 

Matt Ryan edited comment on OAK-7083 at 2/22/18 9:22 PM:
-

(From [~amjain] via oak-dev)
{quote} bq. The solution for {{SharedDataStore}} currently is to require all 
repositories to run a Mark phase then run the Sweep phase on one of them.
Yes. Sorry, I didn’t mention that. I was trying to be brief and ended up being 
unclear. In the situation I described above it is definitely running the mark 
phase first and then the sweep phase. The problem is still as I described - no 
matter which one runs sweep first, it cannot delete all the binaries that may 
possibly have been deleted on both systems.
{quote}
The problem is because that's how the systems are set up. For this particular 
problem on the Secondary there is no reason to even account for the Primary's 
datastore as it should not and cannot delete anything in there.
{quote} bq. Besides there's a problem of the Sweep phase on the primary 
encountering blobs it does not know about (from the secondary) and which it 
cannot delete creating an unpleasant experience. As I understand the Primary 
could be a production system and having these sort of errors crop up would be 
problematic.
If they are regarded as errors, yes. Currently this logs a WARN level message 
(not an ERROR) which suggests that sometimes not all the binaries targeted for 
deletion will actually be deleted.
 So this might be an issue of setting clear expectations. But I do see the 
point.
{quote}
Yes these are logged as WARN as these are not fatal and empirically these are 
problematic and is questioned by customers. But apart from that there is a 
performance impact also as each binary is attempted for deletion which incurs a 
penalty.
{quote} bq. Encode the blobs ids on the Secondary with the {{DataStore}} 
location/type with which we can distinguish the blob ids belonging to the 
respective {{DataStore}}s.
That’s a solution that only works in this very specific use case of 
{{CompositeDataStore}}. In the future if we were ever to want to support 
different scenarios we would then have to reconsider how it encodes blobs for 
each delegate. Would that mean that data written to a data store by the 
{{CompositeDataStore}} could not be read by another {{CompositeDataStore}} 
referencing the same delegate?
{quote}
But encoding of blob ids is needed anyways irrespective of the GC no? 
Otherwise, how does the {{CompositeDataStore}} redirect the calls to CRUD on 
the respective DSs? And did not understand how encoding the blob id with 
information about the DS preclude it from reading. It has to have the same 
semantics for the same delegate. But yes it does preclude moving the blobs from 
one subspace to another. But I don't think that's the use case anyways.
{quote} bq. Secondary's Mark phase only redirects the Primary owned blobids to 
the references file in the Primary's {{DataStore}} (Primary's DataStore 
operating as Shared).
The {{DataStore}} has no knowledge of the garbage collection stages. So IIUC 
this would require creating a new garbage collector that is aware of composite 
data stores and has the ability to interact with the {{CompositeDataStore}} in 
a tightly coupled fashion. Either that or we would have to enhance the data 
store API (for example, add a new interface or extend an interface so it can be 
precisely controlled by the garbage collector). Or both.
{quote}
{{DataStore}} does not have knowledge when GC is taking place. But it does have 
helper methods which are used by GC. Yes I would think that the methods 
currently existing for purpose of GC need to be enhanced and the Composite 
would have some intelligence on execution of some methods for e.g. delete and 
the metadata methods with some information about the delegates.
{quote} bq. Secondary executes GC for its {{DataStore}} independently and does 
not worry about the Shared blobids (already taken care of above).
Same issue - GC happens outside of the control of the {{DataStore}}.

It’s a good idea Amit - something I struggled with quite a while. I considered 
the same approach as well. But it tightly binds garbage collection to the data 
store, whereas now they are currently very loosely bound. GC leverages the 
{{DataStore}} APIs to do GC tasks (like reading and writing metadata files) but 
the {{DataStore}} doesn’t have any knowledge that GC is even happening.

So i don’t see how the {{CompositeDataStore}} could control execution of GC 
only on the independent data store.
{quote}
It does not control execution of the GC but it does control the GC helper 
methods and uses info already available with it for the delegates. Also, we 
could simply have GC instances bound to each delegate {{DataStore}}. This also 
would be similar to a case where we use the {{CompositeDataStore}} for 
internally creating a 

[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373250#comment-16373250
 ] 

Matt Ryan commented on OAK-7083:


(From [~amjain] via oak-dev)
{quote}{quote}The solution for {{SharedDataStore}} currently is to require all 
repositories to run a Mark phase then run the Sweep phase on one of them.
{quote}
Yes. Sorry, I didn’t mention that. I was trying to be brief and ended up being 
unclear. In the situation I described above it is definitely running the mark 
phase first and then the sweep phase. The problem is still as I described - no 
matter which one runs sweep first, it cannot delete all the binaries that may 
possibly have been deleted on both systems.
{quote}
The problem is because that's how the systems are set up. For this particular 
problem on the Secondary there is no reason to even account for the Primary's 
datastore as it should not and cannot delete anything in there.
{quote}{quote}Besides there's a problem of the Sweep phase on the primary 
encountering blobs it does not know about (from the secondary) and which it 
cannot delete creating an unpleasant experience. As I understand the Primary 
could be a production system and having these sort of errors crop up would be 
problematic.
{quote}
If they are regarded as errors, yes. Currently this logs a WARN level message 
(not an ERROR) which suggests that sometimes not all the binaries targeted for 
deletion will actually be deleted.
 So this might be an issue of setting clear expectations. But I do see the 
point.
{quote}
Yes these are logged as WARN as these are not fatal and empirically these are 
problematic and is questioned by customers. But apart from that there is a 
performance impact also as each binary is attempted for deletion which incurs a 
penalty.
{quote}{quote}Encode the blobs ids on the Secondary with the {{DataStore}} 
location/type with which we can distinguish the blob ids belonging to the 
respective {{DataStore}}s.
{quote}
That’s a solution that only works in this very specific use case of 
{{CompositeDataStore}}. In the future if we were ever to want to support 
different scenarios we would then have to reconsider how it encodes blobs for 
each delegate. Would that mean that data written to a data store by the 
{{CompositeDataStore}} could not be read by another {{CompositeDataStore}} 
referencing the same delegate?
{quote}
But encoding of blob ids is needed anyways irrespective of the GC no? 
Otherwise, how does the {{CompositeDataStore}} redirect the calls to CRUD on 
the respective DSs? And did not understand how encoding the blob id with 
information about the DS preclude it from reading. It has to have the same 
semantics for the same delegate. But yes it does preclude moving the blobs from 
one subspace to another. But I don't think that's the use case anyways.
{quote}{quote}Secondary's Mark phase only redirects the Primary owned blobids 
to the references file in the Primary's {{DataStore}} (Primary's DataStore 
operating as Shared).
{quote}
The {{DataStore}} has no knowledge of the garbage collection stages. So IIUC 
this would require creating a new garbage collector that is aware of composite 
data stores and has the ability to interact with the {{CompositeDataStore}} in 
a tightly coupled fashion. Either that or we would have to enhance the data 
store API (for example, add a new interface or extend an interface so it can be 
precisely controlled by the garbage collector). Or both.
{quote}
{{DataStore}} does not have knowledge when GC is taking place. But it does have 
helper methods which are used by GC. Yes I would think that the methods 
currently existing for purpose of GC need to be enhanced and the Composite 
would have some intelligence on execution of some methods for e.g. delete and 
the metadata methods with some information about the delegates.
{quote}{quote}Secondary executes GC for its {{DataStore}} independently and 
does not worry about the Shared blobids (already taken care of above).
{quote}
Same issue - GC happens outside of the control of the {{DataStore}}.

It’s a good idea Amit - something I struggled with quite a while. I considered 
the same approach as well. But it tightly binds garbage collection to the data 
store, whereas now they are currently very loosely bound. GC leverages the 
{{DataStore}} APIs to do GC tasks (like reading and writing metadata files) but 
the {{DataStore}} doesn’t have any knowledge that GC is even happening.

So i don’t see how the {{CompositeDataStore}} could control execution of GC 
only on the independent data store.
{quote}
It does not control execution of the GC but it does control the GC helper 
methods and uses info already available with it for the delegates. Also, we 
could simply have GC instances bound to each delegate {{DataStore}}. This also 
would be similar to a case where we use the {{CompositeDataStore}} for 
internally creating a 

[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373241#comment-16373241
 ] 

Matt Ryan commented on OAK-7083:


 
{quote} bq. Now the problem comes when secondary tries to run the sweep phase. 
It will first try to verify that a references file exists for each repository 
file in DS_P - and fail. This fails because primary deleted its references file 
already. Thus secondary will cancel GC and thus blob C never ends up getting 
deleted. Note that secondary must delete C because it is the only repository 
that knows about C.
 bq. This same situation exists also if secondary sweeps first. If record D was 
created by primary after secondary was cloned, then D is deleted by primary, 
secondary never knows about blob D so it cannot delete it during the sweep 
phase - it can only be deleted by primary.

The solution for SharedDataStore currently is to require all repositories to 
run a Mark phase then run the Sweep phase on one of them.
{quote}

Yes.  Sorry, I didn’t mention that.  I was trying to be brief and ended up 
being unclear.  In the situation I described above it is definitely running the 
mark phase first and then the sweep phase.  The problem is still as I described 
- no matter which one runs sweep first, it cannot delete all the binaries that 
may possibly have been deleted on both systems.

{quote} bq. The change I made to the garbage collector is that when a 
repository finishes the sweep phase, it doesn’t necessarily delete the 
references file. Instead it marks the data store with a “sweepComplete” file 
indicating that this repository finished the sweep phase. When there is a 
“sweepComplete” file for every repository (in other words, the last repository 
to sweep), then all the references files are deleted.

Well currently the problem is that all repositories are not required to run the 
sweep phase. The solution above would have been ok when the GC is to be run 
manually at different times as in your case.
{quote}
Exactly - in the case I’ve described both have to successfully run a sweep or 
not all binaries will be deleted.
{quote}But in the real world applications typically there's a cron (e.g. AEM 
maintenance task) which could be setup to execute weekly at a particular time 
on all repositories. In this case in almost all cases the repository which 
finished the Mark phase at the last would only be able to execute the Sweep 
phase as it would be the only repository to see all the reference files for 
other repos (others executing before it would fail). This is still Ok for the 
{{SharedDataStore}} use cases we have. But with the above solution since not 
all repositories would be able to run the sweep phase the reference files won't 
be cleaned up.
{quote}
A very valid point.  I'll need to think that one through some more.
{quote}Besides there's a problem of the Sweep phase on the primary encountering 
blobs it does not know about (from the secondary) and which it cannot delete 
creating an unpleasant experience. As I understand the Primary could be a 
production system and having these sort of errors crop up would be problematic.
{quote}
If they are regarded as errors, yes.  Currently this logs a WARN level message 
(not an ERROR) which suggests that sometimes not all the binaries targeted for 
deletion will actually be deleted.

So this might be an issue of setting clear expectations.  But I do see the 
point.
{quote}So, generically the solution would be to use the shared {{DataStore}} GC 
paradigm we currently have which requires Mark phase to be run on all 
repositories before running a Sweep.
{quote}
Yes - like I said this is being done, it still requires that both repos do a 
sweep.
{quote}For this specific use case some observations and quick rough sketch of a 
possible solution:
 * The \{{DataStore}}s for the 2 repositories - Primary & Secondary can be 
thought of as Shared & Private
 ** Primary does not know about Secondary and could be an existing repository 
and thus does not know about the {{DataStore}} of the Secondary as well. In 
other words it could even function as a normal {{DataStore}} and need not be a 
{{CompositeDataStore}}.
 ** Secondary does need to know about the Primary and thus registers itself as 
sharing the Primary {{DataStore}}.
 * Encode the blobs ids on the Secondary with the {{DataStore}} location/type 
with which we can distinguish the blob ids belonging to the respective 
\{{DataStore}}s.{quote}
That’s a solution that only works in this very specific use case of 
CompositeDataStore.  In the future if we were ever to want to support different 
scenarios we would then have to reconsider how it encodes blobs for each 
delegate.  Would that mean that data written to a data store by the 
CompositeDataStore could not be read by another CompositeDataStore referencing 
the same delegate?
{quote} * Secondary's Mark phase only redirects the Primary 

[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373229#comment-16373229
 ] 

Matt Ryan commented on OAK-7083:


(From [~amjain] via oak-dev):

 
{quote}Now the problem comes when secondary tries to run the sweep phase.  It 
will first try to verify that a references file exists for each repository file 
in DS_P - and fail.  This fails because primary deleted its references file 
already.  Thus secondary will cancel GC and thus blob C never ends up getting 
deleted.  Note that secondary must delete C because it is the only repository 
that knows about C.

This same situation exists also if secondary sweeps first.  If record D was 
created by primary after secondary was cloned, then D is deleted by primary, 
secondary never knows about blob D so it cannot delete it during the sweep 
phase - it can only be deleted by primary.
{quote}
 

The solution for {{SharedDataStore}} currently is to require all repositories 
to run a Mark phase then run the Sweep phase on one of them.

 
{quote}The change I made to the garbage collector is that when a repository 
finishes the sweep phase, it doesn’t necessarily delete the references file.  
Instead it marks the data store with a “sweepComplete” file indicating that 
this repository finished the sweep phase.  When there is a “sweepComplete” file 
for every repository (in other words, the last repository to sweep), then all 
the references files are deleted.
{quote}
 

Well currently the problem is that all repositories are not required to run the 
sweep phase. The solution above would have been ok when the GC is to be run 
manually at different times as in your case. But in the real world applications 
typically there's a cron (e.g. AEM maintenance task) which could be setup to 
execute weekly at a particular time on all repositories. In this case in almost 
all cases the repository which finished the Mark phase at the last would only 
be able to execute the Sweep phase as it would be the only repository to see 
all the reference files for other repos (others executing before it would 
fail). This is still Ok for the {{SharedDataStore}} use cases we have. But with 
the above solution since not all repositories would be able to run the sweep 
phase the reference files won't be cleaned up. 

Besides there's a problem of the Sweep phase on the primary encountering blobs 
it does not know about (from the secondary) and which it cannot delete creating 
an unpleasant experience. As I understand the Primary could be a production 
system and having these sort of errors crop up would be problematic. 

So, generically the solution would be to use the shared {{DataStore}} GC 
paradigm we currently have which requires Mark phase to be run on all 
repositories before running a Sweep. 

For this specific use case some observations and quick rough sketch of a 
possible solution:
 * The {{DataStore}}s for the 2 repositories - Primary & Secondary can be 
thought of as Shared & Private
 ** Primary does not know about Secondary and could be an existing repository 
and thus does not know about the {{DataStore}} of the Secondary as well. In 
other words it could even function as a normal {{DataStore}} and need not be a 
{{CompositeDataStore}}.
 ** Secondary does need to know about the Primary and thus registers itself as 
sharing the Primary {{DataStore}}.
 * Encode the blobs ids on the Secondary with the {{DataStore}} location/type 
with which we can distinguish the blob ids belonging to the respective 
{{DataStore}}s.
 * Secondary's Mark phase only redirects the Primary owned blobids to the 
references file in the Primary's {{DataStore}} (Primary's {{DataStore}} 
operating as Shared).
 * Secondary executes GC for its {{DataStore}} independently and does not worry 
about the Shared blobids (already taken care of above). 


I presume some of the above steps are required to enable a generic or even some 
restricted {{CompositeDataStore}} solutions.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> 
>
> Key: OAK-7083
> URL: https://issues.apache.org/jira/browse/OAK-7083
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>Reporter: Matt Ryan
>Assignee: Matt Ryan
>Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test 

[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373222#comment-16373222
 ] 

Matt Ryan commented on OAK-7083:


Since [https://github.com/apache/jackrabbit-oak/pull/80] entails a change to 
the {{MarkSweepGarbageCollector}}, we should discuss the change here to see if 
there are concerns with it or if there is a better approach.
 
Let me try to briefly explain the change I've proposed to the 
{{MarkSweepGarbageCollector}}.
 
In the use case I tested there are two Oak repositories, one which we will call 
primary and one which we will call secondary.  Primary gets created first; 
secondary is created by cloning the node store of primary, then using a 
{{CompositeDataStore}} to have two delegate data stores.  The first delegate is 
the same as the data store for primary, in read-only mode.  The second delegate 
is only accessible by the secondary repo.
 
Let the data store shared by primary and secondary be called DS_P and the data 
store being used only by the secondary be called DS_S.  DS_P can be read by 
secondary but not modified, so all changes on secondary are saved in DS_S.  
Primary can still make changes to DS_P.
 
Suppose after creating both repositories, records A and B are deleted from the 
primary repo, and records B and C are deleted from the secondary repo.  Since 
DS_P is shared, only blob B should actually be deleted from DS_P via GC.  After 
both repositories run their “mark” phase, the primary repo created a 
“references” file in DS_P excluding A and B, meaning primary thinks A and B can 
both be deleted.  And the secondary repo created a “references” file in DS_P 
excluding B and C, meaning secondary thinks B and C can both be deleted.
 
Suppose then primary runs the sweep phase first.  It will first verify that it 
has a references file for each repository file in DS_P.  Since both primary and 
secondary put one there this test passes.  It will then merge all the data in 
all the references files in DS_P with its own local view of the existing blobs, 
and come up with a set of blobs to delete.  Primary will conclude that blobs B 
and C should be deleted - B because both primary and secondary said it is 
deleted, and C because secondary said it should be deleted and primary has no 
knowledge of C so it will assume it is okay to delete.  At this point primary 
will delete B and try to delete C and fail (which is ok).  Then primary will 
delete its “references” file from DS_P and call the sweep phase complete.
 
Now the problem comes when secondary tries to run the sweep phase.  It will 
first try to verify that a references file exists for each repository file in 
DS_P - and fail.  This fails because primary deleted its references file 
already.  Thus secondary will cancel GC and thus blob C never ends up getting 
deleted.  Note that secondary must delete C because it is the only repository 
that knows about C.
 
This same situation exists also if secondary sweeps first.  If record D was 
created by primary after secondary was cloned, then D is deleted by primary, 
secondary never knows about blob D so it cannot delete it during the sweep 
phase - it can only be deleted by primary.
 
 
The change I made to the garbage collector is that when a repository finishes 
the sweep phase, it doesn’t necessarily delete the references file.  Instead it 
marks the data store with a “sweepComplete” file indicating that this 
repository finished the sweep phase.  When there is a “sweepComplete” file for 
every repository (in other words, the last repository to sweep), then all the 
references files are deleted.
 
I wrote an integration test to test DSGC for this specific composite data store 
use case at 
[https://github.com/mattvryan/jackrabbit-oak/blob/39b33fe94a055ef588791f238eb85734c34062f3/oak-blob-composite/src/test/java/org/apache/jackrabbit/oak/blob/composite/CompositeDataStoreRORWIT.java.]
 
 
All the oak unit tests pass with this change.  I am concerned about any 
unforeseen consequences that others on-list may have about this change.  Also 
there’s the issue that sweeping must now be done by every repository sharing 
the data store, which will have some inefficiencies.  I’m open to changes or to 
a different approach if we can solve the problem described above still.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> 
>
> Key: OAK-7083
> URL: https://issues.apache.org/jira/browse/OAK-7083
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>Reporter: Matt Ryan
>Assignee: Matt Ryan
>Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> 

[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373219#comment-16373219
 ] 

Matt Ryan commented on OAK-7083:


I'm going to be moving a conversation from the oak-dev list to the ticket over 
the next few comments.

In doing so, I want to make sure we consider the larger picture.  **It is 
important to remember that right now we are talking about a very specific use 
case of {{CompositeDataStore}}, which is the purpose of this issue:  To support 
the ReadOnly/ReadWrite Delegate scenario.  However, we of course want to avoid 
making design decisions that limit the usability of the {{CompositeDataStore}} 
in the future.  Other uses may not always have exactly one read-only delegate - 
there may be multiple writable delegates, with zero or more read-only 
delegates, in any number of combinations.  I want to avoid designing ourselves 
into a corner where we can't support other use cases easily.

 

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> 
>
> Key: OAK-7083
> URL: https://issues.apache.org/jira/browse/OAK-7083
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>Reporter: Matt Ryan
>Assignee: Matt Ryan
>Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support

2018-02-22 Thread Matt Ryan (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373153#comment-16373153
 ] 

Matt Ryan commented on OAK-7083:


The following pull requests have been created for this issue:
 * [https://github.com/apache/jackrabbit-oak/pull/71] - Changes made to 
{{oak-blob-plugins}} and {{oak-blob}} to support {{CompositeDataStore}}.  This 
includes the following:

 ** Adding a {{DataStoreProvider}} interface to {{oak-blob}} so that delegate 
data stores can associate a role to themselves.
 ** Implementing {{AbstractDataStoreFactory}} in {{oak-blob-plugins}} so 
multiple data stores can be configured as factory classes.
 ** Implementing {{FileDataStoreFactory}} in {{oak-blob-plugins}} to provide 
this capability for {{FileDataStore}}.
 * 
[https://github.com/apache/jackrabbit-oak/pull/80|https://github.com/apache/jackrabbit-oak/pull/80/files]
 - Changes made to {{oak-blob-plugins}} to {{MarkSweepGarbageCollector}} so 
that garbage collection will work for {{CompositeDataStore}}.
 * [https://github.com/apache/jackrabbit-oak/pull/74] - Includes the following 
changes:
 ** Addition of {{S3DataStoreFactory}} to {{oak-blob-cloud}}
 ** Creation of {{oak-blob-composite}} which implements {{CompositeDataStore}} 
(service, data store, supporting code, unit tests, etc.)

 

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> 
>
> Key: OAK-7083
> URL: https://issues.apache.org/jira/browse/OAK-7083
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>Reporter: Matt Ryan
>Assignee: Matt Ryan
>Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single 
> standard Oak data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, 
> and then uses a composite data store to refer to the first instance's data 
> store read-only, and refers to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a 
> production instance.  At creation, the test instance will look like 
> production, but any changes made to the test instance do not affect 
> production.  The test instance can be quickly created from production by 
> cloning only the node store, and not requiring a copy of all the data in the 
> data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7280) Remove superfluous methods from SegmentWriter

2018-02-22 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-7280:

Fix Version/s: 1.10

> Remove superfluous methods from SegmentWriter
> -
>
> Key: OAK-7280
> URL: https://issues.apache.org/jira/browse/OAK-7280
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10
>
>
> Some methods in {{SegmentWriter}} are only used in test code. Production code 
> works only on a much smaller subset of the methods exposed by 
> {{SegmentWriter}}. As such, the superfluous methods should be removed and the 
> tests adapted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-7280) Remove superfluous methods from SegmentWriter

2018-02-22 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-7280:
---

 Summary: Remove superfluous methods from SegmentWriter
 Key: OAK-7280
 URL: https://issues.apache.org/jira/browse/OAK-7280
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari


Some methods in {{SegmentWriter}} are only used in test code. Production code 
works only on a much smaller subset of the methods exposed by 
{{SegmentWriter}}. As such, the superfluous methods should be removed and the 
tests adapted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-6312) Unify NodeStore/DataStore configurations

2018-02-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-6312:
---
Fix Version/s: 1.10

> Unify NodeStore/DataStore configurations
> 
>
> Key: OAK-6312
> URL: https://issues.apache.org/jira/browse/OAK-6312
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: blob, blob-plugins, composite, documentmk, rdbmk, 
> segment-tar
>Reporter: Arek Kita
>Priority: Major
>  Labels: configuration, production
> Fix For: 1.10
>
>
> I've noticed recently that with many different NodeStore
> implementation (Segment, Document, Composite) but also DataStore
> implementation (File, S3, Azure) and some composite ones like
> (Hierarchical, Federated) it
> becomes more and more difficult to set up everything correctly and be
> able to know the current persistence state of repository (especially
> with pretty aged repos). The factory code/required options are more complex 
> not only from user perspective but also from maintenance point.
> We should have the same means of *describing* layouts of Oak repository no 
> matter if it is simple or more layered/composite instance.
> Some work has already been done in scope of OAK-6210 so I guess we have good 
> foundations to continue working in that direction.
> /cc [~mattvryan], [~chetanm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7043) Collect SegmentStore stats as part of status zip

2018-02-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-7043:
---
Fix Version/s: 1.10

> Collect SegmentStore stats as part of status zip
> 
>
> Key: OAK-7043
> URL: https://issues.apache.org/jira/browse/OAK-7043
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar
>Reporter: Chetan Mehrotra
>Priority: Major
>  Labels: monitoring, production
> Fix For: 1.10
>
>
> Many times while investigating issue we request customer to provide to size 
> of segmentstore and at times list of segmentstore directory. It would be 
> useful if there is an InventoryPrinter for SegmentStore which can include
> * Size of segment store 
> * Listing of segment store directory
> * Possibly tail of journal.log
> * Possibly some stats/info from index files stored in tar files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-4994) Implement additional record types

2018-02-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig resolved OAK-4994.

   Resolution: Later
Fix Version/s: (was: 1.10)

Resolving as later. We'll implement additional record types as the need comes 
up. E.g. probably for OAK-5885 and also for additional tooling.

> Implement additional record types
> -
>
> Key: OAK-4994
> URL: https://issues.apache.org/jira/browse/OAK-4994
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Minor
>  Labels: tooling
>
> The records written in the segment store should be augmented with additional 
> types. In OAK-2498 the following additional types were identified:
> - List of property names. A list of strings, where every string is a property 
> name, is referenced by the template record.
> - List of list of values. This list is pointed to by the node record and 
> contains the values for single\- and multi\- value properties of that node. 
> The double indirection is needed to support multi-value properties.
> - Map from string to node. This map is referenced by the template and 
> represents the child relationship between nodes.
> - Super root. This is a marker type identifying top-level records for the 
> repository super-roots.
> Just adding these types doesn't improve the situation for the segment store, 
> though. Bucket and block records are not easily parseable because they have a 
> variable length and their size is not specified in the record value itself. 
> For record types to be used effectively, the way we serialize certain kind of 
> data has to be reviewed for further improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7276) Build Jackrabbit Oak #1256 failed

2018-02-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372950#comment-16372950
 ] 

Hudson commented on OAK-7276:
-

Previously failing build now is OK.
 Passed run: [Jackrabbit Oak 
#1259|https://builds.apache.org/job/Jackrabbit%20Oak/1259/] [console 
log|https://builds.apache.org/job/Jackrabbit%20Oak/1259/console]

> Build Jackrabbit Oak #1256 failed
> -
>
> Key: OAK-7276
> URL: https://issues.apache.org/jira/browse/OAK-7276
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration
>Reporter: Hudson
>Priority: Major
>
> No description is provided
> The build Jackrabbit Oak #1256 has failed.
> First failed run: [Jackrabbit Oak 
> #1256|https://builds.apache.org/job/Jackrabbit%20Oak/1256/] [console 
> log|https://builds.apache.org/job/Jackrabbit%20Oak/1256/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-4994) Implement additional record types

2018-02-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-4994:
---
Priority: Minor  (was: Major)

> Implement additional record types
> -
>
> Key: OAK-4994
> URL: https://issues.apache.org/jira/browse/OAK-4994
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Minor
>  Labels: tooling
> Fix For: 1.10
>
>
> The records written in the segment store should be augmented with additional 
> types. In OAK-2498 the following additional types were identified:
> - List of property names. A list of strings, where every string is a property 
> name, is referenced by the template record.
> - List of list of values. This list is pointed to by the node record and 
> contains the values for single\- and multi\- value properties of that node. 
> The double indirection is needed to support multi-value properties.
> - Map from string to node. This map is referenced by the template and 
> represents the child relationship between nodes.
> - Super root. This is a marker type identifying top-level records for the 
> repository super-roots.
> Just adding these types doesn't improve the situation for the segment store, 
> though. Bucket and block records are not easily parseable because they have a 
> variable length and their size is not specified in the record value itself. 
> For record types to be used effectively, the way we serialize certain kind of 
> data has to be reviewed for further improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-7279) segment-tar update from java 7 to java 8 may break persisted names using invalid characters

2018-02-22 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-7279:
---

 Summary: segment-tar update from java 7 to java 8 may break 
persisted names using invalid characters
 Key: OAK-7279
 URL: https://issues.apache.org/jira/browse/OAK-7279
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Julian Reschke


segment-tar relies on {{String.getBytes()}} when persisting strings such as 
item names.

The problem is that the behavior for this has been changed in Java 8 with 
respect to invalid strings (here: null characters and unpaired surrogates).

In Java 7, these would roundtrip, as Java was using the so-called "modified 
UTF-8" encoding (see 
https://docs.oracle.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8).
 This will produce byte sequence that are *not* valid UTF-8.

Java 7 will read them back, but Java 8 will map the non-conforming byte 
sequences to the Unicode replacement character. Note that in particular, 
multiple child entries might get identical names as a consequence.

I'm not sure about the severity of this, and whether something needs to be done 
about it. AFAIC, this is another good reason to reject invalid strings as early 
as possible in the stack.

cc [~mduerig]





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-5885) segment-tar should have a tarmkrecovery command

2018-02-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-5885:
---
Issue Type: New Feature  (was: Task)

> segment-tar should have a tarmkrecovery command
> ---
>
> Key: OAK-5885
> URL: https://issues.apache.org/jira/browse/OAK-5885
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: run, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Michael Dürig
>Priority: Minor
>  Labels: candidate_oak_1_8, tooling
> Fix For: 1.10
>
>
> {{oak-segment}} had a {{tarmkrecovery}} command responsible with listing 
> candidates for head journal entries. We should re-enable this also for 
> {{oak-segment-tar}}.
> /cc [~mduerig] [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-6941) Compatibility matrix for oak-run compact

2018-02-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-6941:
---
Fix Version/s: (was: 1.8.3)
   (was: 1.9.0)

> Compatibility matrix for oak-run compact
> 
>
> Key: OAK-6941
> URL: https://issues.apache.org/jira/browse/OAK-6941
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: doc, run, segment-tar
>Reporter: Valentin Olteanu
>Priority: Major
>  Labels: documentation, tooling
> Fix For: 1.10
>
>
> h4. Problem statement
> For compacting the segmentstore using {{oak-run}}, the safest option is to 
> use the same version of {{oak-run}} as the Oak version used to generate the 
> repository. Yet, sometimes, a newer {{oak-run}} version is recommended to 
> benefit of bug fixes and improvements, but not every combination of source 
> repo and oak-run is safe to use and the user needs a way to check the 
> compatibility. Thus, the users need a tool that guides the decision of which 
> version to use.
> h4. Requirements
> * Easy to decide what {{oak-run}} version should be used for a certain Oak 
> version
> * Up to date with the latest releases
> * Machine readable for scripting
> * Include details on the benefits of using a certain version (release notes)
> * Blacklist of versions that should not be used (with alternatives)
> h4. Solution
> TBD



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7109) rep:facet returns wrong results for complex queries

2018-02-22 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372914#comment-16372914
 ] 

Thomas Mueller commented on OAK-7109:
-

[~diru] OK I (think) I understand.

> Not sure if the index supports not()

For "contains", this is supported via "contains(..., '-exclude')". But for 
generic conditions, no it's not currently supported.

> rep:facet returns wrong results for complex queries
> ---
>
> Key: OAK-7109
> URL: https://issues.apache.org/jira/browse/OAK-7109
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.6.7
>Reporter: Dirk Rudolph
>Priority: Major
>  Labels: facet
> Attachments: facetsInMultipleRoots.patch, 
> restrictionPropagationTest.patch
>
>
> eComplex queries in that case are queries, which are passed to lucene not 
> containing all original constraints. For example queries with multiple path 
> restrictions like:
> {code}
> select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], 
> 'ipsum') and (isdescendantnode(a,'/content1') or 
> isdescendantnode(a,'/content2'))
> {code}
> In that particular case the index planer gives ":fulltext:ipsum" to lucene 
> even though the index supports evaluating path constraints. 
> As counting the facets happens on the raw result of lucene, the returned 
> facets are incorrect. For example having the following content 
> {code}
> /content1/test/foo
>  + text = lorem ipsum
>  - simple/
>   + tags = tag1, tag2
> /content2/test/bar
>  + text = lorem ipsum
>  - simple/
>   + tags = tag1, tag2
> /content3/test/bar
>  + text = lorem ipsum
>  - simple/
>+ tags = tag1, tag2
> {code}
> the expected result for the dimensions of simple/tags and the query above is 
> - tag1: 2
> - tag2: 2
> as the result set is 2 results long and all documents are equal. The actual 
> result set is 
> - tag1: 3
> - tag2: 3
> as the path constraint is not handled by lucene.
> To workaround that the only solution that came to my mind is building the 
> [disjunctive normal 
> form|https://en.wikipedia.org/wiki/Disjunctive_normal_form] of my complex 
> query and executing a query for each of the disjunctive statements. As this 
> is expanding exponentially its only a theoretical solution, nothing for 
> production. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7225) Replace AtomicCounter Supplier

2018-02-22 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372886#comment-16372886
 ] 

Julian Reschke commented on OAK-7225:
-

AFAIU, this is not a public API, right?

In any case, the context was to be able to upgrade Guava, not necessarily to 
get rid of it. As far as I can tell, {{Supplier}} has not been removed from 
Guava, so it's not even clear we need to get rid of it...

> Replace AtomicCounter Supplier
> --
>
> Key: OAK-7225
> URL: https://issues.apache.org/jira/browse/OAK-7225
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.4.0, 1.6.0
>Reporter: Davide Giannella
>Assignee: Davide Giannella
>Priority: Major
> Attachments: OAK-7225-0.diff
>
>
> In the 
> [AtomicCounter|https://github.com/apache/jackrabbit-oak/blob/7a7aa1e5d4f53f5bfb410f58264c237b288f5c74/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/atomic/AtomicCounterEditorProvider.java#L121]
>  we use guava's Supplier which should be trivially replaced by the JDK8 
> [java.util.function.Supplier|https://docs.oracle.com/javase/8/docs/api/java/util/function/Supplier.html].
> In case of backports to Oak 1.4, and therefore Java7 it should be possible to 
> workaround the FunctionalInterface with a utility class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-7278) Update release notes of old branches with pointer to roadmap

2018-02-22 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-7278:
---

 Summary: Update release notes of old branches with pointer to 
roadmap
 Key: OAK-7278
 URL: https://issues.apache.org/jira/browse/OAK-7278
 Project: Jackrabbit Oak
  Issue Type: Task
Reporter: Julian Reschke
Assignee: Julian Reschke






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7272) improve BackgroundLeaseUpdate warning messages

2018-02-22 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372834#comment-16372834
 ] 

Julian Reschke commented on OAK-7272:
-

trunk: [r1825065|http://svn.apache.org/r1825065]

> improve BackgroundLeaseUpdate warning messages
> --
>
> Key: OAK-7272
> URL: https://issues.apache.org/jira/browse/OAK-7272
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_8
> Fix For: 1.9.0, 1.10
>
> Attachments: OAK-7272.diff, OAK-7272.diff
>
>
> Example for current logging:
> {noformat}
> *WARN* [DocumentNodeStore lease update thread (1)] 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore 
> BackgroundLeaseUpdate.execute: time since last renewClusterIdLease() call 
> longer than expected: 5338ms
> {noformat}
> Source:
> {noformat}
> @Override
> protected void execute(@Nonnull DocumentNodeStore nodeStore) {
> // OAK-4859 : keep track of invocation time of renewClusterIdLease
> // and warn if time since last call is longer than 5sec
> final long now = System.currentTimeMillis();
> if (lastRenewClusterIdLeaseCall <= 0) {
> lastRenewClusterIdLeaseCall = now;
> } else {
> final long diff = now - lastRenewClusterIdLeaseCall;
> if (diff > 5000) {
> LOG.warn("BackgroundLeaseUpdate.execute: time since last 
> renewClusterIdLease() call longer than expected: {}ms", diff);
> }
> lastRenewClusterIdLeaseCall = now;
> }
> // first renew the clusterId lease
> nodeStore.renewClusterIdLease();
> }
> {noformat}
> Observations:
> - the warning message doesn't actually say what the expected delay is
> - we only log when it's exceeded by factor 5
> - the threshold is hardwired; it should be computed based on the actual 
> config (I think)
> Also:
> - we don't measure the time of the actual update operation, so we don't know 
> whether it's a thread scheduling problem or a persistence problem (again, I 
> think)
> [~egli], [~mreutegg] - feedback appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-7272) improve BackgroundLeaseUpdate warning messages

2018-02-22 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-7272.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

> improve BackgroundLeaseUpdate warning messages
> --
>
> Key: OAK-7272
> URL: https://issues.apache.org/jira/browse/OAK-7272
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_8
> Fix For: 1.9.0, 1.10
>
> Attachments: OAK-7272.diff, OAK-7272.diff
>
>
> Example for current logging:
> {noformat}
> *WARN* [DocumentNodeStore lease update thread (1)] 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore 
> BackgroundLeaseUpdate.execute: time since last renewClusterIdLease() call 
> longer than expected: 5338ms
> {noformat}
> Source:
> {noformat}
> @Override
> protected void execute(@Nonnull DocumentNodeStore nodeStore) {
> // OAK-4859 : keep track of invocation time of renewClusterIdLease
> // and warn if time since last call is longer than 5sec
> final long now = System.currentTimeMillis();
> if (lastRenewClusterIdLeaseCall <= 0) {
> lastRenewClusterIdLeaseCall = now;
> } else {
> final long diff = now - lastRenewClusterIdLeaseCall;
> if (diff > 5000) {
> LOG.warn("BackgroundLeaseUpdate.execute: time since last 
> renewClusterIdLease() call longer than expected: {}ms", diff);
> }
> lastRenewClusterIdLeaseCall = now;
> }
> // first renew the clusterId lease
> nodeStore.renewClusterIdLease();
> }
> {noformat}
> Observations:
> - the warning message doesn't actually say what the expected delay is
> - we only log when it's exceeded by factor 5
> - the threshold is hardwired; it should be computed based on the actual 
> config (I think)
> Also:
> - we don't measure the time of the actual update operation, so we don't know 
> whether it's a thread scheduling problem or a persistence problem (again, I 
> think)
> [~egli], [~mreutegg] - feedback appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7272) improve BackgroundLeaseUpdate warning messages

2018-02-22 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-7272:

Labels: candidate_oak_1_8  (was: )

> improve BackgroundLeaseUpdate warning messages
> --
>
> Key: OAK-7272
> URL: https://issues.apache.org/jira/browse/OAK-7272
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_8
> Fix For: 1.9.0, 1.10
>
> Attachments: OAK-7272.diff, OAK-7272.diff
>
>
> Example for current logging:
> {noformat}
> *WARN* [DocumentNodeStore lease update thread (1)] 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore 
> BackgroundLeaseUpdate.execute: time since last renewClusterIdLease() call 
> longer than expected: 5338ms
> {noformat}
> Source:
> {noformat}
> @Override
> protected void execute(@Nonnull DocumentNodeStore nodeStore) {
> // OAK-4859 : keep track of invocation time of renewClusterIdLease
> // and warn if time since last call is longer than 5sec
> final long now = System.currentTimeMillis();
> if (lastRenewClusterIdLeaseCall <= 0) {
> lastRenewClusterIdLeaseCall = now;
> } else {
> final long diff = now - lastRenewClusterIdLeaseCall;
> if (diff > 5000) {
> LOG.warn("BackgroundLeaseUpdate.execute: time since last 
> renewClusterIdLease() call longer than expected: {}ms", diff);
> }
> lastRenewClusterIdLeaseCall = now;
> }
> // first renew the clusterId lease
> nodeStore.renewClusterIdLease();
> }
> {noformat}
> Observations:
> - the warning message doesn't actually say what the expected delay is
> - we only log when it's exceeded by factor 5
> - the threshold is hardwired; it should be computed based on the actual 
> config (I think)
> Also:
> - we don't measure the time of the actual update operation, so we don't know 
> whether it's a thread scheduling problem or a persistence problem (again, I 
> think)
> [~egli], [~mreutegg] - feedback appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7225) Replace AtomicCounter Supplier

2018-02-22 Thread Davide Giannella (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372825#comment-16372825
 ] 

Davide Giannella commented on OAK-7225:
---

[~reschke] I've replaced the exposed part of guava with the OOTB java bits. The 
changes are trivial and you can review them in  [^OAK-7225-0.diff] or on the 
[git 
branch|https://github.com/davidegiannella/jackrabbit-oak/compare/trunk...davidegiannella:OAK-7225?expand=1].

I've looked a bit more into the usages of guava and potential replacement. The 
trickiest part is the replacement of the 
[ThreadFactoryBuilder|https://github.com/davidegiannella/jackrabbit-oak/blob/OAK-7225/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/atomic/AtomicCounterEditorProvider.java#L175]
 which provides a handy way to name threads in the Executor. It's possible to 
replace it though with a simple custom implementation extending the default 
implementation in java.

Most of the guava code in the Atomic counter I found could be replaced: 
checkNotNull vs Objects.requireNonNull, ImmutableSet etc vs 
Collections.unmodifiable()... But I don't actually know if we could get rid of 
the whole Guava thing for the whole functionality. And it goes beyond the scope 
of this issue.

Thoughts?

> Replace AtomicCounter Supplier
> --
>
> Key: OAK-7225
> URL: https://issues.apache.org/jira/browse/OAK-7225
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.4.0, 1.6.0
>Reporter: Davide Giannella
>Assignee: Davide Giannella
>Priority: Major
> Attachments: OAK-7225-0.diff
>
>
> In the 
> [AtomicCounter|https://github.com/apache/jackrabbit-oak/blob/7a7aa1e5d4f53f5bfb410f58264c237b288f5c74/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/atomic/AtomicCounterEditorProvider.java#L121]
>  we use guava's Supplier which should be trivially replaced by the JDK8 
> [java.util.function.Supplier|https://docs.oracle.com/javase/8/docs/api/java/util/function/Supplier.html].
> In case of backports to Oak 1.4, and therefore Java7 it should be possible to 
> workaround the FunctionalInterface with a utility class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7225) Replace AtomicCounter Supplier

2018-02-22 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-7225:
--
Attachment: OAK-7225-0.diff

> Replace AtomicCounter Supplier
> --
>
> Key: OAK-7225
> URL: https://issues.apache.org/jira/browse/OAK-7225
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.4.0, 1.6.0
>Reporter: Davide Giannella
>Assignee: Davide Giannella
>Priority: Major
> Attachments: OAK-7225-0.diff
>
>
> In the 
> [AtomicCounter|https://github.com/apache/jackrabbit-oak/blob/7a7aa1e5d4f53f5bfb410f58264c237b288f5c74/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/atomic/AtomicCounterEditorProvider.java#L121]
>  we use guava's Supplier which should be trivially replaced by the JDK8 
> [java.util.function.Supplier|https://docs.oracle.com/javase/8/docs/api/java/util/function/Supplier.html].
> In case of backports to Oak 1.4, and therefore Java7 it should be possible to 
> workaround the FunctionalInterface with a utility class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7277) Expose principal names, path and requested privilege in javax.jcr.AccessDeniedException

2018-02-22 Thread Konrad Windszus (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Windszus updated OAK-7277:
-
Description: 
Currently the error message for {{javax.jcr.AccessDeniedException}} is pretty 
sparse, it could look like this 
OakAccess: Access denied

The full stacktrace might look like this
{code}
javax.jcr.AccessDeniedException: OakAccess: Access denied
at 
org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:231)
at 
org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:212)
at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.newRepositoryException(SessionDelegate.java:670)
at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.save(SessionDelegate.java:496)
at 
org.apache.jackrabbit.oak.jcr.session.SessionImpl$8.performVoid(SessionImpl.java:419)
at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:274)
at 
org.apache.jackrabbit.oak.jcr.session.SessionImpl.save(SessionImpl.java:416)
at 
com.adobe.granite.repository.impl.CRX3SessionImpl.save(CRX3SessionImpl.java:208)
at 
com.day.cq.wcm.core.impl.PageManagerImpl.createRevision(PageManagerImpl.java:1395)
... 115 common frames omitted
Caused by: org.apache.jackrabbit.oak.api.CommitFailedException: OakAccess: 
Access denied
at 
org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.checkPermissions(PermissionValidator.java:242)
at 
org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.propertyAdded(PermissionValidator.java:112)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.CompositeEditor.propertyAdded(CompositeEditor.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:82)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareProperties(SegmentNodeState.java:617)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:515)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.MapRecord$3.childNodeChanged(MapRecord.java:441)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:489)
at 
org.apache.jackrabbit.oak.segment.MapRecord.compareBranch(MapRecord.java:568)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:467)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:433)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.MapRecord$2.childNodeChanged(MapRecord.java:400)
at 

[jira] [Updated] (OAK-7277) Expose principal names, path and requested privilege in javax.jcr.AccessDeniedException

2018-02-22 Thread Konrad Windszus (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Windszus updated OAK-7277:
-
Summary: Expose principal names, path and requested privilege in 
javax.jcr.AccessDeniedException  (was: Expose principal name, path and 
requested privilege in javax.jcr.AccessDeniedException)

> Expose principal names, path and requested privilege in 
> javax.jcr.AccessDeniedException
> ---
>
> Key: OAK-7277
> URL: https://issues.apache.org/jira/browse/OAK-7277
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.6.1
>Reporter: Konrad Windszus
>Priority: Major
>
> Currently the error message for {{javax.jcr.AccessDeniedException}} is pretty 
> sparse, it could look like this 
> OakAccess: Access denied
> The full stacktrace might look like this
> {code}
> javax.jcr.AccessDeniedException: OakAccess: Access denied
> at 
> org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:231)
> at 
> org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:212)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.newRepositoryException(SessionDelegate.java:670)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.save(SessionDelegate.java:496)
> at 
> org.apache.jackrabbit.oak.jcr.session.SessionImpl$8.performVoid(SessionImpl.java:419)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:274)
> at 
> org.apache.jackrabbit.oak.jcr.session.SessionImpl.save(SessionImpl.java:416)
> at 
> com.adobe.granite.repository.impl.CRX3SessionImpl.save(CRX3SessionImpl.java:208)
> at 
> com.day.cq.wcm.core.impl.PageManagerImpl.createRevision(PageManagerImpl.java:1395)
> ... 115 common frames omitted
> Caused by: org.apache.jackrabbit.oak.api.CommitFailedException: 
> OakAccess: Access denied
> at 
> org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.checkPermissions(PermissionValidator.java:242)
> at 
> org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.propertyAdded(PermissionValidator.java:112)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.CompositeEditor.propertyAdded(CompositeEditor.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:82)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareProperties(SegmentNodeState.java:617)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:515)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.MapRecord$3.childNodeChanged(MapRecord.java:441)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:489)
> at 
> 

[jira] [Updated] (OAK-7277) Expose principal name, path and requested privilege in javax.jcr.AccessDeniedException

2018-02-22 Thread Konrad Windszus (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Windszus updated OAK-7277:
-
Summary: Expose principal name, path and requested privilege in 
javax.jcr.AccessDeniedException  (was: Expose authorizable id, path and 
requested privilege in javax.jcr.AccessDeniedException)

> Expose principal name, path and requested privilege in 
> javax.jcr.AccessDeniedException
> --
>
> Key: OAK-7277
> URL: https://issues.apache.org/jira/browse/OAK-7277
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.6.1
>Reporter: Konrad Windszus
>Priority: Major
>
> Currently the error message for {{javax.jcr.AccessDeniedException}} is pretty 
> sparse, it could look like this 
> OakAccess: Access denied
> The full stacktrace might look like this
> {code}
> javax.jcr.AccessDeniedException: OakAccess: Access denied
> at 
> org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:231)
> at 
> org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:212)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.newRepositoryException(SessionDelegate.java:670)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.save(SessionDelegate.java:496)
> at 
> org.apache.jackrabbit.oak.jcr.session.SessionImpl$8.performVoid(SessionImpl.java:419)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:274)
> at 
> org.apache.jackrabbit.oak.jcr.session.SessionImpl.save(SessionImpl.java:416)
> at 
> com.adobe.granite.repository.impl.CRX3SessionImpl.save(CRX3SessionImpl.java:208)
> at 
> com.day.cq.wcm.core.impl.PageManagerImpl.createRevision(PageManagerImpl.java:1395)
> ... 115 common frames omitted
> Caused by: org.apache.jackrabbit.oak.api.CommitFailedException: 
> OakAccess: Access denied
> at 
> org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.checkPermissions(PermissionValidator.java:242)
> at 
> org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.propertyAdded(PermissionValidator.java:112)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.CompositeEditor.propertyAdded(CompositeEditor.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:82)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareProperties(SegmentNodeState.java:617)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:515)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.MapRecord$3.childNodeChanged(MapRecord.java:441)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:489)
> at 
> 

[jira] [Created] (OAK-7277) Expose authorizable id, path and request privilege in javax.jcr.AccessDeniedException

2018-02-22 Thread Konrad Windszus (JIRA)
Konrad Windszus created OAK-7277:


 Summary: Expose authorizable id, path and request privilege in 
javax.jcr.AccessDeniedException
 Key: OAK-7277
 URL: https://issues.apache.org/jira/browse/OAK-7277
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: jcr
Affects Versions: 1.6.1
Reporter: Konrad Windszus


Currently the error message for {{javax.jcr.AccessDeniedException}} is pretty 
sparse, it could look like this 
OakAccess: Access denied

The full stacktrace might look like this
{code}
javax.jcr.AccessDeniedException: OakAccess: Access denied
at 
org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:231)
at 
org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:212)
at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.newRepositoryException(SessionDelegate.java:670)
at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.save(SessionDelegate.java:496)
at 
org.apache.jackrabbit.oak.jcr.session.SessionImpl$8.performVoid(SessionImpl.java:419)
at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:274)
at 
org.apache.jackrabbit.oak.jcr.session.SessionImpl.save(SessionImpl.java:416)
at 
com.adobe.granite.repository.impl.CRX3SessionImpl.save(CRX3SessionImpl.java:208)
at 
com.day.cq.wcm.core.impl.PageManagerImpl.createRevision(PageManagerImpl.java:1395)
... 115 common frames omitted
Caused by: org.apache.jackrabbit.oak.api.CommitFailedException: OakAccess: 
Access denied
at 
org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.checkPermissions(PermissionValidator.java:242)
at 
org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.propertyAdded(PermissionValidator.java:112)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.CompositeEditor.propertyAdded(CompositeEditor.java:83)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:82)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareProperties(SegmentNodeState.java:617)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:515)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at 
org.apache.jackrabbit.oak.segment.MapRecord$3.childNodeChanged(MapRecord.java:441)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:489)
at 
org.apache.jackrabbit.oak.segment.MapRecord.compareBranch(MapRecord.java:568)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:467)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:433)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
at 
org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
at 

[jira] [Updated] (OAK-7277) Expose authorizable id, path and requested privilege in javax.jcr.AccessDeniedException

2018-02-22 Thread Konrad Windszus (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Windszus updated OAK-7277:
-
Summary: Expose authorizable id, path and requested privilege in 
javax.jcr.AccessDeniedException  (was: Expose authorizable id, path and request 
privilege in javax.jcr.AccessDeniedException)

> Expose authorizable id, path and requested privilege in 
> javax.jcr.AccessDeniedException
> ---
>
> Key: OAK-7277
> URL: https://issues.apache.org/jira/browse/OAK-7277
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.6.1
>Reporter: Konrad Windszus
>Priority: Major
>
> Currently the error message for {{javax.jcr.AccessDeniedException}} is pretty 
> sparse, it could look like this 
> OakAccess: Access denied
> The full stacktrace might look like this
> {code}
> javax.jcr.AccessDeniedException: OakAccess: Access denied
> at 
> org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:231)
> at 
> org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:212)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.newRepositoryException(SessionDelegate.java:670)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.save(SessionDelegate.java:496)
> at 
> org.apache.jackrabbit.oak.jcr.session.SessionImpl$8.performVoid(SessionImpl.java:419)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:274)
> at 
> org.apache.jackrabbit.oak.jcr.session.SessionImpl.save(SessionImpl.java:416)
> at 
> com.adobe.granite.repository.impl.CRX3SessionImpl.save(CRX3SessionImpl.java:208)
> at 
> com.day.cq.wcm.core.impl.PageManagerImpl.createRevision(PageManagerImpl.java:1395)
> ... 115 common frames omitted
> Caused by: org.apache.jackrabbit.oak.api.CommitFailedException: 
> OakAccess: Access denied
> at 
> org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.checkPermissions(PermissionValidator.java:242)
> at 
> org.apache.jackrabbit.oak.security.authorization.permission.PermissionValidator.propertyAdded(PermissionValidator.java:112)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.VisibleValidator.propertyAdded(VisibleValidator.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.CompositeEditor.propertyAdded(CompositeEditor.java:83)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:82)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareProperties(SegmentNodeState.java:617)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:515)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:555)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:415)
> at 
> org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:608)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.segment.MapRecord$3.childNodeChanged(MapRecord.java:441)
> at org.apache.jackrabbit.oak.segment.MapRecord.compare(MapRecord.java:489)
> at 
> 

[jira] [Comment Edited] (OAK-7272) improve BackgroundLeaseUpdate warning messages

2018-02-22 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371754#comment-16371754
 ] 

Julian Reschke edited comment on OAK-7272 at 2/22/18 9:42 AM:
--

Updated patch in 
https://issues.apache.org/jira/secure/attachment/12911409/OAK-7272.diff - uses 
DocumentNodeStore's clock. (cc [~mreutegg])


was (Author: reschke):
Updated patch 
inhttps://issues.apache.org/jira/secure/attachment/12911409/OAK-7272.diff - 
uses DocumentNodeStore's clock. (cc [~mreutegg])

> improve BackgroundLeaseUpdate warning messages
> --
>
> Key: OAK-7272
> URL: https://issues.apache.org/jira/browse/OAK-7272
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.10
>
> Attachments: OAK-7272.diff, OAK-7272.diff
>
>
> Example for current logging:
> {noformat}
> *WARN* [DocumentNodeStore lease update thread (1)] 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore 
> BackgroundLeaseUpdate.execute: time since last renewClusterIdLease() call 
> longer than expected: 5338ms
> {noformat}
> Source:
> {noformat}
> @Override
> protected void execute(@Nonnull DocumentNodeStore nodeStore) {
> // OAK-4859 : keep track of invocation time of renewClusterIdLease
> // and warn if time since last call is longer than 5sec
> final long now = System.currentTimeMillis();
> if (lastRenewClusterIdLeaseCall <= 0) {
> lastRenewClusterIdLeaseCall = now;
> } else {
> final long diff = now - lastRenewClusterIdLeaseCall;
> if (diff > 5000) {
> LOG.warn("BackgroundLeaseUpdate.execute: time since last 
> renewClusterIdLease() call longer than expected: {}ms", diff);
> }
> lastRenewClusterIdLeaseCall = now;
> }
> // first renew the clusterId lease
> nodeStore.renewClusterIdLease();
> }
> {noformat}
> Observations:
> - the warning message doesn't actually say what the expected delay is
> - we only log when it's exceeded by factor 5
> - the threshold is hardwired; it should be computed based on the actual 
> config (I think)
> Also:
> - we don't measure the time of the actual update operation, so we don't know 
> whether it's a thread scheduling problem or a persistence problem (again, I 
> think)
> [~egli], [~mreutegg] - feedback appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-6392) Partial lastRev update with branches disabled

2018-02-22 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6392:
--
Affects Version/s: 1.0
   1.2
   1.4.0
Fix Version/s: 1.4.21
   1.6.10
   1.0.42
   1.2.29

Merged fix into branches:

- 1.6: http://svn.apache.org/r1825035
- 1.4: http://svn.apache.org/r1825036
- 1.2: http://svn.apache.org/r1825037
- 1.0: http://svn.apache.org/r1825038

> Partial lastRev update with branches disabled
> -
>
> Key: OAK-6392
> URL: https://issues.apache.org/jira/browse/OAK-6392
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: documentmk
>Affects Versions: 1.0, 1.2, 1.4.0, 1.6.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.7.3, 1.8.0, 1.2.29, 1.0.42, 1.6.10, 1.4.21
>
>
> When {{DocumentMK.Builder.disableBranches(true)}} is used, the background 
> update may write back partial _lastRev batches. This happens because the 
> commit is not guarded by the background operation lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7259) Improve SegmentNodeStoreStats to include number of commits per thread and threads currently waiting on the semaphore

2018-02-22 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372562#comment-16372562
 ] 

Francesco Mari commented on OAK-7259:
-

{quote}Just writing this, I realized that SegmentNodeStoreStats#onCommit is 
only called with the commit semaphore acquired so it won't need additional 
synchronization (in the current implementation of LockBasedScheduler). As for 
onCommitQueued and onCommitDequeued, going forward with the current approach 
means that getQueuedWriters might return an incomplete view of the currently 
queued threads, which should be ok, since this is only about monitoring.{quote}

If no other thread is reading those data structures, every thread passing by 
{{onCommit}} will find the lock free. This will not have a noticeable 
performance overhead for the threads in the commit queue. Thread-safety is 
important to not to report inconsistent data when both reader and writer 
threads access the data structure.



> Improve SegmentNodeStoreStats to include number of commits per thread and 
> threads currently waiting on the semaphore
> 
>
> Key: OAK-7259
> URL: https://issues.apache.org/jira/browse/OAK-7259
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: tooling
> Fix For: 1.9.0, 1.10
>
>
> When investigating the performance of  {{segment-tar}}, the source of the 
> writes (commits) is a very useful indicator of the cause.
> To better understand which threads are currently writing in the repository 
> and which are blocked on the semaphore, we need to improve 
> {{SegmentNodeStoreStats}} to:
>  * expose the number of commits executed per thread
>  * expose threads currently waiting on the semaphore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)