[jira] [Updated] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-5655:
---
Attachment: compaction-time-vs-reposize.m

> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs-reposize.m, 
> compaction-time-vs.reposize.png, data00053a.tar-reads.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226764#comment-16226764
 ] 

Michael Dürig edited comment on OAK-5655 at 10/31/17 1:32 PM:
--

In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

[^compaction-time-vs-reposize.m]

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.

Below image shows the reads in a 4 minuted interval from {{data00053a.tar}}. 
Reads from other tar files look similar:

!data00053a.tar-reads.png|width=600!





was (Author: mduerig):
In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.

Below image shows the reads in a 4 minuted interval from {{data00053a.tar}}. 
Reads from other tar files look similar:

!data00053a.tar-reads.png|width=600!




> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs-reposize.m, 
> compaction-time-vs.reposize.png, data00053a.tar-reads.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-6887) Change default value for autoCompact

2017-10-31 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-6887:
-

 Summary: Change default value for autoCompact
 Key: OAK-6887
 URL: https://issues.apache.org/jira/browse/OAK-6887
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: documentmk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
Priority: Minor
 Fix For: 1.8


The persistent cache has an option {{autoCompact}} which is related to 
{{compact}}. The former compacts the cache in a background thread, while the 
latter controls whether it should be done on close. OAK-2815 set the default 
for {{compact}} to disabled. Similarly, the default for {{autoCompact}} should 
also be disabled, unless a user wants to use the feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6889) Followup on OAK-6755: fix OSGi component descriptors

2017-10-31 Thread Julian Sedding (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Sedding updated OAK-6889:

Affects Version/s: (was: 1.7.10)
   1.7.9

> Followup on OAK-6755: fix OSGi component descriptors
> 
>
> Key: OAK-6889
> URL: https://issues.apache.org/jira/browse/OAK-6889
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, security
>Affects Versions: 1.7.9
>Reporter: Julian Sedding
>Assignee: Julian Sedding
> Fix For: 1.7.11
>
>
> The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be lost 
> or otherwise incorrect.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-5655:
---
Attachment: compaction-time-vs.reposize.png

> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226918#comment-16226918
 ] 

Chetan Mehrotra commented on OAK-6859:
--

[~mreutegg] Would this task run on each cluster node? So far such task were run 
as singleton in a cluster and we relied on Sling Scheduler to ensure that

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6829) ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures

2017-10-31 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226725#comment-16226725
 ] 

Francesco Mari commented on OAK-6829:
-

I added debug messages to show when and why a flush operation is skipped 
(r1813878). I also enabled the appropriate loggers when running tests 
(r1813879).

> ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures
> -
>
> Key: OAK-6829
> URL: https://issues.apache.org/jira/browse/OAK-6829
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.7.9
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8, 1.7.11
>
> Attachments: 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.xml, 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT.xml, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt
>
>
> {noformat}
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 27.921 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> Running org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 93.353 sec 
> <<< FAILURE! - in 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT)
>   Time elapsed: 30.772 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226764#comment-16226764
 ] 

Michael Dürig edited comment on OAK-5655 at 10/31/17 1:17 PM:
--

In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.





was (Author: mduerig):
In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
flight recording from an offline compaction of the same repository with 
{{map=false}}. The flight recording shows that the process spends almost 99% of 
its time in {{java.io.RandomAccessFile.read()}} and all these calls originating 
from segment reads. Furthermore the segment reads are spread more or less 
evenly across time and across all 50 tar files.




> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-6888) Flushing the FileStore might return before data is persisted

2017-10-31 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-6888:
---

 Summary: Flushing the FileStore might return before data is 
persisted
 Key: OAK-6888
 URL: https://issues.apache.org/jira/browse/OAK-6888
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: 1.8, 1.7.11


The implementation of {{FileStore#flush}} might return before all the expected 
data is persisted on disk. 

The root cause of this behaviour is the implementation of 
{{TarRevisions#flush}}, which is too lenient when acquiring the lock for the 
journal file. If a background flush operation is in progress and a user calls 
{{FileStore#flush}}, that method will immediately return because the lock of 
the journal file is already owned by the background flush operation. The caller 
doesn't have the guarantee that everything committed before {{FileStore#flush}} 
is persisted to disk when the method returns. 

A fix for this problem might be to create an additional implementation of 
flush. The current implementation, needed for the background flush thread, will 
not be exposed to the users of {{FileStore}}. The new implementation of 
{{TarRevisions#flush}} should have stricter semantics and always guarantee that 
the persisted head contains everything visible to the user of {{FileStore}} 
before the flush operation was started.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-6889) Followup on OAK-6755: fix OSGi component descriptors

2017-10-31 Thread Julian Sedding (JIRA)
Julian Sedding created OAK-6889:
---

 Summary: Followup on OAK-6755: fix OSGi component descriptors
 Key: OAK-6889
 URL: https://issues.apache.org/jira/browse/OAK-6889
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, security
Affects Versions: 1.7.10
Reporter: Julian Sedding
Assignee: Julian Sedding
 Fix For: 1.7.11


The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be lost 
or otherwise incorrect.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226764#comment-16226764
 ] 

Michael Dürig edited comment on OAK-5655 at 10/31/17 1:21 PM:
--

In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.

Below image shows the reads in a 4 minuted interval from {{data00053a.tar}}. 
Reads from other tar files look similar:

!data00053a.tar-reads.png|width=600!





was (Author: mduerig):
In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.




> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, 
> data00053a.tar-reads.png, offrc.jfr, segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-5655:
---
Attachment: offrc.jfr

> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226764#comment-16226764
 ] 

Michael Dürig commented on OAK-5655:


In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
flight recording from an offline compaction of the same repository with 
{{map=false}}. The flight recording shows that the process spends almost 99% of 
its time in {{java.io.RandomAccessFile.read()}} and all these calls originating 
from segment reads. Furthermore the segment reads are spread more or less 
evenly across time and across all 50 tar files.




> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226764#comment-16226764
 ] 

Michael Dürig edited comment on OAK-5655 at 10/31/17 2:00 PM:
--

In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

[^compaction-time-vs-reposize.m]

Compaction times increase super linear and {{mmap=true}} is clearly superior to 
{{mmap=false}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.

Below image shows the reads in a 4 minuted interval from {{data00053a.tar}}. 
Reads from other tar files look similar:

!data00053a.tar-reads.png|width=600!





was (Author: mduerig):
In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

[^compaction-time-vs-reposize.m]

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.

Below image shows the reads in a 4 minuted interval from {{data00053a.tar}}. 
Reads from other tar files look similar:

!data00053a.tar-reads.png|width=600!




> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs-reposize.m, 
> compaction-time-vs.reposize.png, data00053a.tar-reads.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226906#comment-16226906
 ] 

Marcel Reutegger commented on OAK-6859:
---

Implemented in trunk: http://svn.apache.org/r1813888

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5655) TarMK: Analyse locality of reference

2017-10-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-5655:
---
Attachment: data00053a.tar-reads.png

> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, 
> data00053a.tar-reads.png, offrc.jfr, segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6829) ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures

2017-10-31 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6829:

Attachment: 
org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt

org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt

> ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures
> -
>
> Key: OAK-6829
> URL: https://issues.apache.org/jira/browse/OAK-6829
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.7.9
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8, 1.7.11
>
> Attachments: 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.xml, 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT.xml, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt
>
>
> {noformat}
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 27.921 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> Running org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 93.353 sec 
> <<< FAILURE! - in 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT)
>   Time elapsed: 30.772 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6829) ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures

2017-10-31 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226795#comment-16226795
 ] 

Julian Reschke commented on OAK-6829:
-

Still failing. Attaching logs.

> ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures
> -
>
> Key: OAK-6829
> URL: https://issues.apache.org/jira/browse/OAK-6829
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.7.9
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8, 1.7.11
>
> Attachments: 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.xml, 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT.xml, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt
>
>
> {noformat}
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 27.921 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> Running org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 93.353 sec 
> <<< FAILURE! - in 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT)
>   Time elapsed: 30.772 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-6886) OffRC always logs 0 for the number of compacted nodes in gc.log

2017-10-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig resolved OAK-6886.

   Resolution: Fixed
Fix Version/s: (was: 1.7.12)
   1.7.11

Fixed at http://svn.apache.org/viewvc?rev=1813883=rev

> OffRC always logs 0 for the number of compacted nodes in gc.log
> ---
>
> Key: OAK-6886
> URL: https://issues.apache.org/jira/browse/OAK-6886
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: compaction, gc
> Fix For: 1.8, 1.7.11
>
>
> After an offline compaction the {{gc.log}} always contains 0 for the number 
> of compacted nodes. This is caused by 
> {{org.apache.jackrabbit.oak.segment.tool.Compact.compact()}} instantiating a 
> new {{FileStore}} to run cleanup. That file store has new {{GCMonitor}} 
> instance, which did no see any of the nodes written by the compaction that 
> was run on the previous {{FileStore}} instance. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6741) Switch to official OSGi component and metatype annotations

2017-10-31 Thread Julian Sedding (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226904#comment-16226904
 ] 

Julian Sedding commented on OAK-6741:
-

I created OAK-6889 for following up on the changes done for {{oak-core}}.

> Switch to official OSGi component and metatype annotations
> --
>
> Key: OAK-6741
> URL: https://issues.apache.org/jira/browse/OAK-6741
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Robert Munteanu
> Fix For: 1.8, 1.7.11
>
> Attachments: OAK-6741-proposed-changes-chetans-feedback.patch, 
> osgi-metadata-1.7.8.json, osgi-metadata-trunk.json
>
>
> We should remove the 'old' Felix SCR annotations and move to the 'new' OSGi 
> R6 annotations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6829) ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures

2017-10-31 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226936#comment-16226936
 ] 

Francesco Mari commented on OAK-6829:
-

The additional debug messages show the reason for these random failures. In 
short, the flush operation exposed by the {{FileStore}} has weaker semantics 
than the one expected by the test. More details are provided in OAK-6888, where 
I also attached a failure of this test as a practical example.

> ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures
> -
>
> Key: OAK-6829
> URL: https://issues.apache.org/jira/browse/OAK-6829
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.7.9
>Reporter: Julian Reschke
>Assignee: Francesco Mari
> Fix For: 1.8, 1.7.11
>
> Attachments: 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.xml, 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT.xml, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt
>
>
> {noformat}
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 27.921 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> Running org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 93.353 sec 
> <<< FAILURE! - in 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT)
>   Time elapsed: 30.772 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-6889) Followup on OAK-6755: fix OSGi component descriptors

2017-10-31 Thread Julian Sedding (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Sedding resolved OAK-6889.
-
Resolution: Fixed

> Followup on OAK-6755: fix OSGi component descriptors
> 
>
> Key: OAK-6889
> URL: https://issues.apache.org/jira/browse/OAK-6889
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, security
>Affects Versions: 1.7.9
>Reporter: Julian Sedding
>Assignee: Julian Sedding
> Fix For: 1.7.11
>
>
> The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be lost 
> or otherwise incorrect.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6888) Flushing the FileStore might return before data is persisted

2017-10-31 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6888:

Attachment: failure.txt

As a real-world example of such a failure is an execution of 
{{testSyncBigBlob}} in {{ExternalPrivateStoreIT}}, slightly edited for clarity.

In the example a background flush is preventing the "main" thread from 
performing a flush. When the synchronization between the standby and the 
primary happens, the old head state is transferred 
({{473f3a4f-bf18-4d4f-a2aa-736ed4a64944.0005}}). Later on, when the content 
of the primary and the standby instance is compared, the new head state is used 
instead ({{8574c330-29ca-491a-a66e-b5b0d1b6b75e.000b}}). At this time, the 
background flush operation is completed and the primary {{FileStore}} has a 
different persisted head state than the standby.

> Flushing the FileStore might return before data is persisted
> 
>
> Key: OAK-6888
> URL: https://issues.apache.org/jira/browse/OAK-6888
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: 1.8, 1.7.11
>
> Attachments: failure.txt
>
>
> The implementation of {{FileStore#flush}} might return before all the 
> expected data is persisted on disk. 
> The root cause of this behaviour is the implementation of 
> {{TarRevisions#flush}}, which is too lenient when acquiring the lock for the 
> journal file. If a background flush operation is in progress and a user calls 
> {{FileStore#flush}}, that method will immediately return because the lock of 
> the journal file is already owned by the background flush operation. The 
> caller doesn't have the guarantee that everything committed before 
> {{FileStore#flush}} is persisted to disk when the method returns. 
> A fix for this problem might be to create an additional implementation of 
> flush. The current implementation, needed for the background flush thread, 
> will not be exposed to the users of {{FileStore}}. The new implementation of 
> {{TarRevisions#flush}} should have stricter semantics and always guarantee 
> that the persisted head contains everything visible to the user of 
> {{FileStore}} before the flush operation was started.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6891) Executions of background threads might pile up

2017-10-31 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6891:

Attachment: example.txt

As a real world example of this problem, I attach an execution of 
{{testSyncBigBlob}} in {{ExternalPrivateStoreIT}}, slightly edited for clarity.

The relevant part of the example is at the end, where the primary and standby 
{{FieStore}} are flushed four times each. At this point in the test the two 
{{FileStore}} are being closed, and their respective 
{{ScheduledExecutorService}} are being gracefully shut down. Still, the 
scheduled executions of the background flush thread have to completed, delaying 
the cleanup phase of the test of about 20s.

> Executions of background threads might pile up
> --
>
> Key: OAK-6891
> URL: https://issues.apache.org/jira/browse/OAK-6891
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
> Fix For: 1.8
>
> Attachments: example.txt
>
>
> The background threads used in {{FileStore}} are implemented by wrapping 
> {{Runnable}} instances in {{SafeRunnable}}, and by handing the 
> {{SafeRunnable}} instances over to a {{ScheduledExecutorService}}. 
> The documentation of {{ScheduledExecutorService#scheduleAtFixedRate}} states 
> that "if any execution of a task takes longer than its period, then 
> subsequent executions may start late, but will not concurrently execute". 
> This means that if an execution is delayed, the piled up executions might 
> fire in rapid succession.
> This way of running the periodic background threads might not be ideal. For 
> example, it doesn't make much sense to flush the File Store five times in a 
> row. On the other hand, if the background tasks are coded with this caveat in 
> mind, this issue might not be a problem at all. For example, flushing the 
> File Store five times in a row might not be a problem if many of those 
> executions don't do much and return quickly.
> Tasks piling up might be a problem when it comes to release the resource 
> associated with the {{FileStore}} in a responsive way. Since the 
> {{ScheduledExecutorService}} is gracefully shut down, it might take some time 
> before all the scheduled background tasks are processed and the 
> {{ScheduledExecutorService}} is ready to be terminated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-6887) Change default value for autoCompact

2017-10-31 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-6887.
---
   Resolution: Fixed
Fix Version/s: 1.7.11

Done in trunk: http://svn.apache.org/r1813895

> Change default value for autoCompact
> 
>
> Key: OAK-6887
> URL: https://issues.apache.org/jira/browse/OAK-6887
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.8, 1.7.11
>
>
> The persistent cache has an option {{autoCompact}} which is related to 
> {{compact}}. The former compacts the cache in a background thread, while the 
> latter controls whether it should be done on close. OAK-2815 set the default 
> for {{compact}} to disabled. Similarly, the default for {{autoCompact}} 
> should also be disabled, unless a user wants to use the feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-6859.
---
   Resolution: Fixed
Fix Version/s: 1.7.11

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8, 1.7.11
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6889) Followup on OAK-6755: fix OSGi component descriptors

2017-10-31 Thread Julian Sedding (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226938#comment-16226938
 ] 

Julian Sedding commented on OAK-6889:
-

Fixed in [r1813889|https://svn.apache.org/r1813889].

Most issues were caused by configuration annotations *not* being referenced in 
the {{@Activate}} method signature, which is necessary for DS component 
descriptors to include properties described by the annotation.

>From the OSGi Declarative Services Specification Version 1.3, §112.8.3 
>(emphasis is mine):
{quote}
Properties defined through component property types _used in the signatures of 
the life cycle methods_.

If any of the referenced component property types have _methods with defaults_, 
then the generated component description must include a property element for 
each such method with the property name mapped from the method name, the 
property type mapped from the method type, and the property value set to the 
method's default value. 
{quote}

{{AuthorizationConfigurationImpl}} was missing the {{@Designate}} annotation 
and had a typo in a property name (importBehaviour vs importBehavior).

{{DefaultAuthorizableActionProvider}} (by now moved to {{oak-security-spi}}) 
needed its option labels and values swapped around.

> Followup on OAK-6755: fix OSGi component descriptors
> 
>
> Key: OAK-6889
> URL: https://issues.apache.org/jira/browse/OAK-6889
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, security
>Affects Versions: 1.7.9
>Reporter: Julian Sedding
>Assignee: Julian Sedding
> Fix For: 1.7.11
>
>
> The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be lost 
> or otherwise incorrect.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-6890) Background threads might not be automatically restarted

2017-10-31 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-6890:
---

 Summary: Background threads might not be automatically restarted
 Key: OAK-6890
 URL: https://issues.apache.org/jira/browse/OAK-6890
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Francesco Mari
 Fix For: 1.8


The background threads used in {{FileStore}} are implemented by wrapping 
{{Runnable}} instances in {{SafeRunnable}}, and by handing the {{SafeRunnable}} 
instances over to a {{ScheduledExecutorService}}. 

The documentation of {{ScheduledExecutorService#scheduleAtFixedRate}} states 
that "if any execution of the task encounters an exception, subsequent 
executions are suppressed". But a {{SafeRunnable}} always re-throws any 
{{Throwable}} that it catches, effectively preventing itself from executing 
again in the future.

There is more than one solution to this problem. One of these is to never 
re-throw any exception. Even if it doesn't always make sense, e.g. in case of 
an {{OutOfMemoryError}}, never re-throwing an exception would better fulfil the 
assumption that background threads should always be up and running even in case 
of error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226980#comment-16226980
 ] 

Marcel Reutegger commented on OAK-6859:
---

Updated documentation: http://svn.apache.org/r1813891

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226934#comment-16226934
 ] 

Marcel Reutegger commented on OAK-6859:
---

It relies on the Sling Scheduler to only run on a single cluster node.

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6889) Followup on OAK-6755: fix OSGi component descriptors

2017-10-31 Thread Julian Sedding (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Sedding updated OAK-6889:

Description: 
The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be lost 
or otherwise incorrect.

Issues were found using 
[osgi-ds-metatype-diff|https://github.com/jsedding/osgi-ds-metatype-diff].

  was:The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be 
lost or otherwise incorrect.


> Followup on OAK-6755: fix OSGi component descriptors
> 
>
> Key: OAK-6889
> URL: https://issues.apache.org/jira/browse/OAK-6889
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, security
>Affects Versions: 1.7.9
>Reporter: Julian Sedding
>Assignee: Julian Sedding
> Fix For: 1.7.11
>
>
> The fix for OAK-6755 (see also OAK-6741) caused some OSGi metadata to be lost 
> or otherwise incorrect.
> Issues were found using 
> [osgi-ds-metatype-diff|https://github.com/jsedding/osgi-ds-metatype-diff].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-6891) Executions of background threads might pile up

2017-10-31 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-6891:
---

 Summary: Executions of background threads might pile up
 Key: OAK-6891
 URL: https://issues.apache.org/jira/browse/OAK-6891
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Francesco Mari
 Fix For: 1.8


The background threads used in {{FileStore}} are implemented by wrapping 
{{Runnable}} instances in {{SafeRunnable}}, and by handing the {{SafeRunnable}} 
instances over to a {{ScheduledExecutorService}}. 

The documentation of {{ScheduledExecutorService#scheduleAtFixedRate}} states 
that "if any execution of a task takes longer than its period, then subsequent 
executions may start late, but will not concurrently execute". This means that 
if an execution is delayed, the piled up executions might fire in rapid 
succession.

This way of running the periodic background threads might not be ideal. For 
example, it doesn't make much sense to flush the File Store five times in a 
row. On the other hand, if the background tasks are coded with this caveat in 
mind, this issue might not be a problem at all. For example, flushing the File 
Store five times in a row might not be a problem if many of those executions 
don't do much and return quickly.

Tasks piling up might be a problem when it comes to release the resource 
associated with the {{FileStore}} in a responsive way. Since the 
{{ScheduledExecutorService}} is gracefully shut down, it might take some time 
before all the scheduled background tasks are processed and the 
{{ScheduledExecutorService}} is ready to be terminated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-5940) Remove CachedNodeDocument

2017-10-31 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226443#comment-16226443
 ] 

Julian Reschke commented on OAK-5940:
-

trunk: [r1787151|http://svn.apache.org/r1787151]
1.6: [r1813855|http://svn.apache.org/r1813855]


> Remove CachedNodeDocument
> -
>
> Key: OAK-5940
> URL: https://issues.apache.org/jira/browse/OAK-5940
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4, 
> technical_debt
> Fix For: 1.7.0, 1.8, 1.6.7
>
>
> The CachedNodeDocument interface was introduced with OAK-891 but then the 
> feature was later removed with OAK-2937. The interface is not used anywhere 
> and should be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6756) Convert oak-auth-external to OSGi R6 annotations

2017-10-31 Thread Christian Schneider (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226463#comment-16226463
 ] 

Christian Schneider commented on OAK-6756:
--

I would like to assign myself to this issue but do not haver the rights to do 
so.

I just created a pull request: https://github.com/apache/jackrabbit-oak/pull/73

> Convert oak-auth-external to OSGi R6 annotations
> 
>
> Key: OAK-6756
> URL: https://issues.apache.org/jira/browse/OAK-6756
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: auth-external
>Reporter: Robert Munteanu
> Fix For: 1.8, 1.7.11
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-6756) Convert oak-auth-external to OSGi R6 annotations

2017-10-31 Thread Christian Schneider (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226463#comment-16226463
 ] 

Christian Schneider edited comment on OAK-6756 at 10/31/17 8:48 AM:


I would like to assign myself to this issue but do not have the rights to do so.

I just created a pull request: https://github.com/apache/jackrabbit-oak/pull/73


was (Author: ch...@die-schneider.net):
I would like to assign myself to this issue but do not haver the rights to do 
so.

I just created a pull request: https://github.com/apache/jackrabbit-oak/pull/73

> Convert oak-auth-external to OSGi R6 annotations
> 
>
> Key: OAK-6756
> URL: https://issues.apache.org/jira/browse/OAK-6756
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: auth-external
>Reporter: Robert Munteanu
> Fix For: 1.8, 1.7.11
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5940) Remove CachedNodeDocument

2017-10-31 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-5940:

Labels: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4 
technical_debt  (was: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4 
candidate_oak_1_6 technical_debt)

> Remove CachedNodeDocument
> -
>
> Key: OAK-5940
> URL: https://issues.apache.org/jira/browse/OAK-5940
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4, 
> technical_debt
> Fix For: 1.7.0, 1.8, 1.6.7
>
>
> The CachedNodeDocument interface was introduced with OAK-891 but then the 
> feature was later removed with OAK-2937. The interface is not used anywhere 
> and should be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5940) Remove CachedNodeDocument

2017-10-31 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-5940:

Fix Version/s: 1.6.7

> Remove CachedNodeDocument
> -
>
> Key: OAK-5940
> URL: https://issues.apache.org/jira/browse/OAK-5940
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4, 
> technical_debt
> Fix For: 1.7.0, 1.8, 1.6.7
>
>
> The CachedNodeDocument interface was introduced with OAK-891 but then the 
> feature was later removed with OAK-2937. The interface is not used anywhere 
> and should be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6859:
--
Component/s: (was: mongomk)

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6859:
--
Description: 
Introduce scheduling of the Revision GC task in DocumentNodeStoreService. There 
are already other tasks scheduled, like Journal GC and recovery when another 
cluster node crashes.

I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
externally.

  was:Continuous Revision GC for DocumentNodeStore on MongoDB should be enabled 
by default. This avoids the need to trigger the GC externally and also avoids 
high load on the system when it run e.g. once a day. 

Summary: Schedule Revision GC in DocumentNodeStoreService  (was: Enable 
continuous revision GC on MongoDB)

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-6886) OffRC alway logs 0 for the number of compacted nodes in gc.log

2017-10-31 Thread JIRA
Michael Dürig created OAK-6886:
--

 Summary: OffRC alway logs 0 for the number of compacted nodes in 
gc.log
 Key: OAK-6886
 URL: https://issues.apache.org/jira/browse/OAK-6886
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Michael Dürig
Assignee: Michael Dürig
 Fix For: 1.8, 1.7.12


After an offline compaction the {{gc.log}} always contains 0 for the number of 
compacted nodes. This is caused by 
{{org.apache.jackrabbit.oak.segment.tool.Compact.compact()}} instantiating a 
new {{FileStore}} to run cleanup. That file store has new {{GCMonitor}} 
instance, which did no see any of the nodes written by the compaction that was 
run on the previous {{FileStore}} instance. 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6886) OffRC alway logs 0 for the number of compacted nodes in gc.log

2017-10-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226614#comment-16226614
 ] 

Michael Dürig commented on OAK-6886:


I suggest to change the {{Compact}} tool to use the same {{FileStore}} instance 
for compaction and cleanup. Previously it was necessary to use two instance in 
order for cleanup to be effective and not affected by in memory references 
retaining segments from the old generation. Since we now run gc by retention 
time and set that to a single generation for OffRC that problem should not 
exist any more. 

Proposed patch:
{code}
--- 
oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/tool/Compact.java
   (date 1509379779000)
+++ 
oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/tool/Compact.java
   (date 1509446878000)
@@ -134,10 +134,8 @@
 private void compact() throws IOException, 
InvalidFileStoreVersionException {
 try (FileStore store = newFileStore()) {
 store.compactFull();
-}
 
-System.out.println("-> cleaning up");
-try (FileStore store = newFileStore()) {
+System.out.println("-> cleaning up");
 store.cleanup();
 File journal = new File(path, "journal.log");
 String head;
{code}

[~frm], WDYT?

> OffRC alway logs 0 for the number of compacted nodes in gc.log
> --
>
> Key: OAK-6886
> URL: https://issues.apache.org/jira/browse/OAK-6886
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: compaction, gc
> Fix For: 1.8, 1.7.12
>
>
> After an offline compaction the {{gc.log}} always contains 0 for the number 
> of compacted nodes. This is caused by 
> {{org.apache.jackrabbit.oak.segment.tool.Compact.compact()}} instantiating a 
> new {{FileStore}} to run cleanup. That file store has new {{GCMonitor}} 
> instance, which did no see any of the nodes written by the compaction that 
> was run on the previous {{FileStore}} instance. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6886) OffRC always logs 0 for the number of compacted nodes in gc.log

2017-10-31 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-6886:
---
Summary: OffRC always logs 0 for the number of compacted nodes in gc.log  
(was: OffRC alway logs 0 for the number of compacted nodes in gc.log)

> OffRC always logs 0 for the number of compacted nodes in gc.log
> ---
>
> Key: OAK-6886
> URL: https://issues.apache.org/jira/browse/OAK-6886
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: compaction, gc
> Fix For: 1.8, 1.7.12
>
>
> After an offline compaction the {{gc.log}} always contains 0 for the number 
> of compacted nodes. This is caused by 
> {{org.apache.jackrabbit.oak.segment.tool.Compact.compact()}} instantiating a 
> new {{FileStore}} to run cleanup. That file store has new {{GCMonitor}} 
> instance, which did no see any of the nodes written by the compaction that 
> was run on the previous {{FileStore}} instance. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (OAK-6756) Convert oak-auth-external to OSGi R6 annotations

2017-10-31 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger reassigned OAK-6756:
-

Assignee: Christian Schneider

> Convert oak-auth-external to OSGi R6 annotations
> 
>
> Key: OAK-6756
> URL: https://issues.apache.org/jira/browse/OAK-6756
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: auth-external
>Reporter: Robert Munteanu
>Assignee: Christian Schneider
> Fix For: 1.8, 1.7.11
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (OAK-6852) RDBDocumentStore conditional remove: check condition properly

2017-10-31 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6852:

Comment: was deleted

(was: trunk: [r1812753|http://svn.apache.org/r1812753] 
[r1812750|http://svn.apache.org/r1812750])

> RDBDocumentStore conditional remove: check condition properly
> -
>
> Key: OAK-6852
> URL: https://issues.apache.org/jira/browse/OAK-6852
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.8, 1.7.10, 1.6.7
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6852) RDBDocumentStore conditional remove: check condition properly

2017-10-31 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226538#comment-16226538
 ] 

Julian Reschke commented on OAK-6852:
-

trunk: [r1812753|http://svn.apache.org/r1812753] 
[r1812750|http://svn.apache.org/r1812750]
1.6: [r1813865|http://svn.apache.org/r1813865]


> RDBDocumentStore conditional remove: check condition properly
> -
>
> Key: OAK-6852
> URL: https://issues.apache.org/jira/browse/OAK-6852
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.8, 1.7.10, 1.6.7
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6852) RDBDocumentStore conditional remove: check condition properly

2017-10-31 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6852:

Labels: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4  (was: 
candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4 candidate_oak_1_6)

> RDBDocumentStore conditional remove: check condition properly
> -
>
> Key: OAK-6852
> URL: https://issues.apache.org/jira/browse/OAK-6852
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.8, 1.7.10, 1.6.7
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6852) RDBDocumentStore conditional remove: check condition properly

2017-10-31 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6852:

Fix Version/s: 1.6.7

> RDBDocumentStore conditional remove: check condition properly
> -
>
> Key: OAK-6852
> URL: https://issues.apache.org/jira/browse/OAK-6852
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.8, 1.7.10, 1.6.7
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6886) OffRC always logs 0 for the number of compacted nodes in gc.log

2017-10-31 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226659#comment-16226659
 ] 

Francesco Mari commented on OAK-6886:
-

I don't see any reason why this shouldn't work better. The patch looks good to 
me.

> OffRC always logs 0 for the number of compacted nodes in gc.log
> ---
>
> Key: OAK-6886
> URL: https://issues.apache.org/jira/browse/OAK-6886
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: compaction, gc
> Fix For: 1.8, 1.7.12
>
>
> After an offline compaction the {{gc.log}} always contains 0 for the number 
> of compacted nodes. This is caused by 
> {{org.apache.jackrabbit.oak.segment.tool.Compact.compact()}} instantiating a 
> new {{FileStore}} to run cleanup. That file store has new {{GCMonitor}} 
> instance, which did no see any of the nodes written by the compaction that 
> was run on the previous {{FileStore}} instance. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6735) Lucene Index: improved cost estimation by using document count per field

2017-10-31 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233526#comment-16233526
 ] 

Vikas Saurabh commented on OAK-6735:


Pushed the change to 
https://github.com/catholicon/jackrabbit-oak/tree/OAK-6735-docCntPerFld - it 
has earlier patch + more tests. I would commit it by today.

> Lucene Index: improved cost estimation by using document count per field
> 
>
> Key: OAK-6735
> URL: https://issues.apache.org/jira/browse/OAK-6735
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query
>Affects Versions: 1.7.4
>Reporter: Thomas Mueller
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.8
>
> Attachments: IndexReadPattern.txt, LuceneIndexReadPattern.java, 
> OAK-6735.patch
>
>
> The cost estimation of the Lucene index is somewhat inaccurate because (by 
> default) it just used the number of documents in the index (as of Oak 1.7.4 
> by default, due to OAK-6333).
> Instead, it should use the number of documents for the given fields (the 
> minimum, if there are multiple fields with restrictions). 
> Plus divided by the number of restrictions (as we do now already).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6859) Schedule Revision GC in DocumentNodeStoreService

2017-10-31 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233615#comment-16233615
 ] 

Chetan Mehrotra commented on OAK-6859:
--

Okie. I got confused by reference to Quartz classes. Looks like they are only 
used for validating the cron expressions and not for actual scheduling

> Schedule Revision GC in DocumentNodeStoreService
> 
>
> Key: OAK-6859
> URL: https://issues.apache.org/jira/browse/OAK-6859
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Major
> Fix For: 1.8, 1.7.11
>
>
> Introduce scheduling of the Revision GC task in DocumentNodeStoreService. 
> There are already other tasks scheduled, like Journal GC and recovery when 
> another cluster node crashes.
> I'd like to enable Continuous Revision GC on MongoDB by default and schedule 
> Revision GC on RDB once a day at 2 AM. This avoids the need to trigger the GC 
> externally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)