[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2017-03-09 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3070:
--
Attachment: OAK-3070-3.patch

Attached an updated patch [^OAK-3070-3.patch]. The MongoVersionGCSupport does 
not set a hint anymore for the query and leaves it up to MongoDB to decide what 
the best plan is. With the _modified range now passed to 
getPossiblyDeletedDocs(), the database is IMO in a better position to decide 
what the best index is.

Regarding the query timeout, per default the query on the MongoDB driver level 
does not have a timeout. So we should be fine.

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Attachments: OAK-3070-2.patch, OAK-3070-3.patch, OAK-3070.patch, 
> OAK-3070-updated.patch, OAK-3070-updated.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2017-03-08 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3070:
--
Attachment: OAK-3070-2.patch

Attached an updated patch [^OAK-3070-2.patch] with additional tests, unified 
implementations for VersionGCSupport and no margin for the stored lower bound 
of the next VersionGC.

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Attachments: OAK-3070-2.patch, OAK-3070.patch, 
> OAK-3070-updated.patch, OAK-3070-updated.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2017-02-16 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3070:

Attachment: OAK-3070-updated.patch

patch update to apply to trunk (also logging slightly extended)

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Attachments: OAK-3070.patch, OAK-3070-updated.patch, 
> OAK-3070-updated.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2017-02-16 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3070:

Attachment: OAK-3070-updated.patch

patch update to apply to trunk

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Attachments: OAK-3070.patch, OAK-3070-updated.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2016-01-20 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-3070:
---
Fix Version/s: (was: 1.4)

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-10-21 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3070:
--
Fix Version/s: (was: 1.3.9)
   1.4

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Fix For: 1.4
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-09-08 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3070:
--
Fix Version/s: (was: 1.3.6)
   1.3.7

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Fix For: 1.3.7
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-09-08 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3070:
--
Labels: performance  (was: performance resilience)

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance
> Fix For: 1.3.6
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3070:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance, resilience
> Fix For: 1.3.6
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3070:
---
Labels: performance resilience  (was: )

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
>  Labels: performance, resilience
> Fix For: 1.3.5
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-07-22 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3070:
-
Assignee: Vikas Saurabh

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
>Assignee: Vikas Saurabh
> Fix For: 1.3.5
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-07-21 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-3070:
---
Attachment: OAK-3070.patch

Attaching [^OAK-3070.patch. 
{{VersionGarbageCollectorTest#testGCDeletedDocument}} pretty fairly covers the 
cases that version gc is working correctly.
The test case that I've added just asserts that {{gc()}} forms correct query to 
underlying storage such that already processed documents aren't picked again.
I wanted to keep a tight bound on the lower bound according to the timestamp 
used in the last run. But, I couldn't quite control virtual clock to generate a 
doc with _modified same as the last timestamp used -- so, instead I've given a 
margin of 1 minute to the lower bound (i.e. the lower bound is 1 minute less 
that the upper bound of last gc run).

[~chetanm], [~mreutegg], can you please review?

> Use a lower bound in VersionGC query to avoid checking unmodified once 
> deleted docs
> ---
>
> Key: OAK-3070
> URL: https://issues.apache.org/jira/browse/OAK-3070
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.5
>
> Attachments: OAK-3070.patch
>
>
> As part of OAK-3062 [~mreutegg] suggested
> {quote}
> As a further optimization we could also limit the lower bound of the _modified
> range. The revision GC does not need to check documents with a _deletedOnce
> again if they were not modified after the last successful GC run. If they
> didn't change and were considered existing during the last run, then they
> must still exist in the current GC run. To make this work, we'd need to
> track the last successful revision GC run. 
> {quote}
> Lowest last validated _modified can be possibly saved in settings collection 
> and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)