[jira] [Commented] (OAK-9267) Build Jackrabbit/jackrabbit-oak-trunk #82 failed

2020-11-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227486#comment-17227486
 ] 

Hudson commented on OAK-9267:
-

Previously failing build now is OK.
 Passed run: [Jackrabbit/jackrabbit-oak-trunk 
#89|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/89/] 
[console 
log|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/89/console]

> Build Jackrabbit/jackrabbit-oak-trunk #82 failed
> 
>
> Key: OAK-9267
> URL: https://issues.apache.org/jira/browse/OAK-9267
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration
>Reporter: Hudson
>Priority: Major
>
> No description is provided
> The build Jackrabbit/jackrabbit-oak-trunk #82 has failed.
> First failed run: [Jackrabbit/jackrabbit-oak-trunk 
> #82|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/82/] 
> [console 
> log|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/82/console]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (OAK-9267) Build Jackrabbit/jackrabbit-oak-trunk #82 failed

2020-11-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227426#comment-17227426
 ] 

Hudson commented on OAK-9267:
-

Previously failing build now is OK.
 Passed run: [Jackrabbit/jackrabbit-oak-trunk 
#88|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/88/] 
[console 
log|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/88/console]

> Build Jackrabbit/jackrabbit-oak-trunk #82 failed
> 
>
> Key: OAK-9267
> URL: https://issues.apache.org/jira/browse/OAK-9267
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration
>Reporter: Hudson
>Priority: Major
>
> No description is provided
> The build Jackrabbit/jackrabbit-oak-trunk #82 has failed.
> First failed run: [Jackrabbit/jackrabbit-oak-trunk 
> #82|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/82/] 
> [console 
> log|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/82/console]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-9213) Support feature vector similarity / image similarity in Oak ES

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-9213:
-
Fix Version/s: 1.36.0

> Support feature vector similarity / image similarity in Oak ES
> --
>
> Key: OAK-9213
> URL: https://issues.apache.org/jira/browse/OAK-9213
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>Reporter: Nitin Gupta
>Assignee: Amrit Verma
>Priority: Major
> Fix For: 1.36.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7043) Collect SegmentStore stats as part of status zip

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7043:
-
Fix Version/s: (was: 1.36.0)

> Collect SegmentStore stats as part of status zip
> 
>
> Key: OAK-7043
> URL: https://issues.apache.org/jira/browse/OAK-7043
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar
>Reporter: Chetan Mehrotra
>Priority: Major
>  Labels: monitoring, production
> Fix For: 1.38.0
>
>
> Many times while investigating issue we request customer to provide to size 
> of segmentstore and at times list of segmentstore directory. It would be 
> useful if there is an InventoryPrinter for SegmentStore which can include
> * Size of segment store 
> * Listing of segment store directory
> * Possibly tail of journal.log
> * Possibly some stats/info from index files stored in tar files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7098) Refactor common logic between IndexUpdate and DocumentStoreIndexer

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7098:
-
Fix Version/s: (was: 1.36.0)

> Refactor common logic between IndexUpdate and DocumentStoreIndexer
> --
>
> Key: OAK-7098
> URL: https://issues.apache.org/jira/browse/OAK-7098
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: indexing, run
>Reporter: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> DocumentStoreIndexer implements an alternative way of indexing which differs 
> from diff based indexing done by IndexUpdate. However some part of logic is 
> commong
> We should refactor them and abstract them out so both can share same logic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7504) Include dynamic commit information in the persisted repository data

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7504:
-
Fix Version/s: (was: 1.36.0)

> Include dynamic commit information in the persisted repository data
> ---
>
> Key: OAK-7504
> URL: https://issues.apache.org/jira/browse/OAK-7504
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Minor
> Fix For: 1.38.0
>
>
> The data in the Segment Store doesn't provide any information about the 
> dynamic behaviour of the system. For example, who performed the commit? How 
> many commits were performed from the same committer?
> In order to simplify debugging the dynamic behaviour of a system, it should 
> be possible to store metadata about the commit in the super-root generated by 
> that commit. For example, the following information might be attached to the 
> super-root:
> * The name of the thread performing the commit. This solution might prove 
> expensive in terms of consumed disk space, but would be the most precise tool 
> to identify the author of a commit.
> * A hash of the thread name. If storing thread names proves expensive, a hash 
> of the thread name can be stored instead. This doesn't allow to exactly 
> identify the author of the commit, but would allow us to correlated different 
> commits as performed by the same thread.
> * Both the thread name and its hash, with the thread name stored only every 
> Nth commit. This solution is not as precise as storing the thread name for 
> every commit but, if there is a frequent committer, its thread name will be 
> more likely to be sampled, thus providing a precise identity to a thread name 
> hash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3373) Observers dont survive store restart (was: LuceneIndexProvider: java.lang.IllegalStateException: open)

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3373:
-
Fix Version/s: (was: 1.36.0)

> Observers dont survive store restart (was: LuceneIndexProvider: 
> java.lang.IllegalStateException: open)
> --
>
> Key: OAK-3373
> URL: https://issues.apache.org/jira/browse/OAK-3373
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.3.5
>Reporter: Stefan Egli
>Priority: Minor
> Fix For: 1.38.0
>
>
> The following exception occurs when stopping, then immediately re-starting 
> the oak-core bundle (which was done as part of testing for OAK-3250 - but can 
> be reproduced independently). It's not clear what the consequences are 
> though..
> {code}08.09.2015 14:20:26.960 *ERROR* [oak-lucene-0] 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider Uncaught 
> exception in 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider@3a4a6c5c
> org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: Error 
> occurred while fetching children for path /oak:index/authorizables
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentStoreException.convert(DocumentStoreException.java:48)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.getChildren(DocumentNodeStore.java:902)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.getChildNodes(DocumentNodeStore.java:1082)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.getChildNodeEntries(DocumentNodeState.java:508)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.access$100(DocumentNodeState.java:65)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$ChildNodeEntryIterator.fetchMore(DocumentNodeState.java:716)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$ChildNodeEntryIterator.(DocumentNodeState.java:681)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$1.iterator(DocumentNodeState.java:289)
> at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:129)
> at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:303)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:359)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:140)
> at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:303)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:359)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:140)
> at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:303)
> at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:359)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:108)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:73)
> at 
> org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:127)
> at 
> org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:121)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: open
> at org.bson.util.Assertions.isTrue(Assertions.java:36)
> at 
> com.mongodb.DBTCPConnector.isMongosConnection(DBTCPConnector.java:367)
> at com.mongodb.Mongo.isMongosConnection(Mongo.java:622)
> at com.mongodb.DBCursor._check(DBCursor.java:494)
> at com.mongodb.DBCursor._hasNext(DBCursor.java:621)
> at 

[jira] [Updated] (OAK-6904) NRT Indexes should be closed if async indexer progresses

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6904:
-
Fix Version/s: (was: 1.36.0)

> NRT Indexes should be closed if async indexer progresses
> 
>
> Key: OAK-6904
> URL: https://issues.apache.org/jira/browse/OAK-6904
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.38.0
>
>
> Currently NRTIndex associated with IndexNodeManager are only closed upon 
> index update. However each IndexNodeManager keeps reference to 2 NRTIndex 
> instances. It can happen that following sequence can happen
> # Index /oak:index/ntBaseLucene refers to 2 nrt indexes NR1 and NR2. Where 
> NR1 has 1 M entries and NR2 has 1 M entries
> # AsyncIndexer updates and thus refreshes the /oak:index/ntBaseLucene. This 
> causes new NRT Index NR3 to be created and NR1 to be closed. So NR3 and NR2 
> are active
> # AsyncIndexer updates but no change happen in setup which causes any update 
> to /oak:index/ntBaseLucene. Thus this index does not get refreshed and 
> continues to refer to NR2 
> So as a fix we should refresh any index if it refers to 2 NRT indexes where 
> previous one is not empty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6757) Convert oak-auth-ldap to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6757:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-auth-ldap to OSGi R6 annotations
> 
>
> Key: OAK-6757
> URL: https://issues.apache.org/jira/browse/OAK-6757
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: auth-ldap
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-4647) Multiplexing support in PropertyIndexStats MBean

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-4647:
-
Fix Version/s: (was: 1.36.0)

> Multiplexing support in PropertyIndexStats MBean
> 
>
> Key: OAK-4647
> URL: https://issues.apache.org/jira/browse/OAK-4647
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: property-index
>Reporter: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.38.0
>
>
> {{PropertyIndexStats}} MBean added in OAK-4144 allows introspecting property 
> index content. This needs to be adapted to support updated storage format 
> when multiplexing is enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6914) Improve indexing progress estimates with multiple includes

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6914:
-
Fix Version/s: (was: 1.36.0)

> Improve indexing progress estimates with multiple includes
> --
>
> Key: OAK-6914
> URL: https://issues.apache.org/jira/browse/OAK-6914
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: indexing
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.38.0
>
>
> With OAK-5970 support was added for providing ETA as indexing progresses. 
> However as discussed in the issue this estimate might not be good if indexes 
> have multiple include and excludes
> Purpose of this task is to look for ways to improve it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5937) Disable query where path restriction is not absolute

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5937:
-
Fix Version/s: (was: 1.36.0)

> Disable query where path restriction is not absolute
> 
>
> Key: OAK-5937
> URL: https://issues.apache.org/jira/browse/OAK-5937
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Chetan Mehrotra
>Assignee: Thomas Mueller
>Priority: Minor
> Fix For: 1.38.0
>
>
> Query like below cannot be executed in a performant way. We should provide an 
> option to reject such queries
> //content/foo/bar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7425) Add discovery mechanism for tooling implementations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7425:
-
Fix Version/s: (was: 1.36.0)

> Add discovery mechanism for tooling implementations
> ---
>
> Key: OAK-7425
> URL: https://issues.apache.org/jira/browse/OAK-7425
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Major
>  Labels: technical_debt
> Fix For: 1.38.0
>
> Attachments: 001.patch
>
>
> This issue proposes an idea for discovering implementations of tooling for 
> the Segment Store. Developing a tool for the Segment Store should include the 
> following step.
> * The tool compiles against the {{NodeStore}} API and the API exposed through 
> the oak-segment-tar-tool-api. In particular, the tool uses the 
> {{ToolingSupportFactory}} and related interfaces to instantiate a NodeStore 
> and, optionally, a {{NodeState}} for the proc tree.
> * The tool runs with an implementation-dependent uber-jar in the classpath. 
> The uber-jar includes the {{ToolingSupportFactory}} API, its implementation, 
> and every other class required for the implementation to work. No other JARs 
> is required to use the {{ToolingSupportFactory}} API. The tool uses the 
> Java's {{ServiceLoader}} to instantiate an implementation of 
> {{ToolingSupportFactory}}. The uber-jar is the {{oak-segment-tar-tool}} 
> module.
> The patch falls short of fully implementing the use case because 
> {{oak-segment-tar-tool-api}} is not versioned independently from Oak. This 
> can't happen at the moment because {{oak-store-spi}} and its dependencies are 
> not independently versioned either. The workflow described above could still 
> work, but only because the {{NodeStore}} and {{NodeState}} API are quite 
> stable. A cleaner solution to dependency management is required in the long 
> run.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5588) Improve Session stats

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5588:
-
Fix Version/s: (was: 1.36.0)

> Improve Session stats
> -
>
> Key: OAK-5588
> URL: https://issues.apache.org/jira/browse/OAK-5588
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Ian Boston
>Priority: Major
>  Labels: monitoring, production
> Fix For: 1.38.0
>
>
> Currently each session has a SessionsStats MBean. Omongst other things it 
> records the total number or refresh operations. It also records the rate of 
> refresh operations, although this number in its current form is not usefull 
> as the rate is the number of refresh operations/session lifetime.  It would 
> be much better to have a set of stats related to classes of users that 
> recorded proper metrics in a consistent way.   eg 1 metric set per 
> service-user, 1 for the admin user and perhaps 1 for all normal users. Each 
> would record m1,m5,m15 rates, total count, p50,p75,p95,p99,p999 durations 
> with mean and stdev then 2 sets of metrics could be compared and monitored 
> without having to look at the code to work out how the metric was calculated. 
> Oak has metrics support to do this, minimal code would be required.
> I dont think it would be viable to have 1 metric per unique session (too much 
> overhead, too much data, good for devs but bad for production), and in fact 
> having 1 JMX MBean per unique session is likely to cause problems with 
> everything connected to JMX even the ManagementServer can cope. Same goes for 
> the other proliferation of MBeans in the Oak 1.6. Perhaps a review of JMX in 
> Oak is due.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-8413) Use the new Azure SDK in the Azure Segment Store

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-8413:
-
Fix Version/s: (was: 1.36.0)

> Use the new Azure SDK in the Azure Segment Store
> 
>
> Key: OAK-8413
> URL: https://issues.apache.org/jira/browse/OAK-8413
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-azure
>Reporter: Tomek Rękawek
>Assignee: Andrei Dulceanu
>Priority: Major
> Fix For: 1.38.0
>
>
> We should update the oak-segment-azure to use the most recent Azure SDK 
> version, to keep it compatible with the oak-blob-cloud-azure (see OAK-8105):
> {code}
> 
> com.microsoft.azure
> azure-storage-blob
> 11.0.1
> 
> {code}
> //cc: [~mattvryan]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5367) Strange path parsing

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5367:
-
Fix Version/s: (was: 1.36.0)

> Strange path parsing
> 
>
> Key: OAK-5367
> URL: https://issues.apache.org/jira/browse/OAK-5367
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: JcrPathParserTest.java
>
>
> Incorrect handling of path with "\{" was fixed in OAK-5260, but the behavior 
> of the JcrPathParser is still strange. For example:
> * the root node, "/", is mapped to "/", and the current node, "." is mapped 
> to "". But "/." is mapped to the current node (should be root node).
> * "/parent/./childA2" is mapped to "/parent/childA2" (which is fine), but 
> "/parent/.}/childA2" is also mapped to "/parent/childA2".
> * "\}\{" and "}\[" and "}}[" are mapped to the current node. So are ".[" and 
> "/[" and ".}". And "}\{test" is mapped to "}\{test", which is 
> inconsistent-weird.
> * "x\[1\]}" is mapped to "x".
> All that weirdness should be resolved. Some seem to be just weird, but some 
> look like they could become a problem at some point ("}\{").



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6408) Review package exports for o.a.j.oak.plugins.index.*

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6408:
-
Fix Version/s: (was: 1.36.0)

> Review package exports for o.a.j.oak.plugins.index.*
> 
>
> Key: OAK-6408
> URL: https://issues.apache.org/jira/browse/OAK-6408
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, indexing
>Reporter: Angela Schreiber
>Priority: Major
> Fix For: 1.38.0
>
>
> while working on OAK-6304 and OAK-6355, i noticed that the 
> _o.a.j.oak.plugins.index.*_ contains both internal api/utilities and 
> implementation details which get equally exported (though without having any 
> package export version set).
> in the light of the modularization effort, i would like to suggest that we 
> try to sort that out and separate the _public_ parts from implementation 
> details. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5897) Optimize like constraint support in Property Indexes

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5897:
-
Fix Version/s: (was: 1.36.0)

> Optimize like constraint support in Property Indexes
> 
>
> Key: OAK-5897
> URL: https://issues.apache.org/jira/browse/OAK-5897
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: property-index
>Reporter: Chetan Mehrotra
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> Consider a query
> {noformat}
>  /jcr:root/content//element(*, nt:unstructured)[jcr:like(@resource, 
> '/content/foo/bar%')]
> {noformat}
> This currently gets translated into a range property restriction 
> {noformat}
>  property=[resource=[[/content/foo/bar.., ../content/foo/bas]]]
> {noformat}
> For such a query property index currently returns all nodes having "resource" 
> property i.e. all index data. This can be optimized to return only those 
> nodes where indexed value qualifies the range property restriction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3658) Test failures: JackrabbitNodeTest#testRename and testRenameEventHandling

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3658:
-
Fix Version/s: (was: 1.36.0)

> Test failures: JackrabbitNodeTest#testRename and testRenameEventHandling
> 
>
> Key: OAK-3658
> URL: https://issues.apache.org/jira/browse/OAK-3658
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: jcr
>Reporter: Amit Jain
>Assignee: Amit Jain
>Priority: Minor
> Fix For: 1.38.0
>
>
> Tests fail regularly on trunk - {{JackrabbitNodeTest#testRename}} and 
> {{JackrabbitNodeTest#testRenameEventHandling}}.
> {noformat}
> Test set: org.apache.jackrabbit.oak.jcr.JackrabbitNodeTest
> ---
> Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.106 sec <<< 
> FAILURE!
> testRenameEventHandling(org.apache.jackrabbit.oak.jcr.JackrabbitNodeTest)  
> Time elapsed: 0.01 sec  <<< ERROR!
> javax.jcr.nodetype.ConstraintViolationException: Item is protected.
>   at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl$ItemWriteOperation.checkPreconditions(ItemImpl.java:98)
>   at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:614)
>   at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:270)
>   at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.rename(NodeImpl.java:1485)
>   at 
> org.apache.jackrabbit.oak.jcr.JackrabbitNodeTest.testRenameEventHandling(JackrabbitNodeTest.java:124)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at junit.framework.TestCase.runBare(TestCase.java:141)
>   at junit.framework.TestResult$1.protect(TestResult.java:122)
>   at junit.framework.TestResult.runProtected(TestResult.java:142)
>   at junit.framework.TestResult.run(TestResult.java:125)
>   at junit.framework.TestCase.run(TestCase.java:129)
>   at 
> org.apache.jackrabbit.test.AbstractJCRTest.run(AbstractJCRTest.java:464)
>   at junit.framework.TestSuite.runTest(TestSuite.java:252)
>   at junit.framework.TestSuite.run(TestSuite.java:247)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:86)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
> testRename(org.apache.jackrabbit.oak.jcr.JackrabbitNodeTest)  Time elapsed: 
> 0.007 sec  <<< FAILURE!
> junit.framework.ComparisonFailure: expected:<[a]> but was:<[rep:policy]>
>   at junit.framework.Assert.assertEquals(Assert.java:100)
>   at junit.framework.Assert.assertEquals(Assert.java:107)
>   at junit.framework.TestCase.assertEquals(TestCase.java:269)
>   at 
> org.apache.jackrabbit.oak.jcr.JackrabbitNodeTest.testRename(JackrabbitNodeTest.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at junit.framework.TestCase.runBare(TestCase.java:141)
>   at junit.framework.TestResult$1.protect(TestResult.java:122)
>   at 

[jira] [Updated] (OAK-7151) Support indexed based excerpts on properties

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7151:
-
Fix Version/s: (was: 1.36.0)

> Support indexed based excerpts on properties
> 
>
> Key: OAK-7151
> URL: https://issues.apache.org/jira/browse/OAK-7151
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-7151.patch, OAK-7151.xpath-new-syntax.patch, 
> OAK-7151.xpath.patch
>
>
> As discovered in OAK-4401 we fallback to {{SimpleExcerptProvider}} when 
> requesting excerpts for properties.
> The issue as highlighted in [~teofili]'s comment \[0] is that we at time of 
> query we don't have information about which all columns/fields would be 
> required for excerpts.
> A possible approach is that the query specified explicitly which columns 
> would be required in facets (of course, node level excerpt would still be 
> supported). This issue is to track that improvement.
> Note: this is *not* a substitute for OAK-4401 which is about doing saner 
> highlighting when {{SimpleExcerptProvider}} comes into play e.g. despite this 
> issue excerpt for non-stored fields (properties which aren't configured with 
> {{useInExcerpt}} in the index definition}, we'd need to fallback to 
> {{SimpleExcerptProvider}}.
> /[~tmueller]
> \[0]: 
> https://issues.apache.org/jira/browse/OAK-4401?focusedCommentId=15299857=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15299857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7106) Index Tooling for Oak 1.10

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7106:
-
Fix Version/s: (was: 1.36.0)

> Index Tooling for Oak 1.10
> --
>
> Key: OAK-7106
> URL: https://issues.apache.org/jira/browse/OAK-7106
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: indexing, run
>Reporter: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Epic to track tooling work for Oak 1.10 release



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6947) Add package export versions for oak-store-spi

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6947:
-
Fix Version/s: (was: 1.36.0)

> Add package export versions for oak-store-spi
> -
>
> Key: OAK-6947
> URL: https://issues.apache.org/jira/browse/OAK-6947
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: store-spi
>Reporter: Angela Schreiber
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-6947.patch
>
>
> [~mduerig], [~mreutegg], [~frm], [~stillalex], do you have any strong 
> preferences wrt to the packages we placed in the _oak-store-spi_ module?
> Currently we explicitly export all packages and I think it would make sense 
> to enable the baseline plugin for these packages.
> Any objection from your side?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7261) DocumentStore: inconsistent behaviour for invalid Strings as document ID

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7261:
-
Fix Version/s: (was: 1.36.0)

> DocumentStore: inconsistent behaviour for invalid Strings as document ID
> 
>
> Key: OAK-7261
> URL: https://issues.apache.org/jira/browse/OAK-7261
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: documentmk, mongomk, rdbmk
>Reporter: Julian Reschke
>Priority: Major
> Fix For: 1.38.0
>
>
> - H2DB and Derby roundtrip any string
>  - PostgreSQL rejects the invalid string early
>  - DB2 and Oracle fail the same way as segment store (they persist the 
> replacement character) (see OAK-5506)
>  - MySQL and SQLServer fail the same way as DB2 and Oracle, but here it's the 
> RDBDocumentStore's fault, because the ID column is binary, and we transform 
> to byte sequences ourselves
>  - Mongo claims it saved the document, but upon lookup, returns something 
> with a different ID
> Note that due to how RDB reads work, the returned document has the ID that 
> was requested, not what the DB actually contains.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5272) Expose BlobStore API to provide information whether blob id is content hashed

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5272:
-
Fix Version/s: (was: 1.36.0)

> Expose BlobStore API to provide information whether blob id is content hashed
> -
>
> Key: OAK-5272
> URL: https://issues.apache.org/jira/browse/OAK-5272
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: blob
>Reporter: Amit Jain
>Priority: Major
> Fix For: 1.38.0
>
>
> As per discussion in OAK-5253 it's better to have some information from the 
> BlobStore(s) whether the blob id can be solely relied upon for comparison.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6760) Convert oak-blob-cloud to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6760:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-blob-cloud to OSGi R6 annotations
> -
>
> Key: OAK-6760
> URL: https://issues.apache.org/jira/browse/OAK-6760
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: blob-cloud
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-8859) RDB*Store: update Oracle JDBC dependency to 19.3.0.0

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-8859:
-
Fix Version/s: (was: 1.36.0)

> RDB*Store: update Oracle JDBC dependency to 19.3.0.0
> 
>
> Key: OAK-8859
> URL: https://issues.apache.org/jira/browse/OAK-8859
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: parent
>Reporter: Julian Reschke
>Priority: Minor
> Fix For: 1.38.0
>
> Attachments: OAK-8859.diff
>
>
> See 
> .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5553) Index async index in a new lane without blocking the main lane

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5553:
-
Fix Version/s: (was: 1.36.0)

> Index async index in a new lane without blocking the main lane
> --
>
> Key: OAK-5553
> URL: https://issues.apache.org/jira/browse/OAK-5553
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: indexing
>Reporter: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Currently if an async index has to be reindex for any reason say update of 
> index definition then this process blocks the indexing of other indexes on 
> that lane. 
> For e.g. if on "async" lane we have 2 indexes /oak:index/fooIndex and 
> /oak:index/barIndex and fooIndex needs to be reindexed. In such a case 
> currently AsyncIndexUpdate would work on reindexing and untill that gets 
> complete other index do not receive any update. If the reindexing takes say 1 
> day then other index would start lagging behind by that time. Note that NRT 
> indexing would help somewhat here.
> To improve this we can implement something similar to what was done for 
> property index in OAK-1456 i.e. provide a way where 
> # an admin can trigger reindex of some async indexes
> # those indexes are moved to different lane and then reindexed
> # post reindexing logic should then move them back to there original lane
> Further this task can then be performed on non leader node as the indexes 
> would not be part of any active lane. Also we may implement it as part of 
> oak-run



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7635) oak-run check should support Azure Segment Store

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7635:
-
Fix Version/s: (was: 1.36.0)

> oak-run check should support Azure Segment Store
> 
>
> Key: OAK-7635
> URL: https://issues.apache.org/jira/browse/OAK-7635
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: run, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: tooling
> Fix For: 1.38.0
>
>
> {{oak-run check}} should accept Azure URIs for the segment store in order to 
> be able to check for data integrity. This will come handy in the light of 
> remote compacted segment stores and/or sidegraded remote segment stores (see 
> OAK-7623, OAK-7459).
> The Azure URI will be taken as argument and will have the following format: 
> {{az:[https://myaccount.blob.core.windows.net/container/repo]}}, where _az_ 
> identifies the cloud provider. The last missing piece is the secret key which 
> will be supplied as an environment variable, i.e. _AZURE_SECRET_KEY._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5990) Add properties filtering support to OakEventFilter

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5990:
-
Fix Version/s: (was: 1.36.0)

> Add properties filtering support to OakEventFilter
> --
>
> Key: OAK-5990
> URL: https://issues.apache.org/jira/browse/OAK-5990
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Affects Versions: 1.6.1
>Reporter: Stefan Egli
>Priority: Major
> Fix For: 1.38.0
>
>
> SLING-6164 introduced a _property name hint_ which, when set, allows to limit 
> the observation events to only include those that affect at least one of the 
> those properties listed. The advantage is to be further able to reduce the 
> events sent out. This feature has not yet been implemented on the oak side. 
> Thus we should add this to the OakEventFilter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6911) Provide a way to tune inline size while storing binaries

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6911:
-
Fix Version/s: (was: 1.36.0)

> Provide a way to tune inline size while storing binaries
> 
>
> Key: OAK-6911
> URL: https://issues.apache.org/jira/browse/OAK-6911
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Chetan Mehrotra
>Priority: Major
>  Labels: performance, scalability
> Fix For: 1.38.0
>
>
> SegmentNodeStore currently inlines binaries of size less that 16KB 
> (Segment.MEDIUM_LIMIT) even if external BlobStore is configured. 
> Due to this behaviour quite a bit of segment tar storage consist of blob 
> data. In one setup out of 370 GB segmentstore size 290GB is due to inlined 
> binary. If most of this binary content is moved to BlobStore then it would 
> allow same repository to work better in lesser RAM
> So it would be useful if some way is provided to disable this default 
> behaviour and let BlobStore take control of inline size i.e. in presence of 
> BlobStore no inlining is attempted by SegmentWriter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6919) SegmentCache might introduce unwanted memory references to SegmentId instances

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6919:
-
Fix Version/s: (was: 1.36.0)

> SegmentCache might introduce unwanted memory references to SegmentId instances
> --
>
> Key: OAK-6919
> URL: https://issues.apache.org/jira/browse/OAK-6919
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Major
> Fix For: 1.38.0
>
>
> {{SegmentCache}} contains, through the underlying Guava cache, hard 
> references to both {{SegmentId}} and {{Segment}} instances. Thus, 
> {{SegmentCache}} contributes to the computation of in-memory references that, 
> in turn, constitute the root references of the garbage collection algorithm.
> Further investigations are needed to assess this statement but, if 
> {{SegmentCache}} is proved to be problematic, there are some possible 
> solutions.
> For example, {{SegmentCache}} might be reworked to store references to 
> MSB/LSB pairs as keys, instead of to {{SegmentId}} instances. Moreover, 
> instead of referencing {{Segment}} instances as values, {{SegmentCache}} 
> might hold references to their underlying {{ByteBuffer}}. With these changes 
> in place, {{SegmentCache}} would not interfere with {{SegmentTracker}} and 
> the garbage collection algorithm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6762) Convert oak-blob to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6762:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-blob to OSGi R6 annotations
> ---
>
> Key: OAK-6762
> URL: https://issues.apache.org/jira/browse/OAK-6762
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: blob
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7660) Refactor AzureCompact and Compact

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7660:
-
Fix Version/s: (was: 1.36.0)

> Refactor AzureCompact and Compact
> -
>
> Key: OAK-7660
> URL: https://issues.apache.org/jira/browse/OAK-7660
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: tech-debt, technical_debt, tooling
> Fix For: 1.38.0
>
>
> {{AzureCompact}} in {{oak-segment-azure}} follows closely the structure and 
> logic of {{Compact}} in {{oak-segment-tar}}. Since the only thing which 
> differs is the underlying persistence used (remote in Azure vs. local in TAR 
> files), the common logic should be extracted in a super-class, extended by 
> both. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7634) Repository migration docs should include info on TAR <-> Azure sidegrade

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7634:
-
Fix Version/s: (was: 1.36.0)

> Repository migration docs should include info on TAR <-> Azure sidegrade
> 
>
> Key: OAK-7634
> URL: https://issues.apache.org/jira/browse/OAK-7634
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: doc, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6767) Remove felix SCR annotation support from parent pom

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6767:
-
Fix Version/s: (was: 1.36.0)

> Remove felix SCR annotation support from parent pom
> ---
>
> Key: OAK-6767
> URL: https://issues.apache.org/jira/browse/OAK-6767
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: parent
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6577) Determine the approach for reindexing in case of CompositeNodeStore setups

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6577:
-
Fix Version/s: (was: 1.36.0)

> Determine the approach for reindexing in case of CompositeNodeStore setups
> --
>
> Key: OAK-6577
> URL: https://issues.apache.org/jira/browse/OAK-6577
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: composite, indexing
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Current index tooling is designed to work with a single NodeStore setups. We 
> should determine how reindexing should be done for CompositeNodeStore setup 
> specially where one of the mount is private and read only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6765) Convert oak-jcr to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6765:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-jcr to OSGi R6 annotations
> --
>
> Key: OAK-6765
> URL: https://issues.apache.org/jira/browse/OAK-6765
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: jcr
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6166) Support versioning in the composite node store

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6166:
-
Fix Version/s: (was: 1.36.0)

> Support versioning in the composite node store
> --
>
> Key: OAK-6166
> URL: https://issues.apache.org/jira/browse/OAK-6166
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: composite
>Reporter: Tomek Rękawek
>Priority: Minor
> Fix For: 1.38.0
>
>
> The mount info provider should affect the versioning code as well, so version 
> histories for the mounted paths are stored separately. Similarly to what we 
> have in the indexing, let's store the mounted version histories under:
> /jcr:system/jcr:versionStorage/:oak:mount-MOUNTNAME



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6187) Index usage analysis (which index was used when and how)

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6187:
-
Fix Version/s: (was: 1.36.0)

> Index usage analysis (which index was used when and how)
> 
>
> Key: OAK-6187
> URL: https://issues.apache.org/jira/browse/OAK-6187
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: query
>Reporter: Thomas Mueller
>Priority: Minor
> Fix For: 1.38.0
>
>
> In order to reduce space usage, unused indexes should be removed or trimmed. 
> Trimmed, because an index definition might contain "too much" (specially 
> Lucene indexes, which index multiple properties and can be very large).
> One solution is to 
> * log each query (avoiding duplicates), 
> * include which indexes were used,
> * from that generate the minimum index definitions,
> * compared with existing indexes, get the list of unused indexes and index 
> features since day X.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5506) reject item names with unpaired surrogates early

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5506:
-
Fix Version/s: (was: 1.36.0)

> reject item names with unpaired surrogates early
> 
>
> Key: OAK-5506
> URL: https://issues.apache.org/jira/browse/OAK-5506
> Project: Jackrabbit Oak
>  Issue Type: Wish
>  Components: core, jcr, segment-tar
>Affects Versions: 1.5.18
>Reporter: Julian Reschke
>Priority: Minor
>  Labels: resilience
> Fix For: 1.38.0
>
> Attachments: OAK-5506-01.patch, OAK-5506-02.patch, OAK-5506-4.diff, 
> OAK-5506-bench.diff, OAK-5506-jcr-level.diff, OAK-5506-name-conversion.diff, 
> OAK-5506-segment.diff, OAK-5506-segment2.diff, OAK-5506-segment3.diff, 
> OAK-5506.diff, ValidNamesTest.java
>
>
> Apparently, the following node name is accepted:
>{{"foo\ud800"}}
> but a subsequent {{getPath()}} call fails:
> {noformat}
> javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not 
> exist anymore
> at 
> org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86)
> at 
> org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615)
> at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
> at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140)
> at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271)
> at 
> org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat}
> (test case follows)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6774) Convert oak-upgrade to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6774:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-upgrade to OSGi R6 annotations
> --
>
> Key: OAK-6774
> URL: https://issues.apache.org/jira/browse/OAK-6774
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: upgrade
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6515) Decouple indexing and upload to datastore

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6515:
-
Fix Version/s: (was: 1.36.0)

> Decouple indexing and upload to datastore
> -
>
> Key: OAK-6515
> URL: https://issues.apache.org/jira/browse/OAK-6515
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: indexing, lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Minor
> Fix For: 1.38.0
>
>
> Currently the default async index delay is 5 seconds. Using a larger delay 
> (e.g. 15 seconds) reduces index related growth, however diffing is delayed 15 
> seconds, which can reduce indexing performance. 
> One option (which might require bigger changes) is to index every 5 seconds, 
> and store the index every 5 seconds in the local directory, but only write to 
> the datastore / nodestore every 3rd time (that is, every 15 seconds).
> So that other cluster nodes will only see the index update every 15 seconds. 
> The diffing is done every 5 seconds, and the local index could be used every 
> 5 or every 15 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6628) More precise indexRules support via filtering criteria on property

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6628:
-
Fix Version/s: (was: 1.36.0)

> More precise indexRules support via filtering criteria on property
> --
>
> Key: OAK-6628
> URL: https://issues.apache.org/jira/browse/OAK-6628
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> For Lucene index we currently support indexRules based on nodetype. Here the 
> recommendation is that users must use most precise nodeType/mixinType to 
> target the indexing rule so that only relevant nodes are indexed. 
> For many Sling based applications its being seen that lots of content is 
> nt:unstructured and it uses {{sling:resourceType}} property to distinguish 
> various such nt:unstructured nodes. Currently its not possible to target 
> index definition to index only those nt:unstructured which have specific 
> {{sling:resourceType}}. Which makes it harder to provide a more precise index 
> definitions.
> To help such cases we can generalize the indexRule support via a filtering 
> criteria
> {noformat}
> activityIndex
>   - type = "lucene"
>   + indexRules
> + nt:unstructured
>   - filter-property = "sling:resourceType"
>   - filter-value = "app/activitystreams/components/activity"
>   + properties
> - jcr:primaryType = "nt:unstructured"
> + verb
>   - propertyIndex = true
>   - name = "verb"
> {noformat}
> So indexRule would have 2 more config properties
> * filter-property - Name of property to match
> * filter-value - The value to match
> *Indexing*
> At time of indexing currently LuceneIndexEditor does a 
> {{indexDefinition.getApplicableIndexingRule}} passing it the NodeState. 
> Currently this checks only for jcr:PrimaryType and jxr:mixins to find 
> matching rule.
> This logic would need to be extended to also check if any filter-property is 
> defined in definition. If yes then check if NodeState has that value
> *Querying*
> On query side we need to change the IndexPlanner where it currently use query 
> nodetype for finding matching indexRule. In addition it would need to pass on 
> the property restrictions and the rule only be matched if the property 
> restriction matches the filter
> *Open Item*
> # How to handle change in filter-property value. I think we have similar 
> problem currently if an index nodes nodeType gets changed. In such a case we 
> do not remove it from index. So we need to solve that for both
> # Ensure that all places where rules are matched account for this filter 
> concept



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-1150) NodeType index: don't index all primary and mixin types

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-1150:
-
Fix Version/s: (was: 1.36.0)

> NodeType index: don't index all primary and mixin types
> ---
>
> Key: OAK-1150
> URL: https://issues.apache.org/jira/browse/OAK-1150
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: property-index
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> Currently, the nodetype index indexes all primary types and mixin types 
> (including nt:base I think).
> This results in many nodes in this index, which unnecessarily increases the 
> repository size, but doesn't really help executing queries (running a query 
> to get all nt:base nodes doesn't benefit much from using the nodetype index).
> It should also help reduce writes in updating the index, for example for 
> OAK-1099



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-2787) Faster multi threaded indexing / text extraction for binary content

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-2787:
-
Fix Version/s: (was: 1.36.0)

> Faster multi threaded indexing / text extraction for binary content
> ---
>
> Key: OAK-2787
> URL: https://issues.apache.org/jira/browse/OAK-2787
> Project: Jackrabbit Oak
>  Issue Type: Wish
>  Components: lucene
>Reporter: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> With Lucene based indexing the indexing process is single threaded. This 
> hamper the indexing of binary content as on a multi processor system only 
> single thread can be used to perform the indexing
> [~ianeboston] Suggested a possible approach [1] involving a 2 phase indexing
> # In first phase detect the nodes to be indexed and start the full text 
> extraction of the binary content. Post extraction save the binary token 
> stream back to the node as a hidden data. In this phase the node properties 
> can still be indexed and a marker field would be added to indicate the 
> fulltext index is still pending
> # Later in 2nd phase look for all such Lucene docs and then update them with 
> the saved token stream
> This would allow the text extraction logic to be decouple from Lucene 
> indexing logic
> [1] http://markmail.org/thread/2w5o4bwqsosb6esu



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-2777) Minimize the cost calculation for queries using reference restrictions.

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-2777:
-
Fix Version/s: (was: 1.36.0)

> Minimize the cost calculation for queries using reference restrictions.
> ---
>
> Key: OAK-2777
> URL: https://issues.apache.org/jira/browse/OAK-2777
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, query
>Affects Versions: 1.1.2, 1.2
>Reporter: Przemyslaw Pakulski
>Assignee: Thomas Mueller
>Priority: Major
>  Labels: performance
> Fix For: 1.38.0
>
> Attachments: oak-2777.patch
>
>
> According to the javadocs (QueryIndex) minimum cost for index is 1. Currently 
> ReferenceIndex returns this minimum value, when it can be used for the query.
> But even then cost for remaining indexes is still calculated. We could skip 
> cost calculation of remaining indexes if we achieved the minimum cost already.
> It will speed up all queries which can leverage the reference Index.
> Example query:
> SELECT * FROM [nt:base] WHERE PROPERTY([rep:members], 'WeakReference') = 
> '345bef9b-ffa1-3e09-85df-1e03cfa0fb37'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3767) Provide a way to extend shipped index definitions

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3767:
-
Fix Version/s: (was: 1.36.0)

> Provide a way to extend shipped index definitions
> -
>
> Key: OAK-3767
> URL: https://issues.apache.org/jira/browse/OAK-3767
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: indexing, query
>Reporter: Davide Giannella
>Priority: Major
> Fix For: 1.38.0
>
>
> We need to provide an explicit support for extending out of the box shipped 
> index definition by an application built on top of Oak. Consider a Sling 
> based app which ships with an index on assets like /oak:index/assetIndex. 
> This application is now used in a project where some project specific 
> extensions are to be done i.e. some new custom asset properties are to be 
> indexed. Currently there are two options
> # Create new duplicate index - For project usage we can create a separate 
> index which includes the project specific properties. This has following 
> downsides
> ## Increases index memory consumption - As both /oak:index/assetIndex and 
> /oak:index/myAssetIndex would index same asset nodes they would be storing 
> the same asset path twice and hence cause an increase in memory consumption 
> by the index
> # Increase in indexing time - With increase in number of indexes at same 
> level the indexing time would increase
> # Ambiguity in index selection - As both indexes index same type of nodes 
> they would compete in answering queries related to assets leading to 
> ambiguity in index selection by query engine. 
> Given above it would be better to avoid such cases and provide an explicit 
> support for extending the index definitions. This can be done by enabling 
> support for adding index definition extensions under a sub directory in a sub 
> directory under /oak:index
> {noformat}
> /oak:index
>   + assetIndex
>   + apps
>  + assetIndex
> {noformat}
> The indexing logic should then use the effective index definition for 
> indexing and querying. 
> *question*. Shall we allow this only under root or under any arbitrary path 
> as well? For example /content.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7846) Add a tool to export the tree pointed to by a node record

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7846:
-
Fix Version/s: (was: 1.36.0)

> Add a tool to export the tree pointed to by a node record
> -
>
> Key: OAK-7846
> URL: https://issues.apache.org/jira/browse/OAK-7846
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar
>Affects Versions: 1.10.0
>Reporter: Francesco Mari
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-7846-01.patch, OAK-7846-02.patch
>
>
> oak-segment-tar should have a tool that allows exporting a tree pointed to by 
> a node record. The tool must be written in a way that plays along with 
> existing Oak tools (see OAK-7834) and conventional UNIX ones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6772) Convert oak-solr-core to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6772:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-solr-core to OSGi R6 annotations
> 
>
> Key: OAK-6772
> URL: https://issues.apache.org/jira/browse/OAK-6772
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: solr
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7922) Improve the operations and the reporting of the check command

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7922:
-
Fix Version/s: (was: 1.36.0)

> Improve the operations and the reporting of the check command
> -
>
> Key: OAK-7922
> URL: https://issues.apache.org/jira/browse/OAK-7922
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-7922-01.patch
>
>
> The check command allows a user to check for both the head and the 
> checkpoints. At the end of the execution the command outputs the consistent 
> revisions for the head and the individual checkpoints, if any is found. 
> Moreover, it prints an overall good revision. The consistent revisions for 
> the head and the checkpoints could all be different. If both the head and all 
> the checkpoints are assigned to a consistent revision, the overall good 
> revision is the oldest of those revisions.
> I wonder how useful all of this information is to a user of the command:
>  - I might have a revision where a checkpoint is consistent, but the head is 
> not. In this case, I don't want to revert to that revision because my system 
> will probably be unstable due to the inconsistent head.
>  - The overall good revision might still be partially inconsistent due to the 
> way the command short-circuits the consistency check on the head and the 
> checkpoints. If I revert to the overall good revision, the head might still 
> be inconsistent or one of the checkpoints might be missing.
> I propose to remove the {{\--checkpoints}} and the {{\--head}} flags and 
> define the behaviour of the command as follows.
>  - The check command checks one super-root at a time in its entirety (both 
> head and referenced checkpoints).
>  - The command exits as soon as a super-root is found where both the head and 
> all the checkpoints are consistent.
>  - While searching, the command might find a super-root with a consistent 
> head but one or more inconsistent checkpoint. In this case, the first of such 
> revisions is printed, specifying which checkpoints are inconsistent.
>  - The user might specify a {{--no-checkpoints}} flag to skip checking the 
> checkpoints in the steps above.
> The optimisations currently implemented by the check command can be 
> maintained. We don't need to fully traverse the head or the checkpoints if a 
> well-known corrupted path is still corrupted in the current iteration. The 
> approach proposed above enables additional optimisations:
>  - Since checkpoints are immutable, the command doesn't need to traverse a 
> checkpoint that was inspected before. This is true regardless of the 
> consistency of the checkpoint.
>  - If a super-root includes a checkpoint that was previously determined 
> corrupted, the command can skip that super-root without further inspection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7390) QueryResult.getSize() can be slow for many "or" or "union" conditions

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7390:
-
Fix Version/s: (was: 1.36.0)

> QueryResult.getSize() can be slow for many "or" or "union" conditions
> -
>
> Key: OAK-7390
> URL: https://issues.apache.org/jira/browse/OAK-7390
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> For queries with many union conditions, the "fast" getSize method can 
> actually be slower than iterating over the result. 
> The reason is, the number of index calls grows exponential with regards to 
> number of subqueries: (3x + x^2) / 2, where x is the number of subqueries. 
> For this to have a measurable affect, the number of subqueries needs to be 
> large (more than 100), and the index needs to be slow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-9240) oak-benchmarks jar should be deployed to maven-central

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-9240:
-
Fix Version/s: (was: 1.36.0)

> oak-benchmarks jar should be deployed to maven-central
> --
>
> Key: OAK-9240
> URL: https://issues.apache.org/jira/browse/OAK-9240
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: benchmarks
>Affects Versions: 1.34.0
>Reporter: Aravindo Wingeier
>Assignee: Andrei Dulceanu
>Priority: Trivial
> Fix For: 1.38.0
>
> Attachments: Do_not_skip_deployment_for_oak-benchmarks.patch
>
>
> While [oak-run jar is deployed to maven 
> central|https://repo1.maven.org/maven2/org/apache/jackrabbit/oak-run/1.34.0/],
>  oak-benchmarks is missing. 
>  This is probably due to the property in `oak-benchmarks/pom.xml`:
> {code}
> 
> true
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5984) Property indexes can get ouf of sync

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5984:
-
Fix Version/s: (was: 1.36.0)

> Property indexes can get ouf of sync
> 
>
> Key: OAK-5984
> URL: https://issues.apache.org/jira/browse/OAK-5984
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: property-index
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> Property indexes can get out of sync for the following reasons:
> * the index was disabled for some time
> * the property index component was not started / configured
> * the index definition was changed without reindexing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7212) Document the document order traversal option

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7212:
-
Fix Version/s: (was: 1.36.0)

> Document the document order traversal option
> 
>
> Key: OAK-7212
> URL: https://issues.apache.org/jira/browse/OAK-7212
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: doc, run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Document the doc-order-traversal option introduced with OAK-6353



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5655) TarMK: Analyse locality of reference

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5655:
-
Fix Version/s: (was: 1.36.0)

> TarMK: Analyse locality of reference 
> -
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Priority: Major
>  Labels: scalability
> Fix For: 1.38.0
>
> Attachments: compaction-time-vs-reposize.m, 
> compaction-time-vs.reposize.png, data00053a.tar-reads.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png, segment-reads.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3236) integration test that simulates influence of clock drift

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3236:
-
Fix Version/s: (was: 1.36.0)

> integration test that simulates influence of clock drift
> 
>
> Key: OAK-3236
> URL: https://issues.apache.org/jira/browse/OAK-3236
> Project: Jackrabbit Oak
>  Issue Type: Test
>  Components: core
>Affects Versions: 1.3.4
>Reporter: Stefan Egli
>Priority: Major
> Fix For: 1.38.0
>
>
> Spin-off of OAK-2739 [of this 
> comment|https://issues.apache.org/jira/browse/OAK-2739?focusedCommentId=14693398=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14693398]
>  - ie there should be an integration test that show cases the issues with 
> clock drift and why it is a good idea to have a lease-check (that refuses to 
> let the document store be used any further once the lease times out locally)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3919) Properly manage APIs / SPIs intended for public consumption

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3919:
-
Fix Version/s: (was: 1.36.0)

> Properly manage APIs / SPIs intended for public consumption
> ---
>
> Key: OAK-3919
> URL: https://issues.apache.org/jira/browse/OAK-3919
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Michael Dürig
>Priority: Major
>  Labels: modularization, technical_debt
> Fix For: 1.38.0
>
>
> This is a follow up to OAK-3842, which removed package export declarations 
> for all packages that we either do not want to be used outside of Oak or that 
> are not stable enough yet. 
> This issue is to identify those APIs and SPIs of Oak that we actually *want* 
> to export and to refactor those such we *can* export them. 
> Candidates that are currently used from upstream projects I know of are:
> {code}
>   org.apache.jackrabbit.oak.plugins.observation
>   org.apache.jackrabbit.oak.spi.commit
>   org.apache.jackrabbit.oak.spi.state
>   org.apache.jackrabbit.oak.commons
>   org.apache.jackrabbit.oak.plugins.index.lucene
> {code}
> I suggest to create subtask for those we want to go forward with.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6761) Convert oak-blob-plugins to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6761:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-blob-plugins to OSGi R6 annotations
> ---
>
> Key: OAK-6761
> URL: https://issues.apache.org/jira/browse/OAK-6761
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: blob-plugins
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7207) Define porcelain and plumbing tools for the Segment Store

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7207:
-
Fix Version/s: (was: 1.36.0)

> Define porcelain and plumbing tools for the Segment Store
> -
>
> Key: OAK-7207
> URL: https://issues.apache.org/jira/browse/OAK-7207
> Project: Jackrabbit Oak
>  Issue Type: Wish
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Major
>  Labels: production, tooling
> Fix For: 1.38.0
>
>
> In a spirit similar to 
> [Git|https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain]'s, 
> it would be beneficial to create porcelain and plumbing tooling for the 
> Segment Store.
> Plumbing tools expose lower level operations on the Segment Store. Knowledge 
> about the internals of the Segment Store is necessary to understand how 
> plumbing tools work. Plumbing tools communicate via a command line interface. 
> It must be easy to invoke plumbing tools from other tools (possibly by 
> shelling out). The output of plumbing tools must be easy to consume 
> programmatically.
> Porcelain tools are written for human consumption. Their interface must be 
> user-friendly and should be as much as possible backwards compatible. 
> Porcelain tools use plumbing ones to implement their features. It should be 
> possible to use the same porcelain tools with different versions of the 
> plumbing tools, as long as the plumbing tools "speak" through an interface 
> that remain sufficiently compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6741) Switch to official OSGi component and metatype annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6741:
-
Fix Version/s: (was: 1.36.0)

> Switch to official OSGi component and metatype annotations
> --
>
> Key: OAK-6741
> URL: https://issues.apache.org/jira/browse/OAK-6741
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-6741-proposed-changes-chetans-feedback.patch, 
> osgi-metadata-1.7.8.json, osgi-metadata-trunk.json
>
>
> We should remove the 'old' Felix SCR annotations and move to the 'new' OSGi 
> R6 annotations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6303) Cache in CachingBlobStore might grow beyond configured limit

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6303:
-
Fix Version/s: (was: 1.36.0)

> Cache in CachingBlobStore might grow beyond configured limit
> 
>
> Key: OAK-6303
> URL: https://issues.apache.org/jira/browse/OAK-6303
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, core
>Reporter: Julian Reschke
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-6303-test.diff, OAK-6303.diff
>
>
> It appears that depending on actual cache entry sizes, the {{CacheLIRS}} 
> might grow beyond the configured limit.
> For {{RDBBlobStore}}, the limit is currently configured to 16MB, yet storing 
> random 2M entries appears to fill the cache with 64MB of data (according to 
> it's own stats).
> The attached test case reproduces this.
> (it seems this is caused by the fact that each of the 16 segments of the 
> cache can hold 2 entries, no matter how big they are...)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6941) Compatibility matrix for oak-run compact

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6941:
-
Fix Version/s: (was: 1.36.0)

> Compatibility matrix for oak-run compact
> 
>
> Key: OAK-6941
> URL: https://issues.apache.org/jira/browse/OAK-6941
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: doc, run, segment-tar
>Reporter: Valentin Olteanu
>Priority: Major
>  Labels: documentation, technical_debt, tooling
> Fix For: 1.38.0
>
>
> h4. Problem statement
> For compacting the segmentstore using {{oak-run}}, the safest option is to 
> use the same version of {{oak-run}} as the Oak version used to generate the 
> repository. Yet, sometimes, a newer {{oak-run}} version is recommended to 
> benefit of bug fixes and improvements, but not every combination of source 
> repo and oak-run is safe to use and the user needs a way to check the 
> compatibility. Thus, the users need a tool that guides the decision of which 
> version to use.
> h4. Requirements
> * Easy to decide what {{oak-run}} version should be used for a certain Oak 
> version
> * Up to date with the latest releases
> * Machine readable for scripting
> * Include details on the benefits of using a certain version (release notes)
> * Blacklist of versions that should not be used (with alternatives)
> h4. Solution
> TBD



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5121) review CommitInfo==null in BackgroundObserver with isExternal change

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5121:
-
Fix Version/s: (was: 1.36.0)

> review CommitInfo==null in BackgroundObserver with isExternal change
> 
>
> Key: OAK-5121
> URL: https://issues.apache.org/jira/browse/OAK-5121
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core
>Affects Versions: 1.5.13
>Reporter: Stefan Egli
>Assignee: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-5121.patch
>
>
> OAK-4898 changes CommitInfo to be never null. This is the case outside of the 
> BackgroundObserver - but in the BackgroundObserver itself it is explicitly 
> set to null when compacting. 
> Once OAK-4898 is committed this task is about reviewing the implications in 
> BackgroundObserver wrt compaction and CommitInfo==null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-8288) fix javadoc:javadoc for jdk >= 13

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-8288:
-
Fix Version/s: (was: 1.36.0)

> fix javadoc:javadoc for jdk >= 13
> -
>
> Key: OAK-8288
> URL: https://issues.apache.org/jira/browse/OAK-8288
> Project: Jackrabbit Oak
>  Issue Type: Bug
>Reporter: Julian Reschke
>Priority: Minor
> Fix For: 1.38.0
>
> Attachments: JavaDocHtmlHeaderTest.java
>
>
> Javadoc in JDK 13 makes additional HTML validity checks:
>  * nesting of headlines ( after  is an error)
>  * empty  tags



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5860) Compressed segments

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5860:
-
Fix Version/s: (was: 1.36.0)

> Compressed segments
> ---
>
> Key: OAK-5860
> URL: https://issues.apache.org/jira/browse/OAK-5860
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: scalability
> Fix For: 1.38.0
>
>
> It would be interesting to see the effect of compressing the segments within 
> the tar files with a sufficiently effective and performant compression 
> algorithm:
> * Can we increase overall throughput by trading CPU for IO?
> * Can we scale to bigger repositories (in number of nodes) by squeezing in 
> more segments per MB and thus pushing out onset of thrashing?
> * What would be a good compression algorithm/library?
> * Can/should we make this optional? 
> * Migration and compatibility issues?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7192) Remove package export for org.apache.jackrabbit.oak.composite.checks

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7192:
-
Fix Version/s: (was: 1.36.0)

> Remove package export for org.apache.jackrabbit.oak.composite.checks
> 
>
> Key: OAK-7192
> URL: https://issues.apache.org/jira/browse/OAK-7192
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: composite
>Reporter: Marcel Reutegger
>Priority: Minor
> Fix For: 1.38.0
>
>
> It appears the package {{org.apache.jackrabbit.oak.composite.checks}} is only 
> used internally by the oak-store-composite module and should not be exported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6773) Convert oak-store-composite to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6773:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-store-composite to OSGi R6 annotations
> --
>
> Key: OAK-6773
> URL: https://issues.apache.org/jira/browse/OAK-6773
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: composite
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7193) DataStore: API to retrieve statistic (file headers, size estimation)

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7193:
-
Fix Version/s: (was: 1.36.0)

> DataStore: API to retrieve statistic (file headers, size estimation)
> 
>
> Key: OAK-7193
> URL: https://issues.apache.org/jira/browse/OAK-7193
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: blob
>Reporter: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> Extension of OAK-6254: in addition to retrieving the size, it would be good 
> to retrieve the estimated number and total size per file type. A simple (and 
> in my view sufficient) solution is to use the first few bytes ("magic 
> numbers", 2 bytes should be enough) to get the file type. That would allow to 
> estimate, for example, the number of, and total size, of PDF files, JPEG, 
> Lucene index and so on. A histogram would be nice as well, but I think is not 
> needed.
> To speed up calculation, the blob ID could be extended with the first 2 bytes 
> of the file content, that is: #@ where magic is the 
> first two bytes, in hex. That would allow to quickly get the data from the 
> blob ids (no need to actually read content).
> Sampling should be enough. The longer it takes, the more accurate the data. 
> We could store the data while doing datastore GC, in which case the returned 
> data would be somewhat stale; that's OK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7423) Document the proc tree

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7423:
-
Fix Version/s: (was: 1.36.0)

> Document the proc tree
> --
>
> Key: OAK-7423
> URL: https://issues.apache.org/jira/browse/OAK-7423
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Major
>  Labels: technical_debt
> Fix For: 1.38.0
>
>
> The proc tree, contributed in OAK-7416, lacks Javadoc and high-level 
> documentation. In particular, the exposed content structure should be 
> described in greater detail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-8815) Javadoc build fails if using Java 11

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-8815:
-
Fix Version/s: (was: 1.36.0)

> Javadoc build fails if using Java 11
> 
>
> Key: OAK-8815
> URL: https://issues.apache.org/jira/browse/OAK-8815
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: doc
>Affects Versions: 1.20.0
>Reporter: Matt Ryan
>Priority: Major
> Fix For: 1.38.0
>
>
> Trying to build the Javadocs when using Java 11 fails. If you specify Java 8 
> when building the Javadocs, the build succeeds.
> Command I'm using to build the Javadocs:  {{mvn site -Pjavadoc}} (as 
> described in the {{oak-doc}} readme).
> I will include more information on the errors in comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5917) Document enhancements in indexing in 1.6

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5917:
-
Fix Version/s: (was: 1.36.0)

> Document enhancements in indexing in 1.6
> 
>
> Key: OAK-5917
> URL: https://issues.apache.org/jira/browse/OAK-5917
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: doc
>Reporter: Chetan Mehrotra
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> This task is meant to collect and refer work done in 1.6 release which needs 
> to be documented in Oak docs.
> Issues in lucene and query area 
> [jql|https://issues.apache.org/jira/issues/?jql=project%20%3D%20OAK%20AND%20fixVersion%20%3D%201.6.0%20and%20component%20in%20(lucene%2C%20query)%20ORDER%20BY%20updated%20DESC%2C%20priority%20DESC%2C%20created%20ASC]
> Topics to cover
> * OAK-4412 - Lucene Hybrid Index (/)
> * OAK-4939 - Isolation of corrupted index  (/)
> * OAK-4974 - Enable configuring QueryEngineSettings via OSGi config 
> * OAK-3574 - Function based indexes
> * OAK-4400 - Correlate index with the index definition used to build it  (/)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6766) Convert oak-lucene to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6766:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-lucene to OSGi R6 annotations
> -
>
> Key: OAK-6766
> URL: https://issues.apache.org/jira/browse/OAK-6766
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: lucene
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7381) Reduce debug log output for queries

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7381:
-
Fix Version/s: (was: 1.36.0)

> Reduce debug log output for queries
> ---
>
> Key: OAK-7381
> URL: https://issues.apache.org/jira/browse/OAK-7381
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> When enabling the debug log level, running a query can log a lot. That can 
> slow down executing a large query quite a lot. The amount of logged data 
> should be reduced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3380) Property index pruning should happen asynchronously

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3380:
-
Fix Version/s: (was: 1.36.0)

> Property index pruning should happen asynchronously
> ---
>
> Key: OAK-3380
> URL: https://issues.apache.org/jira/browse/OAK-3380
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: property-index
>Affects Versions: 1.3.5
>Reporter: Vikas Saurabh
>Priority: Minor
>  Labels: resilience
> Fix For: 1.38.0
>
>
> Following up on this (a relatively old) thread \[1], we should do pruning of 
> property index structure asynchronously. The thread was never concluded.. 
> here are a couple of ideas picked from the thread:
> * Move pruning to an async thread
> * Throttle pruning i.e. prune only once in a while
> ** I'm not sure how that would work though -- an unpruned part would remain 
> as is until another index happens on that path.
> Once we can move pruning to some async thread (reducing concurrent updates), 
> OAK-2673 + OAK-2929 can take care of add-add conflicts.
> 
> h6. Why is this an issue despite merge retries taking care of it?
> A couple of cases which have concurrent updates hitting merge conflicts in 
> our product (Adobe AEM):
> * Some index are very volatile (in the sense that indexed property switches 
> its values very quickly) e.g. sling job status, AEM workflow status.
> * Multiple threads take care of jobs. Although sling maintains a bucketed 
> structure for job storage to reduce conflicts... but inside index tree the 
> bucket structure, at times, gets pruned and needs to be created in the next 
> job status change
> While retries do take care of these conflict a lot of times and even when 
> they don't, AEM workflows has it's own retry to work around. But, retrying, 
> IMHO, is just a waste of time -- more importantly in paths where application 
> doesn't really have a control.
> h6. Would this add to cost of traversing index structure?
> Yes, there'd be some left over paths in index structure between asynchronous 
> prunes. But, I think the cost of such wasted traversals would be covered up 
> with time saved in avoiding the concurrent update conflict.
> 
> (cc [~tmueller], [~mreutegg], [~alex.parvulescu], [~chetanm])
> \[1]: 
> http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201506.mbox/%3ccadichf66u2vh-hlrjunansytxfidj2mt3vktr4ybkngpzy9...@mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5924) Prevent long running query from delaying refresh of index

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5924:
-
Fix Version/s: (was: 1.36.0)

> Prevent long running query from delaying refresh of index
> -
>
> Key: OAK-5924
> URL: https://issues.apache.org/jira/browse/OAK-5924
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Whenever the index gets updated {{IndexTracker}} detects the changes and open 
> new {{IndexNode}} and closes old index nodes. This flow would block untill 
> all old IndexNode are closed.
> IndexNode close itself relies on a writer lock. It can happen that a long 
> running query i.e. a query which is about to read a page of large is 
> currently executing on the old IndexNode instance. This query is trying load 
> 100k  docs and is very slow (due to loading of excerpt) then such a query 
> would prevent the IndexNode from getting closed. This in turn would prevent 
> the index from seeing latest data and become stale.
> To make query and indexing more resilient we should look if current IndexNode 
> being used for query is closing or not. If closing then query should open a 
> fresh searcher



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6566) Generate markdown files from metatype description

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6566:
-
Fix Version/s: (was: 1.36.0)

> Generate markdown files from metatype description
> -
>
> Key: OAK-6566
> URL: https://issues.apache.org/jira/browse/OAK-6566
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: doc
>Reporter: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Currently we maintain some documentation around supporting configuration 
> options for few components at [1]. Same information is also captured in 
> metatype annotation.
> We should look into generating these md file from metatype. This can possibly 
> be done via a new maven plugin [2]
> [1] http://jackrabbit.apache.org/oak/docs/osgi_config.html
> [2] https://github.com/TouK/metatype-exporter-maven-plugin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5291) Per-Query Limits (nodes read, nodes read in memory)

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5291:
-
Fix Version/s: (was: 1.36.0)

> Per-Query Limits (nodes read, nodes read in memory)
> ---
>
> Key: OAK-5291
> URL: https://issues.apache.org/jira/browse/OAK-5291
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> In OAK-1395 we added limits for long running queries. In OAK-1571 we added 
> OSGi configuration. In OAK-5237 we change the default settings.
> It would be nice to be able to define the limits per query, similar to 
> OAK-4888. The query would look like (for example, to limit reading to 1 
> million nodes, even if the default query limit is lower):
> {noformat}
> select * from [nt:base] 
> where ischildnode('/oak:index') 
> order by name()
> option(traversal ok, limit 100)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3141) Oak should warn when too many ordered child nodes

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3141:
-
Fix Version/s: (was: 1.36.0)

> Oak should warn when too many ordered child nodes
> -
>
> Key: OAK-3141
> URL: https://issues.apache.org/jira/browse/OAK-3141
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.0.16
>Reporter: Jörg Hoh
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> When working with the RDBMK we came into situations, that large documents did 
> not fit into the provided db columns, there was an overflow, which caused oak 
> not to persist the change. We fixed it by increasing the size of the column.
> But it would be nice if Oak could warn if a document exceeds a certain size 
> (for example 2 megabytes); because this warning indicates, that on a JCR 
> level there might be a problematic situation, for example:
> * ordered node with a large list of childnodes
> * or longstanding sessions with lots of changes, which accumulate to large 
> documents.
> It's certainly nice to know if there's a node/document with such a problem, 
> before the exceptions actually happens and an operation breaks.
> This message should be a warning, and should contain the JCR path of the node 
> plus the current size. To avoid that this message is overseen, it would be 
> good if it is written everyonce in a while (every 10 minutes?) if this 
> condition persists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-4177) Tests on Mongo should fail if mongo is not available

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-4177:
-
Fix Version/s: (was: 1.36.0)

> Tests on Mongo should fail if mongo is not available
> 
>
> Key: OAK-4177
> URL: https://issues.apache.org/jira/browse/OAK-4177
> Project: Jackrabbit Oak
>  Issue Type: Test
>Reporter: Davide Giannella
>Assignee: Davide Giannella
>Priority: Major
> Fix For: 1.38.0
>
>
> Most if not all of the IT/UT that run against mongodb have an
> assumption at class level that if mongodb is not available the tests
> are skipped.
> The tests should fail instead if mongodb is not available and we
> explicitly said that, via the {{nsfixtures}} flags, we want to run the
> tests against mongodb.
> We currently have 4 fixtures/flags: DOCUMENT_NS, SEGMENT_MK,
> DOCUMENT_RDB, MEMORY_NS.
> https://github.com/apache/jackrabbit-oak/blob/f957b6787eb7a70eba454ceb1cae90bd4d47f15c/oak-commons/src/test/java/org/apache/jackrabbit/oak/commons/FixturesHelper.java#L46
> We may have the need to introduce a new Fixture/Flag that indicate
> that we want to run the tests against Document using the in-memory
> implementation. For example: DOCUMENT_NS_IM.
> This will be useful on the Apache Jenkins as we don't have mongo there
> but we still want to run all the possible Document NS tests against
> the in-memory implementation when this is possible.
> /cc [~mreutegg]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5159) Killing the process may stop async index update to to 30 minutes, for DocumentStore (MongoDB, RDB)

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5159:
-
Fix Version/s: (was: 1.36.0)

> Killing the process may stop async index update to to 30 minutes, for 
> DocumentStore (MongoDB, RDB)
> --
>
> Key: OAK-5159
> URL: https://issues.apache.org/jira/browse/OAK-5159
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: indexing
>Reporter: Thomas Mueller
>Priority: Major
>  Labels: resilience
> Fix For: 1.38.0
>
>
> Same as OAK-2108, when using a DocumentStore based repository (MongoDB, 
> RDBMK). This is also a problem in the single-cluster-node case, not just when 
> using multiple cluster node.
> When killing a node that is running the sync index update, then this async 
> index update will not run for up to 15 minutes, because the lease time is set 
> to 15 minutes.
> We could probably use Oak / Sling Discovery to improve the situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6759) Convert oak-blob-cloud-azure to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6759:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-blob-cloud-azure to OSGi R6 annotations
> ---
>
> Key: OAK-6759
> URL: https://issues.apache.org/jira/browse/OAK-6759
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: blob-cloud
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6421) Phase out JCR Locking support

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6421:
-
Fix Version/s: (was: 1.36.0)

> Phase out JCR Locking support
> -
>
> Key: OAK-6421
> URL: https://issues.apache.org/jira/browse/OAK-6421
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: jcr
>Reporter: Marcel Reutegger
>Priority: Major
> Fix For: 1.38.0
>
>
> Oak currently has a lot of gaps in its JCR Locking implementation (see 
> OAK-1962), which basically makes it non-compliant with the JCR specification.
> I propose we phase out the support for JCR Locking because a proper 
> implementation would be rather complex with a runtime behaviour that is very 
> different in a standalone deployment compared to a cluster. In the standalone 
> case a lock could be acquired very quickly, while in the distributed case, 
> the operations would be multiple orders of magnitude slower, depending on how 
> cluster nodes are geographically distributed.
> Applications that rely on strict lock semantics should use other mechanisms, 
> built explicitly for this purpose. E.g. Apache Zookeeper.
> To ease upgrade and migration to a different lock mechanism, the proposal is 
> to introduce a flag or configuration that controls the level of support for 
> JCR Locking:
> - DISABLED: the implementation does not support JCR Locking at all. Methods 
> will throw UnsupportedRepositoryOperationException when defined by the JCR 
> specification. 
> - DEPRECATED: the implementation behaves as right now, but logs a warn or 
> error message that JCR Locking does not work as specified and will be removed 
> in a future version of Oak.
> In a later release (e.g. 1.10) the current JCR Locking implementation would 
> be removed entirely and unconditionally throw an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5792) TarMK: Implement tooling to repair broken nodes

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5792:
-
Fix Version/s: (was: 1.36.0)

> TarMK: Implement tooling to repair broken nodes
> ---
>
> Key: OAK-5792
> URL: https://issues.apache.org/jira/browse/OAK-5792
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: run, segment-tar
>Reporter: Michael Dürig
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: production, technical_debt, tooling
> Fix For: 1.38.0
>
>
> With {{oak-run check}} we can determine the last good revision of a 
> repository and use it to manually roll back a corrupted segment store. 
> Complementary to this we should implement a tool to roll forward a broken 
> revision to a fixed new revision. Such a tool needs to detect which items are 
> affected by a corruption and replace these items with markers. With this the 
> repository could brought back online and the markers could be used to 
> identify the locations in the tree where further manual action might be 
> needed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5958) Document Metrics related classes and interfaces

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5958:
-
Fix Version/s: (was: 1.36.0)

> Document Metrics related classes and interfaces
> ---
>
> Key: OAK-5958
> URL: https://issues.apache.org/jira/browse/OAK-5958
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Michael Dürig
>Assignee: Chetan Mehrotra
>Priority: Major
>  Labels: documentation, technical_debt
> Fix For: 1.38.0
>
>
> The Metrics related classes and interfaces in 
> {{org.apache.jackrabbit.oak.stats}} and 
> {{org.apache.jackrabbit.oak.plugins.metric}} are largely undocumented. 
> Specifically it is not immediately how they should be used, how a new 
> {{Stats}} instance should be added, what the effect this would have and how 
> it would (or would) not be exposed (e.g. via JMX). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7382) Cloud datastore without local disk

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7382:
-
Fix Version/s: (was: 1.36.0)

> Cloud datastore without local disk
> --
>
> Key: OAK-7382
> URL: https://issues.apache.org/jira/browse/OAK-7382
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: blob, blob-cloud
>Reporter: Thomas Mueller
>Assignee: Amit Jain
>Priority: Major
> Fix For: 1.38.0
>
>
> Currently, the S3 datastores need local disk to work (not sure about the 
> Azure one).
> This should not be needed (not for upload, caching,...).
> Also, temporary files for garbage collection should not be needed (instead, 
> use temporary binaries, possibly written to S3 / Azure).
> Really everything should fit in a few MB of memory.
> For S3, it might be needed to read a few MB of data into memory, and then 
> possibly do a multipart upload:
>  
> https://stackoverflow.com/questions/8653146/can-i-stream-a-file-upload-to-s3-without-a-content-length-header



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-7224) oak-run check should have an option to check the segments checksums

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-7224:
-
Fix Version/s: (was: 1.36.0)

> oak-run check should have an option to check the segments checksums
> ---
>
> Key: OAK-7224
> URL: https://issues.apache.org/jira/browse/OAK-7224
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: run, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: tooling
> Fix For: 1.38.0
>
>
> {{oak-run check}} does currently *not* check the checksums of the segments. 
> As a consequence, there is no quick way of determining the state of the 
> repository (corrupt/valid), after corrupting some random node record, as we 
> currently do in {{CheckRepositoryTestBase#corruptRecord}}. To determine that, 
> there needs to be an attempt to read the corrupt record as part of a 
> traversal.
> An easier way would be to have a new dedicated option for this (i.e., 
> {{--segments}}) which checks by default the content of segments against the 
> checksums from all the tar files in the specified location. Additionally, it 
> could accept as an argument a list of tar files, the segments of which to be 
> checked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5316) Rewrite JcrPathParser and JcrNameParser with good test coverage

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5316:
-
Fix Version/s: (was: 1.36.0)

> Rewrite JcrPathParser and JcrNameParser with good test coverage
> ---
>
> Key: OAK-5316
> URL: https://issues.apache.org/jira/browse/OAK-5316
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.5.15
>Reporter: Julian Sedding
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> As discussed in OAK-5260 the implementation of the {{JcrPathParser}} and 
> possibly also the {{JcrNameParser}} are not ideal, i.e. there are potentially 
> many bugs hiding in edge-case scenarios. The parsers' test coverage is also 
> lacking, which is problematic as these code paths get executed very 
> frequently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-3598) Export cache related classes for usage in other oak bundle

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-3598:
-
Fix Version/s: (was: 1.36.0)

> Export cache related classes for usage in other oak bundle
> --
>
> Key: OAK-3598
> URL: https://issues.apache.org/jira/browse/OAK-3598
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: cache
>Reporter: Chetan Mehrotra
>Priority: Major
>  Labels: tech-debt
> Fix For: 1.38.0
>
>
> For OAK-3092 oak-lucene would need to access classes from 
> {{org.apache.jackrabbit.oak.cache}} package. For now its limited to 
> {{CacheStats}} to expose the cache related statistics.
> This task is meant to determine steps needed to export the package 
> * Update the pom.xml to export the package
> * Review current set of classes to see if they need to be reviewed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6513) Journal based Async Indexer

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6513:
-
Fix Version/s: (was: 1.36.0)

> Journal based Async Indexer
> ---
>
> Key: OAK-6513
> URL: https://issues.apache.org/jira/browse/OAK-6513
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: indexing
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> Current async indexer design is based on NodeState diff. This has served us 
> fine so far however off late it is not able to perform well if rate of 
> repository writes is high. When changes happen faster than index-update can 
> process them, larger and larger diffs will happen. These make index-updates 
> slower, which again lead to the next diff being ever larger than the one 
> before (assuming a constant ingestion rate). 
> In current diff based flow the indexer performs complete diff for all changes 
> happening between 2 cycle. It may happen that lots of writes happens but not 
> much indexable content is written. So doing diff there is a wasted effort.
> In 1.6 release for NRT Indexing we implemented a journal based indexing for 
> external changes(OAK-4808, OAK-5430). That approach can be generalized and 
> used for async indexing. 
> Before talking about the journal based approach lets see how IndexEditor work 
> currently
> h4. IndexEditor 
> Currently any IndexEditor performs 2 tasks
> # Identify which node is to be indexed based on some index definition. The 
> Editor gets invoked as part of content diff where it determines which 
> NodeState is to be indexed
> # Update the index based on node to be indexed
> For e.g. in oak-lucene we have LuceneIndexEditor which identifies the 
> NodeStates to be indexed and LuceneDocumentMaker which constructs the Lucene 
> Document from NodeState to be indexed. For journal based approach we can 
> decouple these 2 parts and thus have 
> * IndexEditor - Identifies which all paths need to be indexed for given index 
> definition
> * IndexUpdater - Updates the index based on given NodeState and its path
> h4. High Level Flow
> # Session Commit Flow
> ## Each index type would provide a IndexEditor which would be invoked as part 
> of commit (like sync indexes). These IndexEditor would just determine which 
> paths needs to be indexed. 
> ## As part of commit the paths to be indexed would be written to journal. 
> # AsyncIndexUpdate flow
> ## AsyncIndexUpdate would query this journal to fetch all such indexed paths 
> between the 2 checkpoints
> ## Based on the index path data it would invoke the {{IndexUpdater}} to 
> update the index for that path
> ## Merge the index updates
> h4. Benefits
> Such a design would have following impact
> # More work done as part of write
> # Marking of indexable content is distributed hence at indexing time lesser 
> work to be done
> # Indexing can progress in batches 
> # The indexers can be called in parallel
> h4. Journal Implementation
> DocumentNodeStore currently has an in built journal which is being used for 
> NRT Indexing. That feature can be exposed as an api. 
> For scaling index this design is mostly required for cluster case. So we can 
> possibly have both indexing support implemented and use the journal based 
> support for DocumentNodeStore setups. Or we can look into implementing such a 
> journal for SegmentNodeStore setups also
> h4. Open Points
> * Journal support in SegmentNodeStore
> * Handling deletes. 
> Detailed proposal - 
> https://wiki.apache.org/jackrabbit/Journal%20based%20Async%20Indexer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-8908) RDBBlobStore on SQL Server: bad performance when default collation is of type SQL*

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-8908:
-
Fix Version/s: (was: 1.36.0)

> RDBBlobStore on SQL Server: bad performance when default collation is of type 
> SQL*
> --
>
> Key: OAK-8908
> URL: https://issues.apache.org/jira/browse/OAK-8908
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Julian Reschke
>Priority: Major
> Fix For: 1.38.0
>
> Attachments: OAK-8908-1.6.diff, OAK-8908.diff
>
>
> RDBBlobStore uses a 64-char primary key (digest in hex).
> Unfortunately, this causes performance issues on MS SQL Server, when the 
> collation for that column is of type "SQL*" (see links). These types of 
> collations are deprecated, but still the default for installations on the 
> "EN_US" locale.
> The performance loss can be observed by changing the collation on an existing 
> install, and then enable performance logging on RDBBlobStore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5488) BackgroundObserver MBean report Listener class again

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5488:
-
Fix Version/s: (was: 1.36.0)

> BackgroundObserver MBean report Listener class again
> 
>
> Key: OAK-5488
> URL: https://issues.apache.org/jira/browse/OAK-5488
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, jcr
>Affects Versions: 1.5.18
>Reporter: Stefan Eissing
>Priority: Minor
> Fix For: 1.38.0
>
>
> The MBean stats for {{BackgroundObserverStats}} used to give the className of 
> the listening class.
> With the introduction of {{FilteringDispatcher}} all MBeans only list that 
> class name, making it difficult to find out which observer really is shown.
> Proposal: show the effective className as before again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5544) Improve indexing resilience

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5544:
-
Fix Version/s: (was: 1.36.0)

> Improve indexing resilience
> ---
>
> Key: OAK-5544
> URL: https://issues.apache.org/jira/browse/OAK-5544
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: lucene
>Reporter: Alexander Saar
>Assignee: Chetan Mehrotra
>Priority: Critical
>  Labels: resilience
> Fix For: 1.38.0
>
>
> grouping the improvements for indexer resilience in this issue for easier 
> tracking



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5661) Make NRT indexing resilient against unbounded growth

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5661:
-
Fix Version/s: (was: 1.36.0)

> Make NRT indexing resilient against unbounded growth
> 
>
> Key: OAK-5661
> URL: https://issues.apache.org/jira/browse/OAK-5661
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Major
> Fix For: 1.38.0
>
>
> NRT Indexes for volatile indexes [1] can grow large if async index update 
> faces issues. Like if it gets stuck for days or due to some bug index do not 
> get updates like in OAK-5649 then the sizes can grow very large.
> For such cases we should add some checks in logic where system can ensure 
> that some cleanup is performed or writes to indexes are stopped. Also such a 
> situation should be flagged 
> [1] Indexes which see lots of addition and deletions. So effective indexing 
> size is smaller however if deletions are not applied (as is the case with 
> NRT) such indexes can grow large



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-9212) AzureArchiveManage.listArchives() should not delete segments

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-9212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-9212:
-
Fix Version/s: (was: 1.36.0)

> AzureArchiveManage.listArchives() should not delete segments
> 
>
> Key: OAK-9212
> URL: https://issues.apache.org/jira/browse/OAK-9212
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-azure
>Affects Versions: 1.32.0
>Reporter: Aravindo Wingeier
>Priority: Major
> Fix For: 1.38.0
>
>
> When the segments are listed on an azure blob store and the ".*" blob is 
> missing, it will delete all other segments. This behaviour was introduced 
> with OAK-8566. 
> *Change:* This destructive operation should not happen in method the 
> _AzureArchiveManage.listArchives()_ which indicates a read-only operation. 
> One option is to pull out this functionality and call it somewhere else. 
> *Why is this an issue?*  There is a recovery option in 
> org.apache.jackrabbit.oak.segment.file.tar.TarReader#collectFileEntries which 
> calls org.apache.jackrabbit.oak.segment.file.tar.TarReader#backupSafely. If 
> the recovery is run concurrently with _AzureArchiveManage.listArchives()_ the 
> result can be unexpected. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5889) Change internal queries to use "option(traversal fail)"

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5889:
-
Fix Version/s: (was: 1.36.0)

> Change internal queries to use "option(traversal fail)"
> ---
>
> Key: OAK-5889
> URL: https://issues.apache.org/jira/browse/OAK-5889
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> The Oak internal queries that use an index (that is, hopefully all of them) 
> should fail if no index is available. For example, it's better to fail 
> queries that search a node by UUID, if the UUID index is disabled, otherwise 
> for each such query, the complete repository is traversed.
> To do that, "option(traversal fail)" can be appended to the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6892) Query: ability to "nicely" traverse

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6892:
-
Fix Version/s: (was: 1.36.0)

> Query: ability to "nicely" traverse
> ---
>
> Key: OAK-6892
> URL: https://issues.apache.org/jira/browse/OAK-6892
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Major
> Fix For: 1.38.0
>
>
> Currently, queries that traverse many nodes log a warning, or can even fail 
> (if configured). This is to ensure system resources are not blocked (CPU, 
> I/O, memory).
> But there are cases where it doesn't make sense to create an index, but 
> traverse (a certain path structure, or sometimes even the whole repository). 
> For example, finding a text with "like '%xxx%'". The problem isn't that it's 
> slow; the problem is that it's blocking / slowing down other users. Another 
> example is during migration, where the alternative is to create an index 
> (which also traverses the repository).
> One option is to allow such queries to run, but throttle them. We could add 
> the hint {{option(traversal throttle)}} to do that. Throttle means: don't use 
> up all I/O, but yield to other tasks depending on config settings (during 
> migration, yield is not needed). As a rule of thumb, the longer the query 
> runs, the more should it yield (up to some value).
> It would be good to allow stopping such queries, and get progress 
> information. The easiest solution might be over JMX, and a more advanced 
> solution is using new API (like, using an interface QueryTraversalObserver, 
> and have our QueryResult implement that interface).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-9024) oak-solr-osgi imports org.slf4j.impl

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-9024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-9024:
-
Fix Version/s: (was: 1.36.0)

> oak-solr-osgi imports org.slf4j.impl
> 
>
> Key: OAK-9024
> URL: https://issues.apache.org/jira/browse/OAK-9024
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Reporter: Julian Reschke
>Assignee: Manfred Baedke
>Priority: Minor
> Fix For: 1.38.0
>
> Attachments: OAK-9024.patch
>
>
> From the manifest:
> {{org.slf4j.impl;version="[1.6,2)"}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5463) Implement optimized MultiBinaryPropertyState.size(int)

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5463:
-
Fix Version/s: (was: 1.36.0)

> Implement optimized MultiBinaryPropertyState.size(int)
> --
>
> Key: OAK-5463
> URL: https://issues.apache.org/jira/browse/OAK-5463
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Marcel Reutegger
>Priority: Minor
> Fix For: 1.38.0
>
>
> {{MultiBinaryPropertyState}} currently does not have a {{size(int)}} 
> implementation, which means the base class will convert the {{Blob}} into a 
> String to get the size. This is inefficient and should have an optimized 
> implementation in {{MultiBinaryPropertyState}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-6769) Convert oak-search-mt to OSGi R6 annotations

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-6769:
-
Fix Version/s: (was: 1.36.0)

> Convert oak-search-mt to OSGi R6 annotations
> 
>
> Key: OAK-6769
> URL: https://issues.apache.org/jira/browse/OAK-6769
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: 1.38.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OAK-5927) Load excerpt lazily

2020-11-06 Thread Andrei Dulceanu (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Dulceanu updated OAK-5927:
-
Fix Version/s: (was: 1.36.0)

> Load excerpt lazily
> ---
>
> Key: OAK-5927
> URL: https://issues.apache.org/jira/browse/OAK-5927
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Priority: Major
>  Labels: performance
> Fix For: 1.38.0
>
>
> Currently LucenePropertyIndex loads the excerpt eagerly in batch as part of 
> loadDocs call. The load docs batch size doubles starting from 50 (max 100k) 
> as more data is read. 
> We should look into ways to make the excerpt loaded lazily as and when caller 
> ask for excerpt.
> Note that currently the excerpt are only loaded when query request for 
> excerpt i.e. there is a not null property restriction for {{rep:excerpt}}. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >