[jira] [Created] (OAK-7980) Build Jackrabbit Oak #1881 failed
Hudson created OAK-7980: --- Summary: Build Jackrabbit Oak #1881 failed Key: OAK-7980 URL: https://issues.apache.org/jira/browse/OAK-7980 Project: Jackrabbit Oak Issue Type: Bug Components: continuous integration Reporter: Hudson No description is provided The build Jackrabbit Oak #1881 has failed. First failed run: [Jackrabbit Oak #1881|https://builds.apache.org/job/Jackrabbit%20Oak/1881/] [console log|https://builds.apache.org/job/Jackrabbit%20Oak/1881/console] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OAK-7965) Build Jackrabbit Oak #1859 failed
[ https://issues.apache.org/jira/browse/OAK-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738454#comment-16738454 ] Hudson commented on OAK-7965: - Previously failing build now is OK. Passed run: [Jackrabbit Oak #1880|https://builds.apache.org/job/Jackrabbit%20Oak/1880/] [console log|https://builds.apache.org/job/Jackrabbit%20Oak/1880/console] > Build Jackrabbit Oak #1859 failed > - > > Key: OAK-7965 > URL: https://issues.apache.org/jira/browse/OAK-7965 > Project: Jackrabbit Oak > Issue Type: Bug > Components: continuous integration >Reporter: Hudson >Priority: Major > > No description is provided > The build Jackrabbit Oak #1859 has failed. > First failed run: [Jackrabbit Oak > #1859|https://builds.apache.org/job/Jackrabbit%20Oak/1859/] [console > log|https://builds.apache.org/job/Jackrabbit%20Oak/1859/console] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OAK-7979) DeclaredMembershipPredicate does not compile with Guava 20
[ https://issues.apache.org/jira/browse/OAK-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738452#comment-16738452 ] Julian Reschke commented on OAK-7979: - trunk: [r1850882|http://svn.apache.org/r1850882] > DeclaredMembershipPredicate does not compile with Guava 20 > -- > > Key: OAK-7979 > URL: https://issues.apache.org/jira/browse/OAK-7979 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_10 > Fix For: 1.12, 1.11.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (OAK-7979) DeclaredMembershipPredicate does not compile with Guava 20
[ https://issues.apache.org/jira/browse/OAK-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved OAK-7979. - Resolution: Fixed Fix Version/s: 1.11.0 > DeclaredMembershipPredicate does not compile with Guava 20 > -- > > Key: OAK-7979 > URL: https://issues.apache.org/jira/browse/OAK-7979 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.12, 1.11.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7979) DeclaredMembershipPredicate does not compile with Guava 20
[ https://issues.apache.org/jira/browse/OAK-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-7979: Labels: candidate_oak_1_10 (was: ) > DeclaredMembershipPredicate does not compile with Guava 20 > -- > > Key: OAK-7979 > URL: https://issues.apache.org/jira/browse/OAK-7979 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_10 > Fix For: 1.12, 1.11.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7234) Check for outdated journal at startup
[ https://issues.apache.org/jira/browse/OAK-7234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7234: -- Fix Version/s: (was: 1.10.0) > Check for outdated journal at startup > - > > Key: OAK-7234 > URL: https://issues.apache.org/jira/browse/OAK-7234 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Michael Dürig >Assignee: Michael Dürig >Priority: Minor > Labels: resilience, tooling > Fix For: 1.12 > > > To prevent accidentally branching the repository when the {{journal.log}} > became outdated (e.g. OAK-3702) we could add an additional safety feature > which would prevent the repository from starting in such cases. There's a > couple of concerns to address: > * What kind of tooling / guidance do we need to provide to recover should > such a situation be detected? > * How do we detect the {{journal.log}} being outdated? > * How do we prevent false positives? > * How do we deal with situation where the {{journal.log}} modifications are > intended (e.g. by tools, of manual interventions)? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6254) DataStore: API to retrieve approximate storage size
[ https://issues.apache.org/jira/browse/OAK-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6254: -- Fix Version/s: 1.12 > DataStore: API to retrieve approximate storage size > --- > > Key: OAK-6254 > URL: https://issues.apache.org/jira/browse/OAK-6254 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob >Reporter: Thomas Mueller >Priority: Major > Fix For: 1.10.0, 1.12 > > > The estimated size of the datastore (on disk) is needed to: > * monitor growth over time, or growth of certain operations > * monitor if garbage collection is effective > * avoid out of disk space > * estimate backup size > * statistical purposes (for example, if there are many repositories, to group > them by size) > Datastore size: we could use the following heuristic: We could read the file > sizes in ./datastore/00/00 (if it exists) and multiply by 65536; or > ./datastore/00 and multiply by 256. That would give a rough estimation > (within about 20% for repositories with datastore size > 50 GB). > I think this is mainly important for the FileDataStore. The S3 datastore, if > there is a simple and fast S3 API to read the size, then that would be good > as well, but if there is none, then returning "unknown" is fine for me. > As for the API, I would use something like this: {{long > getEstimatedStorageSize(int accuracyLevel)}} with accuracyLevel 1 for > inaccurate (fastest), 2 more accurate (slower),..., 9 precise (possibly very > slow). Similar to > [java.util.zip.Deflater.setLevel|https://docs.oracle.com/javase/7/docs/api/java/util/zip/Deflater.html#setLevel(int)]. > I would expect it takes up to 1 second for accuracyLevel 0, up to 5 seconds > for accuracyLevel 1, and possibly hours for level 9. With level 1, I would > read files in 00/00, with level 2 - 8 I would read files in 00, and with > level 9 I would read all the files. For level 1, I wouldn't stop; for level > 2, if it takes more than 5 seconds, I would stop and return the current best > estimate. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6619) Async indexer thread may get stuck in CopyOnWriteDirectory close method
[ https://issues.apache.org/jira/browse/OAK-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6619: -- Fix Version/s: (was: 1.10.0) > Async indexer thread may get stuck in CopyOnWriteDirectory close method > --- > > Key: OAK-6619 > URL: https://issues.apache.org/jira/browse/OAK-6619 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra >Priority: Critical > Fix For: 1.12 > > Attachments: status-threaddump-Sep-5.txt > > > With copy-on-write mode enabled at times its seen that async index thread > remain stuck in CopyOnWriteDirectory#close method > {noformat} > "async-index-update-async" prio=5 tid=0xb9e63 nid=0x timed_waiting >java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > - waiting to lock <0x2504cd51> (a > java.util.concurrent.CountDownLatch$Sync) owned by "null" tid=0x-1 > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory.close(CopyOnWriteDirectory.java:221) > at > org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.updateSuggester(DefaultIndexWriter.java:177) > at > org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.close(DefaultIndexWriter.java:121) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:136) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:154) > at > org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:357) > at > org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:60) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56) > at > org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:727) > at > org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.runWhenPermitted(AsyncIndexUpdate.java:572) > at > org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:431) > - locked <0x3d542de5> (a > org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate) > at > org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:245) > at org.quartz.core.JobRunShell.run(JobRunShell.java:202) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The thread is waiting on a latch and no other thread is going to release the > latch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5917) Document enhancements in indexing in 1.6
[ https://issues.apache.org/jira/browse/OAK-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5917: -- Fix Version/s: (was: 1.10.0) > Document enhancements in indexing in 1.6 > > > Key: OAK-5917 > URL: https://issues.apache.org/jira/browse/OAK-5917 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: doc >Reporter: Chetan Mehrotra >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > This task is meant to collect and refer work done in 1.6 release which needs > to be documented in Oak docs. > Issues in lucene and query area > [jql|https://issues.apache.org/jira/issues/?jql=project%20%3D%20OAK%20AND%20fixVersion%20%3D%201.6.0%20and%20component%20in%20(lucene%2C%20query)%20ORDER%20BY%20updated%20DESC%2C%20priority%20DESC%2C%20created%20ASC] > Topics to cover > * OAK-4412 - Lucene Hybrid Index (/) > * OAK-4939 - Isolation of corrupted index (/) > * OAK-4974 - Enable configuring QueryEngineSettings via OSGi config > * OAK-3574 - Function based indexes > * OAK-4400 - Correlate index with the index definition used to build it (/) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-1838) NodeStore API: divergence in contract and implementations
[ https://issues.apache.org/jira/browse/OAK-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-1838: -- Fix Version/s: (was: 1.10.0) > NodeStore API: divergence in contract and implementations > - > > Key: OAK-1838 > URL: https://issues.apache.org/jira/browse/OAK-1838 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Michael Dürig >Assignee: Marcel Reutegger >Priority: Major > Fix For: 1.12 > > > Currently there is a gap between what the API mandates and what the document > and kernel node stores implement. This hinders further evolution of the Oak > stack as implementers must always be aware of non explicit design > requirements. We should look into ways we could close this gap by bringing > implementation and specification closer towards each other. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6762) Convert oak-blob to OSGi R6 annotations
[ https://issues.apache.org/jira/browse/OAK-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6762: -- Fix Version/s: (was: 1.10.0) > Convert oak-blob to OSGi R6 annotations > --- > > Key: OAK-6762 > URL: https://issues.apache.org/jira/browse/OAK-6762 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: blob >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6973) Define public/internal packages
[ https://issues.apache.org/jira/browse/OAK-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6973: -- Fix Version/s: (was: 1.10.0) > Define public/internal packages > --- > > Key: OAK-6973 > URL: https://issues.apache.org/jira/browse/OAK-6973 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Marcel Reutegger >Priority: Major > Fix For: 1.12 > > > As part of the Oak modularization packages previously exported without a > version will at some point have to adhere to proper semantic versioning. See > also OAK-3919 and its sub-tasks. > Since some of those packages are not meant to be used outside of Oak, there > should be a mechanism to define which exported packages are public and which > are considered internal. While semantic versioning rules apply to both > categories, we may want to provide different guarantees/guidance to consumers > of those packages. E.g. increasing the major version of a package used only > by Oak has less impact compared to a major version increase of a 'public' > package used by many applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7635) oak-run check should support Azure Segment Store
[ https://issues.apache.org/jira/browse/OAK-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7635: -- Fix Version/s: (was: 1.10.0) > oak-run check should support Azure Segment Store > > > Key: OAK-7635 > URL: https://issues.apache.org/jira/browse/OAK-7635 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: run, segment-tar >Reporter: Andrei Dulceanu >Assignee: Andrei Dulceanu >Priority: Major > Labels: tooling > Fix For: 1.12 > > > {{oak-run check}} should accept Azure URIs for the segment store in order to > be able to check for data integrity. This will come handy in the light of > remote compacted segment stores and/or sidegraded remote segment stores (see > OAK-7623, OAK-7459). > The Azure URI will be taken as argument and will have the following format: > {{az:[https://myaccount.blob.core.windows.net/container/repo]}}, where _az_ > identifies the cloud provider. The last missing piece is the secret key which > will be supplied as an environment variable, i.e. _AZURE_SECRET_KEY._ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6768) Convert oak-remote to OSGi R6 annotations
[ https://issues.apache.org/jira/browse/OAK-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6768: -- Fix Version/s: (was: 1.10.0) > Convert oak-remote to OSGi R6 annotations > - > > Key: OAK-6768 > URL: https://issues.apache.org/jira/browse/OAK-6768 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: remoting >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6628) More precise indexRules support via filtering criteria on property
[ https://issues.apache.org/jira/browse/OAK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6628: -- Fix Version/s: 1.12 > More precise indexRules support via filtering criteria on property > -- > > Key: OAK-6628 > URL: https://issues.apache.org/jira/browse/OAK-6628 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.10.0, 1.12 > > > For Lucene index we currently support indexRules based on nodetype. Here the > recommendation is that users must use most precise nodeType/mixinType to > target the indexing rule so that only relevant nodes are indexed. > For many Sling based applications its being seen that lots of content is > nt:unstructured and it uses {{sling:resourceType}} property to distinguish > various such nt:unstructured nodes. Currently its not possible to target > index definition to index only those nt:unstructured which have specific > {{sling:resourceType}}. Which makes it harder to provide a more precise index > definitions. > To help such cases we can generalize the indexRule support via a filtering > criteria > {noformat} > activityIndex > - type = "lucene" > + indexRules > + nt:unstructured > - filter-property = "sling:resourceType" > - filter-value = "app/activitystreams/components/activity" > + properties > - jcr:primaryType = "nt:unstructured" > + verb > - propertyIndex = true > - name = "verb" > {noformat} > So indexRule would have 2 more config properties > * filter-property - Name of property to match > * filter-value - The value to match > *Indexing* > At time of indexing currently LuceneIndexEditor does a > {{indexDefinition.getApplicableIndexingRule}} passing it the NodeState. > Currently this checks only for jcr:PrimaryType and jxr:mixins to find > matching rule. > This logic would need to be extended to also check if any filter-property is > defined in definition. If yes then check if NodeState has that value > *Querying* > On query side we need to change the IndexPlanner where it currently use query > nodetype for finding matching indexRule. In addition it would need to pass on > the property restrictions and the rule only be matched if the property > restriction matches the filter > *Open Item* > # How to handle change in filter-property value. I think we have similar > problem currently if an index nodes nodeType gets changed. In such a case we > do not remove it from index. So we need to solve that for both > # Ensure that all places where rules are matched account for this filter > concept -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7635) oak-run check should support Azure Segment Store
[ https://issues.apache.org/jira/browse/OAK-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7635: -- Fix Version/s: 1.12 > oak-run check should support Azure Segment Store > > > Key: OAK-7635 > URL: https://issues.apache.org/jira/browse/OAK-7635 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: run, segment-tar >Reporter: Andrei Dulceanu >Assignee: Andrei Dulceanu >Priority: Major > Labels: tooling > Fix For: 1.10.0, 1.12 > > > {{oak-run check}} should accept Azure URIs for the segment store in order to > be able to check for data integrity. This will come handy in the light of > remote compacted segment stores and/or sidegraded remote segment stores (see > OAK-7623, OAK-7459). > The Azure URI will be taken as argument and will have the following format: > {{az:[https://myaccount.blob.core.windows.net/container/repo]}}, where _az_ > identifies the cloud provider. The last missing piece is the secret key which > will be supplied as an environment variable, i.e. _AZURE_SECRET_KEY._ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7328) Update DocumentNodeStore based OakFixture
[ https://issues.apache.org/jira/browse/OAK-7328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7328: -- Fix Version/s: 1.12 > Update DocumentNodeStore based OakFixture > - > > Key: OAK-7328 > URL: https://issues.apache.org/jira/browse/OAK-7328 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: run >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.10.0, 1.12 > > > The current OakFixtures using a DocumentNodeStore use a configuration / setup > which is different from what a default DocumentNodeStoreService would use. It > would be better if benchmarks run with a configuration close to a default > setup. The main differences identified are: > - Does not have a proper executor, which means some tasks are executed with > the same thread. > - Does not use a separate persistent cache for the journal (diff cache). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7754) Option to disable BlobTracker
[ https://issues.apache.org/jira/browse/OAK-7754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7754: -- Fix Version/s: 1.12 > Option to disable BlobTracker > - > > Key: OAK-7754 > URL: https://issues.apache.org/jira/browse/OAK-7754 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob-plugins >Reporter: Amit Jain >Assignee: Amit Jain >Priority: Major > Fix For: 1.10.0, 1.12 > > > Enable an option to disable blob tracker completely for deployments where > DSGC is explicitly disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6597) rep:excerpt not working for content indexed by aggregation in lucene
[ https://issues.apache.org/jira/browse/OAK-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6597: -- Fix Version/s: 1.12 > rep:excerpt not working for content indexed by aggregation in lucene > > > Key: OAK-6597 > URL: https://issues.apache.org/jira/browse/OAK-6597 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Affects Versions: 1.6.1, 1.7.6, 1.8.0 >Reporter: Dirk Rudolph >Assignee: Chetan Mehrotra >Priority: Major > Labels: excerpt > Fix For: 1.10.0, 1.12 > > Attachments: excerpt-with-aggregation-test.patch > > > I mentioned that properties that got indexed due to an aggregation are not > considered for excerpts (highlighting) as they are not indexed as stored > fields. > See the attached patch that implements a test for excerpts in > {{LuceneIndexAggregationTest2}}. > It creates the following structure: > {code} > /content/foo [test:Page] > + bar (String) > - jcr:content [test:PageContent] > + bar (String) > {code} > where both strings (the _bar_ property at _foo_ and the _bar_ property at > _jcr:content_) contain different text. > Afterwards it queries for 2 terms ("tinc*" and "aliq*") that either exist in > _/content/foo/bar_ or _/content/foo/jcr:content/bar_ but not in both. For the > former one the excerpt is properly provided for the later one it isn't. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5927) Load excerpt lazily
[ https://issues.apache.org/jira/browse/OAK-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5927: -- Fix Version/s: 1.12 > Load excerpt lazily > --- > > Key: OAK-5927 > URL: https://issues.apache.org/jira/browse/OAK-5927 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Priority: Major > Labels: performance > Fix For: 1.10.0, 1.12 > > > Currently LucenePropertyIndex loads the excerpt eagerly in batch as part of > loadDocs call. The load docs batch size doubles starting from 50 (max 100k) > as more data is read. > We should look into ways to make the excerpt loaded lazily as and when caller > ask for excerpt. > Note that currently the excerpt are only loaded when query request for > excerpt i.e. there is a not null property restriction for {{rep:excerpt}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6836) OnRC report
[ https://issues.apache.org/jira/browse/OAK-6836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6836: -- Fix Version/s: 1.12 > OnRC report > --- > > Key: OAK-6836 > URL: https://issues.apache.org/jira/browse/OAK-6836 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Valentin Olteanu >Priority: Minor > Labels: production > Fix For: 1.10.0, 1.12 > > Attachments: gcreport.png > > > Currently, the information regarding an online revision cleanup execution is > scattered across multiple log lines and partially available in the attributes > of {{SegmentRevisionGarbageCollection}} MBean. > While useful for debugging, this is hard to grasp for users that need to > understand the full process to be able to read it. > The idea would be to create a "report" with all the details of an execution > and output it at the end - write to log, but also store it in the MBean, from > where it can be consumed by monitoring and health checks. > In the MBean, this would replace the _Last*_ attributes. > In the logs, this could replace all the intermediary logs (switch them to > DEBUG). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-2787) Faster multi threaded indexing / text extraction for binary content
[ https://issues.apache.org/jira/browse/OAK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2787: -- Fix Version/s: 1.12 > Faster multi threaded indexing / text extraction for binary content > --- > > Key: OAK-2787 > URL: https://issues.apache.org/jira/browse/OAK-2787 > Project: Jackrabbit Oak > Issue Type: Wish > Components: lucene >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.10.0, 1.12 > > > With Lucene based indexing the indexing process is single threaded. This > hamper the indexing of binary content as on a multi processor system only > single thread can be used to perform the indexing > [~ianeboston] Suggested a possible approach [1] involving a 2 phase indexing > # In first phase detect the nodes to be indexed and start the full text > extraction of the binary content. Post extraction save the binary token > stream back to the node as a hidden data. In this phase the node properties > can still be indexed and a marker field would be added to indicate the > fulltext index is still pending > # Later in 2nd phase look for all such Lucene docs and then update them with > the saved token stream > This would allow the text extraction logic to be decouple from Lucene > indexing logic > [1] http://markmail.org/thread/2w5o4bwqsosb6esu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6421) Phase out JCR Locking support
[ https://issues.apache.org/jira/browse/OAK-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6421: -- Fix Version/s: (was: 1.10.0) > Phase out JCR Locking support > - > > Key: OAK-6421 > URL: https://issues.apache.org/jira/browse/OAK-6421 > Project: Jackrabbit Oak > Issue Type: Task > Components: jcr >Reporter: Marcel Reutegger >Priority: Major > Fix For: 1.12 > > > Oak currently has a lot of gaps in its JCR Locking implementation (see > OAK-1962), which basically makes it non-compliant with the JCR specification. > I propose we phase out the support for JCR Locking because a proper > implementation would be rather complex with a runtime behaviour that is very > different in a standalone deployment compared to a cluster. In the standalone > case a lock could be acquired very quickly, while in the distributed case, > the operations would be multiple orders of magnitude slower, depending on how > cluster nodes are geographically distributed. > Applications that rely on strict lock semantics should use other mechanisms, > built explicitly for this purpose. E.g. Apache Zookeeper. > To ease upgrade and migration to a different lock mechanism, the proposal is > to introduce a flag or configuration that controls the level of support for > JCR Locking: > - DISABLED: the implementation does not support JCR Locking at all. Methods > will throw UnsupportedRepositoryOperationException when defined by the JCR > specification. > - DEPRECATED: the implementation behaves as right now, but logs a warn or > error message that JCR Locking does not work as specified and will be removed > in a future version of Oak. > In a later release (e.g. 1.10) the current JCR Locking implementation would > be removed entirely and unconditionally throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6577) Determine the approach for reindexing in case of CompositeNodeStore setups
[ https://issues.apache.org/jira/browse/OAK-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6577: -- Fix Version/s: (was: 1.10.0) > Determine the approach for reindexing in case of CompositeNodeStore setups > -- > > Key: OAK-6577 > URL: https://issues.apache.org/jira/browse/OAK-6577 > Project: Jackrabbit Oak > Issue Type: Task > Components: composite, indexing >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > Current index tooling is designed to work with a single NodeStore setups. We > should determine how reindexing should be done for CompositeNodeStore setup > specially where one of the mount is private and read only -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6515) Decouple indexing and upload to datastore
[ https://issues.apache.org/jira/browse/OAK-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6515: -- Fix Version/s: (was: 1.10.0) > Decouple indexing and upload to datastore > - > > Key: OAK-6515 > URL: https://issues.apache.org/jira/browse/OAK-6515 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: indexing, lucene, query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Minor > Fix For: 1.12 > > > Currently the default async index delay is 5 seconds. Using a larger delay > (e.g. 15 seconds) reduces index related growth, however diffing is delayed 15 > seconds, which can reduce indexing performance. > One option (which might require bigger changes) is to index every 5 seconds, > and store the index every 5 seconds in the local directory, but only write to > the datastore / nodestore every 3rd time (that is, every 15 seconds). > So that other cluster nodes will only see the index update every 15 seconds. > The diffing is done every 5 seconds, and the local index could be used every > 5 or every 15 seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7390) QueryResult.getSize() can be slow for many "or" or "union" conditions
[ https://issues.apache.org/jira/browse/OAK-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7390: -- Fix Version/s: (was: 1.10.0) > QueryResult.getSize() can be slow for many "or" or "union" conditions > - > > Key: OAK-7390 > URL: https://issues.apache.org/jira/browse/OAK-7390 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > For queries with many union conditions, the "fast" getSize method can > actually be slower than iterating over the result. > The reason is, the number of index calls grows exponential with regards to > number of subqueries: (3x + x^2) / 2, where x is the number of subqueries. > For this to have a measurable affect, the number of subqueries needs to be > large (more than 100), and the index needs to be slow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7457) "Covariant return type change detected" warnings with java10
[ https://issues.apache.org/jira/browse/OAK-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7457: -- Fix Version/s: (was: 1.10.0) > "Covariant return type change detected" warnings with java10 > > > Key: OAK-7457 > URL: https://issues.apache.org/jira/browse/OAK-7457 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk, segment-tar >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > Fix For: 1.12 > > > We have quite a few warnings of type "Covariant return type change detected": > {noformat} > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\TCPBroadcaster.java:327: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.flip() has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.flip() > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\UDPBroadcaster.java:135: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.limit(int) > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\UDPBroadcaster.java:138: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\TCPBroadcaster.java:226: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\InMemoryBroadcaster.java:35: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\PersistentCache.java:519: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.limit(int) > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\PersistentCache.java:522: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\PersistentCache.java:535: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataV12.java:196: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataV12.java:197: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.limit(int) > [INFO] > C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataUtils.java:57: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.position(int) > [INFO] > C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataUtils.java:58: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer > java.nio.ByteBuffer.limit(int) > [INFO] > C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\file\tar\index\IndexWriter.java:110: > Covariant return type change detected: java.nio.Buffer > java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer >
[jira] [Updated] (OAK-3866) Sorting on relative properties doesn't work in Solr
[ https://issues.apache.org/jira/browse/OAK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3866: -- Fix Version/s: (was: 1.10.0) > Sorting on relative properties doesn't work in Solr > --- > > Key: OAK-3866 > URL: https://issues.apache.org/jira/browse/OAK-3866 > Project: Jackrabbit Oak > Issue Type: Bug > Components: solr >Affects Versions: 1.0.22, 1.2.9, 1.3.13 >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Major > Fix For: 1.12 > > > Executing a query like > {noformat} > /jcr:root/content/foo//*[(@sling:resourceType = 'x' or @sling:resourceType = > 'y') and jcr:contains(., 'bar*~')] order by jcr:content/@jcr:primaryType > descending > {noformat} > would assume sorting on the _jcr:primaryType_ property of resulting nodes' > _jcr:content_ children. > That is currently not supported in Solr, while it is in Lucene as the latter > supports index time aggregation. > We should inspect if it's possible to extend support for Solr too, most > probably via index time aggregation. > The query should not fail but at least log a warning about that limitation > for the time being. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6836) OnRC report
[ https://issues.apache.org/jira/browse/OAK-6836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6836: -- Fix Version/s: (was: 1.10.0) > OnRC report > --- > > Key: OAK-6836 > URL: https://issues.apache.org/jira/browse/OAK-6836 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Valentin Olteanu >Priority: Minor > Labels: production > Fix For: 1.12 > > Attachments: gcreport.png > > > Currently, the information regarding an online revision cleanup execution is > scattered across multiple log lines and partially available in the attributes > of {{SegmentRevisionGarbageCollection}} MBean. > While useful for debugging, this is hard to grasp for users that need to > understand the full process to be able to read it. > The idea would be to create a "report" with all the details of an execution > and output it at the end - write to log, but also store it in the MBean, from > where it can be consumed by monitoring and health checks. > In the MBean, this would replace the _Last*_ attributes. > In the logs, this could replace all the intermediary logs (switch them to > DEBUG). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7256) Query: option to wait for indexes to be updated
[ https://issues.apache.org/jira/browse/OAK-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7256: -- Fix Version/s: (was: 1.10.0) > Query: option to wait for indexes to be updated > --- > > Key: OAK-7256 > URL: https://issues.apache.org/jira/browse/OAK-7256 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Minor > Fix For: 1.12 > > > Sometimes (rarely, but still) queries should include the very latest changes, > even if the index is updated asynchronously. For example when running unit > test: data is added, and then a query is run to check if the data is there. > The problem with asynchronous indexes is, you don't know exactly how long you > have to wait. Often the index is updated quickly, and sometimes it takes a > few seconds. > What about extending the query syntax as follows: > Wait for all indexes (no matter which index is used for this query) - this > would be used rarely, just for testing: > {noformat} > /jcr:root/* > option(wait for all indexes timeout 60) > {noformat} > Wait for just those indexes (well, usually it's just one, but sometimes it's > multiple) that are needed for the given query. This query could also be used > in an application that strictly needs the very latest results, even for > fulltext queries. The "timeout" would mean "wait at most 10 seconds, and if > not up-to-date then throw an exeption", while "max 10" would mean "wait at > most 10 seconds, but still run the query in any case". > {noformat} > /jcr:root/content//*[jcr:contains(., 'hello')] > option(wait for indexes max 10) > {noformat} > The query would wait, and once the indexes are up-to-date, return the > requested result. > So the syntax is (both SQL-2 and XPath): > {noformat} > option(wait for [all] indexes > { max | timeout } > [, ] ) > {noformat} > So other options can also be used (option traversal fail,...). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7691) Remove deprecated ValueFactoryImpl methods
[ https://issues.apache.org/jira/browse/OAK-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7691: -- Fix Version/s: (was: 1.10.0) > Remove deprecated ValueFactoryImpl methods > -- > > Key: OAK-7691 > URL: https://issues.apache.org/jira/browse/OAK-7691 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: store-spi >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.12 > > > The deprecated static methods on ValueFactoryImpl are not used anymore and > can be removed. See also OAK-7688. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7684) fix maven warnings on oak-solr-osgi
[ https://issues.apache.org/jira/browse/OAK-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7684: -- Fix Version/s: (was: 1.10.0) > fix maven warnings on oak-solr-osgi > --- > > Key: OAK-7684 > URL: https://issues.apache.org/jira/browse/OAK-7684 > Project: Jackrabbit Oak > Issue Type: Task > Components: solr >Reporter: Julian Reschke >Assignee: Tommaso Teofili >Priority: Minor > Fix For: 1.12 > > > {noformat} > [WARNING] Bundle org.apache.jackrabbit:oak-solr-osgi:bundle:1.10-SNAPSHOT : > NoClassDefFoundError: com/vividsolutions/jts/io/InStream > [WARNING] Bundle org.apache.jackrabbit:oak-solr-osgi:bundle:1.10-SNAPSHOT : > NoClassDefFoundError: com/vividsolutions/jts/io/OutStream > [WARNING] Bundle org.apache.jackrabbit:oak-solr-osgi:bundle:1.10-SNAPSHOT : > NoClassDefFoundError: com/vividsolutions/jts/geom/CoordinateSequenceFilter > [WARNING] Bundle org.apache.jackrabbit:oak-solr-osgi:bundle:1.10-SNAPSHOT : > NoClassDefFoundError: com/vividsolutions/jts/geom/GeometryFilter > [WARNING] Bundle org.apache.jackrabbit:oak-solr-osgi:bundle:1.10-SNAPSHOT : > Unused Import-Package instructions: [com.sun.*] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7300) Lucene Index: per-column selectivity to improve cost estimation
[ https://issues.apache.org/jira/browse/OAK-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7300: -- Fix Version/s: (was: 1.10.0) > Lucene Index: per-column selectivity to improve cost estimation > --- > > Key: OAK-7300 > URL: https://issues.apache.org/jira/browse/OAK-7300 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene, query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > In OAK-6735 we have improved cost estimation for Lucene indexes, however the > following case is still not working as expected: a very common property is > indexes (many nodes have that property), and each value of that property is > more or less unique. In this case, currently the cost estimation is the total > number of documents that contain that property. Assuming the condition > "property is not null" this is correct, however for the common case "property > = x" the estimated cost is far too high. > A known workaround is to set the "costPerEntry" for the given index to a low > value, for example 0.2. However this isn't a good solution, as it affects all > properties and queries. > It would be good to be able to set the selectivity per property, for example > by specifying the number of distinct values, or (better yet) the average > number of entries for a given key (1 for unique values, 2 meaning for each > distinct values there are two documents on average). > That value can be set manually (cost override), and it can be set > automatically, e.g. when building the index, or updated from time to time > during the index update, using a cardinality > estimation algorithm. That doesn't have to be accurate; we could use an rough > approximation such as hyperbitbit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5706) Function based indexes with "like" conditions
[ https://issues.apache.org/jira/browse/OAK-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5706: -- Fix Version/s: (was: 1.10.0) > Function based indexes with "like" conditions > - > > Key: OAK-5706 > URL: https://issues.apache.org/jira/browse/OAK-5706 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > Currently, a function-based index is not used when using "like" conditions, > as follows: > {noformat} > /jcr:root//*[jcr:like(fn:lower-case(fn:name()), 'abc%')] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6408) Review package exports for o.a.j.oak.plugins.index.*
[ https://issues.apache.org/jira/browse/OAK-6408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6408: -- Fix Version/s: (was: 1.10.0) > Review package exports for o.a.j.oak.plugins.index.* > > > Key: OAK-6408 > URL: https://issues.apache.org/jira/browse/OAK-6408 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, indexing >Reporter: angela >Priority: Major > Fix For: 1.12 > > > while working on OAK-6304 and OAK-6355, i noticed that the > _o.a.j.oak.plugins.index.*_ contains both internal api/utilities and > implementation details which get equally exported (though without having any > package export version set). > in the light of the modularization effort, i would like to suggest that we > try to sort that out and separate the _public_ parts from implementation > details. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-3141) Oak should warn when too many ordered child nodes
[ https://issues.apache.org/jira/browse/OAK-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3141: -- Fix Version/s: (was: 1.10.0) > Oak should warn when too many ordered child nodes > - > > Key: OAK-3141 > URL: https://issues.apache.org/jira/browse/OAK-3141 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Affects Versions: 1.0.16 >Reporter: Jörg Hoh >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > When working with the RDBMK we came into situations, that large documents did > not fit into the provided db columns, there was an overflow, which caused oak > not to persist the change. We fixed it by increasing the size of the column. > But it would be nice if Oak could warn if a document exceeds a certain size > (for example 2 megabytes); because this warning indicates, that on a JCR > level there might be a problematic situation, for example: > * ordered node with a large list of childnodes > * or longstanding sessions with lots of changes, which accumulate to large > documents. > It's certainly nice to know if there's a node/document with such a problem, > before the exceptions actually happens and an operation breaks. > This message should be a warning, and should contain the JCR path of the node > plus the current size. To avoid that this message is overseen, it would be > good if it is written everyonce in a while (every 10 minutes?) if this > condition persists. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5990) Add properties filtering support to OakEventFilter
[ https://issues.apache.org/jira/browse/OAK-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5990: -- Fix Version/s: (was: 1.10.0) > Add properties filtering support to OakEventFilter > -- > > Key: OAK-5990 > URL: https://issues.apache.org/jira/browse/OAK-5990 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: jcr >Affects Versions: 1.6.1 >Reporter: Stefan Egli >Priority: Major > Fix For: 1.12 > > > SLING-6164 introduced a _property name hint_ which, when set, allows to limit > the observation events to only include those that affect at least one of the > those properties listed. The advantage is to be further able to reduce the > events sent out. This feature has not yet been implemented on the oak side. > Thus we should add this to the OakEventFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4614) Collection of observation resilience issues
[ https://issues.apache.org/jira/browse/OAK-4614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4614: -- Fix Version/s: (was: 1.10.0) > Collection of observation resilience issues > --- > > Key: OAK-4614 > URL: https://issues.apache.org/jira/browse/OAK-4614 > Project: Jackrabbit Oak > Issue Type: Epic > Components: core, documentmk, jcr >Reporter: Marcel Reutegger >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6225) Analyse changing the persistence format of GroupImpl
[ https://issues.apache.org/jira/browse/OAK-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6225: -- Fix Version/s: (was: 1.10.0) > Analyse changing the persistence format of GroupImpl > > > Key: OAK-6225 > URL: https://issues.apache.org/jira/browse/OAK-6225 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Alex Deparvu >Assignee: Alex Deparvu >Priority: Major > Fix For: 1.12 > > Attachments: groupimpl-v0.patch, groupimpl-v1.patch, > groupimpl-v2.patch > > > As suggested on OAK-3933, I'd like to look into using a different persistence > format for the GroupImpl members. > Currently this is saved as a list of child nodes, and I'd like to bench this > against a tree based approach where each sub child node represents a part of > the key so it can be used for lookup. > WIP branch can be found at [0], I merged all the commits so far into a single > one to reduce the noise. > fyi [~anchela] > [0] > https://github.com/apache/jackrabbit-oak/compare/trunk...stillalex:oak-6225 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5316) Rewrite JcrPathParser and JcrNameParser with good test coverage
[ https://issues.apache.org/jira/browse/OAK-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5316: -- Fix Version/s: (was: 1.10.0) > Rewrite JcrPathParser and JcrNameParser with good test coverage > --- > > Key: OAK-5316 > URL: https://issues.apache.org/jira/browse/OAK-5316 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Affects Versions: 1.5.15 >Reporter: Julian Sedding >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > As discussed in OAK-5260 the implementation of the {{JcrPathParser}} and > possibly also the {{JcrNameParser}} are not ideal, i.e. there are potentially > many bugs hiding in edge-case scenarios. The parsers' test coverage is also > lacking, which is problematic as these code paths get executed very > frequently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5152) Improve overflow handling in ChangeSetFilterImpl
[ https://issues.apache.org/jira/browse/OAK-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5152: -- Fix Version/s: (was: 1.10.0) > Improve overflow handling in ChangeSetFilterImpl > > > Key: OAK-5152 > URL: https://issues.apache.org/jira/browse/OAK-5152 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Affects Versions: 1.5.14 >Reporter: Stefan Egli >Priority: Major > Fix For: 1.12 > > > As described in OAK-5151 when a ChangeSet overflows, the ChangeSetFilterImpl > treats the changes as included and doesn't go further into the remaining, > perhaps not-overflown other sets. Besides more testing it wouldn't be much > effort to change this though. Putting this as outside of 1.6 scope for now > though. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7192) Remove package export for org.apache.jackrabbit.oak.composite.checks
[ https://issues.apache.org/jira/browse/OAK-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7192: -- Fix Version/s: (was: 1.10.0) > Remove package export for org.apache.jackrabbit.oak.composite.checks > > > Key: OAK-7192 > URL: https://issues.apache.org/jira/browse/OAK-7192 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: composite >Reporter: Marcel Reutegger >Priority: Minor > Fix For: 1.12 > > > It appears the package {{org.apache.jackrabbit.oak.composite.checks}} is only > used internally by the oak-store-composite module and should not be exported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-3373) Observers dont survive store restart (was: LuceneIndexProvider: java.lang.IllegalStateException: open)
[ https://issues.apache.org/jira/browse/OAK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3373: -- Fix Version/s: (was: 1.10.0) > Observers dont survive store restart (was: LuceneIndexProvider: > java.lang.IllegalStateException: open) > -- > > Key: OAK-3373 > URL: https://issues.apache.org/jira/browse/OAK-3373 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.3.5 >Reporter: Stefan Egli >Priority: Major > Fix For: 1.12 > > > The following exception occurs when stopping, then immediately re-starting > the oak-core bundle (which was done as part of testing for OAK-3250 - but can > be reproduced independently). It's not clear what the consequences are > though.. > {code}08.09.2015 14:20:26.960 *ERROR* [oak-lucene-0] > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider Uncaught > exception in > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider@3a4a6c5c > org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: Error > occurred while fetching children for path /oak:index/authorizables > at > org.apache.jackrabbit.oak.plugins.document.DocumentStoreException.convert(DocumentStoreException.java:48) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.getChildren(DocumentNodeStore.java:902) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.getChildNodes(DocumentNodeStore.java:1082) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.getChildNodeEntries(DocumentNodeState.java:508) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.access$100(DocumentNodeState.java:65) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$ChildNodeEntryIterator.fetchMore(DocumentNodeState.java:716) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$ChildNodeEntryIterator.(DocumentNodeState.java:681) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$1.iterator(DocumentNodeState.java:289) > at > org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:129) > at > org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:303) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:359) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148) > at > org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:140) > at > org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:303) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:359) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148) > at > org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:140) > at > org.apache.jackrabbit.oak.spi.state.AbstractNodeState.compareAgainstBaseState(AbstractNodeState.java:303) > at > org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:359) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:108) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:73) > at > org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:127) > at > org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:121) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalStateException: open > at org.bson.util.Assertions.isTrue(Assertions.java:36) > at > com.mongodb.DBTCPConnector.isMongosConnection(DBTCPConnector.java:367) > at com.mongodb.Mongo.isMongosConnection(Mongo.java:622) > at com.mongodb.DBCursor._check(DBCursor.java:494) > at com.mongodb.DBCursor._hasNext(DBCursor.java:621) > at
[jira] [Updated] (OAK-3380) Property index pruning should happen asynchronously
[ https://issues.apache.org/jira/browse/OAK-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3380: -- Fix Version/s: (was: 1.10.0) > Property index pruning should happen asynchronously > --- > > Key: OAK-3380 > URL: https://issues.apache.org/jira/browse/OAK-3380 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: property-index >Affects Versions: 1.3.5 >Reporter: Vikas Saurabh >Priority: Minor > Labels: resilience > Fix For: 1.12 > > > Following up on this (a relatively old) thread \[1], we should do pruning of > property index structure asynchronously. The thread was never concluded.. > here are a couple of ideas picked from the thread: > * Move pruning to an async thread > * Throttle pruning i.e. prune only once in a while > ** I'm not sure how that would work though -- an unpruned part would remain > as is until another index happens on that path. > Once we can move pruning to some async thread (reducing concurrent updates), > OAK-2673 + OAK-2929 can take care of add-add conflicts. > > h6. Why is this an issue despite merge retries taking care of it? > A couple of cases which have concurrent updates hitting merge conflicts in > our product (Adobe AEM): > * Some index are very volatile (in the sense that indexed property switches > its values very quickly) e.g. sling job status, AEM workflow status. > * Multiple threads take care of jobs. Although sling maintains a bucketed > structure for job storage to reduce conflicts... but inside index tree the > bucket structure, at times, gets pruned and needs to be created in the next > job status change > While retries do take care of these conflict a lot of times and even when > they don't, AEM workflows has it's own retry to work around. But, retrying, > IMHO, is just a waste of time -- more importantly in paths where application > doesn't really have a control. > h6. Would this add to cost of traversing index structure? > Yes, there'd be some left over paths in index structure between asynchronous > prunes. But, I think the cost of such wasted traversals would be covered up > with time saved in avoiding the concurrent update conflict. > > (cc [~tmueller], [~mreutegg], [~alex.parvulescu], [~chetanm]) > \[1]: > http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201506.mbox/%3ccadichf66u2vh-hlrjunansytxfidj2mt3vktr4ybkngpzy9...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5506) reject item names with unpaired surrogates early
[ https://issues.apache.org/jira/browse/OAK-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5506: -- Fix Version/s: (was: 1.10.0) > reject item names with unpaired surrogates early > > > Key: OAK-5506 > URL: https://issues.apache.org/jira/browse/OAK-5506 > Project: Jackrabbit Oak > Issue Type: Wish > Components: core, jcr, segment-tar >Affects Versions: 1.5.18 >Reporter: Julian Reschke >Priority: Minor > Labels: resilience > Fix For: 1.12 > > Attachments: OAK-5506-01.patch, OAK-5506-02.patch, OAK-5506-4.diff, > OAK-5506-bench.diff, OAK-5506-jcr-level.diff, OAK-5506-name-conversion.diff, > OAK-5506-segment.diff, OAK-5506-segment2.diff, OAK-5506-segment3.diff, > OAK-5506.diff, ValidNamesTest.java > > > Apparently, the following node name is accepted: >{{"foo\ud800"}} > but a subsequent {{getPath()}} call fails: > {noformat} > javax.jcr.InvalidItemStateException: This item [/test_node/foo?] does not > exist anymore > at > org.apache.jackrabbit.oak.jcr.delegate.ItemDelegate.checkAlive(ItemDelegate.java:86) > at > org.apache.jackrabbit.oak.jcr.session.operation.ItemOperation.checkPreconditions(ItemOperation.java:34) > at > org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.prePerform(SessionDelegate.java:615) > at > org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:205) > at > org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112) > at > org.apache.jackrabbit.oak.jcr.session.ItemImpl.getPath(ItemImpl.java:140) > at > org.apache.jackrabbit.oak.jcr.session.NodeImpl.getPath(NodeImpl.java:106) > at > org.apache.jackrabbit.oak.jcr.ValidNamesTest.nameTest(ValidNamesTest.java:271) > at > org.apache.jackrabbit.oak.jcr.ValidNamesTest.testUnpairedSurrogate(ValidNamesTest.java:259) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source){noformat} > (test case follows) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7374) Investigate changing the UUID generation algorithm / format to reduce index size, improve speed
[ https://issues.apache.org/jira/browse/OAK-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7374: -- Fix Version/s: (was: 1.10.0) > Investigate changing the UUID generation algorithm / format to reduce index > size, improve speed > --- > > Key: OAK-7374 > URL: https://issues.apache.org/jira/browse/OAK-7374 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > UUIDs are currently randomly generated, which is bad for indexing; specially > read and writes access, due to low locality. > If we could add a time component, I think the index churn (amount of writes) > would shrink, and lookup would be faster. > It should be fairly easy to verify if that's really true (create a > proof-of-concept, and measure). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6774) Convert oak-upgrade to OSGi R6 annotations
[ https://issues.apache.org/jira/browse/OAK-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6774: -- Fix Version/s: (was: 1.10.0) > Convert oak-upgrade to OSGi R6 annotations > -- > > Key: OAK-6774 > URL: https://issues.apache.org/jira/browse/OAK-6774 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: upgrade >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7151) Support indexed based excerpts on properties
[ https://issues.apache.org/jira/browse/OAK-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7151: -- Fix Version/s: (was: 1.10.0) > Support indexed based excerpts on properties > > > Key: OAK-7151 > URL: https://issues.apache.org/jira/browse/OAK-7151 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Major > Fix For: 1.12 > > Attachments: OAK-7151.patch, OAK-7151.xpath-new-syntax.patch, > OAK-7151.xpath.patch > > > As discovered in OAK-4401 we fallback to {{SimpleExcerptProvider}} when > requesting excerpts for properties. > The issue as highlighted in [~teofili]'s comment \[0] is that we at time of > query we don't have information about which all columns/fields would be > required for excerpts. > A possible approach is that the query specified explicitly which columns > would be required in facets (of course, node level excerpt would still be > supported). This issue is to track that improvement. > Note: this is *not* a substitute for OAK-4401 which is about doing saner > highlighting when {{SimpleExcerptProvider}} comes into play e.g. despite this > issue excerpt for non-stored fields (properties which aren't configured with > {{useInExcerpt}} in the index definition}, we'd need to fallback to > {{SimpleExcerptProvider}}. > /[~tmueller] > \[0]: > https://issues.apache.org/jira/browse/OAK-4401?focusedCommentId=15299857=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15299857 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4643) Support multiple readers in suggester, spellcheck and faceted search
[ https://issues.apache.org/jira/browse/OAK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4643: -- Fix Version/s: (was: 1.10.0) > Support multiple readers in suggester, spellcheck and faceted search > > > Key: OAK-4643 > URL: https://issues.apache.org/jira/browse/OAK-4643 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > As part of OAK-4566 normal search has been modified to support multiple > readers. However for suggester, spellcheck and faceted search the logic is > still working with the assumption of single reader. > Those parts need to be adapted to support multiple readers -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5367) Strange path parsing
[ https://issues.apache.org/jira/browse/OAK-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5367: -- Fix Version/s: (was: 1.10.0) > Strange path parsing > > > Key: OAK-5367 > URL: https://issues.apache.org/jira/browse/OAK-5367 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > Attachments: JcrPathParserTest.java > > > Incorrect handling of path with "\{" was fixed in OAK-5260, but the behavior > of the JcrPathParser is still strange. For example: > * the root node, "/", is mapped to "/", and the current node, "." is mapped > to "". But "/." is mapped to the current node (should be root node). > * "/parent/./childA2" is mapped to "/parent/childA2" (which is fine), but > "/parent/.}/childA2" is also mapped to "/parent/childA2". > * "\}\{" and "}\[" and "}}[" are mapped to the current node. So are ".[" and > "/[" and ".}". And "}\{test" is mapped to "}\{test", which is > inconsistent-weird. > * "x\[1\]}" is mapped to "x". > All that weirdness should be resolved. Some seem to be just weird, but some > look like they could become a problem at some point ("}\{"). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7505) Investigate merging VersionablePathHook into existing version editors
[ https://issues.apache.org/jira/browse/OAK-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7505: -- Fix Version/s: (was: 1.10.0) > Investigate merging VersionablePathHook into existing version editors > - > > Key: OAK-7505 > URL: https://issues.apache.org/jira/browse/OAK-7505 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, security >Reporter: Alex Deparvu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7941) Test failure: IndexCopierTest.directoryContentMismatch_COR
[ https://issues.apache.org/jira/browse/OAK-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7941: -- Fix Version/s: (was: 1.10.0) > Test failure: IndexCopierTest.directoryContentMismatch_COR > -- > > Key: OAK-7941 > URL: https://issues.apache.org/jira/browse/OAK-7941 > Project: Jackrabbit Oak > Issue Type: Bug > Components: continuous integration, lucene >Reporter: Hudson >Priority: Major > Fix For: 1.12 > > > No description is provided > The build Jackrabbit Oak #1830 has failed. > First failed run: [Jackrabbit Oak > #1830|https://builds.apache.org/job/Jackrabbit%20Oak/1830/] [console > log|https://builds.apache.org/job/Jackrabbit%20Oak/1830/console] > {noformat} > [ERROR] Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 0.832 s <<< FAILURE! - in > org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest > [ERROR] > directoryContentMismatch_COR(org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest) > Time elapsed: 0.08 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.readAndAssert(IndexCopierTest.java:1119) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.directoryContentMismatch_COR(IndexCopierTest.java:1081) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-2976) Oak percolator
[ https://issues.apache.org/jira/browse/OAK-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2976: -- Fix Version/s: (was: 1.10.0) > Oak percolator > -- > > Key: OAK-2976 > URL: https://issues.apache.org/jira/browse/OAK-2976 > Project: Jackrabbit Oak > Issue Type: Task > Components: query >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Major > Fix For: 1.12 > > > Inspired by [Elasticsearch > percolator|https://www.elastic.co/guide/en/elasticsearch/reference/current/search-percolate.html] > we may implement an Oak percolator that would basically store queries and > perform specific tasks upon indexing of documents matching those queries. > The reasons for possibly having that are that such a mechanism could be used > to run common but slow queries automatically whenever batches of matching > documents get indexed, to eventually warm up the underlying indexes caches. > Also such a percolator could be used as a notification mechanism (alerting, > monitoring). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5588) Improve Session stats.
[ https://issues.apache.org/jira/browse/OAK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5588: -- Fix Version/s: (was: 1.10.0) > Improve Session stats. > -- > > Key: OAK-5588 > URL: https://issues.apache.org/jira/browse/OAK-5588 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Ian Boston >Priority: Major > Labels: monitoring, production > Fix For: 1.12 > > > Currently each session has a SessionsStats MBean. Omongst other things it > records the total number or refresh operations. It also records the rate of > refresh operations, although this number in its current form is not usefull > as the rate is the number of refresh operations/session lifetime. It would > be much better to have a set of stats related to classes of users that > recorded proper metrics in a consistent way. eg 1 metric set per > service-user, 1 for the admin user and perhaps 1 for all normal users. Each > would record m1,m5,m15 rates, total count, p50,p75,p95,p99,p999 durations > with mean and stdev then 2 sets of metrics could be compared and monitored > without having to look at the code to work out how the metric was calculated. > Oak has metrics support to do this, minimal code would be required. > I dont think it would be viable to have 1 metric per unique session (too much > overhead, too much data, good for devs but bad for production), and in fact > having 1 JMX MBean per unique session is likely to cause problems with > everything connected to JMX even the ManagementServer can cope. Same goes for > the other proliferation of MBeans in the Oak 1.6. Perhaps a review of JMX in > Oak is due. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-2727) NodeStateSolrServersObserver should be filtering path selectively
[ https://issues.apache.org/jira/browse/OAK-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2727: -- Fix Version/s: (was: 1.10.0) > NodeStateSolrServersObserver should be filtering path selectively > - > > Key: OAK-2727 > URL: https://issues.apache.org/jira/browse/OAK-2727 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: solr >Affects Versions: 1.1.8 >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Major > Labels: performance > Fix For: 1.12 > > > As discussed in OAK-2718 it'd be good to be able to selectively find Solr > indexes by path, as done in Lucene index, see also OAK-2570. > This would avoid having to do full diffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: (was: 1.10.0) > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.12 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5950) XPath: stack overflow for large combination of "or" and "and"
[ https://issues.apache.org/jira/browse/OAK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5950: -- Fix Version/s: (was: 1.10.0) > XPath: stack overflow for large combination of "or" and "and" > - > > Key: OAK-5950 > URL: https://issues.apache.org/jira/browse/OAK-5950 > Project: Jackrabbit Oak > Issue Type: Bug > Components: query >Reporter: Thomas Mueller >Priority: Critical > Fix For: 1.12 > > > The following query returns in a stack overflow: > {noformat} > xpath2sql /jcr:root/home//element(*,rep:Authorizable)[(@a1=1 or @a2=1 or > @a3=1 or @a4=1 or @a5=1 or @a6=1 or @a7=1 or @a8=1) > and (@b1=1 or @b2=1 or @b3=1 or @b4=1 or @b5=1 or @b6=1 or @b7=1 or @b8=1) > and (@c1=1 or @c2=1 or @c3=1 or @c4=1 or @c5=1 or @c6=1 or @c7=1 or @c8=1) > and (@d1=1 or @d2=1 or @d3=1 or @d4=1 or @d5=1 or @d6=1 or @d7=1 or @d8=1)] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7193) DataStore: API to retrieve statistic (file headers, size estimation)
[ https://issues.apache.org/jira/browse/OAK-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7193: -- Fix Version/s: (was: 1.10.0) > DataStore: API to retrieve statistic (file headers, size estimation) > > > Key: OAK-7193 > URL: https://issues.apache.org/jira/browse/OAK-7193 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > Extension of OAK-6254: in addition to retrieving the size, it would be good > to retrieve the estimated number and total size per file type. A simple (and > in my view sufficient) solution is to use the first few bytes ("magic > numbers", 2 bytes should be enough) to get the file type. That would allow to > estimate, for example, the number of, and total size, of PDF files, JPEG, > Lucene index and so on. A histogram would be nice as well, but I think is not > needed. > To speed up calculation, the blob ID could be extended with the first 2 bytes > of the file content, that is: #@ where magic is the > first two bytes, in hex. That would allow to quickly get the data from the > blob ids (no need to actually read content). > Sampling should be enough. The longer it takes, the more accurate the data. > We could store the data while doing datastore GC, in which case the returned > data would be somewhat stale; that's OK. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5159) Killing the process may stop async index update to to 30 minutes, for DocumentStore (MongoDB, RDB)
[ https://issues.apache.org/jira/browse/OAK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5159: -- Fix Version/s: (was: 1.10.0) > Killing the process may stop async index update to to 30 minutes, for > DocumentStore (MongoDB, RDB) > -- > > Key: OAK-5159 > URL: https://issues.apache.org/jira/browse/OAK-5159 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Priority: Major > Labels: resilience > Fix For: 1.12 > > > Same as OAK-2108, when using a DocumentStore based repository (MongoDB, > RDBMK). This is also a problem in the single-cluster-node case, not just when > using multiple cluster node. > When killing a node that is running the sync index update, then this async > index update will not run for up to 15 minutes, because the lease time is set > to 15 minutes. > We could probably use Oak / Sling Discovery to improve the situation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6412) Consider upgrading to newer Lucene versions
[ https://issues.apache.org/jira/browse/OAK-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6412: -- Fix Version/s: (was: 1.10.0) > Consider upgrading to newer Lucene versions > --- > > Key: OAK-6412 > URL: https://issues.apache.org/jira/browse/OAK-6412 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Major > Fix For: 1.12 > > > An year ago I had started prototyping the upgrade to Lucene 5 [1], in the > meantime version 6 (and 7 soon) has come out. > I think it'd be very nice to upgrade Lucene version to the latest, this would > give us improvements in space consumption and runtime performance. > In case we want to upgrade to 6.0 or later we need to consider upgrade > scenarios because Lucene Codecs are backward compatible with the previous > major release, so Lucene 6 can read Lucene 5 but not Lucene 4.x (4.7 in our > case) therefore we would need to detect that when reading an index and > trigger reindexing using the new format. > Related to that there's also a patch to upgrade Solr index to version 5 (see > OAK-4318). > [1] : https://github.com/tteofili/jackrabbit-oak/tree/lucene5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5860) Compressed segments
[ https://issues.apache.org/jira/browse/OAK-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5860: -- Fix Version/s: (was: 1.10.0) > Compressed segments > --- > > Key: OAK-5860 > URL: https://issues.apache.org/jira/browse/OAK-5860 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: segment-tar >Reporter: Michael Dürig >Assignee: Andrei Dulceanu >Priority: Major > Labels: scalability > Fix For: 1.12 > > > It would be interesting to see the effect of compressing the segments within > the tar files with a sufficiently effective and performant compression > algorithm: > * Can we increase overall throughput by trading CPU for IO? > * Can we scale to bigger repositories (in number of nodes) by squeezing in > more segments per MB and thus pushing out onset of thrashing? > * What would be a good compression algorithm/library? > * Can/should we make this optional? > * Migration and compatibility issues? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6769) Convert oak-search-mt to OSGi R6 annotations
[ https://issues.apache.org/jira/browse/OAK-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6769: -- Fix Version/s: (was: 1.10.0) > Convert oak-search-mt to OSGi R6 annotations > > > Key: OAK-6769 > URL: https://issues.apache.org/jira/browse/OAK-6769 > Project: Jackrabbit Oak > Issue Type: Technical task >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7504) Include dynamic commit information in the persisted repository data
[ https://issues.apache.org/jira/browse/OAK-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7504: -- Fix Version/s: (was: 1.10.0) > Include dynamic commit information in the persisted repository data > --- > > Key: OAK-7504 > URL: https://issues.apache.org/jira/browse/OAK-7504 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Priority: Minor > Fix For: 1.12 > > > The data in the Segment Store doesn't provide any information about the > dynamic behaviour of the system. For example, who performed the commit? How > many commits were performed from the same committer? > In order to simplify debugging the dynamic behaviour of a system, it should > be possible to store metadata about the commit in the super-root generated by > that commit. For example, the following information might be attached to the > super-root: > * The name of the thread performing the commit. This solution might prove > expensive in terms of consumed disk space, but would be the most precise tool > to identify the author of a commit. > * A hash of the thread name. If storing thread names proves expensive, a hash > of the thread name can be stored instead. This doesn't allow to exactly > identify the author of the commit, but would allow us to correlated different > commits as performed by the same thread. > * Both the thread name and its hash, with the thread name stored only every > Nth commit. This solution is not as precise as storing the thread name for > every commit but, if there is a frequent committer, its thread name will be > more likely to be sampled, thus providing a precise identity to a thread name > hash. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7671) [oak-run] Deprecate the datastorecheck command in favor of datastore
[ https://issues.apache.org/jira/browse/OAK-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7671: -- Fix Version/s: (was: 1.10.0) > [oak-run] Deprecate the datastorecheck command in favor of datastore > > > Key: OAK-7671 > URL: https://issues.apache.org/jira/browse/OAK-7671 > Project: Jackrabbit Oak > Issue Type: Task > Components: run >Reporter: Amit Jain >Assignee: Amit Jain >Priority: Major > Fix For: 1.12 > > > With the introduction of \{{datastore}} command which supports both garbage > collection as well as consistency check the \{{datastorecheck}} command > should be deprecated and delegated internally to use that implementation. > Besides some options which are currently not supported by the new command > should also be implemented e.g. --ids, --refs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-1905) SegmentMK: Arch segment(s)
[ https://issues.apache.org/jira/browse/OAK-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-1905: -- Fix Version/s: (was: 1.10.0) > SegmentMK: Arch segment(s) > -- > > Key: OAK-1905 > URL: https://issues.apache.org/jira/browse/OAK-1905 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Jukka Zitting >Priority: Minor > Labels: perfomance, scalability > Fix For: 1.12 > > > There are a lot of constants and other commonly occurring name, values and > other data in a typical repository. To optimize storage space and access > speed, it would be useful to place such data in one or more constant "arch > segments" that are always cached in memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7660) Refactor AzureCompact and Compact
[ https://issues.apache.org/jira/browse/OAK-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7660: -- Fix Version/s: (was: 1.10.0) > Refactor AzureCompact and Compact > - > > Key: OAK-7660 > URL: https://issues.apache.org/jira/browse/OAK-7660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Andrei Dulceanu >Assignee: Andrei Dulceanu >Priority: Major > Labels: tech-debt, technical_debt, tooling > Fix For: 1.12 > > > {{AzureCompact}} in {{oak-segment-azure}} follows closely the structure and > logic of {{Compact}} in {{oak-segment-tar}}. Since the only thing which > differs is the underlying persistence used (remote in Azure vs. local in TAR > files), the common logic should be extracted in a super-class, extended by > both. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4934) Query shapes for JCR Query
[ https://issues.apache.org/jira/browse/OAK-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4934: -- Fix Version/s: (was: 1.10.0) > Query shapes for JCR Query > -- > > Key: OAK-4934 > URL: https://issues.apache.org/jira/browse/OAK-4934 > Project: Jackrabbit Oak > Issue Type: Wish > Components: query >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > For certain requirements it would be good to have a notion/support to deduce > query shape [1] > {quote} > A combination of query predicate, sort, and projection specifications. > For the query predicate, only the structure of the predicate, including the > field names, are significant; the values in the query predicate are > insignificant. As such, a query predicate \{ type: 'food' \} is equivalent to > the query predicate \{ type: 'utensil' \} for a query shape. > {quote} > So transforming that to Oak the shape should represent a JCR-SQL2 query > string (xpath query gets transformed to SQL2) which is a *canonical* > representation of actual query ignoring the property restriction values. > Example we have 2 queries > * SELECT * FROM [app:Asset] AS a WHERE a.[jcr:content/metadata/status] = > 'published' > * SELECT * FROM [app:Asset] AS a WHERE a.[jcr:content/metadata/status] = > 'disabled' > The query shape would be > SELECT * FROM [app:Asset] AS a WHERE a.[jcr:content/metadata/status] = 'A'. > The plan for query having given shape would remain same irrespective of value > of property restrictions. Path restriction can cause some difference though > The shape can then be used for > * Stats Collection - Currently stats collection gets overflown if same query > with different value gets invoked > * Allow configuring hints - See support in Mongo [2] for an example. One > specify via config that for a query of such and such shape this index should > be used > * Less noisy diagnostics - If a query gets invoked with bad plan the QE can > log the warning once instead of logging it for each query invocation > involving different values. > [1] https://docs.mongodb.com/manual/reference/glossary/#term-query-shape > [2] https://docs.mongodb.com/manual/reference/command/planCacheSetFilter/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7370) order by jcr:score desc doesn't work across union query created by optimizing OR clauses
[ https://issues.apache.org/jira/browse/OAK-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7370: -- Fix Version/s: (was: 1.10.0) > order by jcr:score desc doesn't work across union query created by optimizing > OR clauses > > > Key: OAK-7370 > URL: https://issues.apache.org/jira/browse/OAK-7370 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Major > Fix For: 1.12 > > > Merging of sub-queries created due to optimizing OR clauses doesn't work for > sorting on {{jcr:score}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5927) Load excerpt lazily
[ https://issues.apache.org/jira/browse/OAK-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5927: -- Fix Version/s: (was: 1.10.0) > Load excerpt lazily > --- > > Key: OAK-5927 > URL: https://issues.apache.org/jira/browse/OAK-5927 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Priority: Major > Labels: performance > Fix For: 1.12 > > > Currently LucenePropertyIndex loads the excerpt eagerly in batch as part of > loadDocs call. The load docs batch size doubles starting from 50 (max 100k) > as more data is read. > We should look into ways to make the excerpt loaded lazily as and when caller > ask for excerpt. > Note that currently the excerpt are only loaded when query request for > excerpt i.e. there is a not null property restriction for {{rep:excerpt}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5791) Reduce number of calls while adding a new node
[ https://issues.apache.org/jira/browse/OAK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5791: -- Fix Version/s: (was: 1.10.0) > Reduce number of calls while adding a new node > --- > > Key: OAK-5791 > URL: https://issues.apache.org/jira/browse/OAK-5791 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > Adding a new child node currently takes 2 remote calls. We should look into > reducing this to 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7358) Remove all usage of java.security.acl.Group for Java 13
[ https://issues.apache.org/jira/browse/OAK-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7358: -- Fix Version/s: (was: 1.10.0) > Remove all usage of java.security.acl.Group for Java 13 > --- > > Key: OAK-7358 > URL: https://issues.apache.org/jira/browse/OAK-7358 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: security >Reporter: Alex Deparvu >Assignee: Alex Deparvu >Priority: Major > Fix For: 1.12 > > > Followup of OAK-7024 for the actual removal of the Group class from the > codebase to be java 11 compliant. > Not sure what to use for 'fix version', I went with 1.9.0 so this remains on > the radar, but we can push it out as needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7904) Exporting query duration per index metrics with Sling Metrics / DropWizard
[ https://issues.apache.org/jira/browse/OAK-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7904: -- Fix Version/s: (was: 1.10.0) > Exporting query duration per index metrics with Sling Metrics / DropWizard > -- > > Key: OAK-7904 > URL: https://issues.apache.org/jira/browse/OAK-7904 > Project: Jackrabbit Oak > Issue Type: Task > Components: indexing, query >Reporter: Paul Chibulcuteanu >Assignee: Paul Chibulcuteanu >Priority: Major > Fix For: 1.9.10, 1.12 > > Attachments: OAK-7904.0.patch > > > Purpose of this task is to evaluate & create metric which calculates the > average duration of query for each index. > This metric can be later used to evaluate which index(s) need to be optimised. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5360) Cancellation of gc should be reflected by RevisionGC.getRevisionGCStatus()
[ https://issues.apache.org/jira/browse/OAK-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5360: -- Fix Version/s: (was: 1.10.0) > Cancellation of gc should be reflected by RevisionGC.getRevisionGCStatus() > -- > > Key: OAK-5360 > URL: https://issues.apache.org/jira/browse/OAK-5360 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, segment-tar >Reporter: Michael Dürig >Assignee: Michael Dürig >Priority: Major > Labels: gc, management, monitoring, production > Fix For: 1.12 > > > Currently when a garbage collection cycle is cancelled from "within" (i.e. > through {{CancelCompactionSupplier}} then this is not reflected through > {{RevisionGC.getRevisionGCStatus()}} but rather reported as successful run. > We should change this and return a failure result indication the cancellation > so downstream consumers get an proper indication whether and which gc runs > actually succeeded. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5369) Lucene Property Index: Syntax Error, cannot parse
[ https://issues.apache.org/jira/browse/OAK-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5369: -- Fix Version/s: (was: 1.10.0) > Lucene Property Index: Syntax Error, cannot parse > - > > Key: OAK-5369 > URL: https://issues.apache.org/jira/browse/OAK-5369 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > The following query throws an exception in Apache Lucene: > {noformat} > /jcr:root//*[jcr:contains(., 'hello -- world')] > 22.12.2016 16:42:54.511 *WARN* [qtp1944702753-3846] > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex query via > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex@1c0006db > failed. > java.lang.RuntimeException: INVALID_SYNTAX_CANNOT_PARSE: Syntax Error, cannot > parse hello -- world: > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.tokenToQuery(LucenePropertyIndex.java:1450) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.tokenToQuery(LucenePropertyIndex.java:1418) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.access$900(LucenePropertyIndex.java:180) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$3.visitTerm(LucenePropertyIndex.java:1353) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$3.visit(LucenePropertyIndex.java:1307) > at > org.apache.jackrabbit.oak.query.fulltext.FullTextContains.accept(FullTextContains.java:63) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.getFullTextQuery(LucenePropertyIndex.java:1303) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.getLuceneRequest(LucenePropertyIndex.java:791) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.access$300(LucenePropertyIndex.java:180) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:375) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:317) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:306) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor$1.hasNext(LucenePropertyIndex.java:1571) > at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.jackrabbit.oak.spi.query.Cursors$PathCursor.hasNext(Cursors.java:205) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor.hasNext(LucenePropertyIndex.java:1595) > at > org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:420) > at > org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:828) > at > org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:853) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:98) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.(QueryResultImpl.java:94) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl.getRows(QueryResultImpl.java:78) > Caused by: > org.apache.lucene.queryparser.flexible.standard.parser.ParseException: Syntax > Error, cannot parse hello -- world: > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.generateParseException(StandardSyntaxParser.java:1054) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.jj_consume_token(StandardSyntaxParser.java:936) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.Clause(StandardSyntaxParser.java:486) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.ModClause(StandardSyntaxParser.java:303) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.ConjQuery(StandardSyntaxParser.java:234) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.DisjQuery(StandardSyntaxParser.java:204) > at >
[jira] [Updated] (OAK-5144) use 'allNodeTypes' of ChangeSet for nodeType-aggregate-filter
[ https://issues.apache.org/jira/browse/OAK-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5144: -- Fix Version/s: (was: 1.10.0) > use 'allNodeTypes' of ChangeSet for nodeType-aggregate-filter > - > > Key: OAK-5144 > URL: https://issues.apache.org/jira/browse/OAK-5144 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: jcr >Affects Versions: 1.5.14 >Reporter: Stefan Egli >Priority: Major > Fix For: 1.12 > > > With OAK-4940 the ChangeSet now contains all node types up to root that are > related to a change. This fact could be used by the nodeType-aggregate-filter > (OAK-5021), which would likely speed up this type of filter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6765) Convert oak-jcr to OSGi R6 annotations
[ https://issues.apache.org/jira/browse/OAK-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6765: -- Fix Version/s: (was: 1.10.0) > Convert oak-jcr to OSGi R6 annotations > -- > > Key: OAK-6765 > URL: https://issues.apache.org/jira/browse/OAK-6765 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: jcr >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5932) Use static optional greedy policy for BlobStore in DocumentNodeStoreService
[ https://issues.apache.org/jira/browse/OAK-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5932: -- Fix Version/s: (was: 1.10.0) > Use static optional greedy policy for BlobStore in DocumentNodeStoreService > --- > > Key: OAK-5932 > URL: https://issues.apache.org/jira/browse/OAK-5932 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > Currently {{DocumentNodeStoreService}} uses DYNAMIC policy for BlobStore and > DataSource. This leads to complexity in activation due to dynamic nature of > OSGi. > To simplify that we should switch to static, greedy policy. This approach is > used by SegmentNodeStoreService as part of OAK-5223 and reduces the setup > complexity quite a bit -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7043) Collect SegmentStore stats as part of status zip
[ https://issues.apache.org/jira/browse/OAK-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7043: -- Fix Version/s: (was: 1.10.0) > Collect SegmentStore stats as part of status zip > > > Key: OAK-7043 > URL: https://issues.apache.org/jira/browse/OAK-7043 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: segment-tar >Reporter: Chetan Mehrotra >Priority: Major > Labels: monitoring, production > Fix For: 1.12 > > > Many times while investigating issue we request customer to provide to size > of segmentstore and at times list of segmentstore directory. It would be > useful if there is an InventoryPrinter for SegmentStore which can include > * Size of segment store > * Listing of segment store directory > * Possibly tail of journal.log > * Possibly some stats/info from index files stored in tar files -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5937) Disable query where path restriction is not absolute
[ https://issues.apache.org/jira/browse/OAK-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5937: -- Fix Version/s: (was: 1.10.0) > Disable query where path restriction is not absolute > > > Key: OAK-5937 > URL: https://issues.apache.org/jira/browse/OAK-5937 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Chetan Mehrotra >Assignee: Thomas Mueller >Priority: Minor > Fix For: 1.12 > > > Query like below cannot be executed in a performant way. We should provide an > option to reject such queries > //content/foo/bar -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-2777) Minimize the cost calculation for queries using reference restrictions.
[ https://issues.apache.org/jira/browse/OAK-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2777: -- Fix Version/s: (was: 1.10.0) > Minimize the cost calculation for queries using reference restrictions. > --- > > Key: OAK-2777 > URL: https://issues.apache.org/jira/browse/OAK-2777 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, query >Affects Versions: 1.1.2, 1.2 >Reporter: Przemo Pakulski >Assignee: Thomas Mueller >Priority: Major > Labels: performance > Fix For: 1.12 > > Attachments: oak-2777.patch > > > According to the javadocs (QueryIndex) minimum cost for index is 1. Currently > ReferenceIndex returns this minimum value, when it can be used for the query. > But even then cost for remaining indexes is still calculated. We could skip > cost calculation of remaining indexes if we achieved the minimum cost already. > It will speed up all queries which can leverage the reference Index. > Example query: > SELECT * FROM [nt:base] WHERE PROPERTY([rep:members], 'WeakReference') = > '345bef9b-ffa1-3e09-85df-1e03cfa0fb37' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6261) Log queries that sort by un-indexed properties
[ https://issues.apache.org/jira/browse/OAK-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6261: -- Fix Version/s: (was: 1.10.0) > Log queries that sort by un-indexed properties > -- > > Key: OAK-6261 > URL: https://issues.apache.org/jira/browse/OAK-6261 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Minor > Fix For: 1.12 > > > Queries that can read many nodes, and sort by properties that are not > indexed, can be very slow. This includes for example fulltext queries. > As a start, it might make sense to log an "info" level message (but avoid > logging the same message each time a query is run). Per configuration, this > could be turned to "warning". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-3767) Provide a way to extend shipped index definitions
[ https://issues.apache.org/jira/browse/OAK-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3767: -- Fix Version/s: (was: 1.10.0) > Provide a way to extend shipped index definitions > - > > Key: OAK-3767 > URL: https://issues.apache.org/jira/browse/OAK-3767 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: indexing, query >Reporter: Davide Giannella >Priority: Major > Fix For: 1.12 > > > We need to provide an explicit support for extending out of the box shipped > index definition by an application built on top of Oak. Consider a Sling > based app which ships with an index on assets like /oak:index/assetIndex. > This application is now used in a project where some project specific > extensions are to be done i.e. some new custom asset properties are to be > indexed. Currently there are two options > # Create new duplicate index - For project usage we can create a separate > index which includes the project specific properties. This has following > downsides > ## Increases index memory consumption - As both /oak:index/assetIndex and > /oak:index/myAssetIndex would index same asset nodes they would be storing > the same asset path twice and hence cause an increase in memory consumption > by the index > # Increase in indexing time - With increase in number of indexes at same > level the indexing time would increase > # Ambiguity in index selection - As both indexes index same type of nodes > they would compete in answering queries related to assets leading to > ambiguity in index selection by query engine. > Given above it would be better to avoid such cases and provide an explicit > support for extending the index definitions. This can be done by enabling > support for adding index definition extensions under a sub directory in a sub > directory under /oak:index > {noformat} > /oak:index > + assetIndex > + apps > + assetIndex > {noformat} > The indexing logic should then use the effective index definition for > indexing and querying. > *question*. Shall we allow this only under root or under any arbitrary path > as well? For example /content. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6309) Not always convert XPath "primaryType in a, b" to union
[ https://issues.apache.org/jira/browse/OAK-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6309: -- Fix Version/s: (was: 1.10.0) > Not always convert XPath "primaryType in a, b" to union > --- > > Key: OAK-6309 > URL: https://issues.apache.org/jira/browse/OAK-6309 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Critical > Fix For: 1.12 > > > Currently, queries with multiple primary types are always converted to a > "union", but this is not alway the best solution. The main problem is that > results are not sorted by score as expected. Example: > {noformat} > /jcr:root/content//element(*, nt:hierarchyNode)[jcr:contains(., 'abc) > and (@jcr:primaryType = 'acme:Page' or @jcr:primaryType = 'acme:Asset')] > {noformat} > This is currently converted to a union, even if the same index is used for > buth subqueries (assuming there is an index on nt:hierarchyNode). > A workaround is to use: > {noformat} > /jcr:root/content//element(*, nt:hierarchyNode)[jcr:contains(., 'abc) > and (./@jcr:primaryType = 'acme:Page' or ./@jcr:primaryType = 'acme:Asset')] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6741) Switch to official OSGi component and metatype annotations
[ https://issues.apache.org/jira/browse/OAK-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6741: -- Fix Version/s: (was: 1.10.0) > Switch to official OSGi component and metatype annotations > -- > > Key: OAK-6741 > URL: https://issues.apache.org/jira/browse/OAK-6741 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > Attachments: OAK-6741-proposed-changes-chetans-feedback.patch, > osgi-metadata-1.7.8.json, osgi-metadata-trunk.json > > > We should remove the 'old' Felix SCR annotations and move to the 'new' OSGi > R6 annotations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7425) Add discovery mechanism for tooling implementations
[ https://issues.apache.org/jira/browse/OAK-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7425: -- Fix Version/s: (was: 1.10.0) > Add discovery mechanism for tooling implementations > --- > > Key: OAK-7425 > URL: https://issues.apache.org/jira/browse/OAK-7425 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari >Priority: Major > Labels: technical_debt > Fix For: 1.12 > > Attachments: 001.patch > > > This issue proposes an idea for discovering implementations of tooling for > the Segment Store. Developing a tool for the Segment Store should include the > following step. > * The tool compiles against the {{NodeStore}} API and the API exposed through > the oak-segment-tar-tool-api. In particular, the tool uses the > {{ToolingSupportFactory}} and related interfaces to instantiate a NodeStore > and, optionally, a {{NodeState}} for the proc tree. > * The tool runs with an implementation-dependent uber-jar in the classpath. > The uber-jar includes the {{ToolingSupportFactory}} API, its implementation, > and every other class required for the implementation to work. No other JARs > is required to use the {{ToolingSupportFactory}} API. The tool uses the > Java's {{ServiceLoader}} to instantiate an implementation of > {{ToolingSupportFactory}}. The uber-jar is the {{oak-segment-tar-tool}} > module. > The patch falls short of fully implementing the use case because > {{oak-segment-tar-tool-api}} is not versioned independently from Oak. This > can't happen at the moment because {{oak-store-spi}} and its dependencies are > not independently versioned either. The workflow described above could still > work, but only because the {{NodeStore}} and {{NodeState}} API are quite > stable. A cleaner solution to dependency management is required in the long > run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4760) Adjust default timeout values for RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4760: -- Fix Version/s: (was: 1.10.0) > Adjust default timeout values for RDBDocumentStore > -- > > Key: OAK-4760 > URL: https://issues.apache.org/jira/browse/OAK-4760 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, rdbmk >Reporter: Marcel Reutegger >Assignee: Julian Reschke >Priority: Minor > Labels: resilience > Fix For: 1.12 > > > Some default values timeouts of the RDBDocumentStore driver do not work well > with the lease time we use in Oak. > See also OAK-4739. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6911) Provide a way to tune inline size while storing binaries
[ https://issues.apache.org/jira/browse/OAK-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6911: -- Fix Version/s: (was: 1.10.0) > Provide a way to tune inline size while storing binaries > > > Key: OAK-6911 > URL: https://issues.apache.org/jira/browse/OAK-6911 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Chetan Mehrotra >Priority: Major > Labels: performance, scalability > Fix For: 1.12 > > > SegmentNodeStore currently inlines binaries of size less that 16KB > (Segment.MEDIUM_LIMIT) even if external BlobStore is configured. > Due to this behaviour quite a bit of segment tar storage consist of blob > data. In one setup out of 370 GB segmentstore size 290GB is due to inlined > binary. If most of this binary content is moved to BlobStore then it would > allow same repository to work better in lesser RAM > So it would be useful if some way is provided to disable this default > behaviour and let BlobStore take control of inline size i.e. in presence of > BlobStore no inlining is attempted by SegmentWriter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7381) Reduce debug log output for queries
[ https://issues.apache.org/jira/browse/OAK-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7381: -- Fix Version/s: (was: 1.10.0) > Reduce debug log output for queries > --- > > Key: OAK-7381 > URL: https://issues.apache.org/jira/browse/OAK-7381 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > When enabling the debug log level, running a query can log a lot. That can > slow down executing a large query quite a lot. The amount of logged data > should be reduced. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-2787) Faster multi threaded indexing / text extraction for binary content
[ https://issues.apache.org/jira/browse/OAK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2787: -- Fix Version/s: (was: 1.10.0) > Faster multi threaded indexing / text extraction for binary content > --- > > Key: OAK-2787 > URL: https://issues.apache.org/jira/browse/OAK-2787 > Project: Jackrabbit Oak > Issue Type: Wish > Components: lucene >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > With Lucene based indexing the indexing process is single threaded. This > hamper the indexing of binary content as on a multi processor system only > single thread can be used to perform the indexing > [~ianeboston] Suggested a possible approach [1] involving a 2 phase indexing > # In first phase detect the nodes to be indexed and start the full text > extraction of the binary content. Post extraction save the binary token > stream back to the node as a hidden data. In this phase the node properties > can still be indexed and a marker field would be added to indicate the > fulltext index is still pending > # Later in 2nd phase look for all such Lucene docs and then update them with > the saved token stream > This would allow the text extraction logic to be decouple from Lucene > indexing logic > [1] http://markmail.org/thread/2w5o4bwqsosb6esu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6303) Cache in CachingBlobStore might grow beyond configured limit
[ https://issues.apache.org/jira/browse/OAK-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6303: -- Fix Version/s: (was: 1.10.0) > Cache in CachingBlobStore might grow beyond configured limit > > > Key: OAK-6303 > URL: https://issues.apache.org/jira/browse/OAK-6303 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob, core >Reporter: Julian Reschke >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > Attachments: OAK-6303-test.diff, OAK-6303.diff > > > It appears that depending on actual cache entry sizes, the {{CacheLIRS}} > might grow beyond the configured limit. > For {{RDBBlobStore}}, the limit is currently configured to 16MB, yet storing > random 2M entries appears to fill the cache with 64MB of data (according to > it's own stats). > The attached test case reproduces this. > (it seems this is caused by the fact that each of the 16 segments of the > cache can hold 2 entries, no matter how big they are...) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5553) Index async index in a new lane without blocking the main lane
[ https://issues.apache.org/jira/browse/OAK-5553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5553: -- Fix Version/s: (was: 1.10.0) > Index async index in a new lane without blocking the main lane > -- > > Key: OAK-5553 > URL: https://issues.apache.org/jira/browse/OAK-5553 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: indexing >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > Currently if an async index has to be reindex for any reason say update of > index definition then this process blocks the indexing of other indexes on > that lane. > For e.g. if on "async" lane we have 2 indexes /oak:index/fooIndex and > /oak:index/barIndex and fooIndex needs to be reindexed. In such a case > currently AsyncIndexUpdate would work on reindexing and untill that gets > complete other index do not receive any update. If the reindexing takes say 1 > day then other index would start lagging behind by that time. Note that NRT > indexing would help somewhat here. > To improve this we can implement something similar to what was done for > property index in OAK-1456 i.e. provide a way where > # an admin can trigger reindex of some async indexes > # those indexes are moved to different lane and then reindexed > # post reindexing logic should then move them back to there original lane > Further this task can then be performed on non leader node as the indexes > would not be part of any active lane. Also we may implement it as part of > oak-run -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7098) Refactor common logic between IndexUpdate and DocumentStoreIndexer
[ https://issues.apache.org/jira/browse/OAK-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7098: -- Fix Version/s: (was: 1.10.0) > Refactor common logic between IndexUpdate and DocumentStoreIndexer > -- > > Key: OAK-7098 > URL: https://issues.apache.org/jira/browse/OAK-7098 > Project: Jackrabbit Oak > Issue Type: Task > Components: indexing, run >Reporter: Chetan Mehrotra >Priority: Major > Fix For: 1.12 > > > DocumentStoreIndexer implements an alternative way of indexing which differs > from diff based indexing done by IndexUpdate. However some part of logic is > commong > We should refactor them and abstract them out so both can share same logic -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-2538) Support index time aggregation in Solr index
[ https://issues.apache.org/jira/browse/OAK-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-2538: -- Fix Version/s: (was: 1.10.0) > Support index time aggregation in Solr index > > > Key: OAK-2538 > URL: https://issues.apache.org/jira/browse/OAK-2538 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: solr >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Major > Labels: performance > Fix For: 1.12 > > > Solr index is only able to do query time aggregation while that "would not > perform well for multi term searches as each term involves a separate call > and with intersection cursor being used the operation might result in reading > up all match terms even when user accesses only first page", therefore it'd > be good to implement index time aggregation like in Lucene index. (/cc > [~chetanm]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7328) Update DocumentNodeStore based OakFixture
[ https://issues.apache.org/jira/browse/OAK-7328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7328: -- Fix Version/s: (was: 1.10.0) > Update DocumentNodeStore based OakFixture > - > > Key: OAK-7328 > URL: https://issues.apache.org/jira/browse/OAK-7328 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: run >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.12 > > > The current OakFixtures using a DocumentNodeStore use a configuration / setup > which is different from what a default DocumentNodeStoreService would use. It > would be better if benchmarks run with a configuration close to a default > setup. The main differences identified are: > - Does not have a proper executor, which means some tasks are executed with > the same thread. > - Does not use a separate persistent cache for the journal (diff cache). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7938) Test failure: MBeanIT.testClientAndServerEmptyConfig
[ https://issues.apache.org/jira/browse/OAK-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-7938: -- Fix Version/s: (was: 1.10.0) > Test failure: MBeanIT.testClientAndServerEmptyConfig > > > Key: OAK-7938 > URL: https://issues.apache.org/jira/browse/OAK-7938 > Project: Jackrabbit Oak > Issue Type: Bug > Components: continuous integration, segment-tar >Reporter: Hudson >Priority: Major > Fix For: 1.12 > > > No description is provided > The build Jackrabbit Oak #1826 has failed. > First failed run: [Jackrabbit Oak > #1826|https://builds.apache.org/job/Jackrabbit%20Oak/1826/] [console > log|https://builds.apache.org/job/Jackrabbit%20Oak/1826/console] > {noformat} > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.782 > s <<< FAILURE! - in org.apache.jackrabbit.oak.segment.standby.MBeanIT > [ERROR] > testClientAndServerEmptyConfig(org.apache.jackrabbit.oak.segment.standby.MBeanIT) > Time elapsed: 2.499 s <<< FAILURE! > org.junit.ComparisonFailure: expected:<[0]> but was:<[1]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.jackrabbit.oak.segment.standby.MBeanIT.testClientAndServerEmptyConfig(MBeanIT.java:194) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-5291) Per-Query Limits (nodes read, nodes read in memory)
[ https://issues.apache.org/jira/browse/OAK-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-5291: -- Fix Version/s: (was: 1.10.0) > Per-Query Limits (nodes read, nodes read in memory) > --- > > Key: OAK-5291 > URL: https://issues.apache.org/jira/browse/OAK-5291 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.12 > > > In OAK-1395 we added limits for long running queries. In OAK-1571 we added > OSGi configuration. In OAK-5237 we change the default settings. > It would be nice to be able to define the limits per query, similar to > OAK-4888. The query would look like (for example, to limit reading to 1 > million nodes, even if the default query limit is lower): > {noformat} > select * from [nt:base] > where ischildnode('/oak:index') > order by name() > option(traversal ok, limit 100) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6597) rep:excerpt not working for content indexed by aggregation in lucene
[ https://issues.apache.org/jira/browse/OAK-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6597: -- Fix Version/s: (was: 1.10.0) > rep:excerpt not working for content indexed by aggregation in lucene > > > Key: OAK-6597 > URL: https://issues.apache.org/jira/browse/OAK-6597 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Affects Versions: 1.6.1, 1.7.6, 1.8.0 >Reporter: Dirk Rudolph >Assignee: Chetan Mehrotra >Priority: Major > Labels: excerpt > Fix For: 1.12 > > Attachments: excerpt-with-aggregation-test.patch > > > I mentioned that properties that got indexed due to an aggregation are not > considered for excerpts (highlighting) as they are not indexed as stored > fields. > See the attached patch that implements a test for excerpts in > {{LuceneIndexAggregationTest2}}. > It creates the following structure: > {code} > /content/foo [test:Page] > + bar (String) > - jcr:content [test:PageContent] > + bar (String) > {code} > where both strings (the _bar_ property at _foo_ and the _bar_ property at > _jcr:content_) contain different text. > Afterwards it queries for 2 terms ("tinc*" and "aliq*") that either exist in > _/content/foo/bar_ or _/content/foo/jcr:content/bar_ but not in both. For the > former one the excerpt is properly provided for the later one it isn't. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4814) Add orderby support for nodename index
[ https://issues.apache.org/jira/browse/OAK-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4814: -- Fix Version/s: (was: 1.10.0) > Add orderby support for nodename index > -- > > Key: OAK-4814 > URL: https://issues.apache.org/jira/browse/OAK-4814 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Affects Versions: 1.5.10 >Reporter: Ankush Malhotra >Assignee: Vikas Saurabh >Priority: Major > Fix For: 1.12 > > > In OAK-1752 you have implemented the index support for :nodeName. The JCR > Query explain tool shows that it is used for conditions like equals. > But it is not used for ORDER BY name() . > Is name() supported in order by clause? If yes then we would need to add > support for that in oak-lucene -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6758) Convert oak-authorization-cug to OSGi R6 annotations
[ https://issues.apache.org/jira/browse/OAK-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-6758: -- Fix Version/s: (was: 1.10.0) > Convert oak-authorization-cug to OSGi R6 annotations > > > Key: OAK-6758 > URL: https://issues.apache.org/jira/browse/OAK-6758 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: authorization-cug >Reporter: Robert Munteanu >Priority: Major > Fix For: 1.12 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-3598) Export cache related classes for usage in other oak bundle
[ https://issues.apache.org/jira/browse/OAK-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3598: -- Fix Version/s: (was: 1.10.0) > Export cache related classes for usage in other oak bundle > -- > > Key: OAK-3598 > URL: https://issues.apache.org/jira/browse/OAK-3598 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: cache >Reporter: Chetan Mehrotra >Priority: Major > Labels: tech-debt > Fix For: 1.12 > > > For OAK-3092 oak-lucene would need to access classes from > {{org.apache.jackrabbit.oak.cache}} package. For now its limited to > {{CacheStats}} to expose the cache related statistics. > This task is meant to determine steps needed to export the package > * Update the pom.xml to export the package > * Review current set of classes to see if they need to be reviewed -- This message was sent by Atlassian JIRA (v7.6.3#76005)