[jira] [Created] (OAK-10662) improve Reproducible Builds
Herve Boutemy created OAK-10662: --- Summary: improve Reproducible Builds Key: OAK-10662 URL: https://issues.apache.org/jira/browse/OAK-10662 Project: Jackrabbit Oak Issue Type: Improvement Affects Versions: 1.60.0 Reporter: Herve Boutemy Fix For: 1.62.0 release 1.60.0 is quite good: 143 ok, 11 ko there are some easy fixes and probably harder ones later see https://github.com/jvm-repo-rebuild/reproducible-central/blob/master/content/org/apache/jackrabbit/oak/README.md -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OAK-10654) Build Jackrabbit/jackrabbit-oak-trunk #1363 failed
[ https://issues.apache.org/jira/browse/OAK-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820157#comment-17820157 ] Hudson commented on OAK-10654: -- Previously failing build now is OK. Passed run: [Jackrabbit/jackrabbit-oak-trunk #1373|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/1373/] [console log|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/1373/console] > Build Jackrabbit/jackrabbit-oak-trunk #1363 failed > -- > > Key: OAK-10654 > URL: https://issues.apache.org/jira/browse/OAK-10654 > Project: Jackrabbit Oak > Issue Type: Bug > Components: continuous integration >Reporter: Hudson >Priority: Major > > No description is provided > The build Jackrabbit/jackrabbit-oak-trunk #1363 has failed. > First failed run: [Jackrabbit/jackrabbit-oak-trunk > #1363|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/1363/] > [console > log|https://ci-builds.apache.org/job/Jackrabbit/job/jackrabbit-oak-trunk/1363/console] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
[ https://issues.apache.org/jira/browse/OAK-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820143#comment-17820143 ] Julian Reschke commented on OAK-10660: -- OK, this might be easier than expected. PR soonish. > DocumentNodeStore: avoid repeated commits of :childOrder in branch commits > -- > > Key: OAK-10660 > URL: https://issues.apache.org/jira/browse/OAK-10660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > - While persisting the branch commits, we are persisting large :childOrder > properties repeatedly. In practice, only the last value is needed, so the > previous ones could be cleaned up. > - We currently do not keep information about when (revision) and where (_id) > we have set :childOrder. > - The "clean" approach would be to maintain a map of _id/revision that tells > us in which revision we last set :childOrder. That could be used to pair the > setting of the new value with a removal of the previous one. > - But we may be able to simplify that: just maintain a list of _all_ > revisions that changed :childOrder, and any time we need to set a new value > for :childOrder, nuke the entries for all of these revisions. This would be > harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, > except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OAK-10661) oak-search-elastic: remove workaround for elastic/elasticsearch-java/issues/404
Fabrizio Fortino created OAK-10661: -- Summary: oak-search-elastic: remove workaround for elastic/elasticsearch-java/issues/404 Key: OAK-10661 URL: https://issues.apache.org/jira/browse/OAK-10661 Project: Jackrabbit Oak Issue Type: Improvement Components: search, search-elastic Reporter: Fabrizio Fortino Assignee: Fabrizio Fortino https://github.com/elastic/elasticsearch-java/issues/404 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OAK-10657) MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB limit
[ https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820072#comment-17820072 ] Manfred Baedke commented on OAK-10657: -- Feature toggle now implemented, see PR. > MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB > limit > --- > > Key: OAK-10657 > URL: https://issues.apache.org/jira/browse/OAK-10657 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk, mongomk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > To address the 16MB/childorder issue, there are many potential approaches: > - make GC more aggressive > - try to change updates to remove "in-between" changes of ":childOrder" > property > - change the data model of ":childOrder" > - try to shrink document in DB once size related exception happens > This ticket is about the last of these options. > Proposal: > - improve exception thrown by document store so that it can be acted upon > - in document store utils add a method that inspects a document and produces > UpdateOps suitable to shrink the document > - DocumentNodeStore commit could catch exception, obtain update ops, apply > them, and retry once (this should be dependant on a feature toggle) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
[ https://issues.apache.org/jira/browse/OAK-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820045#comment-17820045 ] Julian Reschke edited comment on OAK-10660 at 2/23/24 12:42 PM: Here: {noformat} diff --git a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java index 1b04f62fa5..5cdf3901da 100644 --- a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java +++ b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java @@ -78,6 +78,9 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { /** The maximum number of updates to keep in memory */ private final int updateLimit;+ /** Revisions written by us */ + private final Set revisions = new HashSet<>(); + /** * State of the this branch. Either {@link Unmodified}, {@link InMemory}, {@link Persisted}, * {@link ResetFailed} or {@link Merged}. @@ -321,6 +324,7 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { c.apply(); rev = store.done(c, base.getRootRevision().isBranch(), info); success = true; + revisions.add(c.getRevision()); } finally { if (!success) { store.canceled(c); {noformat} would be a good place to track what revisions are relevant. Now we need to figure out how to pass this down to the place where we create the UpdateOps. was (Author: reschke): Here: {noformat} diff --git a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java index 1b04f62fa5..5cdf3901da 100644 --- a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java +++ b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java @@ -78,6 +78,9 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { /** The maximum number of updates to keep in memory */ private final int updateLimit;+ /** Revisions written by us */ + private final Set revisions = new HashSet<>(); + /** * State of the this branch. Either {@link Unmodified}, {@link InMemory}, {@link Persisted}, * {@link ResetFailed} or {@link Merged}. @@ -321,6 +324,7 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { c.apply(); rev = store.done(c, base.getRootRevision().isBranch(), info); success = true; + revisions.add(c.getRevision()); } finally { if (!success) { store.canceled(c); {noformat} would be a good place to track what revisions are relevant. Now we need to figure out how to pass this down to the place where we create the {UpdateOp}s. > DocumentNodeStore: avoid repeated commits of :childOrder in branch commits > -- > > Key: OAK-10660 > URL: https://issues.apache.org/jira/browse/OAK-10660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > - While persisting the branch commits, we are persisting large :childOrder > properties repeatedly. In practice, only the last value is needed, so the > previous ones could be cleaned up. > - We currently do not keep information about when (revision) and where (_id) > we have set :childOrder. > - The "clean" approach would be to maintain a map of _id/revision that tells > us in which revision we last set :childOrder. That could be used to pair the > setting of the new value with a removal of the previous one. > - But we may be able to simplify that: just maintain a list of _all_ > revisions that changed :childOrder, and any time we need to set a new value > for :childOrder, nuke the entries for all of these revisions. This would be > harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, > except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
[ https://issues.apache.org/jira/browse/OAK-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820045#comment-17820045 ] Julian Reschke commented on OAK-10660: -- Here: {noformat} diff --git a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java index 1b04f62fa5..5cdf3901da 100644 --- a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java +++ b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java @@ -78,6 +78,9 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { /** The maximum number of updates to keep in memory */ private final int updateLimit;+ /** Revisions written by us */ + private final Set revisions = new HashSet<>(); + /** * State of the this branch. Either {@link Unmodified}, {@link InMemory}, {@link Persisted}, * {@link ResetFailed} or {@link Merged}. @@ -321,6 +324,7 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { c.apply(); rev = store.done(c, base.getRootRevision().isBranch(), info); success = true; + revisions.add(c.getRevision()); } finally { if (!success) { store.canceled(c); {noformat} would be a good place to track what revisions are relevant. Now we need to figure out how to pass this down to the place where we create the {UpdateOp}s. > DocumentNodeStore: avoid repeated commits of :childOrder in branch commits > -- > > Key: OAK-10660 > URL: https://issues.apache.org/jira/browse/OAK-10660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > - While persisting the branch commits, we are persisting large :childOrder > properties repeatedly. In practice, only the last value is needed, so the > previous ones could be cleaned up. > - We currently do not keep information about when (revision) and where (_id) > we have set :childOrder. > - The "clean" approach would be to maintain a map of _id/revision that tells > us in which revision we last set :childOrder. That could be used to pair the > setting of the new value with a removal of the previous one. > - But we may be able to simplify that: just maintain a list of _all_ > revisions that changed :childOrder, and any time we need to set a new value > for :childOrder, nuke the entries for all of these revisions. This would be > harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, > except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OAK-10657) MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB limit
[ https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819969#comment-17819969 ] Julian Reschke edited comment on OAK-10657 at 2/23/24 12:39 PM: Trying to write down an alternate approach...: - While persisting the branch commits, we are persisting large :childOrder properties repeatedly. In practice, only the last value is needed, so the previous ones could be cleaned up. - We currently do not keep information about when (revision) and where (_id) we have set :childOrder. - The "clean" approach would be to maintain a map of _id/revision that tells us in which revision we last set :childOrder. That could be used to pair the setting of the new value with a removal of the previous one. - But we may be able to simplify that: just maintain a list of _all_ revisions that changed :childOrder, and any time we need to set a new value for :childOrder, nuke the entries for all of these revisions. This would be harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, except fo ra small overhead in processing. EDIT: opened OAK-10660 to track this was (Author: reschke): Trying to write down an alternate approach...: - While persisting the branch commits, we are persisting large :childOrder properties repeatedly. In practice, only the last value is needed, so the previous ones could be cleaned up. - We currently do not keep information about when (revision) and where (_id) we have set :childOrder. - The "clean" approach would be to maintain a map of _id/revision that tells us in which revision we last set :childOrder. That could be used to pair the setting of the new value with a removal of the previous one. - But we may be able to simplify that: just maintain a list of _all_ revisions that changed :childOrder, and any time we need to set a new value for :childOrder, nuke the entries for all of these revisions. This would be harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, except fo ra small overhead in processing. > MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB > limit > --- > > Key: OAK-10657 > URL: https://issues.apache.org/jira/browse/OAK-10657 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk, mongomk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > To address the 16MB/childorder issue, there are many potential approaches: > - make GC more aggressive > - try to change updates to remove "in-between" changes of ":childOrder" > property > - change the data model of ":childOrder" > - try to shrink document in DB once size related exception happens > This ticket is about the last of these options. > Proposal: > - improve exception thrown by document store so that it can be acted upon > - in document store utils add a method that inspects a document and produces > UpdateOps suitable to shrink the document > - DocumentNodeStore commit could catch exception, obtain update ops, apply > them, and retry once (this should be dependant on a feature toggle) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
Julian Reschke created OAK-10660: Summary: DocumentNodeStore: avoid repeated commits of :childOrder in branch commits Key: OAK-10660 URL: https://issues.apache.org/jira/browse/OAK-10660 Project: Jackrabbit Oak Issue Type: Improvement Components: documentmk Reporter: Julian Reschke Assignee: Julian Reschke - While persisting the branch commits, we are persisting large :childOrder properties repeatedly. In practice, only the last value is needed, so the previous ones could be cleaned up. - We currently do not keep information about when (revision) and where (_id) we have set :childOrder. - The "clean" approach would be to maintain a map of _id/revision that tells us in which revision we last set :childOrder. That could be used to pair the setting of the new value with a removal of the previous one. - But we may be able to simplify that: just maintain a list of _all_ revisions that changed :childOrder, and any time we need to set a new value for :childOrder, nuke the entries for all of these revisions. This would be harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OAK-10657) MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB limit
[ https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819969#comment-17819969 ] Julian Reschke commented on OAK-10657: -- Trying to write down an alternate approach...: - While persisting the branch commits, we are persisting large :childOrder properties repeatedly. In practice, only the last value is needed, so the previous ones could be cleaned up. - We currently do not keep information about when (revision) and where (_id) we have set :childOrder. - The "clean" approach would be to maintain a map of _id/revision that tells us in which revision we last set :childOrder. That could be used to pair the setting of the new value with a removal of the previous one. - But we may be able to simplify that: just maintain a list of _all_ revisions that changed :childOrder, and any time we need to set a new value for :childOrder, nuke the entries for all of these revisions. This would be harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, except fo ra small overhead in processing. > MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB > limit > --- > > Key: OAK-10657 > URL: https://issues.apache.org/jira/browse/OAK-10657 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk, mongomk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > To address the 16MB/childorder issue, there are many potential approaches: > - make GC more aggressive > - try to change updates to remove "in-between" changes of ":childOrder" > property > - change the data model of ":childOrder" > - try to shrink document in DB once size related exception happens > This ticket is about the last of these options. > Proposal: > - improve exception thrown by document store so that it can be acted upon > - in document store utils add a method that inspects a document and produces > UpdateOps suitable to shrink the document > - DocumentNodeStore commit could catch exception, obtain update ops, apply > them, and retry once (this should be dependant on a feature toggle) -- This message was sent by Atlassian Jira (v8.20.10#820010)