[jira] [Assigned] (OAK-6829) ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures

2022-03-04 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-6829:
---

Assignee: (was: Francesco Mari)

> ExternalPrivateStoreIT/ExternalSharedStoreIT.testSyncBigBlob failures
> -
>
> Key: OAK-6829
> URL: https://issues.apache.org/jira/browse/OAK-6829
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.7.9
>Reporter: Julian Reschke
>Priority: Major
> Fix For: 1.7.11, 1.8.0
>
> Attachments: 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.xml, 
> TEST-org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT.xml, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt, 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT-output.txt
>
>
> {noformat}
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 27.921 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> Running org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 93.353 sec 
> <<< FAILURE! - in 
> org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT
> testSyncBigBlob(org.apache.jackrabbit.oak.segment.standby.ExternalSharedStoreIT)
>   Time elapsed: 30.772 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (OAK-8555) Move ByteBuffer wrapper to oak-commons project

2019-08-21 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-8555.
-
Fix Version/s: 1.18.0
   Resolution: Fixed

Fixed at r1865623.

> Move ByteBuffer wrapper to oak-commons project
> --
>
> Key: OAK-8555
> URL: https://issues.apache.org/jira/browse/OAK-8555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons, segment-tar
>Reporter: José Andrés Cordero Benítez
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: move-Buffer-wrapper-svn.patch, 
> move-Buffer-wrapper-with-tests.patch
>
>
> As discussed with [~frm] and [~mreutegg], the wrapper 
> org.apache.jackrabbit.oak.segment.spi.persistence.Buffer should be moved to 
> the commons project to make it more accessible in other projects that need to 
> use it to fix OAK-7457.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8555) Move ByteBuffer wrapper to oak-commons project

2019-08-21 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912270#comment-16912270
 ] 

Francesco Mari commented on OAK-8555:
-

[~corderob], thanks for your contribution!

> Move ByteBuffer wrapper to oak-commons project
> --
>
> Key: OAK-8555
> URL: https://issues.apache.org/jira/browse/OAK-8555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons, segment-tar
>Reporter: José Andrés Cordero Benítez
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: move-Buffer-wrapper-svn.patch, 
> move-Buffer-wrapper-with-tests.patch
>
>
> As discussed with [~frm] and [~mreutegg], the wrapper 
> org.apache.jackrabbit.oak.segment.spi.persistence.Buffer should be moved to 
> the commons project to make it more accessible in other projects that need to 
> use it to fix OAK-7457.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (OAK-8555) Move ByteBuffer wrapper to oak-commons project

2019-08-21 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-8555:
---

Assignee: Francesco Mari

> Move ByteBuffer wrapper to oak-commons project
> --
>
> Key: OAK-8555
> URL: https://issues.apache.org/jira/browse/OAK-8555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons, segment-tar
>Reporter: José Andrés Cordero Benítez
>Assignee: Francesco Mari
>Priority: Major
> Attachments: move-Buffer-wrapper-svn.patch, 
> move-Buffer-wrapper-with-tests.patch
>
>
> As discussed with [~frm] and [~mreutegg], the wrapper 
> org.apache.jackrabbit.oak.segment.spi.persistence.Buffer should be moved to 
> the commons project to make it more accessible in other projects that need to 
> use it to fix OAK-7457.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8555) Move ByteBuffer wrapper to oak-commons project

2019-08-21 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912264#comment-16912264
 ] 

Francesco Mari commented on OAK-8555:
-

[~corderob], the patch looks good to me. I will commit it in trunk.

> Move ByteBuffer wrapper to oak-commons project
> --
>
> Key: OAK-8555
> URL: https://issues.apache.org/jira/browse/OAK-8555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons, segment-tar
>Reporter: José Andrés Cordero Benítez
>Priority: Major
> Attachments: move-Buffer-wrapper-svn.patch, 
> move-Buffer-wrapper-with-tests.patch
>
>
> As discussed with [~frm] and [~mreutegg], the wrapper 
> org.apache.jackrabbit.oak.segment.spi.persistence.Buffer should be moved to 
> the commons project to make it more accessible in other projects that need to 
> use it to fix OAK-7457.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8559) Backport OAK-8066 to 1.10 and 1.8

2019-08-20 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911257#comment-16911257
 ] 

Francesco Mari commented on OAK-8559:
-

[~bhardwajrahul20], thanks for your contribution!

> Backport OAK-8066 to 1.10 and 1.8
> -
>
> Key: OAK-8559
> URL: https://issues.apache.org/jira/browse/OAK-8559
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Blocker
>  Labels: TarMK
> Fix For: 1.8.16, 1.10.5
>
>
> Backport OAK-8066 to 1.10 and 1.8.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (OAK-8559) Backport OAK-8066 to 1.10 and 1.8

2019-08-20 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-8559.
-
Resolution: Fixed

> Backport OAK-8066 to 1.10 and 1.8
> -
>
> Key: OAK-8559
> URL: https://issues.apache.org/jira/browse/OAK-8559
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Blocker
>  Labels: TarMK
> Fix For: 1.8.16, 1.10.5
>
>
> Backport OAK-8066 to 1.10 and 1.8.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (OAK-8066) Nodes with many direct children can lead to OOME when saving

2019-08-20 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8066:

Fix Version/s: 1.8.16

> Nodes with many direct children can lead to OOME when saving
> 
>
> Key: OAK-8066
> URL: https://issues.apache.org/jira/browse/OAK-8066
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Major
>  Labels: TarMK
> Fix For: 1.12.0, 1.8.16, 1.10.5
>
>
> {{DefaultSegmentWriter}} keeps a map of [child 
> nodes|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L805]
>  of a node being written. This can lead to high memory consumption in the 
> case where many child nodes are added at the same time. The latter could 
> happen in the case where a node needs to be rewritten because of an increase 
> in the GC generation from a concurrently completed revision garbage 
> collection.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8559) Backport OAK-8066 to 1.10 and 1.8

2019-08-20 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911255#comment-16911255
 ] 

Francesco Mari commented on OAK-8559:
-

Backported to 1.8 at 1865533.

> Backport OAK-8066 to 1.10 and 1.8
> -
>
> Key: OAK-8559
> URL: https://issues.apache.org/jira/browse/OAK-8559
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Blocker
>  Labels: TarMK
> Fix For: 1.8.16, 1.10.5
>
>
> Backport OAK-8066 to 1.10 and 1.8.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8066) Nodes with many direct children can lead to OOME when saving

2019-08-20 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911256#comment-16911256
 ] 

Francesco Mari commented on OAK-8066:
-

Backported to 1.8 at 1865533.

> Nodes with many direct children can lead to OOME when saving
> 
>
> Key: OAK-8066
> URL: https://issues.apache.org/jira/browse/OAK-8066
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Major
>  Labels: TarMK
> Fix For: 1.12.0, 1.10.5
>
>
> {{DefaultSegmentWriter}} keeps a map of [child 
> nodes|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L805]
>  of a node being written. This can lead to high memory consumption in the 
> case where many child nodes are added at the same time. The latter could 
> happen in the case where a node needs to be rewritten because of an increase 
> in the GC generation from a concurrently completed revision garbage 
> collection.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (OAK-8066) Nodes with many direct children can lead to OOME when saving

2019-08-20 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8066:

Fix Version/s: 1.10.5

> Nodes with many direct children can lead to OOME when saving
> 
>
> Key: OAK-8066
> URL: https://issues.apache.org/jira/browse/OAK-8066
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Major
>  Labels: TarMK
> Fix For: 1.12.0, 1.10.5
>
>
> {{DefaultSegmentWriter}} keeps a map of [child 
> nodes|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L805]
>  of a node being written. This can lead to high memory consumption in the 
> case where many child nodes are added at the same time. The latter could 
> happen in the case where a node needs to be rewritten because of an increase 
> in the GC generation from a concurrently completed revision garbage 
> collection.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8559) Backport OAK-8066 to 1.10 and 1.8

2019-08-20 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911233#comment-16911233
 ] 

Francesco Mari commented on OAK-8559:
-

Backported to 1.10 at 1865531.

> Backport OAK-8066 to 1.10 and 1.8
> -
>
> Key: OAK-8559
> URL: https://issues.apache.org/jira/browse/OAK-8559
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Blocker
>  Labels: TarMK
> Fix For: 1.8.16, 1.10.5
>
>
> Backport OAK-8066 to 1.10 and 1.8.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8066) Nodes with many direct children can lead to OOME when saving

2019-08-20 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911234#comment-16911234
 ] 

Francesco Mari commented on OAK-8066:
-

Backported to 1.10 at 1865531.

> Nodes with many direct children can lead to OOME when saving
> 
>
> Key: OAK-8066
> URL: https://issues.apache.org/jira/browse/OAK-8066
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Major
>  Labels: TarMK
> Fix For: 1.12.0
>
>
> {{DefaultSegmentWriter}} keeps a map of [child 
> nodes|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L805]
>  of a node being written. This can lead to high memory consumption in the 
> case where many child nodes are added at the same time. The latter could 
> happen in the case where a node needs to be rewritten because of an increase 
> in the GC generation from a concurrently completed revision garbage 
> collection.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (OAK-8559) Backport OAK-8066 to 1.10 and 1.8

2019-08-20 Thread Francesco Mari (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8559:

Labels: TarMK  (was: )

> Backport OAK-8066 to 1.10 and 1.8
> -
>
> Key: OAK-8559
> URL: https://issues.apache.org/jira/browse/OAK-8559
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Blocker
>  Labels: TarMK
> Fix For: 1.8.16, 1.10.5
>
>
> Backport OAK-8066 to 1.10 and 1.8.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (OAK-8559) Backport OAK-8066 to 1.10 and 1.8

2019-08-20 Thread Francesco Mari (Jira)
Francesco Mari created OAK-8559:
---

 Summary: Backport OAK-8066 to 1.10 and 1.8
 Key: OAK-8559
 URL: https://issues.apache.org/jira/browse/OAK-8559
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: 1.8.16, 1.10.5


Backport OAK-8066 to 1.10 and 1.8.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8555) Move ByteBuffer wrapper to oak-commons project

2019-08-19 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910476#comment-16910476
 ] 

Francesco Mari commented on OAK-8555:
-

[~corderob], moreover, when you will adjust the package statement and try to 
compile the project, you will see that the baseline plugin will start 
complaining. Adding a class to an existing package forces to to increase the 
minor component of the package version. You will find the package version in 
`package-info.java`.

> Move ByteBuffer wrapper to oak-commons project
> --
>
> Key: OAK-8555
> URL: https://issues.apache.org/jira/browse/OAK-8555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons, segment-tar
>Reporter: José Andrés Cordero Benítez
>Priority: Major
> Attachments: move-Buffer-wrapper-svn.patch
>
>
> As discussed with [~frm] and [~mreutegg], the wrapper 
> org.apache.jackrabbit.oak.segment.spi.persistence.Buffer should be moved to 
> the commons project to make it more accessible in other projects that need to 
> use it to fix OAK-7457.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (OAK-8555) Move ByteBuffer wrapper to oak-commons project

2019-08-19 Thread Francesco Mari (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910476#comment-16910476
 ] 

Francesco Mari edited comment on OAK-8555 at 8/19/19 3:13 PM:
--

[~corderob], moreover, when you will adjust the package statement and try to 
compile the project, you will see that the baseline plugin will start 
complaining. Adding a class to an existing package forces to to increase the 
minor component of the package version. You will find the package version in 
{{package-info.java}}.


was (Author: frm):
[~corderob], moreover, when you will adjust the package statement and try to 
compile the project, you will see that the baseline plugin will start 
complaining. Adding a class to an existing package forces to to increase the 
minor component of the package version. You will find the package version in 
`package-info.java`.

> Move ByteBuffer wrapper to oak-commons project
> --
>
> Key: OAK-8555
> URL: https://issues.apache.org/jira/browse/OAK-8555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons, segment-tar
>Reporter: José Andrés Cordero Benítez
>Priority: Major
> Attachments: move-Buffer-wrapper-svn.patch
>
>
> As discussed with [~frm] and [~mreutegg], the wrapper 
> org.apache.jackrabbit.oak.segment.spi.persistence.Buffer should be moved to 
> the commons project to make it more accessible in other projects that need to 
> use it to fix OAK-7457.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (OAK-8482) Remove false positives of SNFE on azure execution time out

2019-07-17 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887151#comment-16887151
 ] 

Francesco Mari commented on OAK-8482:
-

[~ierandra], [~dulceanu], the patch looks good to me. [~dulceanu], will you 
take care of the commit?

> Remove false positives of SNFE on azure execution time out
> --
>
> Key: OAK-8482
> URL: https://issues.apache.org/jira/browse/OAK-8482
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-azure, segment-tar
>Reporter: Ieran Draghiciu
>Assignee: Andrei Dulceanu
>Priority: Major
> Attachments: OAK-8482-03.patch, OAK-8482-04.patch, 
> OAK-SNFE-false-positives-02.diff
>
>
> When reading a Tar file goes in execution time out a SNFE is thrown and the 
> SNFE metric is't increased.
> We need to not increase the metric and add extra logging to SNFE.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (OAK-8482) Remove false positives of SNFE on azure execution time out

2019-07-16 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885942#comment-16885942
 ] 

Francesco Mari commented on OAK-8482:
-

[~ierandra], I simplified the patch by avoiding unnecessary wrapping and 
unwrapping of the {{RepositoryNotReachableException}}. Please have a look 
together with [~dulceanu]. Is there a way to add at least a test to assert that 
this patch doesn't create any regression?

> Remove false positives of SNFE on azure execution time out
> --
>
> Key: OAK-8482
> URL: https://issues.apache.org/jira/browse/OAK-8482
> Project: Jackrabbit Oak
>  Issue Type: Bug
>Reporter: Ieran Draghiciu
>Priority: Major
> Attachments: OAK-8482-03.patch, OAK-SNFE-false-positives-02.diff
>
>
> When reading a Tar file goes in execution time out a SNFE is thrown and the 
> SNFE metric is't increased.
> We need to not increase the metric and add extra logging to SNFE.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (OAK-8482) Remove false positives of SNFE on azure execution time out

2019-07-16 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8482:

Attachment: OAK-8482-03.patch

> Remove false positives of SNFE on azure execution time out
> --
>
> Key: OAK-8482
> URL: https://issues.apache.org/jira/browse/OAK-8482
> Project: Jackrabbit Oak
>  Issue Type: Bug
>Reporter: Ieran Draghiciu
>Priority: Major
> Attachments: OAK-8482-03.patch, OAK-SNFE-false-positives-02.diff
>
>
> When reading a Tar file goes in execution time out a SNFE is thrown and the 
> SNFE metric is't increased.
> We need to not increase the metric and add extra logging to SNFE.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (OAK-8429) oak-run check should expose repository statistics for the last good revision

2019-06-28 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874760#comment-16874760
 ] 

Francesco Mari commented on OAK-8429:
-

[~dulceanu], I don't think that there is a better way to do that. The patch 
looks good to me.

> oak-run check should expose repository statistics for the last good revision
> 
>
> Key: OAK-8429
> URL: https://issues.apache.org/jira/browse/OAK-8429
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: oak-run, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: tooling
> Fix For: 1.16.0
>
> Attachments: OAK-8429.patch
>
>
> {{oak-run check}} should expose the head node and property counts for the 
> last good revision. Currently these are only logged at the end of the check 
> operation as
> {noformat}
> Checked X nodes and Y properties.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Deleted] (OAK-8417) SNFE after adding AzurePersistence timeouts

2019-06-20 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari deleted OAK-8417:



> SNFE after adding AzurePersistence timeouts
> ---
>
> Key: OAK-8417
> URL: https://issues.apache.org/jira/browse/OAK-8417
> Project: Jackrabbit Oak
>  Issue Type: Bug
>Reporter: Ieran Draghiciu
>Priority: Major
>
> After adding Azure persistance timeouts 
> https://issues.apache.org/jira/browse/OAK-8406 SNFE are generated by Golden 
> Master Publish during replication:
> {code:java}
> 19.06.2019 12:45:13.337 *ERROR* [pool-13-thread-5] 
> org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue Can't persist 
> the segment aece1873-b677-4f07-ad48-bf624d91c666
> java.io.IOException: com.microsoft.azure.storage.StorageException: The client 
> could not finish the operation within specified maximum execution timeout.
>   at 
> org.apache.jackrabbit.oak.segment.azure.AzureSegmentArchiveWriter.doWriteEntry(AzureSegmentArchiveWriter.java:96)
>  [org.apache.jackrabbit.oak-segment-azure:1.14.0.20190617]
>   at 
> org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteAction.passTo(SegmentWriteAction.java:55)
>  [org.apache.jackrabbit.oak-segment-azure:1.14.0.20190617]
>   at 
> org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.consume(SegmentWriteQueue.java:111)
>  [org.apache.jackrabbit.oak-segment-azure:1.14.0.20190617]
>   at 
> org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.consume(SegmentWriteQueue.java:105)
>  [org.apache.jackrabbit.oak-segment-azure:1.14.0.20190617]
>   at 
> org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.mainLoop(SegmentWriteQueue.java:84)
>  [org.apache.jackrabbit.oak-segment-azure:1.14.0.20190617]
> {code}
> See full logs for golden publish attached.
> SNFE are also present on publishers. See full logs attached.
> During consistency check (done by publish pharmer) no SNFE are found and the 
> job passes. See full logs attached.
> cc [~dulceanu] [~frm] [~tomek.rekawek]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-8412) AzurePersistence should specify a retry strategy compatible with the configured timeouts

2019-06-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-8412.
-
Resolution: Fixed

Fixed at r1861581.

> AzurePersistence should specify a retry strategy compatible with the 
> configured timeouts
> 
>
> Key: OAK-8412
> URL: https://issues.apache.org/jira/browse/OAK-8412
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-8412-01.patch
>
>
> {{AzurePersistence}} specifies timeouts for the whole request time, and for 
> the time it needs for Azure Storage to come back with a response. Since no 
> retry policy is specified, the client uses a default exponential retry policy 
> with a backoff of 30s and 3 retries, which doesn't fit with our request 
> timeout of 30s. A proper retry policy should be defined, instead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-8412) AzurePersistence should specify a retry strategy compatible with the configured timeouts

2019-06-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8412:

Attachment: OAK-8412-01.patch

> AzurePersistence should specify a retry strategy compatible with the 
> configured timeouts
> 
>
> Key: OAK-8412
> URL: https://issues.apache.org/jira/browse/OAK-8412
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-8412-01.patch
>
>
> {{AzurePersistence}} specifies timeouts for the whole request time, and for 
> the time it needs for Azure Storage to come back with a response. Since no 
> retry policy is specified, the client uses a default exponential retry policy 
> with a backoff of 30s and 3 retries, which doesn't fit with our request 
> timeout of 30s. A proper retry policy should be defined, instead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-8412) AzurePersistence should specify a retry strategy compatible with the configured timeouts

2019-06-18 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-8412:
---

 Summary: AzurePersistence should specify a retry strategy 
compatible with the configured timeouts
 Key: OAK-8412
 URL: https://issues.apache.org/jira/browse/OAK-8412
 Project: Jackrabbit Oak
  Issue Type: Improvement
Reporter: Francesco Mari
Assignee: Francesco Mari


{{AzurePersistence}} specifies timeouts for the whole request time, and for the 
time it needs for Azure Storage to come back with a response. Since no retry 
policy is specified, the client uses a default exponential retry policy with a 
backoff of 30s and 3 retries, which doesn't fit with our request timeout of 
30s. A proper retry policy should be defined, instead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-8410) AzurePersistence throws an NPE when reacting to RequestCompletedEvent

2019-06-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-8410.
-
   Resolution: Fixed
Fix Version/s: 1.16.0

Fixed at r1861576.

> AzurePersistence throws an NPE when reacting to RequestCompletedEvent
> -
>
> Key: OAK-8410
> URL: https://issues.apache.org/jira/browse/OAK-8410
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-azure
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: OAK-8410-01.patch
>
>
> The {{RequestCompletedEvent}} might have a {{null}} start date or end date. 
> The handler registered by {{AzurePersistence}} should guard against this when 
> notifying the {{RemoteStoreMonitor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-8410) AzurePersistence throws an NPE when reacting to RequestCompletedEvent

2019-06-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8410:

Attachment: OAK-8410-01.patch

> AzurePersistence throws an NPE when reacting to RequestCompletedEvent
> -
>
> Key: OAK-8410
> URL: https://issues.apache.org/jira/browse/OAK-8410
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-azure
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-8410-01.patch
>
>
> The {{RequestCompletedEvent}} might have a {{null}} start date or end date. 
> The handler registered by {{AzurePersistence}} should guard against this when 
> notifying the {{RemoteStoreMonitor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-8410) AzurePersistence throws an NPE when reacting to RequestCompletedEvent

2019-06-18 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-8410:
---

 Summary: AzurePersistence throws an NPE when reacting to 
RequestCompletedEvent
 Key: OAK-8410
 URL: https://issues.apache.org/jira/browse/OAK-8410
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-azure
Reporter: Francesco Mari
Assignee: Francesco Mari


The {{RequestCompletedEvent}} might have a {{null}} start date or end date. The 
handler registered by {{AzurePersistence}} should guard against this when 
notifying the {{RemoteStoreMonitor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-8406) AzurePersistence issues requests without timeout

2019-06-17 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-8406.
-
Resolution: Fixed

Fixed at r1861517.

> AzurePersistence issues requests without timeout
> 
>
> Key: OAK-8406
> URL: https://issues.apache.org/jira/browse/OAK-8406
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-azure
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: OAK-8406-001.patch
>
>
> {{AzurePersistence}} doesn't set any timer for the requests issued to Azure 
> Storage. The implementation should set default timers unless specified by the 
> user of {{AzurePersistence}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-8406) AzurePersistence issues requests without timeout

2019-06-17 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8406:

Attachment: OAK-8406-001.patch

> AzurePersistence issues requests without timeout
> 
>
> Key: OAK-8406
> URL: https://issues.apache.org/jira/browse/OAK-8406
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-azure
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: OAK-8406-001.patch
>
>
> {{AzurePersistence}} doesn't set any timer for the requests issued to Azure 
> Storage. The implementation should set default timers unless specified by the 
> user of {{AzurePersistence}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-8406) AzurePersistence issues requests without timeout

2019-06-17 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-8406:
---

 Summary: AzurePersistence issues requests without timeout
 Key: OAK-8406
 URL: https://issues.apache.org/jira/browse/OAK-8406
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-azure
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: 1.16.0


{{AzurePersistence}} doesn't set any timer for the requests issued to Azure 
Storage. The implementation should set default timers unless specified by the 
user of {{AzurePersistence}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8366) Add remote store monitoring for Azure

2019-06-07 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858731#comment-16858731
 ] 

Francesco Mari commented on OAK-8366:
-

[~ierandra], the patch looks good to me, but there seems to be a bug somewhere. 
When I run the tests for the oak-segment-tar module I get a bunch of NPEs. I 
used this command:

{noformat}
mvn clean verify -P integration-testing -pl oak-segment-tar,oak-segment-azure
{noformat}

> Add remote store monitoring  for Azure
> --
>
> Key: OAK-8366
> URL: https://issues.apache.org/jira/browse/OAK-8366
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-azure
>Reporter: Ieran Draghiciu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: TarMK
> Attachments: azure_remote_store_monitor_1.patch, 
> azure_remote_store_monitorv2.patch, azure_remote_store_monitorv3.patch
>
>
> Add remote store monitoring
> Implement the remote store monitoring for Azure Store. This should include:
> - request_count : number of request to azure store
> - error_count : number of failed requests to azure store
> - duration : duration of a request to azure store in nanoseconds 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8366) Add remote store monitoring for Azure

2019-06-07 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858656#comment-16858656
 ] 

Francesco Mari commented on OAK-8366:
-

[~ierandra], I think you don't need 
{{getGlobalErrorReceivingResponseEventHandler}}. You should be able to figure 
out if a request failed by registering an event handler at 
{{OperationContext#getGlobalRequestCompletedEventHandler}} and check whether 
the {{RequestResult}} has a non-null {{RequestResult#getException}}. I suggest 
to try this approach first instead of upgrading the SDK in order to avoid 
potential problems that might occur after the upgrade. We can track the upgrade 
of the SDK in a separate issue, if needed.

> Add remote store monitoring  for Azure
> --
>
> Key: OAK-8366
> URL: https://issues.apache.org/jira/browse/OAK-8366
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-azure
>Reporter: Ieran Draghiciu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: TarMK
> Attachments: azure_remote_store_monitor_1.patch, 
> azure_remote_store_monitorv2.patch
>
>
> Add remote store monitoring
> Implement the remote store monitoring for Azure Store. This should include:
> - request_count : number of request to azure store
> - error_count : number of failed requests to azure store
> - duration : duration of a request to azure store in nanoseconds 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8366) Add remote store monitoring for Azure

2019-06-03 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854467#comment-16854467
 ] 

Francesco Mari commented on OAK-8366:
-

[~ierandra], the work done so far looks good to me. I would slightly simplify 
the implementation by removing {{CompositeRemoteStoreMonitor}}. It's not used 
anywhere except in {{FileStoreBuilder}}. I would change the semantics of 
{{FileStoreBuilder#withRemoteStoreMonitor}} to only accept a single instance of 
{{CompositeRemoteStoreMonitor}} and let the caller decide if they want to pass 
a special implementation that is the composition of many instance. Until we 
have that problem, I would keep the code simple and remove 
{{CompositeRemoteStoreMonitor}} altogether.

> Add remote store monitoring  for Azure
> --
>
> Key: OAK-8366
> URL: https://issues.apache.org/jira/browse/OAK-8366
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-azure
>Reporter: Ieran Draghiciu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: TarMK
> Attachments: azure_remote_store_monitor_1.patch
>
>
> Add remote store monitoring
> Implement the remote store monitoring for Azure Store. This should include:
> - request_count : number of request to azure store
> - error_count : number of failed requests to azure store
> - duration : duration of a request to azure store in nanoseconds 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8243) Expose the number of SNFEs as metric

2019-04-18 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820938#comment-16820938
 ] 

Francesco Mari commented on OAK-8243:
-

[~dulceanu], LGTM. I think it's the simplest solution to this problem. We can 
always abstract and generalize later, if needed.

> Expose the number of SNFEs as metric
> 
>
> Key: OAK-8243
> URL: https://issues.apache.org/jira/browse/OAK-8243
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: OAK-8243.patch
>
>
> I want to expose the number of {{SegmentNotFoundException}}s as metric.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8186) Create API in OAK for file access to binaries in the repository.

2019-04-10 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814195#comment-16814195
 ] 

Francesco Mari commented on OAK-8186:
-

[~mattvryan], my rule of thumb is that if something can be implemented below 
JCR without introducing new, custom APIs, it will be implemented. If not, it 
should probably be implemented on top of JCR.

It might be that this need for speed when accessing binaries is an actual 
requirement of the application [~hsagi...@gmail.com] is implementing. In this 
case, [~hsagi...@gmail.com] should probably choose different storage mediums 
for the different kind of data he is manipulating. Maybe he should store some 
data directly on the file system, and just store a reference to it (e.g. file 
paths) in JCR. If a user needs certain data access patterns that are not 
provided by JCR, that user should better implement those data access pattern in 
his own code.

> Create API in OAK for file access to binaries in the repository.
> 
>
> Key: OAK-8186
> URL: https://issues.apache.org/jira/browse/OAK-8186
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Henry Saginor
>Priority: Major
> Attachments: FileCopyTest3.java, OAK File Access.jpg, 
> fileCopyTest-0.0.1-SNAPSHOT.jar
>
>
> To get file access applications normally write binaries to temp files. It 
> would be nice if an API existed to get file access directly from OAK. This 
> might also meet some use cases documented at 
> [https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase]
> Suggested API and implementation can be found here [1]. Also, see attached 
> diagram [2].
> I can create a patch if I can get some feedback. Note that suggested API 
> makes it explicit that a temp file is created. I am not sure if direct access 
> to files in datasore would be safe. But I am open to suggestions.
> [1]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/FileReferencable.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReference.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReferenceProvider.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDSBlobTempFileReference.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBlob.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/plugins/value/jcr/BinaryImpl.java]
> [2]
> !OAK File Access.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6947) Add package export versions for oak-store-spi

2019-04-09 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813535#comment-16813535
 ] 

Francesco Mari commented on OAK-6947:
-

I have no strong preference about the oak-store-spi module and the patch will 
probably being reworked, being more than a year old. Free for everyone to pick 
this up.

> Add package export versions for oak-store-spi
> -
>
> Key: OAK-6947
> URL: https://issues.apache.org/jira/browse/OAK-6947
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: store-spi
>Reporter: angela
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: OAK-6947.patch
>
>
> [~mduerig], [~mreutegg], [~frm], [~stillalex], do you have any strong 
> preferences wrt to the packages we placed in the _oak-store-spi_ module?
> Currently we explicitly export all packages and I think it would make sense 
> to enable the baseline plugin for these packages.
> Any objection from your side?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OAK-6947) Add package export versions for oak-store-spi

2019-04-09 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-6947:
---

Assignee: (was: Francesco Mari)

> Add package export versions for oak-store-spi
> -
>
> Key: OAK-6947
> URL: https://issues.apache.org/jira/browse/OAK-6947
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: store-spi
>Reporter: angela
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: OAK-6947.patch
>
>
> [~mduerig], [~mreutegg], [~frm], [~stillalex], do you have any strong 
> preferences wrt to the packages we placed in the _oak-store-spi_ module?
> Currently we explicitly export all packages and I think it would make sense 
> to enable the baseline plugin for these packages.
> Any objection from your side?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8186) Create API in OAK for file access to binaries in the repository.

2019-04-09 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813403#comment-16813403
 ] 

Francesco Mari commented on OAK-8186:
-

Surely the lower you implement the copy operation, the better the performance 
will be. The point is not which kind of copy operation is going to be the 
fastest, but how tight the coupling between your code and Oak's internals are 
going to be. 

If you build your system on top of the JCR API, you are going to be safe no 
matter what implementation of the JCR API you use. If you use a custom API to 
access the internals of Oak, you are bound to a specific configuration of Oak, 
and of Oak only. Moreover, this this is going to create a backwards bind. 
Assuming that we implement this feature in Oak, we will not be able to drop 
support for this feature if we decide to change the internals of our 
implementation. What if, at some point, we decide to save binaries on disk in 
chunks instead of in one big file? We will not be able to do so without 
breaking the contract for fast copies.

Oak lives below JCR, your system lives upon it. I'm not comfortable breaking 
this boundary. I wasn't when the direct binary access was introduced, and I'm 
definitely not comfortable now.

> Create API in OAK for file access to binaries in the repository.
> 
>
> Key: OAK-8186
> URL: https://issues.apache.org/jira/browse/OAK-8186
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Henry Saginor
>Priority: Major
> Attachments: FileCopyTest3.java, OAK File Access.jpg, 
> fileCopyTest-0.0.1-SNAPSHOT.jar
>
>
> To get file access applications normally write binaries to temp files. It 
> would be nice if an API existed to get file access directly from OAK. This 
> might also meet some use cases documented at 
> [https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase]
> Suggested API and implementation can be found here [1]. Also, see attached 
> diagram [2].
> I can create a patch if I can get some feedback. Note that suggested API 
> makes it explicit that a temp file is created. I am not sure if direct access 
> to files in datasore would be safe. But I am open to suggestions.
> [1]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/FileReferencable.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReference.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReferenceProvider.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDSBlobTempFileReference.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBlob.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/plugins/value/jcr/BinaryImpl.java]
> [2]
> !OAK File Access.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-8202) RemoteBlobProcessor should print a stack trace of the exceptions it swallows

2019-04-08 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-8202.
-
   Resolution: Fixed
Fix Version/s: 1.10.3

Fixed at r1857109.

> RemoteBlobProcessor should print a stack trace of the exceptions it swallows
> 
>
> Key: OAK-8202
> URL: https://issues.apache.org/jira/browse/OAK-8202
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.3
>
>
> In order to cope with the dryness of the {{BlobStore}} and {{DataStore}} API, 
> {{RemoteBlobProcessor#shouldFetchBinary}} relies on exceptions to implement 
> part of its logic. While this is a bad development practice, it was the only 
> way to cope with in-memory binary IDs without spending excessive amounts of 
> network and CPU. To Improve the transparency of the system, 
> {{RemoteBlobProcessor}} should print a message at WARN level every time it 
> swallows an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-8202) RemoteBlobProcessor should print a stack trace of the exceptions it swallows

2019-04-08 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-8202:
---

 Summary: RemoteBlobProcessor should print a stack trace of the 
exceptions it swallows
 Key: OAK-8202
 URL: https://issues.apache.org/jira/browse/OAK-8202
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari


In order to cope with the dryness of the {{BlobStore}} and {{DataStore}} API, 
{{RemoteBlobProcessor#shouldFetchBinary}} relies on exceptions to implement 
part of its logic. While this is a bad development practice, it was the only 
way to cope with in-memory binary IDs without spending excessive amounts of 
network and CPU. To Improve the transparency of the system, 
{{RemoteBlobProcessor}} should print a message at WARN level every time it 
swallows an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8186) Create API in OAK for file access to binaries in the repository.

2019-04-04 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809661#comment-16809661
 ] 

Francesco Mari commented on OAK-8186:
-

This use case can be implemented on top of the JCR API by spooling the binary 
to a temporary location, with the added benefit that will be portable across 
different Blob Store and Data Store implementations.

> Create API in OAK for file access to binaries in the repository.
> 
>
> Key: OAK-8186
> URL: https://issues.apache.org/jira/browse/OAK-8186
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Henry Saginor
>Priority: Major
> Attachments: OAK File Access.jpg
>
>
> To get file access applications normally write binaries to temp files. It 
> would be nice if an API existed to get file access directly from OAK. This 
> might also meet some use cases documented at 
> [https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase]
> Suggested API and implementation can be found here [1]. Also, see attached 
> diagram [2].
> I can create a patch if I can get some feedback. Note that suggested API 
> makes it explicit that a temp file is created. I am not sure if direct access 
> to files in datasore would be safe. But I am open to suggestions.
> [1]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/FileReferencable.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReference.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReferenceProvider.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDSBlobTempFileReference.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBlob.java]
>  
> [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/plugins/value/jcr/BinaryImpl.java]
> [2]
> !OAK File Access.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-7938) Test failure: MBeanIT.testClientAndServerEmptyConfig

2019-03-08 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-7938.
-
Resolution: Cannot Reproduce

I can't reproduce this issue and there are no recent failures on Jenkins.

> Test failure: MBeanIT.testClientAndServerEmptyConfig
> 
>
> Key: OAK-7938
> URL: https://issues.apache.org/jira/browse/OAK-7938
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration, segment-tar
>Reporter: Hudson
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.12
>
>
> No description is provided
> The build Jackrabbit Oak #1826 has failed.
> First failed run: [Jackrabbit Oak 
> #1826|https://builds.apache.org/job/Jackrabbit%20Oak/1826/] [console 
> log|https://builds.apache.org/job/Jackrabbit%20Oak/1826/console]
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.782 
> s <<< FAILURE! - in org.apache.jackrabbit.oak.segment.standby.MBeanIT
> [ERROR] 
> testClientAndServerEmptyConfig(org.apache.jackrabbit.oak.segment.standby.MBeanIT)
>   Time elapsed: 2.499 s  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.jackrabbit.oak.segment.standby.MBeanIT.testClientAndServerEmptyConfig(MBeanIT.java:194)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OAK-7938) Test failure: MBeanIT.testClientAndServerEmptyConfig

2019-03-08 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-7938:
---

Assignee: Francesco Mari

> Test failure: MBeanIT.testClientAndServerEmptyConfig
> 
>
> Key: OAK-7938
> URL: https://issues.apache.org/jira/browse/OAK-7938
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration, segment-tar
>Reporter: Hudson
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.12
>
>
> No description is provided
> The build Jackrabbit Oak #1826 has failed.
> First failed run: [Jackrabbit Oak 
> #1826|https://builds.apache.org/job/Jackrabbit%20Oak/1826/] [console 
> log|https://builds.apache.org/job/Jackrabbit%20Oak/1826/console]
> {noformat}
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.782 
> s <<< FAILURE! - in org.apache.jackrabbit.oak.segment.standby.MBeanIT
> [ERROR] 
> testClientAndServerEmptyConfig(org.apache.jackrabbit.oak.segment.standby.MBeanIT)
>   Time elapsed: 2.499 s  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.jackrabbit.oak.segment.standby.MBeanIT.testClientAndServerEmptyConfig(MBeanIT.java:194)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-7027) Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout

2019-03-08 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-7027.
-
   Resolution: Fixed
Fix Version/s: 1.12

Fixed at r1855048.

> Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> 
>
> Key: OAK-7027
> URL: https://issues.apache.org/jira/browse/OAK-7027
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure
> Fix For: 1.12
>
> Attachments: OAK-7027-01.patch, OAK-7027-02.patch
>
>
> Seen on an internal Windows Jenkins node:
> h3. Regression
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> h3. Error Message
> {noformat}
> Values should be different. Actual: { root = { ... } }
> {noformat}
> h3. Stacktrace
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: { root = { ... 
> } }
>   at 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout(ExternalPrivateStoreIT.java:87)
> {noformat}
> h3. Standard Output
> {noformat}
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit3041268421527563090, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit3041268421527563090, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit3041268421527563090\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 4cea1684-ef05-44f5-a869-3ef2df6e0c9a to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit4470899745425503556, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit4470899745425503556, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit4470899745425503556\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 8d19c7dc-8b48-4e10-a58d-31c15c93f2fe to 
> target\junit2834122541179880349\junit4470899745425503556\data0a.tar
> 22:41:13.646 INFO  [main] DataStoreTestBase.java:127Test begin: 
> testSyncFailingDueToTooShortTimeout
> 22:41:13.646 INFO  [main] SegmentNodeStore.java:120 Creating segment 
> node store SegmentNodeStoreBuilder{blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore]}
> 22:41:13.646 INFO  [main] LockBasedScheduler.java:155   Initializing 
> SegmentNodeStore with the commitFairLock option enabled.
> 22:41:13.708 DEBUG [main] StandbyServer.java:248Binding was 
> successful
> 22:41:13.708 DEBUG [main] TarWriter.java:185Writing segment 
> 4a5183bd-bcdf-41ab-a557-6f19143bbc91 to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.739 DEBUG [main] TarRevisions.java:240 TarMK journal 
> update null -> 4a5183bd-bcdf-41ab-a557-6f19143bbc91.000c
> 22:41:13.755 DEBUG [standby-1] GetHeadRequestEncoder.java:33 Sending request 
> from client 9aa63ed8-347b-4f00-ae7c-f984e0623e90 for current head
> 22:41:13.755 DEBUG [primary-1] ClientFilterHandler.java:53  Client 
> /127.0.0.1:65480 

[jira] [Updated] (OAK-7027) Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout

2019-03-08 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-7027:

Attachment: OAK-7027-02.patch

> Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> 
>
> Key: OAK-7027
> URL: https://issues.apache.org/jira/browse/OAK-7027
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure
> Attachments: OAK-7027-01.patch, OAK-7027-02.patch
>
>
> Seen on an internal Windows Jenkins node:
> h3. Regression
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> h3. Error Message
> {noformat}
> Values should be different. Actual: { root = { ... } }
> {noformat}
> h3. Stacktrace
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: { root = { ... 
> } }
>   at 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout(ExternalPrivateStoreIT.java:87)
> {noformat}
> h3. Standard Output
> {noformat}
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit3041268421527563090, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit3041268421527563090, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit3041268421527563090\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 4cea1684-ef05-44f5-a869-3ef2df6e0c9a to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit4470899745425503556, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit4470899745425503556, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit4470899745425503556\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 8d19c7dc-8b48-4e10-a58d-31c15c93f2fe to 
> target\junit2834122541179880349\junit4470899745425503556\data0a.tar
> 22:41:13.646 INFO  [main] DataStoreTestBase.java:127Test begin: 
> testSyncFailingDueToTooShortTimeout
> 22:41:13.646 INFO  [main] SegmentNodeStore.java:120 Creating segment 
> node store SegmentNodeStoreBuilder{blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore]}
> 22:41:13.646 INFO  [main] LockBasedScheduler.java:155   Initializing 
> SegmentNodeStore with the commitFairLock option enabled.
> 22:41:13.708 DEBUG [main] StandbyServer.java:248Binding was 
> successful
> 22:41:13.708 DEBUG [main] TarWriter.java:185Writing segment 
> 4a5183bd-bcdf-41ab-a557-6f19143bbc91 to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.739 DEBUG [main] TarRevisions.java:240 TarMK journal 
> update null -> 4a5183bd-bcdf-41ab-a557-6f19143bbc91.000c
> 22:41:13.755 DEBUG [standby-1] GetHeadRequestEncoder.java:33 Sending request 
> from client 9aa63ed8-347b-4f00-ae7c-f984e0623e90 for current head
> 22:41:13.755 DEBUG [primary-1] ClientFilterHandler.java:53  Client 
> /127.0.0.1:65480 is allowed
> 22:41:13.755 DEBUG [primary-1] RequestDecoder.java:42   

[jira] [Commented] (OAK-7027) Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout

2019-03-08 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787893#comment-16787893
 ] 

Francesco Mari commented on OAK-7027:
-

[~dulceanu], the second patch incorporates your suggestions. It definitely 
makes the code clearer. If you are alright with that, I will commit the changes.

> Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> 
>
> Key: OAK-7027
> URL: https://issues.apache.org/jira/browse/OAK-7027
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure
> Attachments: OAK-7027-01.patch, OAK-7027-02.patch
>
>
> Seen on an internal Windows Jenkins node:
> h3. Regression
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> h3. Error Message
> {noformat}
> Values should be different. Actual: { root = { ... } }
> {noformat}
> h3. Stacktrace
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: { root = { ... 
> } }
>   at 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout(ExternalPrivateStoreIT.java:87)
> {noformat}
> h3. Standard Output
> {noformat}
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit3041268421527563090, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit3041268421527563090, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit3041268421527563090\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 4cea1684-ef05-44f5-a869-3ef2df6e0c9a to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit4470899745425503556, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit4470899745425503556, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit4470899745425503556\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 8d19c7dc-8b48-4e10-a58d-31c15c93f2fe to 
> target\junit2834122541179880349\junit4470899745425503556\data0a.tar
> 22:41:13.646 INFO  [main] DataStoreTestBase.java:127Test begin: 
> testSyncFailingDueToTooShortTimeout
> 22:41:13.646 INFO  [main] SegmentNodeStore.java:120 Creating segment 
> node store SegmentNodeStoreBuilder{blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore]}
> 22:41:13.646 INFO  [main] LockBasedScheduler.java:155   Initializing 
> SegmentNodeStore with the commitFairLock option enabled.
> 22:41:13.708 DEBUG [main] StandbyServer.java:248Binding was 
> successful
> 22:41:13.708 DEBUG [main] TarWriter.java:185Writing segment 
> 4a5183bd-bcdf-41ab-a557-6f19143bbc91 to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.739 DEBUG [main] TarRevisions.java:240 TarMK journal 
> update null -> 4a5183bd-bcdf-41ab-a557-6f19143bbc91.000c
> 22:41:13.755 DEBUG [standby-1] GetHeadRequestEncoder.java:33 Sending request 
> from client 9aa63ed8-347b-4f00-ae7c-f984e0623e90 for 

[jira] [Commented] (OAK-7027) Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout

2019-03-07 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786849#comment-16786849
 ] 

Francesco Mari commented on OAK-7027:
-

The first version of the patch contains a lot of changes, but most of those 
changes have been mechanically performed. The core of the patch changes the 
following:
* Add the possibility to configure a different instance of StandbyHeadReader, 
StandbySegmentReader, StandbyReferencesReader, and StandbyBlobReader (let's 
call these objects the standby server backend) in StandbyServer.
* Add a Builder for StandbyServerSync. This builder allows to configure the 
standby server backend, too. The methods on the Builder related to the standby 
server backend are package-private, as I intend those to be used for testing 
purposes only.
* Move the failing test in a new test class, SlowServerIT. The test class is in 
the o.a.j.o.segment.standby.server, so it's allowed to access the methods to 
configure the standby server backend.
* SlowServerIT leverages a custom standby server backend that adds a delay when 
a blob is read. This makes the timeout on the client expire, thus validating 
the use case correctly.

[~dulceanu], can you have a look at the patch, please?

> Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> 
>
> Key: OAK-7027
> URL: https://issues.apache.org/jira/browse/OAK-7027
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure
> Attachments: OAK-7027-01.patch
>
>
> Seen on an internal Windows Jenkins node:
> h3. Regression
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> h3. Error Message
> {noformat}
> Values should be different. Actual: { root = { ... } }
> {noformat}
> h3. Stacktrace
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: { root = { ... 
> } }
>   at 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout(ExternalPrivateStoreIT.java:87)
> {noformat}
> h3. Standard Output
> {noformat}
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit3041268421527563090, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit3041268421527563090, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit3041268421527563090\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 4cea1684-ef05-44f5-a869-3ef2df6e0c9a to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit4470899745425503556, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit4470899745425503556, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit4470899745425503556\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 8d19c7dc-8b48-4e10-a58d-31c15c93f2fe to 
> target\junit2834122541179880349\junit4470899745425503556\data0a.tar
> 22:41:13.646 INFO  [main] DataStoreTestBase.java:127Test begin: 
> testSyncFailingDueToTooShortTimeout
> 

[jira] [Updated] (OAK-7027) Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout

2019-03-07 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-7027:

Attachment: OAK-7027-01.patch

> Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> 
>
> Key: OAK-7027
> URL: https://issues.apache.org/jira/browse/OAK-7027
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure
> Attachments: OAK-7027-01.patch
>
>
> Seen on an internal Windows Jenkins node:
> h3. Regression
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> h3. Error Message
> {noformat}
> Values should be different. Actual: { root = { ... } }
> {noformat}
> h3. Stacktrace
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: { root = { ... 
> } }
>   at 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout(ExternalPrivateStoreIT.java:87)
> {noformat}
> h3. Standard Output
> {noformat}
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit3041268421527563090, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit3041268421527563090, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit3041268421527563090\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 4cea1684-ef05-44f5-a869-3ef2df6e0c9a to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit4470899745425503556, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit4470899745425503556, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit4470899745425503556\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 8d19c7dc-8b48-4e10-a58d-31c15c93f2fe to 
> target\junit2834122541179880349\junit4470899745425503556\data0a.tar
> 22:41:13.646 INFO  [main] DataStoreTestBase.java:127Test begin: 
> testSyncFailingDueToTooShortTimeout
> 22:41:13.646 INFO  [main] SegmentNodeStore.java:120 Creating segment 
> node store SegmentNodeStoreBuilder{blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore]}
> 22:41:13.646 INFO  [main] LockBasedScheduler.java:155   Initializing 
> SegmentNodeStore with the commitFairLock option enabled.
> 22:41:13.708 DEBUG [main] StandbyServer.java:248Binding was 
> successful
> 22:41:13.708 DEBUG [main] TarWriter.java:185Writing segment 
> 4a5183bd-bcdf-41ab-a557-6f19143bbc91 to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.739 DEBUG [main] TarRevisions.java:240 TarMK journal 
> update null -> 4a5183bd-bcdf-41ab-a557-6f19143bbc91.000c
> 22:41:13.755 DEBUG [standby-1] GetHeadRequestEncoder.java:33 Sending request 
> from client 9aa63ed8-347b-4f00-ae7c-f984e0623e90 for current head
> 22:41:13.755 DEBUG [primary-1] ClientFilterHandler.java:53  Client 
> /127.0.0.1:65480 is allowed
> 22:41:13.755 DEBUG [primary-1] RequestDecoder.java:42   Parsed 'get 

[jira] [Assigned] (OAK-7027) Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout

2019-03-07 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-7027:
---

Assignee: Francesco Mari

> Test failure: ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> 
>
> Key: OAK-7027
> URL: https://issues.apache.org/jira/browse/OAK-7027
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure
>
> Seen on an internal Windows Jenkins node:
> h3. Regression
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout
> h3. Error Message
> {noformat}
> Values should be different. Actual: { root = { ... } }
> {noformat}
> h3. Stacktrace
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: { root = { ... 
> } }
>   at 
> org.apache.jackrabbit.oak.segment.standby.ExternalPrivateStoreIT.testSyncFailingDueToTooShortTimeout(ExternalPrivateStoreIT.java:87)
> {noformat}
> h3. Standard Output
> {noformat}
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit3041268421527563090, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit3041268421527563090, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit3041268421527563090\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 4cea1684-ef05-44f5-a869-3ef2df6e0c9a to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.646 INFO  [main] FileStoreBuilder.java:340 Creating file 
> store FileStoreBuilder{version=1.8-SNAPSHOT, 
> directory=target\junit2834122541179880349\junit4470899745425503556, 
> blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore], maxFileSize=1, 
> segmentCacheSize=0, stringCacheSize=0, templateCacheSize=0, 
> stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, 
> nodeDeduplicationCacheSize=1, memoryMapping=false, 
> gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, 
> gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, 
> retainedGenerations=2, gcType=FULL}}
> 22:41:13.646 INFO  [main] FileStore.java:241TarMK opened at 
> target\junit2834122541179880349\junit4470899745425503556, mmap=false, size=0 
> B (0 bytes)
> 22:41:13.646 DEBUG [main] FileStore.java:247TAR files: 
> TarFiles{readers=[],writer=target\junit2834122541179880349\junit4470899745425503556\data0a.tar}
> 22:41:13.646 DEBUG [main] TarWriter.java:185Writing segment 
> 8d19c7dc-8b48-4e10-a58d-31c15c93f2fe to 
> target\junit2834122541179880349\junit4470899745425503556\data0a.tar
> 22:41:13.646 INFO  [main] DataStoreTestBase.java:127Test begin: 
> testSyncFailingDueToTooShortTimeout
> 22:41:13.646 INFO  [main] SegmentNodeStore.java:120 Creating segment 
> node store SegmentNodeStoreBuilder{blobStore=DataStore backed BlobStore 
> [org.apache.jackrabbit.core.data.FileDataStore]}
> 22:41:13.646 INFO  [main] LockBasedScheduler.java:155   Initializing 
> SegmentNodeStore with the commitFairLock option enabled.
> 22:41:13.708 DEBUG [main] StandbyServer.java:248Binding was 
> successful
> 22:41:13.708 DEBUG [main] TarWriter.java:185Writing segment 
> 4a5183bd-bcdf-41ab-a557-6f19143bbc91 to 
> target\junit2834122541179880349\junit3041268421527563090\data0a.tar
> 22:41:13.739 DEBUG [main] TarRevisions.java:240 TarMK journal 
> update null -> 4a5183bd-bcdf-41ab-a557-6f19143bbc91.000c
> 22:41:13.755 DEBUG [standby-1] GetHeadRequestEncoder.java:33 Sending request 
> from client 9aa63ed8-347b-4f00-ae7c-f984e0623e90 for current head
> 22:41:13.755 DEBUG [primary-1] ClientFilterHandler.java:53  Client 
> /127.0.0.1:65480 is allowed
> 22:41:13.755 DEBUG [primary-1] RequestDecoder.java:42   Parsed 'get head' 
> message
> 22:41:13.755 DEBUG 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-03-05 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Fix Version/s: 1.10.2

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.12, 1.8.12, 1.10.2
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-03-05 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784502#comment-16784502
 ] 

Francesco Mari commented on OAK-6749:
-

Backported to 1.10 at 1854861.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.12, 1.8.12
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-03-04 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Fix Version/s: (was: 1.10)
   1.12

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.12, 1.8.12
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-03-04 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Fix Version/s: (was: 1.10.1)

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-03-04 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783356#comment-16783356
 ] 

Francesco Mari commented on OAK-8063:
-

[~dulceanu], I like your patch very much since it greatly simplifies the 
transfer logic. I propose a third version of the patch with a minor change just 
to avoid repeating the same check in two different places and to consistently 
log the root segment ID where the traversal starts.

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063-03.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-03-04 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783364#comment-16783364
 ] 

Francesco Mari commented on OAK-6749:
-

[~dulceanu], this issue hasn't be backported to 1.10. My bad. I think I was 
fooled by the Jira versions and assumed that the fixed managed to land on 1.10 
before 1.10.1 was cut. I will take care of the backport, thanks for pointing it 
out.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> 

[jira] [Commented] (OAK-8006) SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby

2019-03-04 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783369#comment-16783369
 ] 

Francesco Mari commented on OAK-8006:
-

That's correct, I'm getting rusty. In this case, I think it's safe to proceed 
with the backport.

> SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby
> --
>
> Key: OAK-8006
> URL: https://issues.apache.org/jira/browse/OAK-8006
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.10.1
>
> Attachments: OAK-8006-02.patch, OAK-8006-test.patch, OAK-8006.patch
>
>
> When persisting a segment transferred from master, among others, the cold 
> standby needs to read the binary references from the segment. While this 
> usually doesn't involve any additional reads from any other segments, there 
> is a special case concerning binary IDs larger than 4092 bytes. These can 
> live in other segments (which got transferred prior to the current segment 
> and are already on the standby), but it might also be the case that the 
> binary ID is stored in the same segment. If this happens, the call to 
> {{blobId.getSegment()}}[0], triggers a new read of the current, un-persisted 
> segment . Thus, a {{SegmentNotFoundException}} is thrown:
> {noformat}
> 22.01.2019 09:35:59.345 *ERROR* [standby-run-1] 
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed 
> synchronizing state.
> org.apache.jackrabbit.oak.segment.SegmentNotFoundException: Segment 
> d40a9da6-06a2-4dc0-ab91-5554a33c02b0 not found
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:284)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at com.google.common.cache.LocalCache.get(LocalCache.java:3932) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.getSegment(SegmentCache.java:160)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.readSegment(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentId.getSegment(SegmentId.java:153) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.RecordId.getSegment(RecordId.java:98) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentBlob.readLongBlobId(SegmentBlob.java:206)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentBlob.readBlobId(SegmentBlob.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore$3.consume(AbstractFileStore.java:262)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.Segment.forEachRecord(Segment.java:601) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readBinaryReferences(AbstractFileStore.java:257)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.writeSegment(FileStore.java:533)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 

[jira] [Commented] (OAK-8006) SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby

2019-03-04 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783358#comment-16783358
 ] 

Francesco Mari commented on OAK-8006:
-

[~dulceanu], I think we should backport this to 1.8. While looking at this 
issue, I wondered if this shouldn't already be solved by OAK-8063. Shouldn't a 
proper ordering of the segments to be persisted prevent this problem from 
occurring at all?

> SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby
> --
>
> Key: OAK-8006
> URL: https://issues.apache.org/jira/browse/OAK-8006
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.10.1
>
> Attachments: OAK-8006-02.patch, OAK-8006-test.patch, OAK-8006.patch
>
>
> When persisting a segment transferred from master, among others, the cold 
> standby needs to read the binary references from the segment. While this 
> usually doesn't involve any additional reads from any other segments, there 
> is a special case concerning binary IDs larger than 4092 bytes. These can 
> live in other segments (which got transferred prior to the current segment 
> and are already on the standby), but it might also be the case that the 
> binary ID is stored in the same segment. If this happens, the call to 
> {{blobId.getSegment()}}[0], triggers a new read of the current, un-persisted 
> segment . Thus, a {{SegmentNotFoundException}} is thrown:
> {noformat}
> 22.01.2019 09:35:59.345 *ERROR* [standby-run-1] 
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed 
> synchronizing state.
> org.apache.jackrabbit.oak.segment.SegmentNotFoundException: Segment 
> d40a9da6-06a2-4dc0-ab91-5554a33c02b0 not found
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:284)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at com.google.common.cache.LocalCache.get(LocalCache.java:3932) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.getSegment(SegmentCache.java:160)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.readSegment(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentId.getSegment(SegmentId.java:153) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.RecordId.getSegment(RecordId.java:98) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentBlob.readLongBlobId(SegmentBlob.java:206)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentBlob.readBlobId(SegmentBlob.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore$3.consume(AbstractFileStore.java:262)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.Segment.forEachRecord(Segment.java:601) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readBinaryReferences(AbstractFileStore.java:257)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> 

[jira] [Updated] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-03-04 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-8063:

Attachment: OAK-8063-03.patch

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063-03.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8006) SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby

2019-01-30 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755942#comment-16755942
 ] 

Francesco Mari commented on OAK-8006:
-

[~dulceanu], this complicates everything. Every solution I can think of is 
either a lot of effort, or is a big hack, or both.

 * Remove {{SegmentId#getSegment}} and face the consequences. This will have a 
huge ripple effect across the code but it remove the root cause of this issue: 
there is a backdoor into loading a segment that can cause unforeseen 
behaviours. We probably don't want to do that.
 * Adjust everything across the stack trace with the same kind of logic that 
was originally implemented by the patch: if this ID points to the current 
segment, use it, otherwise load the segment. This is challenging too, because 
there are too many places where this logic must implemented, and where the 
"current segment" must be passed. Moreover, we will never be sure that we 
didn't miss a spot.
 * Use a different strategy when persisting segments on the standby instance. 
Instead of computing the references and binary references right when the 
segment is persisted, defer that computation to a later stage. For example, 
before a {{TarWriter}} is closed and turned into a {{TarReader}}, it might go 
through all the segments it contains, update the relevant indexes, and persist 
those indexes. This is the solution with the best complexity/benefit ratio I 
can think of.

Maybe there is some other option as well. Maybe we can use a custom 
{{FileStore}} implementation that wraps the real {{FileStore}} but that is also 
aware of the segments not yet persisted on disk (the old standby implementation 
did that and we stopped doing it for a good reason). Maybe it is just enough to 
introduce a new implementation of {{SegmentStore}} or {{SegmentIdProvider}}, 
but I'm not sure if it helps.

> SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby
> --
>
> Key: OAK-8006
> URL: https://issues.apache.org/jira/browse/OAK-8006
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.10.1, 1.8.12
>
> Attachments: OAK-8006-test.patch, OAK-8006.patch
>
>
> When persisting a segment transferred from master, among others, the cold 
> standby needs to read the binary references from the segment. While this 
> usually doesn't involve any additional reads from any other segments, there 
> is a special case concerning binary IDs larger than 4092 bytes. These can 
> live in other segments (which got transferred prior to the current segment 
> and are already on the standby), but it might also be the case that the 
> binary ID is stored in the same segment. If this happens, the call to 
> {{blobId.getSegment()}}[0], triggers a new read of the current, un-persisted 
> segment . Thus, a {{SegmentNotFoundException}} is thrown:
> {noformat}
> 22.01.2019 09:35:59.345 *ERROR* [standby-run-1] 
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed 
> synchronizing state.
> org.apache.jackrabbit.oak.segment.SegmentNotFoundException: Segment 
> d40a9da6-06a2-4dc0-ab91-5554a33c02b0 not found
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:284)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at com.google.common.cache.LocalCache.get(LocalCache.java:3932) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> 

[jira] [Commented] (OAK-8014) Commits carrying over from previous GC generation can block other threads from committing

2019-01-30 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755882#comment-16755882
 ] 

Francesco Mari commented on OAK-8014:
-

[~mduerig], the patch proves the point and should fix the observed behaviour. I 
think it's perfectly safe to move the call to {{changes.getNodeState()}} 
outside the lock.

> Commits carrying over from previous GC generation can block other threads 
> from committing
> -
>
> Key: OAK-8014
> URL: https://issues.apache.org/jira/browse/OAK-8014
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Affects Versions: 1.10.0, 1.8.11
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Blocker
>  Labels: TarMK
> Fix For: 1.8.12
>
> Attachments: OAK-8014.patch
>
>
> A commit that is based on a previous (full) generation can block other 
> commits from progressing for a long time. This happens because such a commit 
> will do a deep copy of its state to avoid linking to old segments (see 
> OAK-3348). Most of the deep copying is usually avoided by the deduplication 
> caches. However, in cases where the cache hit rate is not good enough we have 
> seen deep copy operations up to several minutes. Sometimes this deep copy 
> operation happens inside the commit lock of 
> {{LockBasedScheduler.schedule()}}, which then causes all other commits to 
> become blocked.
> cc [~rma61...@adobe.com], [~edivad]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8006) SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby

2019-01-29 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754821#comment-16754821
 ] 

Francesco Mari commented on OAK-8006:
-

[~dulceanu], I was able to sketch out a test case that activates the desired 
code path in {{SegmentBlob}}. The test case fails to test anything, but if you 
descend into the call {{blob.length()}} with your debugger, you can see that 
the new code is called. Maybe we can build on top of that?

{noformat}
public class SegmentBlobTest {

// `BLOB_SIZE` has to be chosen in such a way that is both:
//
// - Less than or equal to `minRecordLength` in the Blob Store. This is
//   necessary in order to create a sufficiently long blob ID.
// - Greater than or equal to `Segment.BLOB_ID_SMALL_LIMIT` in order to save
//   the blob ID as a large, external blob ID.
// - Greater than or equal to `Segment.MEDIUM_LIMIT`, otherwise the content
//   of the binary will be written as a mere value record instead of a
//   binary ID referencing an external binary.
//
// Since `Segment.MEDIUM_LIMIT` is the largest of the constants above, it is
// sufficient to set `BLOB_SIZE` to `Segment.MEDIUM_LIMIT`. 

private static final int BLOB_SIZE = 2 * SegmentTestConstants.MEDIUM_LIMIT;

private TemporaryFolder folder = new TemporaryFolder(new File("target"));

private TemporaryBlobStore blobStore = new TemporaryBlobStore(folder) {

@Override
protected void configureDataStore(FileDataStore dataStore) {
dataStore.setMinRecordLength(BLOB_SIZE);
}

};

private TemporaryFileStore fileStore = new TemporaryFileStore(folder, 
blobStore, false);

@Rule
public RuleChain ruleChain = RuleChain.outerRule(folder)
.around(blobStore)
.around(fileStore);

@Test
public void testReadLongBlobId() throws Exception {
SegmentNodeStore nodeStore = 
SegmentNodeStoreBuilders.builder(fileStore.fileStore()).build();
Blob blob = nodeStore.createBlob(new NullInputStream(BLOB_SIZE));
blob.length();
}
}
{noformat}

> SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby
> --
>
> Key: OAK-8006
> URL: https://issues.apache.org/jira/browse/OAK-8006
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.10.1, 1.8.12
>
> Attachments: OAK-8006.patch
>
>
> When persisting a segment transferred from master, among others, the cold 
> standby needs to read the binary references from the segment. While this 
> usually doesn't involve any additional reads from any other segments, there 
> is a special case concerning binary IDs larger than 4092 bytes. These can 
> live in other segments (which got transferred prior to the current segment 
> and are already on the standby), but it might also be the case that the 
> binary ID is stored in the same segment. If this happens, the call to 
> {{blobId.getSegment()}}[0], triggers a new read of the current, un-persisted 
> segment . Thus, a {{SegmentNotFoundException}} is thrown:
> {noformat}
> 22.01.2019 09:35:59.345 *ERROR* [standby-run-1] 
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed 
> synchronizing state.
> org.apache.jackrabbit.oak.segment.SegmentNotFoundException: Segment 
> d40a9da6-06a2-4dc0-ab91-5554a33c02b0 not found
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:284)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) 
> 

[jira] [Commented] (OAK-8006) SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby

2019-01-29 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754785#comment-16754785
 ] 

Francesco Mari commented on OAK-8006:
-

[~dulceanu], LGTM. Given the urgency of this issue, I would fix it immediately 
on trunk and backport it to 1.10, but I would wait until backporting further 
until we come with a way to reproduce this with a test case.

> SegmentBlob#readLongBlobId might cause SegmentNotFoundException on standby
> --
>
> Key: OAK-8006
> URL: https://issues.apache.org/jira/browse/OAK-8006
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.10.1, 1.8.12
>
> Attachments: OAK-8006.patch
>
>
> When persisting a segment transferred from master, among others, the cold 
> standby needs to read the binary references from the segment. While this 
> usually doesn't involve any additional reads from any other segments, there 
> is a special case concerning binary IDs larger than 4092 bytes. These can 
> live in other segments (which got transferred prior to the current segment 
> and are already on the standby), but it might also be the case that the 
> binary ID is stored in the same segment. If this happens, the call to 
> {{blobId.getSegment()}}[0], triggers a new read of the current, un-persisted 
> segment . Thus, a {{SegmentNotFoundException}} is thrown:
> {noformat}
> 22.01.2019 09:35:59.345 *ERROR* [standby-run-1] 
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed 
> synchronizing state.
> org.apache.jackrabbit.oak.segment.SegmentNotFoundException: Segment 
> d40a9da6-06a2-4dc0-ab91-5554a33c02b0 not found
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:284)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
>  [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at com.google.common.cache.LocalCache.get(LocalCache.java:3932) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) 
> [com.adobe.granite.osgi.wrapper.guava:15.0.0.0002]
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.getSegment(SegmentCache.java:160)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.readSegment(FileStore.java:498)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentId.getSegment(SegmentId.java:153) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.RecordId.getSegment(RecordId.java:98) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentBlob.readLongBlobId(SegmentBlob.java:206)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.SegmentBlob.readBlobId(SegmentBlob.java:163)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore$3.consume(AbstractFileStore.java:262)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.Segment.forEachRecord(Segment.java:601) 
> [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readBinaryReferences(AbstractFileStore.java:257)
>  [org.apache.jackrabbit.oak-segment-tar:1.10.0]
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750980#comment-16750980
 ] 

Francesco Mari commented on OAK-6749:
-

[~Csaba Varga], thanks for your contribution.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch, 
> repack_binaries.groovy
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-24 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750930#comment-16750930
 ] 

Francesco Mari commented on OAK-6749:
-

[~Csaba Varga], I think there is some value in sharing your script on this 
issue. If other users will be stuck with this problem in 1.6, this issue will 
probably pop up. If that happens, you would surely do a great service to 
someone else!

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> 

[jira] [Resolved] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-23 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-6749.
-
Resolution: Fixed

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-23 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749811#comment-16749811
 ] 

Francesco Mari commented on OAK-6749:
-

The version of the Cold Standby at 1.6 is way too primitive compared to 1.8. 
Too many previous changes need to be backported in order to make this backport 
feasible, and that's just too risky - especially since [~Csaba Varga] managed 
to find a workaround for the issue. I'm going to resolve this issue.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-23 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749782#comment-16749782
 ] 

Francesco Mari commented on OAK-6749:
-

Backported to 1.8 at r1851902.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-23 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Fix Version/s: 1.8.12

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.8.12, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-23 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749764#comment-16749764
 ] 

Francesco Mari commented on OAK-6749:
-

[~Csaba Varga], for the sake of completeness, can you describe how you worked 
around the problem?

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Fix Version/s: 1.10

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1, 1.10
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Fix Version/s: 1.10.1

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10.1
>
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-18 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746313#comment-16746313
 ] 

Francesco Mari commented on OAK-6749:
-

I fixed the issue at r1851619. I'm going to backport the relevant commits up to 
1.6.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-18 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746129#comment-16746129
 ] 

Francesco Mari commented on OAK-6749:
-

I incorporated [~tmueller]'s suggestions - both the one given here and other 
feedback provided offline - into the second version of the patch. The method 
{{shouldFetchBinary}} is definitely longer than the previous implementation, 
but it's well commented and I think it conveys a better description of the 
shortcuts and the worst case involved in such a check.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-18 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Attachment: OAK-6749-02.patch

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-6749-01.patch, OAK-6749-02.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:611)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-17 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745305#comment-16745305
 ] 

Francesco Mari commented on OAK-6749:
-

The first version of the patch changes the logic to determine whether a blob 
should be downloaded from the primary. Following a suggestion from [~tmueller], 
I use the {{BlobStore}} to determine if the content of the blob can be 
retrieved, instead of relying on the absence of a binary reference. 
[~tmueller], I would appreciate a review of this patch. [~dulceanu], could you 
please skim through my previous commits and the patch and check if everything 
looks alright?

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-6749-01.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> 

[jira] [Updated] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-17 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-6749:

Attachment: OAK-6749-01.patch

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
> Attachments: OAK-6749-01.patch
>
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:611)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-17 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745302#comment-16745302
 ] 

Francesco Mari commented on OAK-6749:
-

I added another test at r1851551 to prove that inline binaries are never 
downloaded. This will prove valuable when the logic to determine wether to 
download a blob changes.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> 

[jira] [Commented] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-17 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745156#comment-16745156
 ] 

Francesco Mari commented on OAK-6749:
-

In r1851533 and r1851535 I refactored the blob processing logic from 
{{StandbyDiff}} in a separate class. In r1851534 I added a failing test case 
demonstrating the problem.

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> 

[jira] [Assigned] (OAK-6749) Segment-Tar standby sync fails with "in-memory" blobs present in the source repo

2019-01-16 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-6749:
---

Assignee: Francesco Mari

> Segment-Tar standby sync fails with "in-memory" blobs present in the source 
> repo
> 
>
> Key: OAK-6749
> URL: https://issues.apache.org/jira/browse/OAK-6749
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, tarmk-standby
>Affects Versions: 1.6.2
>Reporter: Csaba Varga
>Assignee: Francesco Mari
>Priority: Major
>
> We have run into some issue when trying to transition from an active/active 
> Mongo NodeStore cluster to a single Segment-Tar server with cold standby. The 
> issue itself manifests when the standby server tries to pull changes from the 
> primary after the first round of online revision GC.
> Let me summarize the way we ended up with the current state, and my 
> hypothesis about what happened, based on my debugging so far:
> # We started with a Mongo NodeStore and an external FileDataStore as the blob 
> store. The FileDataStore was set up with minRecordLength=4096. The Mongo 
> store stores blobs below minRecordLength as special "in-memory" blobIDs where 
> the data itself is baked into the ID string in hex.
> # We have executed a sidegrade of the Mongo store into a Segment-Tar store. 
> Our datastore is over 1TB in size, so copying the binaries wasn't an option. 
> The new repository is simply reusing the existing datastore. The "in-memory" 
> blobIDs still look like external blobIDs to the sidegrade process, so they 
> were copied into the Segment-Tar repository as-is, instead of being converted 
> into the efficient in-line format.
> # The server started up without issues on the new Segment-Tar store. The 
> migrated "in-memory" blob IDs seem to work fine, if a bit sub-optimal.
> # At this point, we have created a cold standby instance by copying the files 
> of the stopped primary instance and making the necessary config changes on 
> both servers.
> # Everything worked fine until the primary server started its first round of 
> online revision GC. After that process completed, the standby node started 
> throwing exceptions about missing segments, and eventually stopped 
> altogether. In the meantime, the following warning showed up in the primary 
> log:
> {code:java}
> 29.09.2017 06:12:08.088 *WARN* [nioEventLoopGroup-3-10] 
> org.apache.jackrabbit.oak.segment.standby.server.ExceptionHandler Exception 
> caught on the server
> io.netty.handler.codec.TooLongFrameException: frame length (8208) exceeds the 
> allowed maximum (8192)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:146)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.fail(LineBasedFrameDecoder.java:142)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:99)
> at 
> io.netty.handler.codec.LineBasedFrameDecoder.decode(LineBasedFrameDecoder.java:75)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:345)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:352)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:611)
> at 
> 

[jira] [Assigned] (OAK-5613) Test failure: segment.standby.ExternalSharedStoreIT.testProxyFlippedIntermediateByteChange2

2019-01-15 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-5613:
---

Assignee: Francesco Mari

> Test failure: 
> segment.standby.ExternalSharedStoreIT.testProxyFlippedIntermediateByteChange2
> ---
>
> Key: OAK-5613
> URL: https://issues.apache.org/jira/browse/OAK-5613
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration, segmentmk
>Affects Versions: 1.0.36, 1.6.11
>Reporter: Hudson
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure, windows
> Attachments: unit-tests.log
>
>
> Jenkins Windows CI failure: https://builds.apache.org/job/Oak-Win/
> The build Oak-Win/Windows slaves=Windows,jdk=JDK 1.7 (unlimited security) 
> 64-bit Windows only,nsfixtures=SEGMENT_MK,profile=integrationTesting #443 has 
> failed.
> First failed run: [Oak-Win/Windows slaves=Windows,jdk=JDK 1.7 (unlimited 
> security) 64-bit Windows 
> only,nsfixtures=SEGMENT_MK,profile=integrationTesting 
> #443|https://builds.apache.org/job/Oak-Win/Windows%20slaves=Windows,jdk=JDK%201.7%20(unlimited%20security)%2064-bit%20Windows%20only,nsfixtures=SEGMENT_MK,profile=integrationTesting/443/]
>  [console 
> log|https://builds.apache.org/job/Oak-Win/Windows%20slaves=Windows,jdk=JDK%201.7%20(unlimited%20security)%2064-bit%20Windows%20only,nsfixtures=SEGMENT_MK,profile=integrationTesting/443/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-5532) Test failure: segment.standby.ExternalSharedStoreIT/BrokenNetworkTest.test...

2019-01-15 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-5532.
-
Resolution: Won't Fix

This issue relates to a module that has been deprecated and dismissed. I will 
resolve this issue as won't fix.

> Test failure: segment.standby.ExternalSharedStoreIT/BrokenNetworkTest.test...
> -
>
> Key: OAK-5532
> URL: https://issues.apache.org/jira/browse/OAK-5532
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration, segmentmk
>Affects Versions: 1.0.42, 1.2.30
>Reporter: Hudson
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure, ubuntu
> Attachments: build1401-unit-tests.log, unit-tests.log
>
>
> Jenkins CI failure: 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/
> The build Apache Jackrabbit Oak matrix/Ubuntu Slaves=ubuntu,jdk=JDK 1.8 
> (latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting #1384 has failed.
> First failed run: [Apache Jackrabbit Oak matrix/Ubuntu Slaves=ubuntu,jdk=JDK 
> 1.8 (latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting 
> #1384|https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/Ubuntu%20Slaves=ubuntu,jdk=JDK%201.8%20(latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting/1384/]
>  [console 
> log|https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/Ubuntu%20Slaves=ubuntu,jdk=JDK%201.8%20(latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting/1384/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-4709) Test failure: ExternalPrivateStoreIT.testProxyFlippedIntermediateByte2

2019-01-15 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-4709.
-
Resolution: Won't Fix

This issue relates to a module that has been deprecated and dismissed. I will 
resolve this issue as won't fix.

> Test failure: ExternalPrivateStoreIT.testProxyFlippedIntermediateByte2
> --
>
> Key: OAK-4709
> URL: https://issues.apache.org/jira/browse/OAK-4709
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: tarmk-standby
>Affects Versions: 1.2.18, 1.4.6, 1.6.1
>Reporter: Julian Reschke
>Assignee: Francesco Mari
>Priority: Major
>
> This test reliably fails for me in 1.4 with:
> {noformat}
> testProxyFlippedIntermediateByte2(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 1.3 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}
> It does not fail in trunk, but that's because all ITs for tarmk-standby 
> currently are skipped, because they are executed with fixture SEGMENT_TAR 
> while the code checks for SEGMENT_MK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-5613) Test failure: segment.standby.ExternalSharedStoreIT.testProxyFlippedIntermediateByteChange2

2019-01-15 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-5613.
-
Resolution: Duplicate

> Test failure: 
> segment.standby.ExternalSharedStoreIT.testProxyFlippedIntermediateByteChange2
> ---
>
> Key: OAK-5613
> URL: https://issues.apache.org/jira/browse/OAK-5613
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration, segmentmk
>Affects Versions: 1.0.36, 1.6.11
>Reporter: Hudson
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure, windows
> Attachments: unit-tests.log
>
>
> Jenkins Windows CI failure: https://builds.apache.org/job/Oak-Win/
> The build Oak-Win/Windows slaves=Windows,jdk=JDK 1.7 (unlimited security) 
> 64-bit Windows only,nsfixtures=SEGMENT_MK,profile=integrationTesting #443 has 
> failed.
> First failed run: [Oak-Win/Windows slaves=Windows,jdk=JDK 1.7 (unlimited 
> security) 64-bit Windows 
> only,nsfixtures=SEGMENT_MK,profile=integrationTesting 
> #443|https://builds.apache.org/job/Oak-Win/Windows%20slaves=Windows,jdk=JDK%201.7%20(unlimited%20security)%2064-bit%20Windows%20only,nsfixtures=SEGMENT_MK,profile=integrationTesting/443/]
>  [console 
> log|https://builds.apache.org/job/Oak-Win/Windows%20slaves=Windows,jdk=JDK%201.7%20(unlimited%20security)%2064-bit%20Windows%20only,nsfixtures=SEGMENT_MK,profile=integrationTesting/443/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OAK-5532) Test failure: segment.standby.ExternalSharedStoreIT/BrokenNetworkTest.test...

2019-01-15 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-5532:
---

Assignee: Francesco Mari

> Test failure: segment.standby.ExternalSharedStoreIT/BrokenNetworkTest.test...
> -
>
> Key: OAK-5532
> URL: https://issues.apache.org/jira/browse/OAK-5532
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: continuous integration, segmentmk
>Affects Versions: 1.0.42, 1.2.30
>Reporter: Hudson
>Assignee: Francesco Mari
>Priority: Major
>  Labels: test-failure, ubuntu
> Attachments: build1401-unit-tests.log, unit-tests.log
>
>
> Jenkins CI failure: 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/
> The build Apache Jackrabbit Oak matrix/Ubuntu Slaves=ubuntu,jdk=JDK 1.8 
> (latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting #1384 has failed.
> First failed run: [Apache Jackrabbit Oak matrix/Ubuntu Slaves=ubuntu,jdk=JDK 
> 1.8 (latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting 
> #1384|https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/Ubuntu%20Slaves=ubuntu,jdk=JDK%201.8%20(latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting/1384/]
>  [console 
> log|https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/Ubuntu%20Slaves=ubuntu,jdk=JDK%201.8%20(latest),nsfixtures=DOCUMENT_RDB,profile=integrationTesting/1384/console]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OAK-4709) Test failure: ExternalPrivateStoreIT.testProxyFlippedIntermediateByte2

2019-01-15 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-4709:
---

Assignee: Francesco Mari

> Test failure: ExternalPrivateStoreIT.testProxyFlippedIntermediateByte2
> --
>
> Key: OAK-4709
> URL: https://issues.apache.org/jira/browse/OAK-4709
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: tarmk-standby
>Affects Versions: 1.2.18, 1.4.6, 1.6.1
>Reporter: Julian Reschke
>Assignee: Francesco Mari
>Priority: Major
>
> This test reliably fails for me in 1.4 with:
> {noformat}
> testProxyFlippedIntermediateByte2(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalPrivateStoreIT)
>   Time elapsed: 1.3 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<{ root = { ... } }> but was:<{ root : { } 
> }>
> {noformat}
> It does not fail in trunk, but that's because all ITs for tarmk-standby 
> currently are skipped, because they are executed with fixture SEGMENT_TAR 
> while the code checks for SEGMENT_MK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-7719) CheckCommand should consistently use an alternative journal if specified

2019-01-03 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-7719.
-
   Resolution: Fixed
Fix Version/s: 1.9.14

Fixed at r1850238.

> CheckCommand should consistently use an alternative journal if specified
> 
>
> Key: OAK-7719
> URL: https://issues.apache.org/jira/browse/OAK-7719
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: run, segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
>  Labels: technical_debt
> Fix For: 1.10, 1.9.14
>
>
> Callers of the {{check}} command can specify an alternative journal with the 
> {{\-\-journal}} option. This option instructs the {{ConsistencyChecker}} to 
> check the revisions stored in that file instead of the ones stored in the 
> default {{journal.log}}.
> I spotted at least two problems while using {{\-\-journal}} on a repository 
> with a corrupted {{journal.log}} that didn't contain any valid revision.
> First, the path to the {{FileStore}} is validated by 
> {{FileStoreHelper#isValidFileStoreOrFail}}, which checks for the existence of 
> a {{journal.log}} in the specified folder. But if a {{journal.log}} doesn't 
> exist and the user specified a different journal on the command line this 
> check should be ignored.
> Second, when opening the {{FileStore}} the default {{journal.log}} is scanned 
> to determine the initial revision of the head state. If a user specifies an 
> alternative journal on the command line, that journal should be used instead 
> of the default {{journal.log}}. It might be that the default journal contains 
> no valid revision, which would force the system to crash when opening a new 
> instance of {{FileStore}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-11 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-7914.
-
   Resolution: Not A Problem
Fix Version/s: (was: 1.6.16)

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.6.15
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Critical
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-11 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717618#comment-16717618
 ] 

Francesco Mari commented on OAK-7914:
-

[~rma61...@adobe.com], good to know. I'm going to resolve this issue. Thanks 
for the information.

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.6.15
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Critical
> Fix For: 1.6.16
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-11 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari reassigned OAK-7914:
---

Assignee: Francesco Mari

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.6.15
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Critical
> Fix For: 1.6.16
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-10 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-7914:

Fix Version/s: (was: 1.10)
   1.6.16

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.6.15
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.6.16
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-10 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-7914:

Affects Version/s: 1.6.15

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: 1.6.15
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.6.16
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-10 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714994#comment-16714994
 ] 

Francesco Mari commented on OAK-7914:
-

[~rma61...@adobe.com], I'm not able to reproduce this issue. There is a guard 
in the code handling the {{gc.log}} that prevents it from being updated if the 
compaction phase fails and doesn't install a new head revision. See [this 
line|https://github.com/apache/jackrabbit-oak/blob/1.6/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/GCJournal.java#L77]
 and related comment for reference. I think that your attachment only includes 
log statements from {{FileStore}}. Have you observed log statements from 
{{org.apache.jackrabbit.oak.segment.file.GCJournal}} in your system?

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.10
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-10 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714735#comment-16714735
 ] 

Francesco Mari commented on OAK-7914:
-

[~rma61...@adobe.com], I don't think that the issue is caused by a problem with 
the permissions on {{gc.log}}. First, because the latest journal entry is 
maintained in memory even if {{gc.log}} is not writeable. Second, because an 
error in reading or writing {{gc.log}} would generate an ERROR message in the 
logs, and I couldn't find any.

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.10
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction

2018-12-10 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714532#comment-16714532
 ] 

Francesco Mari commented on OAK-7914:
-

[~mduerig], I think that this issue is related to OAK-5360. Cleanup can't 
distinguish between a completion and a cancellation of the compaction phase, 
and thus always update the {{gc.log}}. If my assumption is correct, I will take 
care of OAK-5360 as well.

> Cleanup updates the gc.log after a failed compaction
> 
>
> Key: OAK-7914
> URL: https://issues.apache.org/jira/browse/OAK-7914
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.10
>
> Attachments: compaction.log
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the 
> result of the compaction phase. This might cause a scenario similar to the 
> following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the 
> repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of 
> content added to the repository is negligible. For the sake of simplicity, 
> let's say that the size of the repository hasn't changed. The following 
> happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't 
> removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} 
> reports that the latest known size of the repository is 100GB, and there is 
> not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the 
> repository, but estimation is not able to see that anymore. The solution 
> seems to be not to update the {{gc.log}} if compaction fails. In other words, 
> {{gc.log}} should contain the size of the *compacted* repository over time, 
> and no more.
> Thanks to [~rma61...@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (OAK-7945) Document the recover-journal command

2018-12-06 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-7945.
-
   Resolution: Fixed
Fix Version/s: 1.9.13

Fixed at r1848302.

> Document the recover-journal command
> 
>
> Key: OAK-7945
> URL: https://issues.apache.org/jira/browse/OAK-7945
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10, 1.9.13
>
>
> Add documentation for the recover-journal command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OAK-7945) Document the recover-journal command

2018-12-06 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-7945:
---

 Summary: Document the recover-journal command
 Key: OAK-7945
 URL: https://issues.apache.org/jira/browse/OAK-7945
 Project: Jackrabbit Oak
  Issue Type: Documentation
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: 1.10


Add documentation for the recover-journal command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-7457) "Covariant return type change detected" warnings with java10

2018-12-05 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710356#comment-16710356
 ] 

Francesco Mari commented on OAK-7457:
-

[~reschke], I fixed the Segment Store in OAK-7942. My solution was to create a 
wrapper for {{ByteBuffer}} encapsulating the correct access pattern to the 
wrapped {{ByteBuffer}}. You can find the wrapper class at 
{{org.apache.jackrabbit.oak.segment.spi.persistence.Buffer}}. I don't know if 
this solution makes sense for the Document Node Store too. I wanted to give you 
a heads up to save you from implementing similar code from scratch whenever you 
tackle this issue.

> "Covariant return type change detected" warnings with java10
> 
>
> Key: OAK-7457
> URL: https://issues.apache.org/jira/browse/OAK-7457
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, segment-tar
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
> Fix For: 1.10
>
>
> We have quite a few warnings of type "Covariant return type change detected":
> {noformat}
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\TCPBroadcaster.java:327:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.flip() has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.flip()
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\UDPBroadcaster.java:135:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.limit(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\UDPBroadcaster.java:138:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\TCPBroadcaster.java:226:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\broadcast\InMemoryBroadcaster.java:35:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\PersistentCache.java:519:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.limit(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\PersistentCache.java:522:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-store-document\src\main\java\org\apache\jackrabbit\oak\plugins\document\persistentCache\PersistentCache.java:535:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataV12.java:196:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataV12.java:197:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.limit(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.limit(int)
> [INFO] 
> C:\projects\apache\oak\trunk\oak-segment-tar\src\main\java\org\apache\jackrabbit\oak\segment\data\SegmentDataUtils.java:57:
>  Covariant return type change detected: java.nio.Buffer 
> java.nio.ByteBuffer.position(int) has been changed to java.nio.ByteBuffer 
> java.nio.ByteBuffer.position(int)
> [INFO] 
> 

[jira] [Resolved] (OAK-7942) Fix covariant return type changes in ByteBuffer

2018-12-05 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-7942.
-
   Resolution: Fixed
Fix Version/s: 1.9.13

Fixed at r1848226.

> Fix covariant return type changes in ByteBuffer
> ---
>
> Key: OAK-7942
> URL: https://issues.apache.org/jira/browse/OAK-7942
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10, 1.9.13
>
> Attachments: OAK-7942-01.patch, OAK-7942-02.patch
>
>
> Many methods in {{ByteBuffer}} now return an instance of {{ByteBuffer}} 
> instead of {{Buffer}}. This results in {{NoSuchMethodError}} when code 
> compiled in Java 9 is executed in an older JVM. See 
> [this|https://jira.mongodb.org/browse/JAVA-2559] and 
> [this|https://github.com/plasma-umass/doppio/issues/497#issuecomment-334740243]
>  for reference and proposed fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OAK-7942) Fix covariant return type changes in ByteBuffer

2018-12-05 Thread Francesco Mari (JIRA)


 [ 
https://issues.apache.org/jira/browse/OAK-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-7942:

Attachment: OAK-7942-02.patch

> Fix covariant return type changes in ByteBuffer
> ---
>
> Key: OAK-7942
> URL: https://issues.apache.org/jira/browse/OAK-7942
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
>Priority: Major
> Fix For: 1.10
>
> Attachments: OAK-7942-01.patch, OAK-7942-02.patch
>
>
> Many methods in {{ByteBuffer}} now return an instance of {{ByteBuffer}} 
> instead of {{Buffer}}. This results in {{NoSuchMethodError}} when code 
> compiled in Java 9 is executed in an older JVM. See 
> [this|https://jira.mongodb.org/browse/JAVA-2559] and 
> [this|https://github.com/plasma-umass/doppio/issues/497#issuecomment-334740243]
>  for reference and proposed fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >