[
https://issues.apache.org/jira/browse/OAK-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598909#comment-17598909
]
Julian Sedding commented on OAK-9914:
-------------------------------------
Hi [~miroslav]
bq. Returning null wouldn't allow having the second Oak process in read-only
mode.
The TAR-file-based persistence returns null for files without an index (which
is equivalent to an azure folder without a "close" file). And it does allow for
a second read-only process. The test-case I shared yesterday shows that the
TAR-file-based RO-store also "recovers" and creates a {{*.ro.bak}} file.
Unfortunately, the javadocs for {{SegmentArchiveReader#open()}} do not specify
when an {{IOException}} should be thrown. However it _does_ specify that
{{null}} should be returned "if the archive doesn't exist or doesn't have a
valid index". I would interpret "doesn't have a valid index" to "has not been
closed" based on the precedent of the {{SegmentTarManager}} implementation.
Based on this rationale, I would align the behaviour of the
{{AzureArchiveManager}} with the one of the {{SegmentTarManager}}. That
certainly does not violate the javadocs, and it passes all existing unit-tests
in {{oak-segment-azure}}. Returning {{null}} for an "unclosed" archive gets rid
of the ugly stack-trace in the log file.
On the other hand, if I apply your patch [^AzureArchiveManager.patch], the test
{{AzureArchiveManagerTest#testUncleanStopSegmentMissing}} starts failing. I.e.
it breaks "real" recovery.
I agree that the read-only case could be implemented more nicely (that's in
[{{TarReader#openRO()}}|https://github.com/apache/jackrabbit-oak/blob/c6ddcc55bee3de915459af01e91edad32d538f3d/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/tar/TarReader.java#L115-L135]).
I see the following opportunities for improvements:
1. It does not attempt to read an archive using
{{SegmentArchiveManager#forceOpen()}} once {{SegmentArchiveManager#open()}}
returns {{null}}. {{#forceOpen}} is documented to "[open] an archive that
wasn't closed correctly". Implementing it is optional, and the
{{SegmentTarManager}} does not currently implement it (it delegates to #open()).
2. The archive is "recovered" persistently. I think for the read-only case it
would be nicer to reconstruct a missing index in-memory.
Number (1), i.e. using {{forceOpen}} would already work with the
{{AzureArchivemanager}}.
Using a system property sounds strange to me. There are no two correct
behaviours, so let's just fix it. And BTW, the system property patch would also
fail {{AzureArchiveManagerTest#testUncleanStopSegmentMissing}} when the system
property is set.
> Starting Oak with Azure persistence in read-only mode while another Oak
> process is running will initiate repo recovery
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: OAK-9914
> URL: https://issues.apache.org/jira/browse/OAK-9914
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: segment-azure
> Affects Versions: 1.44.0
> Reporter: Miroslav Smiljanic
> Assignee: Miroslav Smiljanic
> Priority: Major
> Attachments: AzureArchiveManager.patch,
> AzureArchiveManager_sys_prop.patch, OAK-9914_test.patch
>
>
> The sequence of events:
> # Oak process with read/write file store utilizing Azure persistence is
> already running
> # New Oak process is starting up, with read-only file store using Azure
> persistence and the same storage account and container like the previous Oak
> process
> ## New Oak process starts recovery procedure for the last tar directory that
> is not closed
> Scenario presented in the attached test case:
> [^OAK-9914_test.patch]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)