[
https://issues.apache.org/jira/browse/OAK-11991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032498#comment-18032498
]
Ieran Draghiciu commented on OAK-11991:
---------------------------------------
{code:java}
(recovery process starts)
17.10.2025 11:34:34.714 *WARN* [segmentstore-init-29]
org.apache.jackrabbit.oak.segment.file.tar.TarReader Could not find a valid tar
index in [data194238a.tar], recovering...
(lists all blobs)
17.10.2025 11:34:36.711 *INFO* [reactor-http-nio-1]
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP
Request: GET
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar%2F0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
200 10ms
....
8836 requests
....
17.10.2025 11:36:06.224 *INFO* [reactor-http-nio-4]
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP
Request: GET
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar%2F2284.c8c03cff-c439-46c8-aaa4-6b998f82153e
200 4ms
(completes in ~1:30)
(recover blobs)
17.10.2025 11:36:06.226 *INFO* [segmentstore-init-29]
org.apache.jackrabbit.oak.segment.azure.AzureArchiveManager Recovering segment
data194238a.tar/0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
....
8836 requests
....
17.10.2025 11:36:06.394 *INFO* [segmentstore-init-29]
org.apache.jackrabbit.oak.segment.azure.AzureArchiveManager Recovering segment
data194238a.tar/2284.c8c03cff-c439-46c8-aaa4-6b998f82153e
(completes in <1s)
(copy blobs to bak)
17.10.2025 11:36:19.381 *INFO* [reactor-http-nio-5]
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP
Request: PUT
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
202 193ms
17.10.2025 11:36:19.505 *INFO* [reactor-http-nio-6]
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP
Request: HEAD
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
200 17ms
....
5700 request
....
17.10.2025 11:52:01.639 *INFO* [reactor-http-nio-2]
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP
Request: PUT
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F1645.5a496e63-35d0-4f1b-aede-5d202b935dee
202 32ms
17.10.2025 11:52:01.743 *INFO* [reactor-http-nio-1]
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP
Request: HEAD
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F1645.5a496e63-35d0-4f1b-aede-5d202b935dee
200 4ms
.....
runs for 16 min, pod gets restarted because of readiness probe
~3000 more segments should be copied to bak archive
{code}
> Optimize the oak-segment recovery process
> -----------------------------------------
>
> Key: OAK-11991
> URL: https://issues.apache.org/jira/browse/OAK-11991
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: segment-azure, segment-tar
> Reporter: Ieran Draghiciu
> Priority: Major
>
> Tar archives with many segment files (more then 10.000) takes to much to
> recover. Investigate and implement solution to optimize this process.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)