[ 
https://issues.apache.org/jira/browse/OAK-11991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032498#comment-18032498
 ] 

Ieran Draghiciu commented on OAK-11991:
---------------------------------------


{code:java}
(recovery process starts)
17.10.2025 11:34:34.714 *WARN* [segmentstore-init-29] 
org.apache.jackrabbit.oak.segment.file.tar.TarReader Could not find a valid tar 
index in [data194238a.tar], recovering...

(lists all blobs)
17.10.2025 11:34:36.711 *INFO* [reactor-http-nio-1] 
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP 
Request: GET 
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar%2F0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
 200 10ms
....
8836 requests
....
17.10.2025 11:36:06.224 *INFO* [reactor-http-nio-4] 
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP 
Request: GET 
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar%2F2284.c8c03cff-c439-46c8-aaa4-6b998f82153e
 200 4ms
(completes in ~1:30)

(recover blobs)
17.10.2025 11:36:06.226 *INFO* [segmentstore-init-29] 
org.apache.jackrabbit.oak.segment.azure.AzureArchiveManager Recovering segment 
data194238a.tar/0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
....
8836 requests
....
17.10.2025 11:36:06.394 *INFO* [segmentstore-init-29] 
org.apache.jackrabbit.oak.segment.azure.AzureArchiveManager Recovering segment 
data194238a.tar/2284.c8c03cff-c439-46c8-aaa4-6b998f82153e
(completes in <1s)


(copy blobs to bak)
17.10.2025 11:36:19.381 *INFO* [reactor-http-nio-5] 
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP 
Request: PUT 
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
 202 193ms
17.10.2025 11:36:19.505 *INFO* [reactor-http-nio-6] 
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP 
Request: HEAD 
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F0000.8c1917fe-ef75-462b-aace-64f1c9a88e8c
 200 17ms
....
5700 request
....
17.10.2025 11:52:01.639 *INFO* [reactor-http-nio-2] 
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP 
Request: PUT 
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F1645.5a496e63-35d0-4f1b-aede-5d202b935dee
 202 32ms
17.10.2025 11:52:01.743 *INFO* [reactor-http-nio-1] 
org.apache.jackrabbit.oak.segment.azure.AzureHttpRequestLoggingPolicy HTTP 
Request: HEAD 
https://sa0296720963470457782a1f.blob.core.windows.net/aem-sgmt-6d96d44a1a9a5930c37cf9a16735e9441c59f039-0006ef/aem%2Fdata194238a.tar.10.bak%2F1645.5a496e63-35d0-4f1b-aede-5d202b935dee
 200 4ms

.....
runs for 16 min, pod gets restarted because of readiness probe
~3000 more segments should be copied to bak archive
{code}


> Optimize the oak-segment recovery process
> -----------------------------------------
>
>                 Key: OAK-11991
>                 URL: https://issues.apache.org/jira/browse/OAK-11991
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: segment-azure, segment-tar
>            Reporter: Ieran Draghiciu
>            Priority: Major
>
> Tar archives with many segment files (more then 10.000) takes to much to 
> recover. Investigate and implement solution to optimize this process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to