[
https://issues.apache.org/jira/browse/OAK-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davide Giannella closed OAK-5034.
---------------------------------
Bulk close for 1.5.13
> FileStoreUtil#readSegmentWithRetry max retry delay is too short to be
> functional
> --------------------------------------------------------------------------------
>
> Key: OAK-5034
> URL: https://issues.apache.org/jira/browse/OAK-5034
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: segment-tar
> Affects Versions: Segment Tar 0.0.16
> Reporter: Timothee Maret
> Assignee: Francesco Mari
> Fix For: 1.6, 1.5.13
>
> Attachments: OAK-5034.patch
>
>
> The commit {{1765838}} introduced the {{FileStoreUtil#readSegmentWithRetry}}
> util and reduced the period between two tries (from 2sec to 0.125s) while the
> total number of tries did not change.
> This does not give enough time for the server to find references and
> segments, thus causing exceptions such as
> {code}
> 29.10.2016 05:07:37.242 *ERROR* [sling-default-2-Registered Service.605]
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed
> synchronizing state.
> java.lang.IllegalStateException: Unable to read references of segment
> 5168c878-3a3f-49d0-aea9-b8b57d5d867f from primary
> at
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.readReferences(StandbyClientSyncExecution.java:196)
> at
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.copySegmentHierarchyFromPrimary(StandbyClientSyncExecution.java:130)
> at
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.compareAgainstBaseState(StandbyClientSyncExecution.java:94)
> at
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.execute(StandbyClientSyncExecution.java:74)
> at
> org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync.run(StandbyClientSync.java:143)
> at
> org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:118)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> and causing the client to throw exceptions, ultimately causing IT tests to
> fail.
> IIUC, the minimum period to retry should be bigger than a TarMK flush cycle
> (5 sec).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)