[
https://issues.apache.org/jira/browse/HDDS-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148034#comment-17148034
]
Siyao Meng edited comment on HDDS-3874 at 6/29/20, 6:09 PM:
------------------------------------------------------------
[~elek] I doubt this has anything to do with the FS interface. Looks like it is
stuck in a lock in SCM.
OFS contract cluster config is exactly the same as o3fs
({{RootedOzoneContract#createCluster}}) so this wouldn't be a variable.
I recall seeing a mini cluster setup/teardown related bug locally that, if I
setup and teardown mini cluster more than once in the same test class, access
to the second cluster would get stuck and the test would time out (try
{{TestOzoneManagerListVolumes}}). I was suspecting some clean up issues back
then but the problem disappears in GH workflow runs. Could be related.
was (Author: smeng):
[~elek] I doubt this has anything to do with the FS interface. Looks like it is
stuck in a lock in SCM.
OFS contract cluster config is exactly the same as o3fs
({{RootedOzoneContract#createCluster}}) so this wouldn't be a variable.
I recall seeing a mini cluster setup/teardown related bug locally that, if I
setup and teardown mini cluster more than once in the same test class, access
to the second cluster would get stuck and the test would time out. I was
suspecting some clean up issues back then but the problem disappears in GH
workflow runs.
> ITestRootedOzoneContract tests are flaky
> ----------------------------------------
>
> Key: HDDS-3874
> URL: https://issues.apache.org/jira/browse/HDDS-3874
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Marton Elek
> Assignee: Siyao Meng
> Priority: Blocker
>
> Different tests are failed with similar reasons:
> {code}
> java.lang.Exception: test timed out after 180000 milliseconds
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> at
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> at
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:537)
> at
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:499)
> at
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:514)
> at
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:149)
> at
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleStreamAction(KeyOutputStream.java:483)
> at
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:457)
> at
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:510)
> at
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.close(OzoneFSOutputStream.java:56)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at
> org.apache.hadoop.fs.contract.ContractTestUtils.createFile(ContractTestUtils.java:638)
> at
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenFileTwice(AbstractContractOpenTest.java:135)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> Example:
> https://github.com/elek/ozone-build-results/blob/master/2020/06/16/1051/it-filesystem-contract/hadoop-ozone/integration-test/org.apache.hadoop.fs.ozone.contract.rooted.ITestRootedOzoneContractOpen.txt
> But same problem here:
> https://github.com/elek/hadoop-ozone/runs/810175295?check_suite_focus=true
> (contract)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]