Gabor Bota created HADOOP-15819:
-----------------------------------
Summary: S3A integration test failures: FileSystem is closed! -
without parallel test run
Key: HADOOP-15819
URL: https://issues.apache.org/jira/browse/HADOOP-15819
Project: Hadoop Common
Issue Type: Bug
Components: fs/s3
Affects Versions: 3.1.1
Reporter: Gabor Bota
Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against
Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these
failures:
{noformat}
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 s
<<< FAILURE! - in
org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob
[ERROR]
testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob)
Time elapsed: 0.027 s <<< ERROR!
java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed!
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 s
<<< FAILURE! - in
org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob
[ERROR]
testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob)
Time elapsed: 0.021 s <<< ERROR!
java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed!
[ERROR]
testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob)
Time elapsed: 0.022 s <<< ERROR!
java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed!
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 s
<<< FAILURE! - in
org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest
[ERROR]
testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest)
Time elapsed: 0.023 s <<< ERROR!
java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed!
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 s
<<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob
[ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob)
Time elapsed: 0.039 s <<< ERROR!
java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed!
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 s
<<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory
[ERROR]
testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) Time
elapsed: 0.014 s <<< ERROR!
java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed!
{noformat}
The big issue is that the tests are running in a serial manner - no test is
running on top of the other - so we should not see that the tests are failing
like this. The issue could be in how we handle
org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same
S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B
test will get the same FileSystem object from the cache and try to use it, but
it is closed.
We see this a lot in our downstream testing too. It's not possible to tell that
the failed regression test result is an implementation issue in the runtime
code or a test implementation problem.
I've checked when and what closes the S3AFileSystem with a sightly modified
version of S3AFileSystem which logs the closers of the fs if an error should
occur. I'll attach this modified java file for reference. See the next example
of the result when it's running:
{noformat}
2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close():
java.lang.RuntimeException: Using closed FS!.
at
org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73)
at
org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.mkdirs(S3ACloseEnforcedFileSystem.java:474)
at
org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338)
at
org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193)
at
org.apache.hadoop.fs.s3a.ITestS3AClosedFS.setup(ITestS3AClosedFS.java:40)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(61)) - FS was closed 6 times by:
2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(64)) - ----------------- close() #1
---------------
2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(65)) - java.lang.RuntimeException:
close() called here
at
org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.close(S3ACloseEnforcedFileSystem.java:83)
at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
at
org.apache.hadoop.fs.s3a.AbstractS3ATestBase.teardown(AbstractS3ATestBase.java:55)
at
org.apache.hadoop.fs.s3a.commit.AbstractCommitITest.teardown(AbstractCommitITest.java:206)
at
org.apache.hadoop.fs.s3a.auth.ITestAssumedRoleCommitOperations.teardown(ITestAssumedRoleCommitOperations.java:90)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(64)) - ----------------- close() #2
---------------
2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(65)) - java.lang.RuntimeException:
close() called here
at
org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.close(S3ACloseEnforcedFileSystem.java:83)
at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
at
org.apache.hadoop.fs.s3a.AbstractS3ATestBase.teardown(AbstractS3ATestBase.java:55)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
2018-10-04 00:52:25,597 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(64)) - ----------------- close() #3
---------------
2018-10-04 00:52:25,598 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem
(S3ACloseEnforcedFileSystem.java:logClosers(65)) - java.lang.RuntimeException:
close() called here
at
org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.close(S3ACloseEnforcedFileSystem.java:83)
at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
at
org.apache.hadoop.fs.s3a.AbstractS3ATestBase.teardown(AbstractS3ATestBase.java:55)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
(...)
{noformat}
>From the logs it seems that the closers are the teardown methods of the
>abstract test classes, so the same FileSystem object gets reused between tests.
To run a test with the modified fs the following should be in the config:
{noformat}
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem</value>
<description>The implementation class of the S3A Filesystem</description>
</property>
{noformat}
I file this jira to solve this issue, and start a conversation if needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]