[
https://issues.apache.org/jira/browse/HADOOP-11262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pieter Reuse updated HADOOP-11262:
----------------------------------
Attachment: HADOOP-11262-6.patch
Patch 6:
As requested, expanded patch 5 with tests by extending the following tests,
overriding them with S3A specifics:
* _TestFileContext.java_
* _FileContextCreateMkdirBaseTest.java_
* _FileContextMainOperationsBaseTest.java_
* _FCStatisticsBaseTest.java_
* _FileContextURIBase.java_
* _FileContextUtilBase.java_
In doing so, fixed following bugs in _FileContextMainOperationsBaseTest.java_:
* _line 1169_: creating a symlink on an FS that doesn't support this throws an
_UnsupportedOperationException_, not an _IOException_ (see
_FileContext.java:1441_).
* _lines 1252 and 1313_: the contract of _read()_ is not to read the whole file
- that's the contract of _readFully()_. For this reason tests assuming that the
whole file has been read should use _readFully()_ instead of _read()_.
And added an enhancement for object storage systems in the same file:
* _line 1238_: an object storage system throws an _IOException_ as a file does
not exist *before* the file is closed (nor does it have a checksum at that
moment). This object-storage issue is resolved by changing the order of
_fc.setVerifyChecksum(true, path)_ and _out.write(data, 0, data.length)_, while
this does not impact the behaviour on hdfs or other file systems.
Discovered and patched the following related bugs in S3A:
* Bugfix in _S3AFileSystem.java_: ports on s3 should be ignored, which
corresponds with a value of -1 (instead of the default 0 in FileSystem).
* Another bugfix is in _S3AFileStatus.java_: _getModificationTime()_ is
overwritten for directories. It returns _System.currentTimeMillis()_ because an
ObjectStore does not keep track of modification-times of directories. Because
some parts of the Hadoop ecosystem use modification time to ignore or delete
"old" directories (e.g. the YarnHistorySever), returning 0 for directories is
not the best option here.
Added _TestS3AMiniYarnCluster.java_, which runs a simple _WordCount_-MapReduce
job on a _YarnMiniCluster_ using S3A as filesystem.
> Enable YARN to use S3A
> -----------------------
>
> Key: HADOOP-11262
> URL: https://issues.apache.org/jira/browse/HADOOP-11262
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Thomas Demoor
> Assignee: Pieter Reuse
> Labels: amazon, s3
> Attachments: HADOOP-11262-2.patch, HADOOP-11262-3.patch,
> HADOOP-11262-4.patch, HADOOP-11262-5.patch, HADOOP-11262-6.patch,
> HADOOP-11262.patch
>
>
> Uses DelegateToFileSystem to expose S3A as an AbstractFileSystem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)