Hi, You may want to check HADOOP-10400 <https://issues.apache.org/jira/browse/HADOOP-10400> for the overhaul of S3 filesystem fixed in 2.6.
The subclass of AbstractFileSystem was filed as HADOOP-10643 <https://issues.apache.org/jira/browse/HADOOP-10643>, but which was not included in HADOOP-10400 though I made a comment <https://issues.apache.org/jira/browse/HADOOP-10400?focusedCommentId=14104967&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14104967> . I suggest not to use S3 as defaultFS as commented in "Why you cannot use S3 as a replacement for HDFS <https://wiki.apache.org/hadoop/AmazonS3>" to avoid all sorts of these issues. The best practice is to use S3 as a supplementary solution to Hadoop in order to bring life cycle management(expiration and tiering), and source/destination over the internet. Thanks, Takenori On Sun, Sep 28, 2014 at 5:23 PM, Naganarasimha G R (Naga) < [email protected]> wrote: > Hi Jay, > Thanks a lot for replying and it clarifies most of it, but still some > parts are not so clear . > Some clarifications from my side : > *| When you say "HDFS does not support fs.AbstractFileSystem.s3.impl".... > That is true. If your file system is configured using HDFS, then s3 urls > will not be used, ever.* > :) i think i am not doing this basic mistake . What we have done is we > have configured *"viewfs://nsX" for "fs.defaultFS"* and one of the mount > is S3 i.e. *"fs.viewfs.mounttable.nsX.link./uds" to "s3://hadoop/test1/"* > . > So it fails to even create YARNRunner instance as there is no mapping for > *"**fs.AbstractFileSystem.s3.impl" *if run "./yarn jar"*. *But as per the > code even if set *"fs.defaultFS"* to s3 it will not work as there is no > mapping for S3's impl of AbstractFileSystem interface. > > These are my further queries > > 1. Whats the purpose of *AbstractFileSystem *and *FileSystem * > interfaces? > 2. Does HDFS default package(code) support configuration of S3 ? I see > S3 implementation of *FileSystem* interface( > *org.apache.hadoop.fs.s3.S3FileSystem*) *but not for **AbstractFileSystem > **!. *So i presume it doesn't support S3 completely. Whats the reason > for not supporting both ? > 3. Suppose if i need to support Amazon S3 do i need to extend and > implement *AbstractFileSystem *and configure > *"**fs.AbstractFileSystem.s3.impl" > *or some thing more than this i need to take care*?* > > Regards, > > Naga > > > > Huawei Technologies Co., Ltd. > Phone: > Fax: > Mobile: +91 9980040283 > Email: [email protected] > Huawei Technologies Co., Ltd. > Bantian, Longgang District,Shenzhen 518129, P.R.China > http://www.huawei.com > > > ------------------------------ > *From:* jay vyas [[email protected]] > *Sent:* Saturday, September 27, 2014 02:41 > *To:* [email protected] > *Subject:* Re: > > See https://wiki.apache.org/hadoop/HCFS/ > > YES Yarn is written to the FileSystem interface. It works on S3FileSystem > and GlusterFileSystem and any other HCFS. > > We have run , and continue to run, the many tests in apache bigtop's test > suite against our hadoop clusters running on alternative file system > implementations, > and it works. > > When you say "HDFS does not support fs.AbstractFileSystem.s3.impl".... > That is true. If your file system is configured using HDFS, then s3 urls > will not be used, ever. > > When you create a FileSystem object in hadoop, it reads the uri (i.e. > "glusterfs:///") and then finds the file system binding in your > core-site.xml (i.e. fs.AbstractFileSystem.glusterfs.impl). > > So the URI must have a corresponding entry in the core-site.xml. > > As a reference implementation, you can see > https://github.com/gluster/glusterfs-hadoop/blob/master/conf/core-site.xml > > > > > On Fri, Sep 26, 2014 at 10:10 AM, Naganarasimha G R (Naga) < > [email protected]> wrote: > >> Hi All, >> >> I have following doubts on pluggable FileSystem and YARN >> 1. If all the implementations should extend FileSystem then why there is >> a parallel class AbstractFileSystem. which ViewFS extends ? >> 2. Is YARN supposed to run on any of the pluggable >> org.apache.hadoop.fs.FileSystem like s3 ? >> if its suppose to run then when submitting a job in the client side >> YARNRunner is calling FileContext.getFileContext(this.conf); >> which is further calling FileContext.getAbstractFileSystem() which throws >> exception for S3. >> So i am not able to run YARN job with ViewFS with S3 as mount. And based >> on the code even if i configure only S3 then also its going to fail. >> 3. HDFS does not support "fs.AbstractFileSystem.s3.impl" with some >> default class similar to org.apache.hadoop.fs.s3.S3FileSystem ? >> >> Regards, >> >> Naga >> >> >> >> Huawei Technologies Co., Ltd. >> Phone: >> Fax: >> Mobile: +91 9980040283 >> Email: [email protected] >> Huawei Technologies Co., Ltd. >> http://www.huawei.com >> >> >> > > > -- > jay vyas >
