subject:"\"\\\[spark on yarn\\\] spark on yarn without DFS\""

Re: [spark on yarn] spark on yarn without DFS

2019-05-23 Thread Achilleus 003

This is interesting. Would really appreciate it if you could share what exactly did you change in* core-site.xml *and *yarn-site.xml.* On Wed, May 22, 2019 at 9:14 AM Gourav Sengupta wrote: > just wondering what is the advantage of doing this? > > Regards > Gourav Sengupta > > On Wed, May 22, 20

Re: [spark on yarn] spark on yarn without DFS

2019-05-22 Thread Gourav Sengupta

just wondering what is the advantage of doing this? Regards Gourav Sengupta On Wed, May 22, 2019 at 3:01 AM Huizhe Wang wrote: > Hi Hari, > Thanks :) I tried to do it as u said. It works ;) > > > Hariharan 于2019年5月20日周一下午3:54写道： > >> Hi Huizhe, >> >> You can set the "fs.defaultFS" field in cor

Re: [spark on yarn] spark on yarn without DFS

2019-05-21 Thread Huizhe Wang

Hi Hari, Thanks :) I tried to do it as u said. It works ;) Hariharan 于2019年5月20日周一下午3:54写道： > Hi Huizhe, > > You can set the "fs.defaultFS" field in core-site.xml to some path on s3. > That way your spark job will use S3 for all operations that need HDFS. > Intermediate data will still be store

Re: [spark on yarn] spark on yarn without DFS

2019-05-20 Thread JB Data31

There is a kind of check in the *yarn-site.xml* *yarn.nodemanager.remote-app-log-dir /var/yarn/logs* ** Using *hdfs://:9000* as* fs.defaultFS* in *core-site.xml* you have to *hdfs dfs -mkdir /var/yarn/logs* Using *S3://* as * fs.defaultFS*... Take care of *.dir* properties in* hdfs-site

Re: [spark on yarn] spark on yarn without DFS

2019-05-20 Thread Hariharan

Hi Huizhe, You can set the "fs.defaultFS" field in core-site.xml to some path on s3. That way your spark job will use S3 for all operations that need HDFS. Intermediate data will still be stored on local disk though. Thanks, Hari On Mon, May 20, 2019 at 10:14 AM Abdeali Kothari wrote: > While

Re: [spark on yarn] spark on yarn without DFS

2019-05-19 Thread Abdeali Kothari

While spark can read from S3 directly in EMR, I believe it still needs the HDFS to perform shuffles and to write intermediate data into disk when doing jobs (I.e. when the in memory need stop spill over to disk) For these operations, Spark does need a distributed file system - You could use someth

Re: [spark on yarn] spark on yarn without DFS

2019-05-19 Thread Jeff Zhang

I am afraid not, because yarn needs dfs. Huizhe Wang 于2019年5月20日周一上午9:50写道： > Hi, > > I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and > using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and > DataNode. I got an error when using yarn cluster mode. Co

[spark on yarn] spark on yarn without DFS

2019-05-19 Thread Huizhe Wang

Hi, I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and DataNode. I got an error when using yarn cluster mode. Could I using yarn without start DFS, how could I use this mode? Yours, Jane

Re: [spark on yarn] spark on yarn without DFS

Re: [spark on yarn] spark on yarn without DFS

Re: [spark on yarn] spark on yarn without DFS

Re: [spark on yarn] spark on yarn without DFS

Re: [spark on yarn] spark on yarn without DFS

Re: [spark on yarn] spark on yarn without DFS

Re: [spark on yarn] spark on yarn without DFS

[spark on yarn] spark on yarn without DFS

8 matches

Site Navigation

Mail list logo

Footer information