Jey,

On Mon, Jan 20, 2014 at 10:59 PM, Jey Kottalam <[email protected]> wrote:

> >> This sounds like either a bug or somehow the S3 library requiring lots
> of
> >> memory to read a block. There isn’t a separate way to run HDFS over S3.
> >> Hadoop just has different implementations of “file systems”, one of
> which is
> >> S3. There’s a pointer to these versions at the bottom of
> >>
> http://spark.incubator.apache.org/docs/latest/ec2-scripts.html#accessing-data-in-s3
> >> but it is indeed pretty hidden in the docs.
> >
> >
> > Hmmm. Maybe a bug then. If I read a small 600 byte file via the s3n://
> uri -
> > it works on a spark cluster. If I try a 20GB file it just sits and sits
> and
> > sits frozen. Is there anything I can do to instrument this and figure out
> > what is going on?
> >
>
> Try taking a look at the stderr log of the executor that failed. You
> should hopefully see a more detailed error message there. The stderr
> logs can be found by browsing to http://mymaster:8080, where
> `mymaster` is the hostname of your Spark master.
>

Thanks. I will try that but your assumption is that something is failing in
an obvious way with a message. By the look of the spark-shell - just frozen
I would say something is "stuck".  Will report back.

Thanks,
Ognen

Reply via email to