I wonder if anyone can confirm is Spark on YARN the problem here?  Or is it
how AWS has put it together?  I'm wondering if Spark on YARN has problems
with configuration files for the workers and driver?


Peter Halliday

On Thu, Apr 14, 2016 at 1:09 PM, Peter Halliday <pjh...@cornell.edu> wrote:

> An update to this is that I can see the log4j.properties files and the
> metrics.properties files correctly on the master.  When I submit a Spark
> Step that runs Spark in deploy mode of cluster, I see the cluster files
> being zipped up and pushed via hdfs to the driver and workers.  However, I
> don't see evidence than the configuration files are read from or used after
> they pushed....
>
> On Wed, Apr 13, 2016 at 11:22 AM, Peter Halliday <pjh...@cornell.edu>
> wrote:
>
>> I have an existing cluster that I stand up via Docker images and
>> CloudFormation Templates  on AWS.  We are moving to EMR and AWS Data
>> Pipeline process, and having problems with metrics and log4j.  We’ve sent a
>> JSON configuration for spark-log4j and spark-metrics.  The log4j file seems
>> to be basically working for the master.  However, the driver and executors
>> it isn’t working for.  I’m not sure why.  Also, the metrics aren’t working
>> anywhere. It’s using a cloud watch to log the metrics, and there’s no
>> CloudWatch Sink for Spark it seems on EMR, and so we created one that we
>> added to a jar than’s sent via —jars to spark-submit.
>>
>> Peter Halliday
>
>
>

Reply via email to