Hello Mans,
On 1 Jan 2018, at 17:12, M Singh wrote:
> I am not sure if I missed it - but can you let us know what is your input
> source and output sink ?
Reading from S3 and writing to S3.
However the never-ending task 0.0 happens in a stage way before outputting
Hello Gourav,
On 30 Dec 2017, at 20:20, Gourav Sengupta wrote:
> Please try to use the SPARK UI from the way that AWS EMR recommends, it
> should be available from the resource manager. I never ever had any problem
> working with it. THAT HAS ALWAYS BEEN MY PRIMARY
Thanks for the update Kwon.
Regards,
On Mon, Jan 1, 2018 at 7:54 PM Hyukjin Kwon wrote:
> Hi,
>
>
> There's a PR - https://github.com/apache/spark/pull/18581 and JIRA
> - SPARK-21289
>
> Alternatively, you could check out multiLine option for CSV and see if
> applicable.
Hi,
There's a PR - https://github.com/apache/spark/pull/18581 and JIRA
- SPARK-21289
Alternatively, you could check out multiLine option for CSV and see if
applicable.
Thanks.
2017-12-30 2:19 GMT+09:00 sk skk :
> Hi,
>
> Do we have an option to write a csv or text
hi,
Would like an opinion on using *mesos cluster dispatcher*.
It worked for me on 2 vagrant machines setup( i.e mesos master and slave).
Is it better to start the spark driver using Marathon instead of dispatcher?
the —supervise option can become a pain as you cannot stop the driver.
please
Hi Jeroen:
I am not sure if I missed it - but can you let us know what is your input
source and output sink ?
In some cases, I found that saving to S3 was a problem. In this case I started
saving the output to the EMR HDFS and later copied to S3 using s3-dist-cp which
solved our issue.
Mans
Here is the list that I will probably try to fill:
1. Check GC on the offending executor when the task is running. May be
you need even more memory.
2. Go back to some previous successful run of the job and check the
spark ui for the offending stage and check max task time/max