Re: [Spark Streamiing] Streaming job failing consistently after 1h

Manish Malhotra Fri, 19 May 2017 22:53:13 -0700

Im also facing same problem.

I have implemented Java based custom receiver, which consumes from
messaging system say JMS.
once received message, I call store(object) ... Im storing spark Row object.


it run for around 8 hrs, and then goes OOM, and OOM is happening in
receiver nodes.
I also tried to run multiple receivers, to distribute the load but faces
the same issue.

something fundamentally we are doing wrong, which tells custom
receiver/spark to release the memory.
but Im not able to crack that, atleast till now.

any help is appreciated spark group !!

Regards,
Manish



On Sun, Mar 5, 2017 at 6:37 PM, Charles O. Bajomo <
charles.baj...@pretechconsulting.co.uk> wrote:

> Hello all,
>
> I have a strange behaviour I can't understand. I have a streaming job
> using a custom java receiver that pull data from a jms queue that I process
> and then write to HDFS as parquet and avro files. For some reason my job
> keeps failing after 1hr and 30 minutes. When It fails I get an error saying
> the "container is running beyond physical memory limits. Current Usage
> 4.5GB of 4.5GB physical memory used. 6.4GB of 9.4GB virtual memory used. ".
> to be honest I don;t understand the error,  What are the memory limits
> shown in the error referring to? I allocated 10 executors with 6 cores each
> and 4G of executor and driver memory. I set the overhead memory to 2.8G, so
> the values don't add up.
>
> Anyone have any idea what the error is referring? I have increased the
> memory and i didn't help, it appears it just bought me more time.
>
> Thanks.
>

Re: [Spark Streamiing] Streaming job failing consistently after 1h

Reply via email to