Re: Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

2017-04-14 Thread sohimankotia
Hi Shannon, Thanks for your response . First Yes, I am running flink in yarn and my job is running with parallelism 1 . There are few points , may those can help you to narrow down for a solution to help me , 1. I have other jobs also running in same cluster but with more than 1 parallelism ,

Re: Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

2017-04-14 Thread Shannon Carey
I've had similar problems when running Flink in Yarn. Flink task manager fails and it can't launch re-start jobs because there aren't enough slots and eventually Yarn decides to terminate Flink and you lose all your jobs & state because Flink regards it as a graceful shutdown. My latest attempt

Re: Flink job on secure Yarn fails after many hours

2017-04-14 Thread Niels Basjes
Hi, No, this issue is now gone for us. The fixed in 1.2.0 ensured that we are now able to run jobs on our cluster beyond the 7 days limit. Niels On Wed, Apr 12, 2017 at 5:35 PM, Robert Metzger wrote: > Niels, are you still facing this issue? > > As far as I understood it, the security changes

Re: Flink errors out and job fails--IOException from CollectSink.open()

2017-04-14 Thread Sathi Chowdhury
I am consistently seeing the same behavior…tried with elevated memory for job manager and taskmanager taskmanager.rpc.port: 6123 taskmanager.data.port: 4964 taskmanager.heap.mb: 39000 taskmanager.numberOfTaskSlots: 1 taskmanager.network.numberOfBuffers: 16368 taskmanager.memory.preallocate: false

Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

2017-04-14 Thread sohimankotia
I am running a flink streaming job with parallelism 1 . Suddenly after 4 hours job failed . It showed Container container_e39_1492083788459_0676_01_02 is completed with diagnostics: Container [pid=79546,containerID=container_e39_1492083788459_0676_01_02] is running beyond physical memor