Hi Lasse,
I met that before. I think maybe the non-heap memory trend of the graph you
attached is the "expected" result ... Because rocksdb will keep the a "filter
(bloom filter)" in memory for every opened sst file by default, and the num of
the sst file will increase by time, so it looks
Please see the last comment on this issue:
https://github.com/facebook/rocksdb/issues/3216
FYI
On Tue, Apr 10, 2018 at 12:25 AM, Lasse Nedergaard <
lassenederga...@gmail.com> wrote:
>
> This graph shows Non-Heap . If the same pattern exists it make sense that
> it will try to allocate more
> Date: 4/10/18 12:25 AM (GMT-08:00)
> To: Ken Krugler <kkrugler_li...@transpac.com>
> Cc: user <user@flink.apache.org>, Chesnay Schepler <ches...@apache.org>
> Subject: Re: java.lang.Exception: TaskManager was lost/killed
>
>
> This graph shows Non-Heap . If the s
ay
Schepler <ches...@apache.org> Subject: Re: java.lang.Exception: TaskManager was
lost/killed
This graph shows Non-Heap . If the same pattern exists it make sense that it
will try to allocate more memory and then exceed the limit. I can see the trend
for all other containers tha
This graph shows Non-Heap . If the same pattern exists it make sense that
it will try to allocate more memory and then exceed the limit. I can see
the trend for all other containers that has been killed. So my question is
now, what is using non-heap memory?
From
Hi.
I found the exception attached below, for our simple job. It states that
our task-manager was killed du to exceed memory limit on 2.7GB. But when I
look at Flink metricts just 30 sec before it use 1.3 GB heap and 712 MB
Non-Heap around 2 GB.
So something else are also using memory inside the
Hi Chesnay,
Don’t know if this helps, but I’d run into this as well, though I haven’t
hooked up YourKit to analyze exactly what’s causing the memory problem.
E.g. after about 3.5 hours running locally, it failed with memory issues.
In the TaskManager logs, I start seeing exceptions in my
Same story here, 1.3.2 on K8s. Very hard to find reasons on why a TM is
killed. Not likely caused by memory leak. If there is a logger I have turn
on please let me know.
On Mon, Apr 9, 2018, 13:41 Lasse Nedergaard
wrote:
> We see the same running 1.4.2 on Yarn hosted
We see the same running 1.4.2 on Yarn hosted on Aws EMR cluster. The only thing
I can find in the logs from are SIGTERM with the code 15 or -100.
Today our simple job reading from Kinesis and writing to Cassandra was killed.
The other day in another job I identified a map state.remove command
We will need more information to offer any solution. The exception
simply means that a TaskManager shut down, for which there are a myriad
of possible explanations.
Please have a look at the TaskManager logs, they may contain a hint as
to why it shut down.
On 09.04.2018 16:01, Javier Lopez
Hi,
"are you moving the job jar to the ~/flink-1.4.2/lib path ? " -> Yes, to
every node in the cluster.
On 9 April 2018 at 15:37, miki haiat wrote:
> Javier
> "adding the jar file to the /lib path of every task manager"
> are you moving the job jar to the*
Javier
"adding the jar file to the /lib path of every task manager"
are you moving the job jar to the* ~/flink-1.4.2/lib path* ?
On Mon, Apr 9, 2018 at 12:23 PM, Javier Lopez
wrote:
> Hi,
>
> We had the same metaspace problem, it was solved by adding the jar file to
>
Hi,
We had the same metaspace problem, it was solved by adding the jar file to
the /lib path of every task manager, as explained here
https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/debugging_classloading.html#avoiding-dynamic-classloading.
As well we added these java
I've seen similar problem, but it was not a heap size, but Metaspace.
It was caused by a job restarting in a loop. Looks like for each restart,
Flink loads new instance of classes and very soon in runs out of metaspace.
I've created a JIRA issue for this problem, but got no response from the
I have seen this when my task manager ran out of RAM. Increase the heap size.
flink-conf.yaml:
taskmanager.heap.mb
jobmanager.heap.mb
Michael
> On Apr 8, 2018, at 2:36 AM, 王凯 wrote:
>
>
> hi all, recently, i found a problem,it runs well when
15 matches
Mail list logo