Hi Xintong,
Thanks very much for the response. Let me check out the new UI on flink
1.12.
The reason I asked this question is because our flink cluster on k8s shows
a container_working_set_bytes(used by OOMkiller) to be > 3Gb. I assume that
the used(heap, non-heap) values on the UI are correct.
Thanks for the discussion, JING ZHANG. I like the first proposal since it
is simple and consistent with dataStream API. It is helpful to add more
docs about the special late case in WindowAggregate. Also, I expect the
more flexible emit strategies later.
Jark Wu 于2021年7月2日周五 上午10:33写道:
> Sorry,
Hi Xiuming,
+1 on your idea.
BTW, Flink also provides a debug tool to track the latency of records
travelling through the system[1]. But you should note the following issue
if enable the latency tracking.
(1) It's a tool for debugging purposes because enabling latency metrics can
significantly
Sorry, I made a typo above. I mean I prefer proposal (1) that
only needs to set `table.exec.emit.allow-lateness` to handle late events.
`table.exec.emit.late-fire.delay` can be optional which is 0s by default.
`table.exec.state.ttl` will not affect window state anymore, so window state
is still
Hi Sudharsan,
The non-heap max is decided by JVM automatically and is not controlled by
Flink. Moreover, it doesn't mean Flink will use up to that size of non-heap
memory.
These metrics are fetched directly from JVM and do not correspond well with
Flink's memory configurations, which very often
大佬们,帮看一下,为什么那里会出现类型转换异常了。
-- 原始邮件 --
发件人: kcz <573693...@qq.com
发送时间: 2021年7月1日 22:49
收件人: user-zh
Another side question, Shall we add metric to cover the complete restarting
time (phase 1 + phase 2)? Current metric jm.restartingTime only covers
phase 1. Thanks!
Best
Lu
On Thu, Jul 1, 2021 at 12:09 PM Lu Niu wrote:
> Thanks TIll and Yang for help! Also Thanks Till for a quick fix!
>
> I did
Thanks TIll and Yang for help! Also Thanks Till for a quick fix!
I did another test yesterday. In this test, I intentionally throw exception
from the source operator:
```
if (runtimeContext.getIndexOfThisSubtask() == 1
&& errorFrenquecyInMin > 0
&& System.currentTimeMillis() -
Thanks TIll and Yang for help! Also Thanks Till for a quick fix!
I did another test yesterday. In this test, I intentionally throw exception
from the source operator:
```
if (runtimeContext.getIndexOfThisSubtask() == 1
&& errorFrenquecyInMin > 0
&& System.currentTimeMillis() -
Hi,
On my flink setup, I have taskmanager.memory.process.size set to 2536M. I
expect all the memory components shown on the UI to add up to this number.
However, I don't see this.
I have flink managed memory: 811Mb
JVM heap max: 886Mb
JVM non-heap max: 744Mb
Direct memory: 204Mb
This adds
Hello everyone,
I'm testing a custom Kubernetes operator that should fulfill some specific
requirements I have for Flink. I know of this WIP project:
https://github.com/wangyang0918/flink-native-k8s-operator
I can see that it uses some classes that aren't publicly documented, and I
believe it
Hi Shilpa,
I've confirmed that "recovered" jobs are not compatible between minor
versions of Flink (e.g., between 1.12 and 1.13). I believe the issue is
that the session cluster was upgraded to 1.13 without first stopping the
jobs running on it.
If this is the case, the workaround is to stop
退订
| |
高耀军
|
|
18221112...@163.com
|
签名由网易邮箱大师定制
:1.13.1 :
Caused by: java.lang.ClassCastException:
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.TextNode
cannot be cast to
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.ObjectNode
at
退订
Hi Stephan,
Thanks for the detailed explanation! This really helps understand all this
better. Appreciate your help!
Regards,
Sonam
From: Stephan Ewen
Sent: Wednesday, June 30, 2021 3:56:22 AM
To: Sonam Mandal ; user@flink.apache.org
Cc:
A quick addition, I think with FLINK-23202 it should now also be possible
to improve the heartbeat mechanism in the general case. We can leverage the
unreachability exception thrown if a remote target is no longer reachable
to mark an heartbeat target as no longer reachable [1]. This can then be
Thanks Jing for bringing up this topic,
The emit strategy configs are annotated as Experiential and not public on
documentations.
However, I see this is a very useful feature which many users are looking
for.
I have posted these configs for many questions like "how to handle late
events in SQL".
Hi Zhu,
Does is mean our upgrades are going to fail and the jobs are not backward
compatible?
I did verify the job itself is built using 1.13.0.
Is there a workaround for this?
Thanks,
Shilpa
On Wed, Jun 30, 2021 at 11:14 PM Zhu Zhu wrote:
> Hi Shilpa,
>
> JobType was introduced in 1.13. So
Since you are deploying Flink workloads on Yarn, the Flink ResourceManager
should get the container
completion event after the heartbeat of Yarn NM->Yarn RM->Flink RM, which
is 8 seconds by default.
And Flink ResourceManager will release the dead TaskManager container once
received the completion
Thanks for the pointer. let me try upgrading the flink
On Thu, Jul 1, 2021 at 5:29 PM Yun Tang wrote:
> Hi Tao,
>
> I run your program with Flink-1.12.1 and found the problem you described
> really existed. And things would go normal if switching to Flink-1.12.2
> version.
>
> After dig into
Hi Tao,
I run your program with Flink-1.12.1 and found the problem you described really
existed. And things would go normal if switching to Flink-1.12.2 version.
After dig into the root cause, I found this is caused by a fixed bug [1]: If a
legacy source task fails outside of the legacy
The analysis of Gen is correct. Flink currently uses its heartbeat as the
primary means to detect dead TaskManagers. This means that Flink will take
at least `heartbeat.timeout` time before the system recovers. Even if the
cancellation happens fast (e.g. by having configured a low
Hi Tao,
it looks like it should work considering that you have a sleep of 1 second
before each emission. I'm going to add Roman to this thread. Maybe, he has
sees something obvious which I'm missing.
Could you run the job with the log level set to debug and provide the logs
once more?
> 1. how to specify the number of TaskManager?
> In batch mode, I tried to use (Max Parallelism / (cores per tm)), but it
> does not work. Number of TaskManager is muchlarger than (Max Parallelism /
> cores per tm).
It not the cores per tm, but the number of slots per tm. Please refer
to
| |
谢治平
|
|
邮箱:xiezhiping...@163.com
|
在2021年06月03日 19:06,李朋辉 写道:
退订
| |
李朋辉
|
|
邮箱:lipengh...@126.com
|
签名由 网易邮箱大师 定制
Hi Ingo,
Thank you for your advice, we've not tried it yet, we just thought it may
work that way, but now it seems not then. We'll see how it could match our
use case with the AggregateFunction interface.
On Thu, Jul 1, 2021 at 1:57 PM Ingo Bürk wrote:
> Hi Kai,
>
> CheckpointedFunction is not
Hello community,
I would like to know how long it takes for an event to flow through the
whole Flink pipeline, that consumes from Kafka and sinks to Redis.
My current idea is, for each event:
1. calculate a start_time in source (timestamp field of [metadata](
28 matches
Mail list logo