Re: Flink task node shut it self off.

2019-12-20 Thread jingjing bai
hi john in our experience , the checkpoint interval we set interval 1-10 minute and timeout usurally 5*interval . mostly we set 2 or 5 minute and 10 or 20timeout. it depend on u data bulk per second and which window used. John Smith 于2019年12月21日周六 上午5:26写道: > Hi, using Flink 1.8.0 > > 1st

Taskmanagers in Docker Fail to Resolve Own Hostnames and Won't Accept Tasks

2019-12-20 Thread Martin, Nick J [US] (IS)
I'm running Flink 1.7.2 in a Docker swarm. Intermittently, new task managers will fail to resolve their own host names when starting up. In the log I see "no hostname could be resolved" messages coming from TaskManagerLocation. The webUI on the jobmanager shows the taskmanagers as are

Flink task node shut it self off.

2019-12-20 Thread John Smith
Hi, using Flink 1.8.0 1st off I must say Flink resiliency is very impressive, we lost a node and never lost one message by using checkpoints and Kafka. Thanks! The cluster is a self hosted cluster and we use our own zookeeper cluster. We have... 3 zookeepers: 4 cpu, 8GB (each) 3 job nodes: 4

Re: Deprecated SplitStream class - what should be use instead.

2019-12-20 Thread KristoffSC
Hi Kostas, Thank you for the answer and clarification. If Side-outputs are treated in the same way and there is no significant performance penalty then it seems that they are ok for my use case. I can accept the name mismatch ;) Regards, Krzysztof -- Sent from:

使用flink 做维表关联

2019-12-20 Thread lucas.wu
hi 大家好: 最近有在调研使用flink做实时数仓,但是有个问题没弄清楚,就是明细表和维度表做join的时候,该采取什么的方案?目前的想到的就是明细表通过流消费进来,维度表放缓存。但是这种方案有弊端,就是维度表更新后,历史join过的数据无法再更新。不知道大家还有什么其他的方案?ps:目前有看到flink有支持join,这种需要两个表都是流的方式进入flink,然后会将历史的数据保存在state里面,这种对于量大的表会不会有问题?

Re: [DISCUSS] Drop vendor specific repositories from pom.xml

2019-12-20 Thread Robert Metzger
Okay, I understand. I'm okay with removing the profile. On Thu, Dec 19, 2019 at 11:34 AM Till Rohrmann wrote: > The profiles make bumping ZooKeeper's version a bit more cumbersome. I > would be interested for this reason to get rid of them, too. > > Cheers, > Till > > On Wed, Dec 18, 2019 at

Re: Deprecated SplitStream class - what should be use instead.

2019-12-20 Thread Kostas Kloudas
Hi Krzysztof, If I get it correctly, your main reason behind not using side-outputs is that it seems that "side-output", by the name, seems to be a "second class citizen" compared to the main output. I see your point but in terms of functionality, there is no difference between the different

Re: yarn per job 模式这个报错原因是什么?随机出现

2019-12-20 Thread Yun Tang
Hi 这个异常是因为无法绑定随机端口,在出问题的JM机器上检查一下 netstat,看是不是有大量的连接占用了很多端口。一般这种问题都是因为大量对外连接未关闭导致的,找到是什么类型的进程占用了大量端口。 祝好 唐云 From: rockey...@163.com Sent: Friday, December 20, 2019 15:04 To: user-zh Subject: yarn per job 模式这个报错原因是什么?随机出现 嗨,大家好,flink per