Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Xintong Song
Hi Jeff, I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Xintong Song
Hi Jeff, I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on

Flink 1.10 Local Aggregate问题

2020-03-29 Thread chanamper
Dear All, 请教一下,Flink 1.10版本中Java Api如何实现Local Aggregate的功能呢?数据存在较大的倾斜,想在keyby前进行一次local aggregate, 看了下在https://cwiki.apache.org/confluence/display/FLINK/FLIP-44%3A+Support+Local+Aggregation+in+Flink有计划实现localKeyBy()方法, flink 1.10 java api中还没发现对应的localKeyBy方法。请问下,目前Java

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yangze Guo
Thanks for your feedbacks, @Xintong and @Jeff. @Jeff I think it would always be good to leverage exist logic in Flink, such as JobListener. However, this calculator does not only target to check the conflict, it also targets to provide the calculating result to user before the job is actually

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yangze Guo
Thanks for your feedbacks, @Xintong and @Jeff. @Jeff I think it would always be good to leverage exist logic in Flink, such as JobListener. However, this calculator does not only target to check the conflict, it also targets to provide the calculating result to user before the job is actually

Re: End to End Latency Tracking in flink

2020-03-29 Thread Lu Niu
Thanks for reply, @Zhijiang, @Congxian! @Congxian $current_processing - $event_time works for event time. How about processing time? Is there a good way to measure the latency? Best Lu On Sun, Mar 29, 2020 at 6:21 AM Zhijiang wrote: > Hi Lu, > > Besides Congxian's replies, you can also get

Re: State & Generics

2020-03-29 Thread Mike Mintz
I was able to get generic types to work when I used GenericTypeInfo and made sure to wrap the generic in some concrete type. In my case I used scala.Some as the wrapper. It looks something like this (in Scala): import org.apache.flink.api.java.typeutils.GenericTypeInfo val descriptor = new

Re:Re: flink savepoint问题

2020-03-29 Thread xyq
Hi,您好: 我这边有个小流 left join大流的需求,小流的数据夜间基本没有 可能会4-5个小时没数据,目前的情况是一到晚上container老是被kill掉,报的是内存溢出。我想问下,我想把托管内存这设置成false,会有什么弊端吗?或者该问题怎么解决?困扰了好久了,请您指点一谢谢。 state.backend.rocksdb.memory.managed : false 在 2020-03-28 11:04:09,"Congxian Qiu" 写道: >Hi > >对于问题 1 在反压的情况下,可能导致 Savepoint

Re:Re: flink savepoint问题

2020-03-29 Thread xyq
非常感谢 在 2020-03-28 11:04:09,"Congxian Qiu" 写道: >Hi > >对于问题 1 在反压的情况下,可能导致 Savepoint 做不成功从而超时,这个暂时没法解决,现在有一个 issue[1] 在做 Unalign >Checkpoint 可以解决反压情况下的 checkpoint >对于问题 3,checkpoint 超时了,超时的定义:在设置的时间内(比如你这里 5 分钟),有 task 没有完成 >snapshot。调长超时时间能够一定的缓解这个问题,不过你最好找到超时的原因,然后针对性的优化。 >[1]

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Jeff Zhang
Hi Yangze, Does this tool just parse the configuration in flink-conf.yaml ? Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line) [1]

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Jeff Zhang
Hi Yangze, Does this tool just parse the configuration in flink-conf.yaml ? Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line) [1]

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Xintong Song
Thanks Yangze, I've tried the tool and I think its very helpful. Thank you~ Xintong Song On Mon, Mar 30, 2020 at 9:40 AM Yangze Guo wrote: > Hi, Yun, > > I'm sorry that it currently could not handle it. But I think it is a > really good idea and that feature would be added to the next

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Xintong Song
Thanks Yangze, I've tried the tool and I think its very helpful. Thank you~ Xintong Song On Mon, Mar 30, 2020 at 9:40 AM Yangze Guo wrote: > Hi, Yun, > > I'm sorry that it currently could not handle it. But I think it is a > really good idea and that feature would be added to the next

Re: Flink1.10执行sql超出内存限制被yarn杀掉

2020-03-29 Thread Xintong Song
> > file模式用的是direct中哪一部分memory > 这部分内存开销按理说应该是归在 Network Memory,但是目前并没有,只能通过 Framework / Task Off-Heap 来配置。你可以关注一下 FLINK-15981 [1] 。 Thank you~ Xintong Song [1] https://issues.apache.org/jira/browse/FLINK-15981 On Sat, Mar 28, 2020 at 9:25 AM faaron zheng wrote: > >

Re: Flink 1.10: 关于 RocksDBStateBackend In background 状态清理机制的问题

2020-03-29 Thread LakeShen
嗯嗯,非常感谢你的回答,Congxian Qiu 。 Congxian Qiu 于2020年3月28日周六 上午11:39写道: > Hi > > 这个地方我理解是,每次处理一定数量的 StateEntry 之后,会获取当前的 timestamp 然后在 RocksDB 的 compaction > 时对所有的 StateEntry 进行 filter。 > > Calling of TTL filter during compaction slows it down. > > Best, > Congxian > > > LakeShen 于2020年3月26日周四

Re: Re: flinksql如何控制结果输出的频率

2020-03-29 Thread LakeShen
哈哈,学习了,Benchao, Benchao Li 于2020年3月28日周六 下午11:26写道: > Hi, > > 这个输出是retract的是by design的,你可以自己改造下sink,来输出你想要的结果。 > fast > emit是按照处理时间来提前输出的。比如某个key下面来了第一条数据之后,开始设置一个固定周期的定时,如果下个周期聚合结果有发生变化,则输出。 > > flink小猪 <18579099...@163.com> 于2020年3月28日周六 下午8:25写道: > > > > > > > > >

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yangze Guo
Hi, Yun, I'm sorry that it currently could not handle it. But I think it is a really good idea and that feature would be added to the next version. Best, Yangze Guo On Mon, Mar 30, 2020 at 12:21 AM Yun Tang wrote: > > Very interesting and convenient tool, just a quick question: could this tool

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yangze Guo
Hi, Yun, I'm sorry that it currently could not handle it. But I think it is a really good idea and that feature would be added to the next version. Best, Yangze Guo On Mon, Mar 30, 2020 at 12:21 AM Yun Tang wrote: > > Very interesting and convenient tool, just a quick question: could this tool

Re: Log file environment variable 'log.file' is not set.

2020-03-29 Thread Vitaliy Semochkin
Hello Yun, I see this error reported by: *org.apache.flink.runtime.webmonitor.WebMonitorUtils* - *JobManager log files are unavailable in the web dashboard. Log file location not found in environment variable 'log.file' or configuration key 'Key: 'web.log.path' , default: null (fallback keys:

Re: Log file environment variable 'log.file' is not set.

2020-03-29 Thread Yun Tang
Hi Vitaliy Property of 'log.file' would be configured if you have uploaded 'logback.xml' or 'log4j.properties' [1]. The file would contain logs of job manager or task manager which is decided by the component itself. And as you can see, this is only a local file path, I am afraid this cannot

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yun Tang
Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ? Best Yun Tang From: Yangze Guo Sent: Friday, March 27, 2020 18:00 To: user ;

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yun Tang
Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ? Best Yun Tang From: Yangze Guo Sent: Friday, March 27, 2020 18:00 To: user ;

Re: Testing RichAsyncFunction with TestHarness

2020-03-29 Thread KristoffSC
HI :) I have finally figured it out :) On top of changes from last email, in my test method, I had to wrap "testHarness.processElement" in synchronized block, like this: @Test public void foo() throws Exception { synchronized (this.testHarness.getCheckpointLock()) {

Flink 1.10 Java API local Aggregate问题

2020-03-29 Thread chanamper
Dear All,

Re: End to End Latency Tracking in flink

2020-03-29 Thread Zhijiang
Hi Lu, Besides Congxian's replies, you can also get some further explanations from "https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking;. Best, Zhijiang -- From:Congxian Qiu Send Time:2020 Mar. 28

[ANNOUNCE] Weekly Community Update 2020/13

2020-03-29 Thread Konstantin Knauf
Dear community, happy to share this week's Apache Flink community digest with a couple of threads around the upcoming release of Apache Flink Stateful Functions 2.0, an update on Flink 1.10.1, two FLIPs to improve Apache Flink's distributed runtime and the schedule for Flink Forward Virtual

Re: Testing RichAsyncFunction with TestHarness

2020-03-29 Thread KristoffSC
Hi, another update on this one. I managed to make the workaround a little bit cleaner. The test setup I have now is like this: ByteArrayOutputStream streamEdgesBytes = new ByteArrayOutputStream(); ObjectOutputStream oosStreamEdges = new ObjectOutputStream(streamEdgesBytes);

Re: Kafka - FLink - MongoDB using Scala

2020-03-29 Thread Konstantin Knauf
cc user@f.a.o Hi Siva, I am not aware of a Flink MongoDB Connector in either Apache Flink, Apache Bahir or flink-packages.org. I assume that you are doing idempotent upserts, and hence do not require a transactional sink to achieve end-to-end exactly-once results. To build one yourself, you

Re: Windows on SinkFunctions

2020-03-29 Thread Sidney Feiner
Thanks! What am I supposed to put in the apply/process function for the sink to be invoked on a List of items? Sidney Feiner / Data Platform Developer M: +972.528197720 / Skype: sidney.feiner.startapp [emailsignature] From: tison Sent: Sunday, March 22, 2020