Re: How to debug checkpoints failing to complete

2020-03-27 Thread Congxian Qiu
Hi >From my experience, you can first check the jobmanager.log, find out whether the checkpoint expired or was declined by some task, if expired, you can follow the adivce of seeksst given above(maybe enable debug log can help you here), if was declined, then you can go to the taskmanager.log to

Re: End to End Latency Tracking in flink

2020-03-27 Thread Congxian Qiu
Hi As far as I know, the latency-tracking feature is for debugging usages, you can use it to debug, and disable it when running the job on production. >From my side, use $current_processing - $event_time is something ok, but keep the things in mind: the event time may not be the time ingested in

Re: flink 1.9 状态后端为FsStateBackend,修改checkpoint时出现警告

2020-03-27 Thread Congxian Qiu
Hi 从报错来看,你用 StateProcessAPI,StateProcessAPI 的某些 function(这里是 getMetricGroup) 不提供支持,因此会有这个提示,如果你没有显示调用这个 function 的话,那可能是个 bug Best, Congxian guanyq 于2020年3月25日周三 下午3:24写道: > > > > > > >

Re: Flink 1.10: 关于 RocksDBStateBackend In background 状态清理机制的问题

2020-03-27 Thread Congxian Qiu
Hi 这个地方我理解是,每次处理一定数量的 StateEntry 之后,会获取当前的 timestamp 然后在 RocksDB 的 compaction 时对所有的 StateEntry 进行 filter。 > Calling of TTL filter during compaction slows it down. Best, Congxian LakeShen 于2020年3月26日周四 下午8:55写道: > Hi 社区的小伙伴, > > 我现在有个关于 Flink 1.10 RocksDBStateBackend 状态清理机制的问题,在 1.10中,RocksDB

Re: flinksql如何控制结果输出的频率

2020-03-27 Thread Benchao Li
Jark, 这个功能我们用的还挺多的~ 现在还有个痛点是window operator不支持retract输入,所以用了emit就没有办法做到窗口级联使用了。 Jark Wu 于2020年3月27日周五 下午8:01写道: > Benchao 可以啊。这么隐秘的实验性功能都被你发现了 :D > > On Fri, 27 Mar 2020 at 15:24, Benchao Li wrote: > > > Hi, > > > > 对于第二个场景,可以尝试一下fast emit: > > table.exec.emit.early-fire.enabled = true > >

Re: flink savepoint问题

2020-03-27 Thread Congxian Qiu
Hi 对于问题 1 在反压的情况下,可能导致 Savepoint 做不成功从而超时,这个暂时没法解决,现在有一个 issue[1] 在做 Unalign Checkpoint 可以解决反压情况下的 checkpoint 对于问题 3,checkpoint 超时了,超时的定义:在设置的时间内(比如你这里 5 分钟),有 task 没有完成 snapshot。调长超时时间能够一定的缓解这个问题,不过你最好找到超时的原因,然后针对性的优化。 [1] https://issues.apache.org/jira/browse/FLINK-14551 Best, Congxian

Re: flinksql如何控制结果输出的频率

2020-03-27 Thread Tianwang Li
第一个场景: 从SQL的角度,增加时间字段精确到分钟为key,格式如-MM-dd HH:mm。这样是不是就可以实现你要到效果了。 flink小猪 <18579099...@163.com> 于2020年3月27日周五 上午11:29写道: > 我有两个需求 > 1.如果我需要开一个一分钟的窗口,但是我希望能够每5分钟输出一次结果(输出每分钟的聚合结果,即输出5行聚合结果),该怎么办? > 2.如果我需要开一个一个小时的窗口,但是我希望能够每5分钟输出一次结果(窗口数据还没全部到达,先输出当前已经聚合的结果),该怎么办? --

回复:Flink1.10执行sql超出内存限制被yarn杀掉

2020-03-27 Thread faaron zheng
Hi,感谢大家的回复,经过我的分析和测试,我猜测是和taskmanager.network.blocking-shuffle.type=mmap 有关。我看了下监控,Mappred占用的内存会逐渐升至20多G甚至更高,从而导致超过yarn的限制被杀掉。另一方面,如果我配置成taskmanager.network.blocking-shuffle.type=file,监控Mappred一直为0,最后报错会是OutOfMemoryError:direct buffer memory 说明mmap和file用的是不同的存储。

End to End Latency Tracking in flink

2020-03-27 Thread Lu Niu
Hi, I am looking for end to end latency monitoring of link job. Based on my study, I have two options: 1. flink provide a latency tracking feature. However, the documentation says it cannot show actual latency of business logic as it will bypass all operators.

Re: Testing RichAsyncFunction with TestHarness

2020-03-27 Thread KristoffSC
I've debug it a little bit and I found that it fails in InstantiationUtil.readObjectFromConfig method when we execute byte[] bytes = config.getBytes(key, (byte[])null); This returns null. The key that it is looking for is "edgesInOrder". In the config map, there are only two entries though. For

Testing RichAsyncFunction with TestHarness

2020-03-27 Thread KristoffSC
Hi, Im trying to test my RichAsyncFunction implementation with OneInputStreamOperatorTestHarness based on [1]. I'm using Flink 1.9.2 My test setup is: this.processFunction = new MyRichAsyncFunction(); this.testHarness = new OneInputStreamOperatorTestHarness<>( new

Re: Support of custom JDBC dialects in flink-jdbc

2020-03-27 Thread Dongwon Kim
Hi Jark, Many thanks for creating the issue on Jira and nice summarization :-) Best, Dongwon On Sat, Mar 28, 2020 at 12:37 AM Jark Wu wrote: > Hi Dongwon, > > I saw many requirements on this and I'm big +1 for this. > I created https://issues.apache.org/jira/browse/FLINK-16833 to track this >

Re: Support of custom JDBC dialects in flink-jdbc

2020-03-27 Thread Jark Wu
Hi Dongwon, I saw many requirements on this and I'm big +1 for this. I created https://issues.apache.org/jira/browse/FLINK-16833 to track this effort. Hope this can be done before 1.11 release. Best, Jark On Fri, 27 Mar 2020 at 22:22, Dongwon Kim wrote: > Hi, I tried flink-jdbc [1] to read

Re: Issue with Could not resolve ResourceManager address akka.tcp://flink

2020-03-27 Thread Yang Wang
Could you also check the jobmanager logs whether the flink akka is also bound to and listening at the hostname "prod-bigd-dn11"? Otherwise, all the package from taskmanager will be discarded. Best, Yang Vitaliy Semochkin 于2020年3月27日周五 下午3:35写道: > Hello Zhu, > > The host can be resolved and

Support of custom JDBC dialects in flink-jdbc

2020-03-27 Thread Dongwon Kim
Hi, I tried flink-jdbc [1] to read data from Druid because Druid implements Calcite Avatica [2], but the connection string, jdbc:avatica:remote:url= http://BROKER:8082/druid/v2/sql/avatica/, is not supported by any of JDBCDialects [3]. I implement custom JDBCDialect [4], custom

Re: "Legacy Source Thread" line in logs

2020-03-27 Thread Arvid Heise
Hi KristoffSC, the short answer is: you have probably differently configured logger. They log in a different format or level. The longer answer: all source connectors currently use the legacy source thread. That will only change with FLIP-27 [1] being widely adapted. It was originally planned to

Re: flinksql如何控制结果输出的频率

2020-03-27 Thread Jark Wu
Benchao 可以啊。这么隐秘的实验性功能都被你发现了 :D On Fri, 27 Mar 2020 at 15:24, Benchao Li wrote: > Hi, > > 对于第二个场景,可以尝试一下fast emit: > table.exec.emit.early-fire.enabled = true > table.exec.emit.early-fire.delay = 5min > > PS: > 1. 这个feature并没有在官方文档里面写出来,目前应该是一个实验性质的feature > 2.

Re:在Flink SQL的JDBC Connector中,Oracle的TIMESTAMP字段类型转换异常问题

2020-03-27 Thread sunfulin
Hi, 据我所知现在还不能直接支持Oracle的driver吧?你是怎么使用Flink SQL读写oracle的哈? 在 2020-03-27 17:21:21,"111" 写道: >Hi, >在使用Flink SQL读写Oracle JDBC表时,遇到了timestamp转换异常: >Caused by: java.lang.ClassCastException: oracle.sql.TIMESTAMP cannot be cast >to java.sql.Timestamp at

"Legacy Source Thread" line in logs

2020-03-27 Thread KristoffSC
Hi all, When I run Flink from IDE i can see this prefix in logs "Legacy Source Thread" Running the same job as JobCluster on docker, this prefix is not present. What this prefix means? Btw, I'm using [1] as ActiveMQ connector. Thanks. [1]

[Third-party Tool] Flink memory calculator

2020-03-27 Thread Yangze Guo
Hi, there. In release-1.10, the memory setup of task managers has changed a lot. I would like to provide here a third-party tool to simulate and get the calculation result of Flink's memory configuration. Although there is already a detailed setup guide[1] and migration guide[2] officially, the

[Third-party Tool] Flink memory calculator

2020-03-27 Thread Yangze Guo
Hi, there. In release-1.10, the memory setup of task managers has changed a lot. I would like to provide here a third-party tool to simulate and get the calculation result of Flink's memory configuration. Although there is already a detailed setup guide[1] and migration guide[2] officially, the

在Flink SQL的JDBC Connector中,Oracle的TIMESTAMP字段类型转换异常问题

2020-03-27 Thread 111
Hi, 在使用Flink SQL读写Oracle JDBC表时,遇到了timestamp转换异常: Caused by: java.lang.ClassCastException: oracle.sql.TIMESTAMP cannot be cast to java.sql.Timestamp at org.apache.flink.table.dataformat.DataFormatConverters$TimestampConverter.toInternalImpl(DataFormatConverters.java:860) at

Re: Dynamic Flink SQL

2020-03-27 Thread Krzysztof Zarzycki
I want to do a bit different hacky PoC: * I will write a sink, that caches the results in "JVM global" memory. Then I will write a source, that reads this cache. * I will launch one job, that reads from Kafka source, shuffles the data to the desired partitioning and then sinks to that cache. *

Re: 申请中文社区

2020-03-27 Thread Yangze Guo
您好,如果想订阅user-zh,可以发送邮件到user-zh-subscr...@flink.apache.org Best, Yangze Guo On Fri, Mar 27, 2020 at 4:45 PM 大数据开发面试_夏永权 wrote: > > 申请中文社区,不知是否成功,我非常想加入社区,非常感谢。

申请中文社区

2020-03-27 Thread 大数据开发面试_夏永权
申请中文社区,不知是否成功,我非常想加入社区,非常感谢。

flink savepoint问题

2020-03-27 Thread 大数据开发面试_夏永权
Hi,您好,在使用flink的过程中遇到如下问题,个人未能解决,所以请求您指导一下,谢谢! 1. flink cancel -s $SAVEPOINT_DIR $job_id -yid $application_id 在程序有背压的时候停不掉 The program finished with the following exception: org.apache.flink.util.FlinkException: Could not cancel job 1f768e4ca9ad5792a4844a5d12163b73. at

Re: state schema evolution for case classes

2020-03-27 Thread Tzu-Li (Gordon) Tai
Hi Apoorv, Sorry for the late reply, have been quite busy with backlog items the past days. On Fri, Mar 20, 2020 at 4:37 PM Apoorv Upadhyay < apoorv.upadh...@razorpay.com> wrote: > Thanks Gordon for the suggestion, > > I am going by this repo : >

Re: Issue with Could not resolve ResourceManager address akka.tcp://flink

2020-03-27 Thread Vitaliy Semochkin
Hello Zhu, The host can be resolved and there are no filewalls in the cluster, so all ports are opened. Regards, Vitaliy On Fri, Mar 27, 2020 at 8:32 AM Zhu Zhu wrote: > Hi Vitaliy, > > >> *Cannot serve slot request, no ResourceManager connected* > This is not a problem, just that the JM

Re: flinksql如何控制结果输出的频率

2020-03-27 Thread Benchao Li
Hi, 对于第二个场景,可以尝试一下fast emit: table.exec.emit.early-fire.enabled = true table.exec.emit.early-fire.delay = 5min PS: 1. 这个feature并没有在官方文档里面写出来,目前应该是一个实验性质的feature 2. window加了emit之后,会由原来输出append结果变成输出retract结果 Jingsong Li 于2020年3月27日周五 下午2:51写道: > Hi, > > For #1: > 创建级联的两级window: > - 1分钟窗口 > -

??????Flink?????? ?????? Keyed Watermarks

2020-03-27 Thread ??????(Jiacheng Jiang)
?? ---- ??:"Utopia"https://bigdata.cs.ut.ee/keyed-watermarks-partition-aware-watermark-generation-apache-flink Best regards Utopia

Re: Ask for reason for choice of S3 plugins

2020-03-27 Thread David Anderson
If you are using both the Hadoop S3 and Presto S3 filesystems, you should use s3p:// and s3a:// to distinguish between the two. Presto is recommended for checkpointing because the Hadoop implementation has very high latency when creating files, and because it hits request rate limits very

Re: flinksql如何控制结果输出的频率

2020-03-27 Thread Jingsong Li
Hi, For #1: 创建级联的两级window: - 1分钟窗口 - 5分钟窗口,计算只是保存数据,发送明细数据结果 Best, Jingsong Lee

Ask for reason for choice of S3 plugins

2020-03-27 Thread B.Zhou
Hi, In this document https://ci.apache.org/projects/flink/flink-docs-stable/ops/filesystems/s3.html#hadooppresto-s3-file-systems-plugins, it mentioned that * Presto is the recommended file system for checkpointing to S3. Is there a reason for that? Is there some bottleneck for s3 hadoop