use of Scala versions >= 2.13 in Flink 1.15

2021-12-06 Thread guenterh.lists
Dear list, there have been some discussions and activities in the last months about a Scala free runtime which should make it possible to use newer Scala version (>= 2.13 / 3.x) on the application side. Stephan Ewen announced the implementation is on the way [1] and Martijn Vissr mentioned

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-06 Thread Yingjie Cao
Hi Till, Thanks for your feedback. >>> How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure? There are some tests that need to be adjusted, for example, BlockingShuffleITCase. For other tests,

Re: Re: Re: how to run streaming process after batch process is completed?

2021-12-06 Thread Yun Gao
Hi Joern, Very thanks for sharing the cases! Could you also share a bit more on the detailed scenarios~? Best, Yun --Original Mail -- Sender:Joern Kottmann Send Date:Fri Dec 3 16:43:38 2021 Recipients:Yun Gao CC:vtygoss , Alexander Preuß ,

Re: use of Scala versions >= 2.13 in Flink 1.15

2021-12-06 Thread Chesnay Schepler
With regards to the Java APIs, you will definitely be able to use the Java DataSet/DataStream APIs from Scala without any restrictions imposed by Flink. This is already working with the current SNAPSHOT version. As we speak we are also working to achieve the same for the Table API; we expect

Re: enable.auto.commit=true and checkpointing turned on

2021-12-06 Thread Vishal Santoshi
perfect. Thanks. That is what I imagined. On Mon, Dec 6, 2021 at 2:04 AM Hang Ruan wrote: > Hi, > > 1. Yes, the kafka source will use the Kafka committed offset for the group > id to start the job. > > 2. No, the auto.offset.reset >

Re: Order of events in Broadcast State

2021-12-06 Thread David Anderson
Event ordering in Flink is only maintained between pairs of events that take exactly the same path through the execution graph. So if you have multiple instances of A (let's call them A1 and A2), each broadcasting a partition of the total rule space, then one instance of B (B1) might receive rule1

[DISCUSS] Strong read-after-write consistency of Flink FileSystems

2021-12-06 Thread David Morávek
Hi Everyone, as outlined in FLIP-194 discussion [1], for the future directions of Flink HA services, I'd like to verify my thoughts around guarantees of the distributed filesystems used with Flink. Currently some of the services (*JobGraphStore*, *CompletedCheckpointStore*) are implemented using

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Chesnay Schepler
ping @users; any input on how this would affect you is highly appreciated. On 25/11/2021 22:39, Chesnay Schepler wrote: I included the user ML in the thread. @users Are you still using Zookeeper 3.4? If so, were you planning to upgrade Zookeeper in the near future? I'm not sure about ZK

Converting DataStream of Avro SpecificRecord to Table

2021-12-06 Thread Dongwon Kim
Hi community, I'm currently converting a DataStream of Avro SpecificRecord type into Table using the following method: public static Table toTable(StreamTableEnvironment tEnv, DataStream dataStream,

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Arvid Heise
Could someone please help me understand the implications of the upgrade? As far as I understood this upgrade would only affect users that have a zookeeper shared across multiple services, some of which require ZK 3.4-? A workaround for those users would be to run two ZKs with different versions,

Re: Table DataStream Conversion Lost Watermark

2021-12-06 Thread Timo Walther
Hi Yunfeng, it seems this is a deeper issue with the fromValues implementation. Under the hood, it still uses the deprecated InputFormat stack. And as far as I can see, there we don't emit a final MAX_WATERMARK. I will definitely forward this. But toDataStream forwards watermarks correctly.

Re: GenericWriteAheadSink, declined checkpoint for a finished source

2021-12-06 Thread James Sandys-Lumsdaine
Hello again, We recently upgraded from Flink 1.12.3 to 1.14.0 and we were hoping it would solve our issue with checkpointing with finished data sources. We need the checkpointing to work to trigger Flink's GenericWriteAheadSink class. Firstly, the constant mentioned on FLIP-147 that enables

Re: GenericWriteAheadSink, declined checkpoint for a finished source

2021-12-06 Thread Dawid Wysakowicz
Hi, Sorry to hear it's hard to find the option. It is part of the 1.14 release[1]. It is also documented how to enable it[2]. Happy to hear how we can improve the situation here. As for the exception. Are you seeing this exception occur repeatedly for the same task? I can imagine a situation

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Chesnay Schepler
Current users of ZK 3.4 and below would need to upgrade their Zookeeper installation that is used by Flink to 3.5+. Whether K8s users are affected depends on whether they use ZK or not. If they do, see above, otherwise they are not affected at all. On 06/12/2021 18:49, Arvid Heise wrote:

Re: [DISCUSS] Deprecate Java 8 support

2021-12-06 Thread Nicolás Ferrario
Oh my bad, it must be Statefun then. I remember I needed to play around with that for _some_ build. On Wed, Dec 1, 2021 at 7:48 PM Chesnay Schepler wrote: > Flink can be built with Java 11 since 1.10. If I recall correctly we > solved the tools.jar issue, which Hadoop depends on, by excluding

Issue with incremental checkpointing size

2021-12-06 Thread Vidya Sagar Mula
Hi, In my project, we are trying to configure the "Incremental checkpointing" with RocksDB in the backend. We are using Flink 1.11 version and RockDB with AWS : S3 backend Issue: -- In my pipeline, my window size is 5 mins and the incremental checkpointing is happening for every 2 mins. I

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Yang Wang
FYI: We(Alibaba) are widely using ZooKeeper 3.5.5 for all the YARN and some K8s Flink high-available applications. Best, Yang Chesnay Schepler 于2021年12月7日周二 上午2:22写道: > Current users of ZK 3.4 and below would need to upgrade their Zookeeper > installation that is used by Flink to 3.5+. > >

Re: Issue with incremental checkpointing size

2021-12-06 Thread Caizhi Weng
Hi! the checkpointing size is not going beyond 300 MB Is 300MB the total size of checkpoint or the incremental size of checkpoint? If it is the latter one, Flink will only store necessary information (for example the keys and the fields that are selected) in checkpoint and it is compressed, so

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Dongwon Kim
When should I prepare for upgrading ZK to 3.5 or newer? We're operating a Hadoop cluster w/ ZK 3.4.6 for running only Flink jobs. Just hope that the rolling update is not that painful - any advice on this? Best, Dongwon On Tue, Dec 7, 2021 at 3:22 AM Chesnay Schepler wrote: > Current users of

Re: Table DataStream Conversion Lost Watermark

2021-12-06 Thread Yunfeng Zhou
Hi Timo, Thanks for this information. Since it is confirmed that toDataStream is functioning correctly and that I can avoid this problem by not using fromValues in my implementation, I think I have got enough information for my current work and don't need to rediscuss fromDatastream's behavior.

Re: Order of events in Broadcast State

2021-12-06 Thread Alexey Trenikhun
Thank you David From: David Anderson Sent: Monday, December 6, 2021 1:36:20 AM To: Alexey Trenikhun Cc: Flink User Mail List Subject: Re: Order of events in Broadcast State Event ordering in Flink is only maintained between pairs of events that take exactly

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-12-06 Thread Natu Lauchande
Hey Timo and Flink community, I wonder if there is a fix for this issue. The last time I rollbacked to version 12 of Flink and downgraded Ververica. I am really keen to leverage the new features on the latest versions of Ververica 2.5+ , i have tried a myriad of tricks suggested ( example :

Re: Unable to create new native thread error

2021-12-06 Thread David Morávek
Hi Ilan, I think so, using CLI instead of REST API should solve this, as the user code execution would be pulled out to a separate JVM. If you're going to try that, it would be great to hear back whether it has solved your issue. As for 1.13.4, there is currently no on-going effort / concrete

Re: GCS/Object Storage Rate Limiting

2021-12-06 Thread David Morávek
Hi Kevin, Flink comes with two schedulers for streaming: - Default - Adaptive (opt-in) Adaptive is still in experimental phase and doesn't support local recover. You're most likely using the first one, so you should be OK. Can you elaborate on this a bit? We aren't changing the parallelism when

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-06 Thread Yingjie Cao
Hi Till, Thanks for your feedback. >>> How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure? There are some tests that need to be adjusted, for example, BlockingShuffleITCase. For other tests,

Re: 关于flink on yarn 跨多hdfs集群访问的问题

2021-12-06 Thread Yang Wang
我觉得你可以尝试一下ship本地的hadoop conf,然后设置HADOOP_CONF_DIR环境变量的方式 -yt /path/of/my-hadoop-conf -yD containerized.master.env.HADOOP_CONF_DIR='$PWD/my-hadoop-conf' -yD containerized.taskmanager.env.HADOOP_CONF_DIR='$PWD/my-hadoop-conf' Best, Yang chenqizhu 于2021年11月30日周二 上午10:00写道: > all,您好: > >

flink cdc支持mysql整库同步进hudi湖吗?

2021-12-06 Thread casel.chen
flink cdc支持mysql整库同步进hudi湖吗?如果支持的话,希望能给一个例子,还要求能够支持schema变更。谢谢!

Re:Re: 关于flink on yarn 跨多hdfs集群访问的问题

2021-12-06 Thread casel.chen
如果是两套oss或s3 bucket(每个bucket对应一组accessKey/secret)要怎么配置呢?例如写数据到bucketA,但checkpoint在bucketB 在 2021-12-06 18:59:46,"Yang Wang" 写道: >我觉得你可以尝试一下ship本地的hadoop conf,然后设置HADOOP_CONF_DIR环境变量的方式 > >-yt /path/of/my-hadoop-conf >-yD

关于异步 AsyncTableFunction CompletableFuture 的疑问?

2021-12-06 Thread Michael Ran
deal all: 目前在看table api 中,自定义的异步 join 方法 AsyncTableFunction#eval 方法时,发现接口提供的是: public void eval(CompletableFuture> future,Object... keys) {...} 目前遇到两个问题: 1. 直接用 futrue.complate(rowdata) 传递数据,只是实现了发送是异步,异步join 得自己实现,这个理解对吗? 2. 像join hbase

回复: flink cdc支持mysql整库同步进hudi湖吗?

2021-12-06 Thread chengyanan1...@foxmail.com
支持,例子参考hudi官网 chengyanan1...@foxmail.com 发件人: casel.chen 发送时间: 2021-12-06 23:55 收件人: user-zh@flink.apache.org 主题: flink cdc支持mysql整库同步进hudi湖吗? flink cdc支持mysql整库同步进hudi湖吗?如果支持的话,希望能给一个例子,还要求能够支持schema变更。谢谢!

???????? AsyncTableFunction CompletableFuture ??????

2021-12-06 Thread ?????w??
deal all?? table api join AsyncTableFunction#eval ?? public void eval(CompletableFuture

Re: 关于异步 AsyncTableFunction CompletableFuture 的疑问

2021-12-06 Thread Caizhi Weng
Hi! 1. 直接用 futrue.complate(rowdata) 传递数据,只是实现了发送是异步,异步join 得自己实现,这个理解对吗? 正确。例如 HBaseRowDataAsyncLookupFunction 里就调用了 hbase 提供的 table#get 异步方法实现异步查询。 2. 像join hbase 里面通过线程池实现了join异步,是无法保证顺序,并没有看到任何保证顺序的操作,还是有其他逻辑保证顺序吗? Async operator 就是为了提高无需保序的操作(例如很多 etl 查维表就不关心顺序)的效率才引入的,如果对顺序有强需求就不能用

Re: 关于异步 AsyncTableFunction CompletableFuture 的疑问?

2021-12-06 Thread Caizhi Weng
Hi! 1. 直接用 futrue.complate(rowdata) 传递数据,只是实现了发送是异步,异步join 得自己实现,这个理解对吗? 正确。例如 HBaseRowDataAsyncLookupFunction 里就调用了 hbase 提供的 table#get 异步方法实现异步查询。 2. 像join hbase 里面通过线程池实现了join异步,是无法保证顺序,并没有看到任何保证顺序的操作,还是有其他逻辑保证顺序吗? Async operator 就是为了提高无需保序的操作(例如很多 etl 查维表就不关心顺序)的效率才引入的,如果对顺序有强需求就不能用

Re:Re: 关于异步 AsyncTableFunction CompletableFuture 的疑问?

2021-12-06 Thread Michael Ran
好的,谢谢,我这边尝试下异步保证顺序,我们这边有些场景 在 2021-12-07 14:17:51,"Caizhi Weng" 写道: >Hi! > >1. 直接用 futrue.complate(rowdata) 传递数据,只是实现了发送是异步,异步join 得自己实现,这个理解对吗? > > >正确。例如 HBaseRowDataAsyncLookupFunction 里就调用了 hbase 提供的 table#get 异步方法实现异步查询。 > >2. 像join hbase 里面通过线程池实现了join异步,是无法保证顺序,并没有看到任何保证顺序的操作,还是有其他逻辑保证顺序吗?

Re: flink结合历史数据怎么处理

2021-12-06 Thread Leonard Xu
MySQL CDC connector 支持并发读取的,读取过程也不会用锁,600万的数据量很小了,百亿级的分库分表我们和社区用户测试下都是ok的,你可以自己试试。 祝好, Leonard > 2021年12月6日 下午3:54,张阳 <705503...@qq.com.INVALID> 写道: > > 因为数据量有600w 所以担心初始化时间太长 或者性能问题 > > > > > --原始邮件-- > 发件人:

????

2021-12-06 Thread ?6?4??????????

????

2021-12-06 Thread ?6?4??????????

Re: 退订

2021-12-06 Thread liber xue
退订 ™薇维苿尉℃ 于2021年12月6日 周一16:20写道: > 退订

????

2021-12-06 Thread lorthevan