Re: how to propagate watermarks across multiple jobs

2021-03-03 Thread Piotr Nowojski
Great :) Just one more note. Currently FlinkKafkaShuffle has a critical bug [1] that probably will prevent you from using it directly. I hope it will be fixed in some next release. In the meantime you can just inspire your solution with the source code. Best, Piotrek [1]

Re: Scaling Higher than 10k Nodes

2021-03-03 Thread Piotr Nowojski
Hi Joey, Sorry for not responding to your question sooner. As you can imagine there are not many users running Flink at such scale. As far as I know, Alibaba is running the largest/one of the largest clusters, I'm asking for someone who is familiar with those deployments to take a look at this

Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

2021-03-03 Thread Sebastián Magrí
Thanks a lot for the added context and pointers Julian and Leonard, I've fixed it by going down to the arithmetics as suggested in one of the Calcite discussions. The changes proposed by FLIP-126 definitely look good. I'll check its details further. Best Regards, On Thu, 4 Mar 2021 at 04:18,

Flink SQL upsert-kafka connector 生成的 Stage ChangelogNormalize 算子的疑问

2021-03-03 Thread Qishang
Hi 社区。 Flink 1.12.1 现在的业务是通过 canal-json 从kafka 接入数据写入DB ,但是由于 Topic 是1个 Partition ,设置大的并发,对于只有 forword 的ETL没有作用。 insert into table_a select id,udf(a),b,c from table_b; 发现 upsert-kafka connector expain 生成的 Stage 有 ChangelogNormalize 可以分区 1. ChangelogNormalize 是会默认加上的吗,还是在哪里可以设置? 2. 这个可以改变默认

Re: flink sql中如何使用异步io关联维表?

2021-03-03 Thread HunterXHunter
定义一个 sourcetable -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink-savepoint问题

2021-03-03 Thread Congxian Qiu
对于 keyed state,需要保证同一个 key 在 同一个 keygroup 中,如果是某个 key 有热点,可以在 keyby 之前进行一次 map(在 key 后面拼接一些 后缀),然后 keyby,最后处理完成之后,将这些进行聚合 Best, Congxian guomuhua <663021...@qq.com> 于2021年3月4日周四 下午12:49写道: > 我也遇到类似情况,为了打散数据,keyby加了随机数。请问怎么正确打散数据呢? > nobleyd wrote > > 是不是使用了随机key。 > > > guaishushu1103@ > > > >

Re: flink-savepoint问题

2021-03-03 Thread guomuhua
我也遇到类似情况,为了打散数据,keyby加了随机数。请问怎么正确打散数据呢? nobleyd wrote > 是不是使用了随机key。 > guaishushu1103@ > > guaishushu1103@ > 于2021年3月3日周三 下午6:53写道:> checkpoint 可以成功保存,但是savepoint出现错误:> > java.lang.Exception: Could not materialize checkpoint 2404 for operator> > KeyedProcess (21/48).> at> >

Re: flink-savepoint问题

2021-03-03 Thread guomuhua
我也遇到同样问题,为了打散数据,在keyby时加了随机数作为后缀,去掉随机数,可以正常savepoint,加上随机数就savepoint失败。所以如果确有要打散数据的需求,应该怎么处理呢? -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Processing-time temporal join is not supported yet.

2021-03-03 Thread Leonard Xu
Hi, Eric > what will be the best workaround to enrich stream of data from a kafka topics > with statical data based on id? Currently you can put your statical data in Hive/JDBC/HBase which supports lookup the data in full table env as a workaround,. You can also write a UDF which caches the s3

Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

2021-03-03 Thread Leonard Xu
Hi, Sebastián Ramírez Magrí (Sorry for wrong name in above mail) Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded calcite to 1.26 version which contains

Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

2021-03-03 Thread Leonard Xu
Hi, Jaffe Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded calcite to 1.26 version which contains the patch, what we need is only to adapt it in Flink

Defining GlobalJobParameters in Flink Unit Testing Harnesses

2021-03-03 Thread Rion Williams
Hi all, Early today I had asked a few questions regarding the use of the many testing constructs available within Flink and believe that I have things in a good direction at present. I did run into a specific case that either may not be supported, or just isn't documented well enough for me to

pipeline.auto-watermark-interval vs setAutoWatermarkInterval

2021-03-03 Thread Aeden Jameson
I'm hoping to have my confusion clarified regarding the settings, 1. pipeline.auto-watermark-interval https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long- 2. setAutoWatermarkInterval

Re: how to propagate watermarks across multiple jobs

2021-03-03 Thread yidan zhao
Yes, you are right and thank you. I take a brief look at what FlinkKafkaShuffle is doing, it seems what I need and I will have a try. >

Re: flink Application Native k8s使用oss作为backend日志偶尔报错

2021-03-03 Thread 王 羽凡
2021-03-04 02:33:25,292 DEBUG org.apache.flink.runtime.rpc.akka.SupervisorActor [] - Starting FencedAkkaRpcActor with name jobmanager_2. 2021/3/4 上午10:33:25 2021-03-04 02:33:25,304 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for

Re: [External] : Re: Need help with JDBC Broken Pipeline Issue after some idle time

2021-03-03 Thread Fuyao Li
Hi Qinghui, I agree. I am trying to found internal and resources on the internet to fix the issue. Idle Time Limits might be a reason. But after configuring those

Re: 1.12.2 docker image

2021-03-03 Thread Chesnay Schepler
they should be released in a day or two. On 3/3/2021 11:18 PM, Bohinski, Kevin wrote: Hi, Are there plans to provide a docker image for 1.12.2? Best kevin

1.12.2 docker image

2021-03-03 Thread Bohinski, Kevin
Hi, Are there plans to provide a docker image for 1.12.2? Best kevin

Re: Flink Zookeeper leader change v 1.9.X

2021-03-03 Thread Chesnay Schepler
1) This could occur due to a number of reasons, like processes crashing, network issues between ZK and Flink, or the JobManager being stuck in some blocking operation for a long time. You will need to take a look at the ZK/Flink logs to narrow things down. 2) For FLINK-14091 the issue was not

Re: Stop vs Cancel with savepoint

2021-03-03 Thread Chesnay Schepler
Your understanding of cancel vs stop(-with-savepoint) is correct. I agree that we should update the REST API documentation and have a section outlining the problems with cancel-with-savepoint. Would you like to open a ticket yourself? On 3/3/2021 11:16 AM, Thomas Eckestad wrote: Hi! Cancel

Re: Unit Testing State Stores in KeyedProcessFunctions

2021-03-03 Thread Rion Williams
Thanks Chesnay, I agree that output testing is more practical and far less brittle, I was just curious if support was there for it. I have a specific use case where I’m managing my own windows and may schedule something to be emitted but after some processing time delay so it could potentially

Re: Compile Error

2021-03-03 Thread Chesnay Schepler
The flink-clients dependency is correct. We will need additional information to debug the Job execution failures, because these can happen due to all kind of reasons. Things like the full stacktrace, or exceptions from the logging output. Additionally, I would recommend to base your project

Antw: [EXT] Re: Running Apache Flink on Android

2021-03-03 Thread Alexander Borgschulze
Hey, Thanks for your answer :) For my Master's thesis,I want to test and evaluate the use of CEP technologies for detecting Complex Patterns in Android sensor data (Floating Phone Data). Apache Flink offers a CEP library, so I thought it would be an interesting option. The data sources would be

Re: Unit Testing State Stores in KeyedProcessFunctions

2021-03-03 Thread Chesnay Schepler
I do not believe this to be possible. Given that the state will likely in some form affect the behavior of the function (usually in regards to what it outputs), it may be a better idea to test for that. (I suppose you'd want tests like that anyway) On 3/3/2021 8:10 PM, Rion Williams wrote:

Re: Flink upgrade causes operator to lose state

2021-03-03 Thread Chesnay Schepler
It is currently not possible to upgrade table API / SQL applications via savepoints. This thread may provide some more insights: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-and-checkpoints-and-savepoints-td40749.html On 3/3/2021 6:53 PM, soumoks wrote: Hi,

Re: Running Apache Flink on Android

2021-03-03 Thread Piotr Nowojski
Hi, The question would be, why do you want to do it? I think it might be possible, but probably nobody has ever tested it. Flink is a distributed system, so running it on an Android phone doesn't make much sense. I would suggest you first make your app/example work outside of Android. To make

Re: Flink KafkaProducer flushing on savepoints

2021-03-03 Thread Piotr Nowojski
Hi, What Flink version and which FlinkKafkaProducer version are you using? `FlinkKafkaProducerBase` is no longer used in the latest version. I would guess some older versions, and FlinkKafkaProducer010 or later (no longer supported). I would suggest either to use the universal FlinkKafkaProducer

Re: Job downgrade

2021-03-03 Thread Alexey Trenikhun
If I copy class A into version 1+ it works. But it is the problem from CD perspective - I want to introduce feature which required new state: 1st I need make version 1+ with class B, but no other changes, then version 2 with class B and logic changes, upgrade job and if job doesn’t do what

Re: Job downgrade

2021-03-03 Thread Piotr Nowojski
Hi, I'm not sure what's the reason behind this. Probably classes are somehow attached to the state and this would explain why you are experiencing this issue. I've asked someone else from the community to chip in, but in the meantime, can not you just prepare a new "version 1" of the job, with

Re: how to propagate watermarks across multiple jobs

2021-03-03 Thread Piotr Nowojski
Hi, Can not you write the watermark as a special event to the "mid-topic"? In the "new job2" you would parse this event and use it to assign watermark before `xxxWindow2`? I believe this is what FlinkKafkaShuffle is doing [1], you could look at its code for inspiration. Piotrek [1]

Re: Flink Metrics

2021-03-03 Thread Piotr Nowojski
Hi, 1) Do you want to output those metrics as Flink metrics? Or output those "metrics"/counters as values to some external system (like Kafka)? The problem discussed in [1], was that the metrics (Counters) were not fitting in memory, so David suggested to hold them on Flink's state and treat the

Unit Testing State Stores in KeyedProcessFunctions

2021-03-03 Thread Rion Williams
Hi all! Is it possible to apply assertions against the underlying state stores within a KeyedProcessFunction using the existing KeyedOneInputStreamOperatorTestHarness class within unit tests? Basically I wanted to ensure that if I passed in two elements each with unique keys that I would be able

Re: Independence of task parallelism

2021-03-03 Thread Piotr Nowojski
Hi Jan, As far as I remember, Flink doesn't handle very well cases like (1-2-1-1-1) and two Task Managers. There are no guarantees how the operators/subtasks are going to be scheduled, but most likely it will be as you mentioned/observed. First task manager will be handling all of the operators,

Re: Allocating tasks to specific TaskManagers

2021-03-03 Thread Piotr Nowojski
Hi Hyejo, I don't think it's possible. May I ask why do you want to do this? Best, Piotrek pon., 1 mar 2021 o 21:02 황혜조 napisał(a): > Hi, > > I am looking for a way to allocate each created subTask to a specific > TaskManager. > Is there any way to force assigning tasks to specific

Flink upgrade causes operator to lose state

2021-03-03 Thread soumoks
Hi, We are upgrading several applications from Flink 1.9.1 to 1.11.2. Some of the applications written with Table API are not able start from savepoint after the upgrade and fail with the following error. Caused by: java.lang.IllegalStateException: Failed to rollback to checkpoint/savepoint

Compile Error

2021-03-03 Thread Abdullah bin Omar
Hi, I am running a code (Example Program) from [1], and followed the [2] for the dependencies. I used this in the pom.xml: http://maven.apache.org/POM/4.0.0 "* xmlns:xsi=*"http://www.w3.org/2001/XMLSchema-instance "*

flink sql中如何使用异步io关联维表?

2021-03-03 Thread casel.chen
flink sql中如何使用异步io关联维表?官网文档有介绍么?

Re: Savepoint documentation

2021-03-03 Thread Farouk
Thanks a million :) Le mer. 3 mars 2021 à 11:15, David Anderson a écrit : > > Out of curiosity, does it mean that savepoint created by flink 1.11 > cannot be recovered by a job running with flink 1.10 or older versions (so > downgrade is impossible)? > > That's correct. See the mailing list

Re: Flink 1.12 Compatibility with hbase-shaded-client 2.1 in application jar

2021-03-03 Thread Debraj Manna
The issue is resolved. org.apache.hbase exclusion was missing on my application pom while creating the uber jar. diff --git a/map/engine/pom.xml b/map/engine/pom.xml index 8337be031d1..8eceb721fa7 100644 --- a/map/engine/pom.xml +++ b/map/engine/pom.xml @@ -203,6 +203,7 @@

Re:flink TableEnvironment.sqlUpdate不支持update 多表关联更新吗

2021-03-03 Thread Michael Ran
SQL 也不能这样吧- - At 2021-03-03 16:43:49, "JackJia" wrote: >Hi 诸位同仁: >诸同仁好,flink TableEnvironment.sqlUpdate是不是不支持update 多表关联更新? > > >如下代码: >bbTableEnv.sqlUpdate("update order_tb a, min_max_usertime_tb b set a.mark=-2 " >+ >" where a.mac=b.mac and extract(epoch from a.usertime)/7200

Re: flink-savepoint问题

2021-03-03 Thread yidan zhao
是不是使用了随机key。 guaishushu1...@163.com 于2021年3月3日周三 下午6:53写道: > checkpoint 可以成功保存,但是savepoint出现错误: > java.lang.Exception: Could not materialize checkpoint 2404 for operator > KeyedProcess (21/48). > at >

Re: Handling Data Separation / Watermarking from Kafka in Flink

2021-03-03 Thread David Anderson
When bounded Flink sources reach the end of their input, a special watermark with the value Watermark.MAX_WATERMARK is emitted that will take care of flushing all windows. One approach is to use a DeserializationSchema or KafkaDeserializationSchema with an implementation of isEndOfStream that

flink-savepoint问题

2021-03-03 Thread guaishushu1...@163.com
checkpoint 可以成功保存,但是savepoint出现错误: java.lang.Exception: Could not materialize checkpoint 2404 for operator KeyedProcess (21/48). at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1100) at

Stop vs Cancel with savepoint

2021-03-03 Thread Thomas Eckestad
Hi! Cancel with savepoint is marked as deprecated in the cli-documentation. It is not marked as deprecated in the REST-API documentation though? Is that a mistake? At least some recommendation regarding stop vs cancel would be appropriate to include in the API doc, or? As I understand, stop

Re: Savepoint documentation

2021-03-03 Thread David Anderson
> Out of curiosity, does it mean that savepoint created by flink 1.11 cannot be recovered by a job running with flink 1.10 or older versions (so downgrade is impossible)? That's correct. See the mailing list thread on Backwards Compatibility of Savepoints [1]. [1]

Re: BroadcastState dropped when data deleted in Kafka

2021-03-03 Thread bat man
I created a new descriptor and rulestream used it in the second process function and this works fine. public static final MapStateDescriptor rulesDescriptor = new MapStateDescriptor<>( "rules", BasicTypeInfo.INT_TYPE_INFO, TypeInformation.of(Rule.class)); public static

Re: Processing-time temporal join is not supported yet.

2021-03-03 Thread eric hoffmann
Hi Leonard, Thx for your reply, Not problem to help on the JIRA topic, In my situation, in a full sql env, what will be the best workaround to enrich stream of data from a kafka topics with statical data based on id? i know how to do t in stream. eric Le sam. 27 févr. 2021 à 05:15, Leonard Xu a

Re: Python DataStream API Questions -- Java/Scala Interoperability?

2021-03-03 Thread Shuiqiang Chen
Hi Kevin, Thank you for your questions. Currently, users are not able to defined custom source/sinks in Python. This is a greate feature that can unify the end to end PyFlink application development in Python and is a large topic that we have no plan to support at present. As you have noticed

flink TableEnvironment.sqlUpdate??????update ??????????????

2021-03-03 Thread JackJia
Hi ?? ??flink TableEnvironment.sqlUpdateupdate ?? ?? bbTableEnv.sqlUpdate("update order_tb a, min_max_usertime_tb b set a.mark=-2 " + " where a.mac=b.mac and extract(epoch from a.usertime)/7200 = b.user_2hour " + " and a.usertime > b.min_usertime

Re: Flink application kept restarting

2021-03-03 Thread Rainie Li
I see. Thank you for the explanation. Best regards Rainie On Wed, Mar 3, 2021 at 12:24 AM Matthias Pohl wrote: > Hi Rainie, > in general buffer pools being destroyed usually mean that some other > exception occurred that caused the task to fail and in the process of > failure handling the

??????????Flink1.11??flink-runtime-web????

2021-03-03 Thread Natasha
hi Michael, ?? ---- ??: "user-zh"

flink TableEnvironment.sqlUpdate??????update ??????????????

2021-03-03 Thread JackJia
Hi ?? ??flink TableEnvironment.sqlUpdateupdate ?? ?? bbTableEnv.sqlUpdate("update order_tb a, min_max_usertime_tb b set a.mark=-2 " + " where a.mac=b.mac and extract(epoch from a.usertime)/7200 = b.user_2hour " + " and a.usertime > b.min_usertime

Re: Flink application kept restarting

2021-03-03 Thread Matthias Pohl
Hi Rainie, in general buffer pools being destroyed usually mean that some other exception occurred that caused the task to fail and in the process of failure handling the operator-related network buffer is destroyed. That causes the "java.lang.RuntimeException: Buffer pool is destroyed." in your