Re: get state from window

2022-08-17 Thread yuxia
Sorry for misleading. After some investigation, seems UDTAGG can only used in flink table spi. Best regards, Yuxia - 原始邮件 - 发件人: "yuxia" 收件人: "user-zh" 抄送: "User" 发送时间: 星期四, 2022年 8 月 18日 上午 10:21:12 主题: Re: get state from window > does flink sql support UDTAGG? Yes, Flink sql support

Re: get state from window

2022-08-17 Thread yanfei lei
Hi, there are two methods on the Context object that a process() invocation receives that allows access to the two types of state: - globalState(), which allows access to keyed state that is not scoped to a window - windowState(), which allows access to keyed state that is also scoped

Re: get state from window

2022-08-17 Thread yuxia
> does flink sql support UDTAGG? Yes, Flink sql support UDTAGG. Best regards, Yuxia - 原始邮件 - 发件人: "曲洋" 收件人: "user-zh" , "User" 发送时间: 星期四, 2022年 8 月 18日 上午 10:03:24 主题: get state from window Hi dear engineers, I have one question: does flink streaming support getting the state.I overr

Re: without DISTINCT unique lines show up many times in FLINK SQL

2022-08-17 Thread yuxia
Seems it's the same problem to the problem discussed in [1] [1]:https://lists.apache.org/thread/3lvkd8hryb1zdxs3o8z65mrjyoqzs88l Best regards, Yuxia - 原始邮件 - 发件人: "Marco Villalobos" 收件人: "User" 发送时间: 星期三, 2022年 8 月 17日 下午 12:56:44 主题: without DISTINCT unique lines show up many times in

Re: Is this a Batch SQL Bug?

2022-08-17 Thread yuxia
Thanks for raising it. Yes, you're right. It's indeed a bug. The problem is the RowData produced by LineBytesInputFormat is reused, but DeserializationSchemaAdapter#Reader only do shallow copy of the produced data, so that the finnal result will always be the last row value. Could you please hel

Re: Pyflink :: Conversion from DataStream to TableAPI

2022-08-17 Thread yu'an huang
Thank you for Dian's explaination. I thought pyflink suported non-keyed stream cause I saw "If key_by(...) is not called, your stream is not keyed." in the document lol. Sorry for the confusion to Ramana. On Thu, 18 Aug 2022 at 9:36 AM, Dian Fu wrote: > Hey Ramana, > > Non-keyed window will be s

Re: Pyflink :: Conversion from DataStream to TableAPI

2022-08-17 Thread Dian Fu
Hey Ramana, Non-keyed window will be supported in the coming Flink 1.16. See https://issues.apache.org/jira/browse/FLINK-26480 for more details. In releases prior to 1.16, you could work around it as following: ``` data_stream = xxx data_stream.key_by(lambda x: 'key').xxx().force_non_parallel() `

Is this a Batch SQL Bug?

2022-08-17 Thread Marco Villalobos
Given this program: ```java package mvillalobos.bug; import org.apache.flink.api.common.RuntimeExecutionMode; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.table.api.TableResult; import org.apache.flink.table.api.bridge.java.StreamTableEnvir

Failing to compile Flink 1.9 with Scala 2.12

2022-08-17 Thread Milind Vaidya
Hi Trying to compile and build Flink jars based on Scala 2.12. Settings : Java 8 Maven 3.6.3 / 3.8.6 Many online posts suggest using Java 8 which is already in place. Building using Jenkins. Any clues as to how to get rid of it? net.alchim31.maven scala-maven-plugin 3.3.2 -nobootcp -Xs

Re: flink sink kafka exactly once plz help me

2022-08-17 Thread David Anderson
You can keep the same transaction ID if you are restarting the job as a continuation of what was running before. You need distinct IDs for different jobs that will be running against the same kafka brokers. I think of the transaction ID as an application identifier. See [1] for a complete list of

Eventtimes and watermarks not in sync after sorting stream by eventide

2022-08-17 Thread Peter Schrott
Background: Using Flink v. 1.13.2 on AWS, with job parallelism of 4. Ingress data from AWS Kinesis are not partitioned by the correct key according to business logic. For that reason, events are repartitioned by using a KeyedStream produced by calling keyBy(.) function providing the correct logi

flink sink kafka exactly once plz help me

2022-08-17 Thread kcz
flink-1.14.4 kafka-2.4.0 setTransactionalIdPrefix has a small question, this parameter if I start the next job, can not use the last transaction ID, need to automatically generate a new one, I just tested the restart from chk, but also generated a new transaction ID, this will not lead to data l