Re: Support minibatch for TopNFunction

2024-03-27 Thread Roman Boyko
Hi Xushuai! Thank you for your reply! 1. Yes, you are absolutely right - we can't fold the records inside output buffer if the current record, which is provided to output, has accumulate type (+I or +U). Only revoke type of records (-U or -D which produced by current TopN function or received by

Re: [DISCUSS] Planning Flink 1.20

2024-03-27 Thread weijie guo
Hi everyone, The discussion has been going on for a long time and there are currently 4 RM candidates. According to the voting results, the final release manager of 1.20 are: Weijie Guo, Rui Fan, Ufuk Celebi and Robert Metzger. We will do the first release sync on April 9, 2024, at 10am (UTC+2) a

[jira] [Created] (FLINK-34958) Add support Flink 1.20-SNAPSHOT and bump flink-connector-parent to 1.1.0 for mongodb connector

2024-03-27 Thread Zhongqiang Gong (Jira)
Zhongqiang Gong created FLINK-34958: --- Summary: Add support Flink 1.20-SNAPSHOT and bump flink-connector-parent to 1.1.0 for mongodb connector Key: FLINK-34958 URL: https://issues.apache.org/jira/browse/FLINK-349

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Jinzhong Li
Hi Feifan, Sorry for the misunderstanding. As Hangxiang explained, the basic cleanup mechanism for remote working directory is the same as rocksdb-statebackend, that is, when TM exits, forst-statebackend will delete the entire working dir. Regarding orphaned files cleanup in the case of TM crash,

Re: [DISCUSS] FLIP-427: Disaggregated State Store

2024-03-27 Thread Feifan Wang
Thanks for your reply, Hangxiang. I totally agree with you about the jni part. Hi Yun Tang, I just noticed that FLIP-427 mentions “The life cycle of working dir is managed as before local strategy.” IIUC, the working dir will be deleted after TaskManager exit. And I think that's enough for curre

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Hangxiang Yu
Hi, Yun and Feifan. Thanks for your reply. About the cleanup of working dir, as mentioned in FLIP-427, "The life cycle of working dir is managed as before local strategy.". Since the current working dir and checkpoint dir are separate, The life cycle including creating and cleanup of working dir

Re: Re: [DISCUSS] FLIP-427: Disaggregated State Store

2024-03-27 Thread Hangxiang Yu
Hi, Feifan. Thanks for your reply. What if we only use jni to access DFS that needs to reuse Flink FileSystem? > And all local disk access through native api. This idea is based on the > understanding that jni overhead is not worth mentioning compared to DFS > access latency. It might make more s

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Feifan Wang
And I think the cleanup of working dir should be discussion in FLIP-427[1] ( this mail list [2]) ? [1] https://cwiki.apache.org/confluence/x/T4p3EQ [2] https://lists.apache.org/thread/vktfzqvb7t4rltg7fdlsyd9sfdmrc4ft —— Best regards, Feifan Wang At 2024-03-28 11:56:22, "Feifa

[jira] [Created] (FLINK-34957) JDBC Autoscaler event handler throws Column 'message' cannot be null

2024-03-27 Thread Rui Fan (Jira)
Rui Fan created FLINK-34957: --- Summary: JDBC Autoscaler event handler throws Column 'message' cannot be null Key: FLINK-34957 URL: https://issues.apache.org/jira/browse/FLINK-34957 Project: Flink

Re:Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Feifan Wang
Hi Jinzhong : > I suggest that we could postpone this topic for now and consider it > comprehensively combined with the TM ownership file management in the future > FLIP. Sorry I still think we should consider the cleanup of the working dir in this FLIP, although we may come up with a better

Re:Re: [DISCUSS] FLIP-427: Disaggregated State Store

2024-03-27 Thread Feifan Wang
Thanks for this valuable proposal Hangxiang ! > If we need to introduce a JNI call during each filesystem call, that would be > N times JNI cost compared with the current RocksDB state-backend's JNI cost. What if we only use jni to access DFS that needs to reuse Flink FileSystem? And all local

[jira] [Created] (FLINK-34956) The config type is wrong for Duration

2024-03-27 Thread Rui Fan (Jira)
Rui Fan created FLINK-34956: --- Summary: The config type is wrong for Duration Key: FLINK-34956 URL: https://issues.apache.org/jira/browse/FLINK-34956 Project: Flink Issue Type: Bug Compone

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Jinzhong Li
Hi Yun, Thanks for your reply. > 1. Why must we have another 'subTask-checkpoint-sub-dir' > under the shared directory? if we don't consider making > TM ownership in this FLIP, this design seems unnecessary. Good catch! We will not change the directory layout of shared directory in this FLIP. I

Re: [DISCUSS] FLIP-XXX: Introduce Flink SQL variables

2024-03-27 Thread Yanfei Lei
Hi Ferenc, Thanks for the proposal, using SQLvariables to exclude environment-specific configuration from code sounds like a good idea. I'm new to Flink SQL and I'm curious if these variables can be calculated from statements or expression [1]? In FLIP, it seems that the values are in the form of

Re: Support minibatch for TopNFunction

2024-03-27 Thread shuai xu
Hi, Roman Thanks for your proposal. I think this is an interesting idea and it might be useful when there are operators downstream of the TopN. And I have some questions about your proposal after reading your doc. 1. From the input-output perspective, only the accumulated data seems to be sent

Re: [DISCUSS] FLIP-427: Disaggregated State Store

2024-03-27 Thread Hangxiang Yu
Hi, Yun. Thanks for the reply. The JNI cost you considered is right. As replied to Yue, I agreed to leave space and consider proposal 1 as an optimization in the future, which is also updated in the FLIP. The other question is that the configuration of > `state.backend.forSt.working-dir` looks to

[jira] [Created] (FLINK-34955) Upgrade commons-compress to 1.26.0

2024-03-27 Thread Shilun Fan (Jira)
Shilun Fan created FLINK-34955: -- Summary: Upgrade commons-compress to 1.26.0 Key: FLINK-34955 URL: https://issues.apache.org/jira/browse/FLINK-34955 Project: Flink Issue Type: Improvement

Re: [DISCUSS] FLIP-XXX: Introduce Flink SQL variables

2024-03-27 Thread Jim Hughes
Hi Ferenc, Looks like a good idea. I'd prefer sticking to the SQL standard if possible. Would it be possible / sensible to allow for each syntax, perhaps managed by a config setting? Cheers, Jim On Tue, Mar 26, 2024 at 6:59 AM Ferenc Csaky wrote: > Hello devs, > > I would like to start a di

[jira] [Created] (FLINK-34954) Kryo input implementation NoFetchingInput fails to handle zero length bytes

2024-03-27 Thread Qinghui Xu (Jira)
Qinghui Xu created FLINK-34954: -- Summary: Kryo input implementation NoFetchingInput fails to handle zero length bytes Key: FLINK-34954 URL: https://issues.apache.org/jira/browse/FLINK-34954 Project: Flin

Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-03-27 Thread Venkatakrishnan Sowrirajan
Rui, I assume the current proposal would also handle the case of mixed mode (BATCH + STREAMING within the same app) in the future, right? Regards Venkat On Wed, Mar 27, 2024 at 10:15 AM Venkatakrishnan Sowrirajan < vsowr...@asu.edu> wrote: > This will be a very useful addition to Flink UI. Than

Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-03-27 Thread Venkatakrishnan Sowrirajan
This will be a very useful addition to Flink UI. Thanks Rui for starting a FLIP for this improvement. Regards Venkata krishnan On Wed, Mar 27, 2024 at 4:49 AM Muhammet Orazov wrote: > Hello Rui, > > Thanks for the proposal! It looks good! > > I have minor clarification from my side: > > The ex

Re: [DISCUSS] FLIP-437: Support ML Models in Flink SQL

2024-03-27 Thread Hao Li
Hi Jark, I think we can start with supporting popular model providers such as openai, azureml, sagemaker for remote models. Thanks, Hao On Tue, Mar 26, 2024 at 8:15 PM Jark Wu wrote: > Thanks for the PoC and updating, > > The final syntax looks good to me, at least it is a nice and concise fir

Re: [DISCUSS] FLIP-427: Disaggregated State Store

2024-03-27 Thread Yun Tang
Hi Hangxiang, The design looks good, and I also support leaving space for proposal 1. As you know, loading index/filter/data blocks for querying across levels would introduce high IO access within the LSM tree for old data. If we need to introduce a JNI call during each filesystem call, that wo

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Yun Tang
Hi Jinzhong, The overall design looks good. I have two minor questions: 1. Why must we have another 'subTask-checkpoint-sub-dir' under the shared directory? if we don't consider making TM ownership in this FLIP, this design seems unnecessary. 2. This FLIP forgets to mention the cleanup of the

[jira] [Created] (FLINK-34953) Add github ci for flink-web to auto commit build files

2024-03-27 Thread Zhongqiang Gong (Jira)
Zhongqiang Gong created FLINK-34953: --- Summary: Add github ci for flink-web to auto commit build files Key: FLINK-34953 URL: https://issues.apache.org/jira/browse/FLINK-34953 Project: Flink

Re: [DISCUSS] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI

2024-03-27 Thread Muhammet Orazov
Hello Rui, Thanks for the proposal! It looks good! I have minor clarification from my side: The execution mode is also used for the DataStream API [1], would that also affect/hide the DataStream execution mode if we remove it from the WebUI? Best, Muhammet [1]: https://nightlies.apache.org/f

[VOTE] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-27 Thread Jinzhong Li
Hi devs, I'd like to start a vote on the FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State [1]. The discussion thread is here [2]. The vote will be open for at least 72 hours unless there is an objection or insufficient votes. [1] https://cwiki.apache.org/confluence/x/UYp3E

[jira] [Created] (FLINK-34952) Flink CDC pipeline supports SourceFunction

2024-03-27 Thread Hongshun Wang (Jira)
Hongshun Wang created FLINK-34952: - Summary: Flink CDC pipeline supports SourceFunction Key: FLINK-34952 URL: https://issues.apache.org/jira/browse/FLINK-34952 Project: Flink Issue Type: Imp

Re: [DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-03-27 Thread Anupam Aggarwal
Thanks Kevin, I will pick this up more actively starting early next week. On Wed, Mar 27, 2024 at 1:36 AM Kevin Lam wrote: > Thanks Anupam! Looking forward to it. > > On Thu, Mar 14, 2024 at 1:50 AM Anupam Aggarwal > > wrote: > > > Hi Kevin, > > > > Thanks, these are some great points. > > Just

[VOTE] FLIP-426: Grouping Remote State Access

2024-03-27 Thread Jinzhong Li
Hi devs, I'd like to start a vote on the FLIP-426: Grouping Remote State Access [1]. The discussion thread is here [2]. The vote will be open for at least 72 hours unless there is an objection or insufficient votes. [1] https://cwiki.apache.org/confluence/x/TYp3EQ [2] https://lists.apache.org/

[VOTE] FLIP-427: Disaggregated state Store

2024-03-27 Thread Hangxiang Yu
Hi devs, Thanks all for your valuable feedback about FLIP-427: Disaggregated state Store [1]. I'd like to start a vote on it. The discussion thread is here [2]. The vote will be open for at least 72 hours unless there is an objection or insufficient votes. [1] https://cwiki.apache.org/confluenc

[VOTE] FLIP-425: Asynchronous Execution Model

2024-03-27 Thread Yanfei Lei
Hi everyone, Thanks for all the feedback about the FLIP-425: Asynchronous Execution Model [1]. The discussion thread is here [2]. The vote will be open for at least 72 hours unless there is an objection or insufficient votes. [1] https://cwiki.apache.org/confluence/x/S4p3EQ [2] https://lists.apa

[VOTE] FLIP-424: Asynchronous State APIs

2024-03-27 Thread Zakelly Lan
Hi devs, I'd like to start a vote on the FLIP-424: Asynchronous State APIs [1]. The discussion thread is here [2]. The vote will be open for at least 72 hours unless there is an objection or insufficient votes. [1] https://cwiki.apache.org/confluence/x/SYp3EQ [2] https://lists.apache.org/thread/

Re: [DISCUSS] Flink Website Menu Adjustment

2024-03-27 Thread gongzhongqiang
Hi everyone, Thanks for feedback. It seems that there is no consensus on this change. I have opened a FLINK-34946 issue to follow up on this matter. Best regards, Zhongqiang Gong Hang Ruan 于2024年3月26日周二 14:11写道: > +1 for the proposal. > > B

Re: [DISCUSS] FLIP-423 ~FLIP-428: Introduce Disaggregated State Storage and Management in Flink 2.0

2024-03-27 Thread Zakelly Lan
Thank Piotrek for your valuable input! I will prepare the following FLIPs about faster checkpointing in the current async execution model and the new APIs. And I have added some brief description of this part in FLIP-423/424/425. Regarding your concern: > My main concern here, is to prevent a s

[jira] [Created] (FLINK-34951) Flink-ci-mirror stopped running for commits after 22nd of March

2024-03-27 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-34951: --- Summary: Flink-ci-mirror stopped running for commits after 22nd of March Key: FLINK-34951 URL: https://issues.apache.org/jira/browse/FLINK-34951 Project: Flink

[jira] [Created] (FLINK-34950) Disable spotless on Java 21 for connector-shared-utils

2024-03-27 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-34950: --- Summary: Disable spotless on Java 21 for connector-shared-utils Key: FLINK-34950 URL: https://issues.apache.org/jira/browse/FLINK-34950 Project: Flink

Community Over Code NA 2024 Travel Assistance Applications now open!

2024-03-27 Thread Gavin McDonald
Hello to all users, contributors and Committers! [ You are receiving this email as a subscriber to one or more ASF project dev or user mailing lists and is not being sent to you directly. It is important that we reach all of our users and contributors/committers so that they may get a chance t

[jira] [Created] (FLINK-34949) Suggestion notify / skip DDL at data inegration famework v3.X

2024-03-27 Thread Lee SeungMin (Jira)
Lee SeungMin created FLINK-34949: Summary: Suggestion notify / skip DDL at data inegration famework v3.X Key: FLINK-34949 URL: https://issues.apache.org/jira/browse/FLINK-34949 Project: Flink

Re: [DISCUSS] FLIP-423 ~FLIP-428: Introduce Disaggregated State Storage and Management in Flink 2.0

2024-03-27 Thread Piotr Nowojski
Hi! Yes, after some long offline discussions we agreed to proceed as planned here, but we should treat the current API as experimental. The issues are that either we can not checkpoint lambdas as they are currently defined, leading to problems caused by in-flight records draining under backpressur

[jira] [Created] (FLINK-34948) CDC RowType can not convert to flink type

2024-03-27 Thread Qishang Zhong (Jira)
Qishang Zhong created FLINK-34948: - Summary: CDC RowType can not convert to flink type Key: FLINK-34948 URL: https://issues.apache.org/jira/browse/FLINK-34948 Project: Flink Issue Type: Bug