Re: Bug: Kafka producer always writes to partition 0, because KafkaSerializationSchemaWrapper does not call open() method of FlinkKafkaPartitioner

2020-09-03 Thread DONG, Weike
Hi community, And by the way, during *FlinkKafkaProducer#initProducer*, the *flinkKafkaPartitioner* is only opened when is is NOT null, which is unfortunately not the case here, because it would be set to null if *KafkaSerializationSchemaWrapper *is provided in the arguments of the constructor.

Re: [VOTE] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Till Rohrmann
Hi Xintong, thanks for starting the vote. +1 for the proposal given that we find a proper name for the different memory consumers (specifically the batch/RocksDB consumer) and their corresponding weights. Cheers, Till On Thu, Sep 3, 2020 at 12:43 PM Xintong Song wrote: > Hi devs, > > I'd

Re: [VOTE] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-09-03 Thread Piotr Nowojski
+1 czw., 3 wrz 2020 o 10:08 Guowei Ma napisał(a): > +1 > Looking forward to having a unified datastream api. > Best, > Guowei > > > On Thu, Sep 3, 2020 at 3:46 PM Dawid Wysakowicz > wrote: > > > +1 > > > > I think it gives a clear idea why we should deprecate and eventually > > remove the

Re: [DISCUSS] FLIP-136: Improve interoperability between DataStream and Table API

2020-09-03 Thread Timo Walther
Hi Danny, "if ChangelogMode.INSERT is the default, existing pipelines should be compatible" It is not about changelog mode compatibility, it is about the type compatibility. The renaming to `toInsertStream` is only to have a mean of dealing with data type inconsistencies that could break

Re: [DISCUSS] Releasing Flink 1.11.2

2020-09-03 Thread Dawid Wysakowicz
User has just reported another issue FLINK-19133 which I think should be a blocker for the 1.11.2 release. I'll try to prepare a fix as soon as possible. On 03/09/2020 09:36, Zhu Zhu wrote: > Thanks for the inputs! > I have made FLINK-14942 and FLINK-18641 blockers for 1.11.2. > > And thanks a

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Till Rohrmann
Thanks for updating the FLIP Xintong. It looks good to me. One minor comment is that we could name the configuration parameter also taskmanager.memory.managed.consumer-weights which might be a bit more expressive what this option does. Cheers, Till On Thu, Sep 3, 2020 at 12:44 PM Xintong Song

[jira] [Created] (FLINK-19132) Failed to start jobs for consuming Secure Kafka after cluster restart

2020-09-03 Thread Olivier Zembri (Jira)
Olivier Zembri created FLINK-19132: -- Summary: Failed to start jobs for consuming Secure Kafka after cluster restart Key: FLINK-19132 URL: https://issues.apache.org/jira/browse/FLINK-19132 Project:

Bug: Kafka producer always writes to partition 0, because KafkaSerializationSchemaWrapper does not call open() method of FlinkKafkaPartitioner

2020-09-03 Thread DONG, Weike
Hi community, We have found a serious issue with the newly-introduced *KafkaSerializationSchemaWrapper *class, which eventually let *FlinkKafkaProducer *only write to partition 0 in the given Kafka topic under certain conditions. First let's look at this constructor in the universal version of

Re: [DISCUSS] FLIP-139: General Python User-Defined Aggregate Function on Table API

2020-09-03 Thread Wei Zhong
Hi everyone, Are there more comments about this FLIP? If not, I would like to bring up the VOTE. Best, Wei > 在 2020年9月1日,11:15,Wei Zhong 写道: > > Hi Timo, > > Thanks for your notification. I’ll remove it from the design doc. > > Best, > Wei > >> 在 2020年8月31日,21:11,Timo Walther 写道: >> >>

[jira] [Created] (FLINK-19133) User provided partitioners are not initialized correctly

2020-09-03 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-19133: Summary: User provided partitioners are not initialized correctly Key: FLINK-19133 URL: https://issues.apache.org/jira/browse/FLINK-19133 Project: Flink

[jira] [Created] (FLINK-19134) Fix the converter of array coder for Python DataStream API.

2020-09-03 Thread Shuiqiang Chen (Jira)
Shuiqiang Chen created FLINK-19134: -- Summary: Fix the converter of array coder for Python DataStream API. Key: FLINK-19134 URL: https://issues.apache.org/jira/browse/FLINK-19134 Project: Flink

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-09-03 Thread Till Rohrmann
Thanks for the feedback Xintong and Zhu Zhu. I've added a bit more details for the intended interface extensions, potential follow ups (removing the AllocationIDs) and the question about whether to reuse or return a slot if the profiles don't fully match. If nobody objects, then I would start a

Re: HadoopOutputFormat has issues with LocalExecutionEnvironment?

2020-09-03 Thread Ken Krugler
Hi Robert, I haven’t tried yet with 1.11, on my list. I’ll be spending time on this tomorrow, so hopefully more results. As for setting the algorithm version 2, I do it in code like this: Job job = Job.getInstance(); job.getConfiguration().set("io.serializations",

Re: [VOTE] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-09-03 Thread David Anderson
+1 On Thu, Sep 3, 2020 at 3:06 PM Piotr Nowojski wrote: > +1 > > czw., 3 wrz 2020 o 10:08 Guowei Ma napisał(a): > > > +1 > > Looking forward to having a unified datastream api. > > Best, > > Guowei > > > > > > On Thu, Sep 3, 2020 at 3:46 PM Dawid Wysakowicz > > wrote: > > > > > +1 > > > > >

[jira] [Created] (FLINK-19135) (Stream)ExecutionEnvironment.execute() should not throw ExecutionException

2020-09-03 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19135: Summary: (Stream)ExecutionEnvironment.execute() should not throw ExecutionException Key: FLINK-19135 URL: https://issues.apache.org/jira/browse/FLINK-19135

[VOTE] FLIP-138: Declarative Resource Management

2020-09-03 Thread Till Rohrmann
Hi devs, I'd like to start a voting thread on FLIP-138 [1], which proposes to make the slot protocol declarative. The proposal has been discussed in [2]. The vote will be open for at least 72h + weekends. Hence, it can be closed on September 9, unless there is an objection or not enough votes.

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Xintong Song
Thanks Till, `taskmanager.memory.managed.consumer-weights` sounds good to me. Thank you~ Xintong Song On Thu, Sep 3, 2020 at 8:44 PM Till Rohrmann wrote: > Thanks for updating the FLIP Xintong. It looks good to me. One minor > comment is that we could name the configuration parameter > also

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-09-03 Thread Xintong Song
Thanks Till, the changes look good to me. Looking forward to the vote. Thank you~ Xintong Song On Fri, Sep 4, 2020 at 12:31 AM Till Rohrmann wrote: > Thanks for the feedback Xintong and Zhu Zhu. I've added a bit more details > for the intended interface extensions, potential follow ups

Re: [VOTE] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-09-03 Thread Zhu Zhu
+1 >From the viewpoint of a developer, too many APIs also adds complexity and risk to the code change for job compilation stage and runtime stage. Looking forward to the day that we can drop DataSet API and having a unified one. Thanks, Zhu David Anderson 于2020年9月4日周五 上午2:23写道: > +1 > > > On

[jira] [Created] (FLINK-19136) MetricsAvailabilityITCase.testReporter failed with "Could not satisfy the predicate within the allowed time"

2020-09-03 Thread Dian Fu (Jira)
Dian Fu created FLINK-19136: --- Summary: MetricsAvailabilityITCase.testReporter failed with "Could not satisfy the predicate within the allowed time" Key: FLINK-19136 URL:

Re: [VOTE] FLIP-138: Declarative Resource Management

2020-09-03 Thread Xintong Song
Thanks for starting this vote. +1 from my side. Thank you~ Xintong Song On Fri, Sep 4, 2020 at 12:37 AM Till Rohrmann wrote: > Hi devs, > > I'd like to start a voting thread on FLIP-138 [1], which proposes to make > the slot protocol declarative. The proposal has been discussed in [2]. > >

[jira] [Created] (FLINK-19138) Python UDF supports directly specifying input_types as DataTypes.ROW

2020-09-03 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-19138: Summary: Python UDF supports directly specifying input_types as DataTypes.ROW Key: FLINK-19138 URL: https://issues.apache.org/jira/browse/FLINK-19138 Project: Flink

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-09-03 Thread Zhu Zhu
The new edits look good to me. Looking forward to the vote. Thanks, Zhu Xintong Song 于2020年9月4日周五 上午9:49写道: > Thanks Till, the changes look good to me. Looking forward to the vote. > > Thank you~ > > Xintong Song > > > > On Fri, Sep 4, 2020 at 12:31 AM Till Rohrmann > wrote: > > > Thanks for

[jira] [Created] (FLINK-19137) Bump Apache Parquet to 1.11.1

2020-09-03 Thread ABC (Jira)
ABC created FLINK-19137: --- Summary: Bump Apache Parquet to 1.11.1 Key: FLINK-19137 URL: https://issues.apache.org/jira/browse/FLINK-19137 Project: Flink Issue Type: Improvement Components:

[VOTE] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Xingbo Huang
Hi all, I would like to start the vote for FLIP-137[1], which is discussed and reached a consensus in the discussion thread[2]. The vote will be open for at least 72h, unless there is an objection or not enough votes. Best, Xingbo [1]

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Xingbo Huang
Hi everyone, Thanks all of you for the discussion. If there are no objections, I would like to start a vote thread tomorrow. Best, Xingbo Dian Fu 于2020年9月3日周四 下午5:45写道: > Thanks for preparing the FLIP, xingbo! > > LGTM overall and looking forward to the voting! > > Regards, > Dian > > > 在

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Yangze Guo
Thanks for driving this! The newest version LGTM. +1 for this FLIP. Best, Yangze Guo On Thu, Sep 3, 2020 at 2:11 PM Dian Fu wrote: > > Thanks for driving this FLIP, Xintong! +1 to the updated version. > > > 在 2020年9月2日,下午6:09,Xintong Song 写道: > > > > Thanks for the input, Yu. > > > > I believe

Re: HadoopOutputFormat has issues with LocalExecutionEnvironment?

2020-09-03 Thread Robert Metzger
Hi Ken, sorry for the late reply. This could be a bug in Flink. Does the issue also occur on Flink 1.11? Have you set a breakpoint in the HadoopOutputFormat.finalizeGlobal() when running locally to validate that this method doesn't get called? What do you mean by "algorithm version 2"? Where can

[RESULT][VOTE] Remove deprecated DataStream#fold and DataStream#split in 1.12

2020-09-03 Thread Dawid Wysakowicz
The voting time for removing the DataStream#fold and DataStream#split has passed. I'm closing the vote now. There were 5 +1 votes (4 of which binding) : * Aljoscha (binding) * Paul Lam (non-binding) * David (binding) * Timo (binding) * Konstantin (binding) There were no -1 votes. I

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Zhu Zhu
Thanks for proposing this improvement! @Xintong The proposal looks good to me. Agreed that we should make it as simple as possible for users to understand. Thanks, Zhu Dian Fu 于2020年9月3日周四 下午2:11写道: > Thanks for driving this FLIP, Xintong! +1 to the updated version. > > > 在

Re: [DISCUSS] FLIP-136: Improve interoperability between DataStream and Table API

2020-09-03 Thread Timo Walther
Thanks for the nice summary Dawid. I also see the pain points in this part of the API. Most of the users just want to add a time attribute. I'm not sure how much projection features we need to have in a `fromDataStream`. Users can do column renaming/reordering afterwards in a `.select()`.

Re: [DISCUSS] Releasing Flink 1.11.2

2020-09-03 Thread Congxian Qiu
Hi I'd like to include FLINK-14942 into 1.11.2. FLINK-14942 (this fixes a bug introduce in 1.11.0), there is a pr for it already. Best, Congxian Zhou, Brian 于2020年9月3日周四 上午11:21写道: > Hi, > > Thanks Becket for addressing the issue. FLINK-18641 is now a blocker for > Pravega connector

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Dian Fu
Thanks for driving this FLIP, Xintong! +1 to the updated version. > 在 2020年9月2日,下午6:09,Xintong Song 写道: > > Thanks for the input, Yu. > > I believe the current proposal should work with RocksDB, or any other state > backend, using memory at either the slot or the scope. With the proposed >

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Xingbo Huang
Hi Jincheng, Yes, I agree that users can extend the class `AggregateFunction` if they want to define a Pandas UDAF by the way of custom classes. I have updated the part of the FLIP. Best, Xingbo jincheng sun 于2020年9月3日周四 下午1:48写道: > Thanks for the update Xingbo! > > Pandas UDAF can reuse the

Re: [VOTE] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-09-03 Thread Guowei Ma
+1 Looking forward to having a unified datastream api. Best, Guowei On Thu, Sep 3, 2020 at 3:46 PM Dawid Wysakowicz wrote: > +1 > > I think it gives a clear idea why we should deprecate and eventually > remove the DataSet API. > > Best, > > Dawid > > On 03/09/2020 09:37, Yun Gao wrote: > >

[jira] [Created] (FLINK-19131) Add py38 support in PyFlink

2020-09-03 Thread sunjincheng (Jira)
sunjincheng created FLINK-19131: --- Summary: Add py38 support in PyFlink Key: FLINK-19131 URL: https://issues.apache.org/jira/browse/FLINK-19131 Project: Flink Issue Type: New Feature

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread jincheng sun
Thank you! looking forward to the voting :) Best, Jincheng Xingbo Huang 于2020年9月3日周四 下午2:39写道: > Hi Jincheng, > > Yes, I agree that users can extend the class `AggregateFunction` if they > want to define a Pandas UDAF by the way of custom classes. I have updated > the part of the FLIP. > >

Re: [DISCUSS] Releasing Flink 1.11.2

2020-09-03 Thread Zhu Zhu
Thanks for the inputs! I have made FLINK-14942 and FLINK-18641 blockers for 1.11.2. And thanks a lot for offering help, zhijiang! Thanks, Zhu Congxian Qiu 于2020年9月3日周四 下午3:18写道: > Hi > I'd like to include FLINK-14942 into 1.11.2. FLINK-14942 (this fixes a > bug introduce in 1.11.0), there

Re: [VOTE] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-09-03 Thread Yun Gao
Very thanks for bring this up! +1 for deprecating the DataSet API and providing a unified streaming/batch programming model to users. Best, Yun -- Sender:Aljoscha Krettek Date:2020/09/02 19:22:51 Recipient:Flink Dev

Re: [VOTE] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-09-03 Thread Dawid Wysakowicz
+1 I think it gives a clear idea why we should deprecate and eventually remove the DataSet API. Best, Dawid On 03/09/2020 09:37, Yun Gao wrote: > Very thanks for bring this up! +1 for deprecating the DataSet API and > providing a unified streaming/batch programming model to users. > > Best,

[VOTE] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Xintong Song
Hi devs, I'd like to start a voting thread on FLIP-141[1], which proposes how managed memory should be shared by various use cases within a slot. The proposal has been discussed in [2]. The vote will be open for at least 72h + weekends. I'll try to close it on September 8, unless there is an

Re: [DISCUSS] FLIP-136: Improve interoperability between DataStream and Table API

2020-09-03 Thread Danny Chan
"It is a more conservative approach to introduce that in a new method rather than changing the existing one under the hood and potentially break existing pipelines silently” I like the idea actually, but if ChangelogMode.INSERT is the default, existing pipelines should be compatible. We can see

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Dian Fu
Thanks for preparing the FLIP, xingbo! LGTM overall and looking forward to the voting! Regards, Dian > 在 2020年9月3日,下午5:22,jincheng sun 写道: > > Thank you! looking forward to the voting :) > > Best, > Jincheng > > > Xingbo Huang 于2020年9月3日周四 下午2:39写道: > >> Hi Jincheng, >> >> Yes, I agree

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

2020-09-03 Thread Xintong Song
Thanks all for the feedback. FYI, I've opened a voting thread[1] on this. Thank you~ Xintong Song [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-141-Intra-Slot-Managed-Memory-Sharing-td44358.html On Thu, Sep 3, 2020 at 2:54 PM Zhu Zhu wrote: > Thanks for