[jira] [Created] (FLINK-22862) Support profiling in PyFlink
Huang Xingbo created FLINK-22862: Summary: Support profiling in PyFlink Key: FLINK-22862 URL: https://issues.apache.org/jira/browse/FLINK-22862 Project: Flink Issue Type: Improvement Components: API / Python Affects Versions: 1.14.0 Reporter: Huang Xingbo Assignee: Huang Xingbo Fix For: 1.14.0 We will support profiling to help users to analyze the performance bottleneck in their python udf -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [Discuss] Planning Flink 1.14
Thanks for bringing this up. I have one thought about the release period. In a short word: shall we try to extend the release period for 1 month? There are a couple of reasons why I want to bring up this proposal. 1) I observed that lots of users are actually far behind the current Flink version. For example, we are now actively developing 1.14 but most users I know who have a migration or upgrade plan are planning to upgrade to 1.12. This means we need to back port bug fixes to 1.12 and 1.13. If we extend the release period by 1 month, I think there may be some chances that users can have a proper time frame to upgrade to the previous released version. Then we can have a good development cycle which looks like "actively developing the current version and making the previous version stable, not 2 ~ 3 versions before". Always far away from Flink's latest version also suppresses the motivation to contribute to Flink from users perspective. 2) Increasing the release period also eases the workload of committers which I think can improve the contributor experience. I have seen several times that when some contributors want to do some new features or improvements, we have to response with "sorry we are right now focusing with implementing/stabilizing planned feature for this version", and the contributions are mostly like being stalled and never brought up again. BTW extending the release period also has downsides. It slows down the delivery speed of new features. And I'm also not sure how much it can improve the above 2 issues. Looking forward to hearing some feedback from the community, both users and developers. Best, Kurt On Wed, Jun 2, 2021 at 8:39 PM JING ZHANG wrote: > Hi Dawid, Joe & Xintong, > > Thanks for starting the discussion. > > I would like to polish Window TVFs[1][2] which is a popular feature in SQL > introduced in 1.13. > > The detailed items are as follows. > 1. Add more computations based on Window TVF > * Window Join (which is already merged in master branch) > * Window Table Function > * Window Deduplicate > 2. Finish related JIRA to improve user experience >* Add offset support for TUMBLE, HOP, session window > 3. Complement the missing functions compared to the group window, which is > a precondition of deprecating the legacy Grouped Window Function in the > later versions. >* Support Session windows >* Support allow-lateness >* Support retract input stream >* Support window TVF in batch mode > > [1] https://issues.apache.org/jira/browse/FLINK-19604 > [2] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function#FLIP145:SupportSQLwindowingtablevaluedfunction-CumulatingWindows > > Best regards, > JING ZHANG > > Xintong Song 于2021年6月2日周三 下午6:45写道: > > > Hi all, > > > > As 1.13 has been released for a while, I think it is a good time to start > > planning for the 1.14 release cycle. > > > > - Release managers: This time we'd like to have a team of 3 release > > managers. Dawid, Joe and I would like to volunteer for it. What do you > > think about it? > > > > - Timeline: According to our approximate 4 months release period, we > > propose to aim for a feature freeze roughly in early August (which could > > mean something like early September for the 1.14. release). Does it work > > for everyone? > > > > - Collecting features: It would be helpful to have a rough overview of > the > > new features that will likely be included in this release. We have > created > > a wiki page [1] for collecting such information. We'd like to kindly ask > > all committers to fill in the page with features that they intend to work > > on. > > > > We would also like to emphasize some aspects of the engineering process: > > > > - Stability of master: This has been an issue during the 1.13 feature > > freeze phase and it is still going on. We encourage every committer to > not > > merge PRs through the Github button, but do this manually, with caution > for > > the commits merged after the CI being triggered. It would be appreciated > to > > always build the project before merging to master. > > > > - Documentation: Please try to see documentation as an integrated part of > > the engineering process and don't push it to the feature freeze phase or > > even after. You might even think about going documentation first. We, as > > the Flink community, are adding great stuff, that is pushing the limits > of > > streaming data processors, with every release. We should also make this > > stuff usable for our users by documenting it well. > > > > - Promotion of 1.14: What applies to documentation also applies to all > the > > activity around the release. We encourage every contributor to also think > > about, plan and prepare activities like blog posts and talk, that will > > promote and spread the release once it is done. > > > > Please let us know what you think. > > > > Thank you~ > > Dawid, Joe & Xintong > > > > [1]
[jira] [Created] (FLINK-22861) TIMESTAMPADD + timestamp_ltz type throws CodeGenException when comparing with timestamp type
Caizhi Weng created FLINK-22861: --- Summary: TIMESTAMPADD + timestamp_ltz type throws CodeGenException when comparing with timestamp type Key: FLINK-22861 URL: https://issues.apache.org/jira/browse/FLINK-22861 Project: Flink Issue Type: Bug Components: Table SQL / Planner, Table SQL / Runtime Affects Versions: 1.14.0, 1.13.2 Reporter: Caizhi Weng Add the following test case to {{org.apache.flink.table.planner.runtime.batch.sql.CalcITCase}} to reproduce this issue. {code:scala} @Test def myTest(): Unit = { checkResult("SELECT TIMESTAMPADD(MINUTE, 10, CURRENT_TIMESTAMP) < TO_TIMESTAMP('2021-06-01 00:00:00')", Seq()) } {code} The exception stack is {code} org.apache.flink.table.planner.codegen.CodeGenException: Incomparable types: TIMESTAMP_LTZ(3) NOT NULL and TIMESTAMP(3) NOT NULL at org.apache.flink.table.planner.codegen.calls.ScalarOperatorGens$.generateComparison(ScalarOperatorGens.scala:633) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateCallExpression(ExprCodeGenerator.scala:661) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:529) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56) at org.apache.calcite.rex.RexCall.accept(RexCall.java:174) at org.apache.flink.table.planner.codegen.ExprCodeGenerator$$anonfun$11.apply(ExprCodeGenerator.scala:526) at org.apache.flink.table.planner.codegen.ExprCodeGenerator$$anonfun$11.apply(ExprCodeGenerator.scala:517) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:517) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:56) at org.apache.calcite.rex.RexCall.accept(RexCall.java:174) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateExpression(ExprCodeGenerator.scala:155) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$$anonfun$2.apply(CalcCodeGenerator.scala:141) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$$anonfun$2.apply(CalcCodeGenerator.scala:141) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.produceProjectionCode$1(CalcCodeGenerator.scala:141) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.generateProcessCode(CalcCodeGenerator.scala:167) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.generateCalcOperator(CalcCodeGenerator.scala:50) at org.apache.flink.table.planner.codegen.CalcCodeGenerator.generateCalcOperator(CalcCodeGenerator.scala) at org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecCalc.translateToPlanInternal(CommonExecCalc.java:94) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) at org.apache.flink.table.planner.plan.nodes.exec.ExecEdge.translateToPlan(ExecEdge.java:247) at org.apache.flink.table.planner.plan.nodes.exec.batch.BatchExecSink.translateToPlanInternal(BatchExecSink.java:58) at org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:134) at org.apache.flink.table.planner.delegation.BatchPlanner$$anonfun$translateToPlan$1.apply(BatchPlanner.scala:80) at org.apache.flink.table.planner.delegation.BatchPlanner$$anonfun$translateToPlan$1.apply(BatchPlanner.scala:79) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at
Re: Integrating new connector with Flink SQL.
Hi santhosh, 1.I recommend you use the new source with ScanTablesource. 2.You can use `org.apache.flink.table.connector.source.SourceProvider` to integrate to ScanTablesource. (Introduced in 1.12) 3.You can just implement a new source, this one can be used by both Flink DataStream and Flink SQL. (As well as SourceFunction, it can be used by both Flink DataStream and Flink SQL too) Actually, the connector of the table is just a wrapper of DataStream. They should not have core differences. I believe we should migrate KafkaDynamicSource to the new source KafkaSource in the Flink 1.14. Maybe @Qingsheng Ren is working on this. Best, Jingsong On Thu, Jun 3, 2021 at 5:19 AM santhosh venkat wrote: > Hi, > > Please correct me if I'm wrong anywhere. I'm just new to Flink and trying > to navigate the landscape. > > Within my company, currently we're trying to develop a connector for our > internal change data capture system(brooklin) for flink. We are planning to > use Flink SQL as a primary API to build streaming applications. > > When exploring flink contracts, we noticed that there are two different > flavors of APIs available in Flink for Source integration. > > a) Flink Table API : The Flink ScanTableSource abstractions which are > currently relying upon the SourceFunction interfaces for integrating with > the underlying messaging-client libraries. For instance, KafkaDynamicSource > and KinesisDynamicSource are currently using the FlinkKafkaConsumer(an > implementation of RichParallelSourceFunction) and KinesisConsumer(an > implementation of RichParallelSourceFunction) respectively to read from the > broker. > > b) FLIP-27 style connector implementations: There are connectors which > implement SplitEnumerator and SourceReader abstractions, where the > Enumerator runs with the JobMaster and the Readers runs within the > TaskManager processes respectively. > > Questions: > > 1. If I want to integrate a new connector and want to use Flink SQL, then > what is the recommendation? Are the users supposed to implement the > RichParallelSourceFunction, CheckpointListener etc similar to > FlinkKafkaConsumer and wire into the ScanTableSource API? > > 2. Just wondering, what is the long term plan for the ScanTablesource APIs? > Are there plans for them to use and integrate with the SplitEnumerator and > SourceReader abstractions? > > 3. If I want to offer my connector implementation to both Flink DataStream > and Flink SQL APIs, then should I implement both the flavors of source > APIs(SplitEnumerator/SourceReader as well as SourceFunction) in flink? > > I would really appreciate it if someone can help and answer the above > questions. > > Thanks. > -- Best, Jingsong Lee
[jira] [Created] (FLINK-22860) Supplementary 'HELP' command prompt message for SQL-Cli.
Roc Marshal created FLINK-22860: --- Summary: Supplementary 'HELP' command prompt message for SQL-Cli. Key: FLINK-22860 URL: https://issues.apache.org/jira/browse/FLINK-22860 Project: Flink Issue Type: Improvement Components: Table SQL / Client Reporter: Roc Marshal Attachments: attach.png !attach.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22859) Wordcount on Docker test (custom fs plugin) fails due to output hash mismatch
Xintong Song created FLINK-22859: Summary: Wordcount on Docker test (custom fs plugin) fails due to output hash mismatch Key: FLINK-22859 URL: https://issues.apache.org/jira/browse/FLINK-22859 Project: Flink Issue Type: Bug Components: Tests Affects Versions: 1.14.0 Reporter: Xintong Song [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18568=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=1213] And the CI hangs until timeout after this test failure. -- This message was sent by Atlassian Jira (v8.3.4#803005)
RE: [VOTE] Watermark propagation with Sink API
+1 (non-binding) Thanks Eron, looking forward to seeing this feature soon. Thanks, Brian -Original Message- From: Arvid Heise Sent: Wednesday, June 2, 2021 15:44 To: dev Subject: Re: [VOTE] Watermark propagation with Sink API [EXTERNAL EMAIL] +1 (binding) Thanks Eron for driving this effort; it will open up new exciting use cases. On Tue, Jun 1, 2021 at 7:17 PM Eron Wright wrote: > After some good discussion about how to enhance the Sink API to > process watermarks, I believe we're ready to proceed with a vote. > Voting will be open until at least Friday, June 4th, 2021. > > Reference: > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/displa > y/FLINK/FLIP-167*3A*Watermarks*for*Sink*API__;JSsrKys!!LpKI!zkBBhuqEEi > fxF_fDQqAjqsbuWWFmnAvwRfEAWxeC63viFWXPLul-GCBb-PTq$ > [cwiki[.]apache[.]org] > > Discussion thread: > > https://urldefense.com/v3/__https://lists.apache.org/thread.html/r5194 > e1cf157d1fd5ba7ca9b567cb01723bd582aa12dda57d25bca37e*40*3Cdev.flink.ap > ache.org*3E__;JSUl!!LpKI!zkBBhuqEEifxF_fDQqAjqsbuWWFmnAvwRfEAWxeC63viF > WXPLul-GJXlxwqN$ [lists[.]apache[.]org] > > Implementation Issue: > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/FLIN > K-22700__;!!LpKI!zkBBhuqEEifxF_fDQqAjqsbuWWFmnAvwRfEAWxeC63viFWXPLul-G > N6AJm4h$ [issues[.]apache[.]org] > > Thanks, > Eron Wright > StreamNative >
Integrating new connector with Flink SQL.
Hi, Please correct me if I'm wrong anywhere. I'm just new to Flink and trying to navigate the landscape. Within my company, currently we're trying to develop a connector for our internal change data capture system(brooklin) for flink. We are planning to use Flink SQL as a primary API to build streaming applications. When exploring flink contracts, we noticed that there are two different flavors of APIs available in Flink for Source integration. a) Flink Table API : The Flink ScanTableSource abstractions which are currently relying upon the SourceFunction interfaces for integrating with the underlying messaging-client libraries. For instance, KafkaDynamicSource and KinesisDynamicSource are currently using the FlinkKafkaConsumer(an implementation of RichParallelSourceFunction) and KinesisConsumer(an implementation of RichParallelSourceFunction) respectively to read from the broker. b) FLIP-27 style connector implementations: There are connectors which implement SplitEnumerator and SourceReader abstractions, where the Enumerator runs with the JobMaster and the Readers runs within the TaskManager processes respectively. Questions: 1. If I want to integrate a new connector and want to use Flink SQL, then what is the recommendation? Are the users supposed to implement the RichParallelSourceFunction, CheckpointListener etc similar to FlinkKafkaConsumer and wire into the ScanTableSource API? 2. Just wondering, what is the long term plan for the ScanTablesource APIs? Are there plans for them to use and integrate with the SplitEnumerator and SourceReader abstractions? 3. If I want to offer my connector implementation to both Flink DataStream and Flink SQL APIs, then should I implement both the flavors of source APIs(SplitEnumerator/SourceReader as well as SourceFunction) in flink? I would really appreciate it if someone can help and answer the above questions. Thanks.
[jira] [Created] (FLINK-22858) avro-confluent doesn't support confluent schema registry that has security enabled
TAO XIAO created FLINK-22858: Summary: avro-confluent doesn't support confluent schema registry that has security enabled Key: FLINK-22858 URL: https://issues.apache.org/jira/browse/FLINK-22858 Project: Flink Issue Type: Improvement Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table SQL / Ecosystem Affects Versions: 1.13.1 Reporter: TAO XIAO Schema registry supports HTTP authentication via configuration[1] however avro-confluent doesn't pass format options to schema registry client which results in no authentication connection to schema registry only. To fix this we can pass the format options to CachedSchemaCoderProvider.java -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22857) Add possibility to call built-in functions in SpecializedFunction
Timo Walther created FLINK-22857: Summary: Add possibility to call built-in functions in SpecializedFunction Key: FLINK-22857 URL: https://issues.apache.org/jira/browse/FLINK-22857 Project: Flink Issue Type: Sub-task Components: Table SQL / API Reporter: Timo Walther Assignee: Timo Walther This is the last missing piece to avoid code generation when developing built-in functions. Core operations such as CAST, EQUALS, etc. will still use code generation but other built-in functions should be able to use these core operations without the need for generating code. It should be possible to call other built-in functions via a SpecializedFunction. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: recover from svaepoint
Hi, I think there is no generic way. If this error has happened indeed after starting a second job from the same savepoint, or something like that, user can change Sink's operator UID. If this is an issue of intentional recovery from an earlier checkpoint/savepoint, maybe `FlinkKafkaProducer#setLogFailuresOnly` will be helpful. Best, Piotrek wt., 1 cze 2021 o 15:16 Till Rohrmann napisał(a): > The error message says that we are trying to reuse a transaction id that is > currently being used or has expired. > > I am not 100% sure how this can happen. My suspicion is that you have > resumed a job multiple times from the same savepoint. Have you checked that > there is no other job which has been resumed from the same savepoint and > which is currently running or has run and completed checkpoints? > > @pnowojski @Becket Qin how > does the transaction id generation ensures that we don't have a clash of > transaction ids if we resume the same job multiple times from the same > savepoint? From the code, I do see that we have a TransactionalIdsGenerator > which is initialized with the taskName and the operator UID. > > fyi: @Arvid Heise > > Cheers, > Till > > > On Mon, May 31, 2021 at 11:10 AM 周瑞 wrote: > > > HI: > > When "sink.semantic = exactly-once", the following exception is > > thrown when recovering from svaepoint > > > >public static final String KAFKA_TABLE_FORMAT = > > "CREATE TABLE "+TABLE_NAME+" (\n" + > > " "+COLUMN_NAME+" STRING\n" + > > ") WITH (\n" + > > " 'connector' = 'kafka',\n" + > > " 'topic' = '%s',\n" + > > " 'properties.bootstrap.servers' = '%s',\n" + > > " 'sink.semantic' = 'exactly-once',\n" + > > " 'properties.transaction.timeout.ms' = > > '90',\n" + > > " 'sink.partitioner' = > > 'com.woqutench.qmatrix.cdc.extractor.PkPartitioner',\n" + > > " 'format' = 'dbz-json'\n" + > > ")\n"; > > [] - Source: TableSourceScan(table=[[default_catalog, default_database, > > debezium_source]], fields=[data]) -> Sink: Sink > > (table=[default_catalog.default_database.KafkaTable], fields=[data]) (1/1 > > )#859 (075273be72ab01bf1afd3c066876aaa6) switched from INITIALIZING to > > FAILED with failure cause: org.apache.kafka.common.KafkaException: > > Unexpected error in InitProducerIdResponse; Producer attempted an > > operation with an old epoch. Either there is a newer producer with the > > same transactionalId, or the producer's transaction has been expired by > > the broker. > > at org.apache.kafka.clients.producer.internals. > > > TransactionManager$InitProducerIdHandler.handleResponse(TransactionManager > > .java:1352) > > at org.apache.kafka.clients.producer.internals. > > TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java: > > 1260) > > at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse > > .java:109) > > at org.apache.kafka.clients.NetworkClient.completeResponses( > > NetworkClient.java:572) > > at > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:564) > > at org.apache.kafka.clients.producer.internals.Sender > > .maybeSendAndPollTransactionalRequest(Sender.java:414) > > at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender > > .java:312) > > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java: > > 239) > > at java.lang.Thread.run(Thread.java:748) > > >
[jira] [Created] (FLINK-22856) Move our Azure pipelines away from Ubuntu 16.04 by September
Robert Metzger created FLINK-22856: -- Summary: Move our Azure pipelines away from Ubuntu 16.04 by September Key: FLINK-22856 URL: https://issues.apache.org/jira/browse/FLINK-22856 Project: Flink Issue Type: Bug Components: Build System / Azure Pipelines Reporter: Robert Metzger Fix For: 1.14.0 Azure won't support Ubuntu 16.04 starting from October, hence we need to migrate to a newer ubuntu version. We should do this at a time when the builds are relatively stable to be able to clearly identify issues relating to the version upgrade. Also, we shouldn't do this before a feature freeze ;) -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [Discuss] Planning Flink 1.14
Hi Dawid, Joe & Xintong, Thanks for starting the discussion. I would like to polish Window TVFs[1][2] which is a popular feature in SQL introduced in 1.13. The detailed items are as follows. 1. Add more computations based on Window TVF * Window Join (which is already merged in master branch) * Window Table Function * Window Deduplicate 2. Finish related JIRA to improve user experience * Add offset support for TUMBLE, HOP, session window 3. Complement the missing functions compared to the group window, which is a precondition of deprecating the legacy Grouped Window Function in the later versions. * Support Session windows * Support allow-lateness * Support retract input stream * Support window TVF in batch mode [1] https://issues.apache.org/jira/browse/FLINK-19604 [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function#FLIP145:SupportSQLwindowingtablevaluedfunction-CumulatingWindows Best regards, JING ZHANG Xintong Song 于2021年6月2日周三 下午6:45写道: > Hi all, > > As 1.13 has been released for a while, I think it is a good time to start > planning for the 1.14 release cycle. > > - Release managers: This time we'd like to have a team of 3 release > managers. Dawid, Joe and I would like to volunteer for it. What do you > think about it? > > - Timeline: According to our approximate 4 months release period, we > propose to aim for a feature freeze roughly in early August (which could > mean something like early September for the 1.14. release). Does it work > for everyone? > > - Collecting features: It would be helpful to have a rough overview of the > new features that will likely be included in this release. We have created > a wiki page [1] for collecting such information. We'd like to kindly ask > all committers to fill in the page with features that they intend to work > on. > > We would also like to emphasize some aspects of the engineering process: > > - Stability of master: This has been an issue during the 1.13 feature > freeze phase and it is still going on. We encourage every committer to not > merge PRs through the Github button, but do this manually, with caution for > the commits merged after the CI being triggered. It would be appreciated to > always build the project before merging to master. > > - Documentation: Please try to see documentation as an integrated part of > the engineering process and don't push it to the feature freeze phase or > even after. You might even think about going documentation first. We, as > the Flink community, are adding great stuff, that is pushing the limits of > streaming data processors, with every release. We should also make this > stuff usable for our users by documenting it well. > > - Promotion of 1.14: What applies to documentation also applies to all the > activity around the release. We encourage every contributor to also think > about, plan and prepare activities like blog posts and talk, that will > promote and spread the release once it is done. > > Please let us know what you think. > > Thank you~ > Dawid, Joe & Xintong > > [1] https://cwiki.apache.org/confluence/display/FLINK/1.14+Release >
[jira] [Created] (FLINK-22855) Translate the 'Overview of Python API' page into Chinese.
Roc Marshal created FLINK-22855: --- Summary: Translate the 'Overview of Python API' page into Chinese. Key: FLINK-22855 URL: https://issues.apache.org/jira/browse/FLINK-22855 Project: Flink Issue Type: Improvement Components: chinese-translation Affects Versions: 1.14.0 Reporter: Roc Marshal target link: https://ci.apache.org/projects/flink/flink-docs-release-1.13/zh/docs/dev/python/overview/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22854) Translate 'Apache Flink Documentation' index page to Chinese
Roc Marshal created FLINK-22854: --- Summary: Translate 'Apache Flink Documentation' index page to Chinese Key: FLINK-22854 URL: https://issues.apache.org/jira/browse/FLINK-22854 Project: Flink Issue Type: Technical Debt Components: chinese-translation Affects Versions: 1.14.0 Reporter: Roc Marshal target page : https://ci.apache.org/projects/flink/flink-docs-release-1.13/zh/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22853) FLinkSql聚合函数max/min/sum返回结果重复
Raypon Wang created FLINK-22853: --- Summary: FLinkSql聚合函数max/min/sum返回结果重复 Key: FLINK-22853 URL: https://issues.apache.org/jira/browse/FLINK-22853 Project: Flink Issue Type: Bug Components: Table SQL / API Affects Versions: 1.12.1 Reporter: Raypon Wang mysql数据如下: id offset 1 1 1 3 1 2 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[Discuss] Planning Flink 1.14
Hi all, As 1.13 has been released for a while, I think it is a good time to start planning for the 1.14 release cycle. - Release managers: This time we'd like to have a team of 3 release managers. Dawid, Joe and I would like to volunteer for it. What do you think about it? - Timeline: According to our approximate 4 months release period, we propose to aim for a feature freeze roughly in early August (which could mean something like early September for the 1.14. release). Does it work for everyone? - Collecting features: It would be helpful to have a rough overview of the new features that will likely be included in this release. We have created a wiki page [1] for collecting such information. We'd like to kindly ask all committers to fill in the page with features that they intend to work on. We would also like to emphasize some aspects of the engineering process: - Stability of master: This has been an issue during the 1.13 feature freeze phase and it is still going on. We encourage every committer to not merge PRs through the Github button, but do this manually, with caution for the commits merged after the CI being triggered. It would be appreciated to always build the project before merging to master. - Documentation: Please try to see documentation as an integrated part of the engineering process and don't push it to the feature freeze phase or even after. You might even think about going documentation first. We, as the Flink community, are adding great stuff, that is pushing the limits of streaming data processors, with every release. We should also make this stuff usable for our users by documenting it well. - Promotion of 1.14: What applies to documentation also applies to all the activity around the release. We encourage every contributor to also think about, plan and prepare activities like blog posts and talk, that will promote and spread the release once it is done. Please let us know what you think. Thank you~ Dawid, Joe & Xintong [1] https://cwiki.apache.org/confluence/display/FLINK/1.14+Release
[jira] [Created] (FLINK-22852) Add SPNEGO authentication support
Gabor Somogyi created FLINK-22852: - Summary: Add SPNEGO authentication support Key: FLINK-22852 URL: https://issues.apache.org/jira/browse/FLINK-22852 Project: Flink Issue Type: Sub-task Components: Runtime / REST, Runtime / Web Frontend Affects Versions: 1.13.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22851) Add basic authentication support
Gabor Somogyi created FLINK-22851: - Summary: Add basic authentication support Key: FLINK-22851 URL: https://issues.apache.org/jira/browse/FLINK-22851 Project: Flink Issue Type: Sub-task Components: Runtime / REST, Runtime / Web Frontend Affects Versions: 1.13.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Dashboard/HistoryServer authentication
Hi team, Happy to be here and hope I can provide quality additions in the future. Thank you all for helpful the suggestions! Considering them the FLIP has been modified and the work continues on the already existing Jira. BR, G On Wed, Jun 2, 2021 at 11:23 AM Márton Balassi wrote: > Thanks, Chesney - I totally missed that. Answered on the ticket too, let > us continue there then. > > Till, I agree that we should keep this codepath as slim as possible. It is > an important design decision that we aim to keep the list of authentication > protocols to a minimum. We believe that this should not be a primary > concern of Flink and a trusted proxy service (for example Apache Knox) > should be used to enable a multitude of enduser authentication mechanisms. > The bare minimum of authentication mechanisms to support consequently > consist of a single strong authentication protocol for which Kerberos is > the enterprise solution and HTTP Basic primary for development and > light-weight scenarios. > > Added the above wording to G's doc. > > https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit > > > > On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler > wrote: > >> There's a related effort: >> https://issues.apache.org/jira/browse/FLINK-21108 >> >> On 6/1/2021 10:14 AM, Till Rohrmann wrote: >> > Hi Gabor, welcome to the Flink community! >> > >> > Thanks for sharing this proposal with the community Márton. In general, >> I >> > agree that authentication is missing and that this is required for using >> > Flink within an enterprise. The thing I am wondering is whether this >> > feature strictly needs to be implemented inside of Flink or whether a >> proxy >> > setup could do the job? Have you considered this option? If yes, then it >> > would be good to list it under the point of rejected alternatives. >> > >> > I do see the benefit of implementing this feature inside of Flink if >> many >> > users need it. If not, then it might be easier for the project to not >> > increase the surface area since it makes the overall maintenance harder. >> > >> > Cheers, >> > Till >> > >> > On Mon, May 31, 2021 at 4:57 PM Márton Balassi >> wrote: >> > >> >> Hi team, >> >> >> >> Firstly I would like to introduce Gabor or G [1] for short to the >> >> community, he is a Spark committer who has recently transitioned to the >> >> Flink Engineering team at Cloudera and is looking forward to >> contributing >> >> to Apache Flink. Previously G primarily focused on Spark Streaming and >> >> security. >> >> >> >> Based on requests from our customers G has implemented Kerberos and >> HTTP >> >> Basic Authentication for the Flink Dashboard and HistoryServer. >> Previously >> >> lacked an authentication story. >> >> >> >> We are looking to contribute this functionality back to the community, >> we >> >> believe that given Flink's maturity there should be a common code >> solution >> >> for this general pattern. >> >> >> >> We are looking forward to your feedback on G's design. [2] >> >> >> >> [1] http://gaborsomogyi.com/ >> >> [2] >> >> >> >> >> https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit >> >> >> >>
[jira] [Created] (FLINK-22850) org.apache.flink.configuration.ConfigOptions.defaultValue() and noDefaultValue() is Deprecated.
lixiaobao created FLINK-22850: - Summary: org.apache.flink.configuration.ConfigOptions.defaultValue() and noDefaultValue() is Deprecated. Key: FLINK-22850 URL: https://issues.apache.org/jira/browse/FLINK-22850 Project: Flink Issue Type: Improvement Components: API / Core Affects Versions: 1.12.4, 1.12.3, 1.13.0, 1.11.1, 1.10.1 Reporter: lixiaobao Fix For: 1.13.0 {code:java} //代码占位符 @Deprecated public ConfigOption defaultValue(T value) { checkNotNull(value); return new ConfigOption<>( key, value.getClass(), ConfigOption.EMPTY_DESCRIPTION, value, false); } @Deprecated public ConfigOption noDefaultValue() { return new ConfigOption<>( key, String.class, ConfigOption.EMPTY_DESCRIPTION, null, false); } {code} The method is marked as Deprecated, shoud define the type explicitly first with one of the intType(), stringType(), etc -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Dashboard/HistoryServer authentication
Thanks, Chesney - I totally missed that. Answered on the ticket too, let us continue there then. Till, I agree that we should keep this codepath as slim as possible. It is an important design decision that we aim to keep the list of authentication protocols to a minimum. We believe that this should not be a primary concern of Flink and a trusted proxy service (for example Apache Knox) should be used to enable a multitude of enduser authentication mechanisms. The bare minimum of authentication mechanisms to support consequently consist of a single strong authentication protocol for which Kerberos is the enterprise solution and HTTP Basic primary for development and light-weight scenarios. Added the above wording to G's doc. https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit On Tue, Jun 1, 2021 at 11:47 AM Chesnay Schepler wrote: > There's a related effort: > https://issues.apache.org/jira/browse/FLINK-21108 > > On 6/1/2021 10:14 AM, Till Rohrmann wrote: > > Hi Gabor, welcome to the Flink community! > > > > Thanks for sharing this proposal with the community Márton. In general, I > > agree that authentication is missing and that this is required for using > > Flink within an enterprise. The thing I am wondering is whether this > > feature strictly needs to be implemented inside of Flink or whether a > proxy > > setup could do the job? Have you considered this option? If yes, then it > > would be good to list it under the point of rejected alternatives. > > > > I do see the benefit of implementing this feature inside of Flink if many > > users need it. If not, then it might be easier for the project to not > > increase the surface area since it makes the overall maintenance harder. > > > > Cheers, > > Till > > > > On Mon, May 31, 2021 at 4:57 PM Márton Balassi > wrote: > > > >> Hi team, > >> > >> Firstly I would like to introduce Gabor or G [1] for short to the > >> community, he is a Spark committer who has recently transitioned to the > >> Flink Engineering team at Cloudera and is looking forward to > contributing > >> to Apache Flink. Previously G primarily focused on Spark Streaming and > >> security. > >> > >> Based on requests from our customers G has implemented Kerberos and HTTP > >> Basic Authentication for the Flink Dashboard and HistoryServer. > Previously > >> lacked an authentication story. > >> > >> We are looking to contribute this functionality back to the community, > we > >> believe that given Flink's maturity there should be a common code > solution > >> for this general pattern. > >> > >> We are looking forward to your feedback on G's design. [2] > >> > >> [1] http://gaborsomogyi.com/ > >> [2] > >> > >> > https://docs.google.com/document/d/1NMPeJ9H0G49TGy3AzTVVJVKmYC0okwOtqLTSPnGqzHw/edit > >> > >
Re: [VOTE] Watermark propagation with Sink API
+1 (binding) Thanks Eron for driving this effort; it will open up new exciting use cases. On Tue, Jun 1, 2021 at 7:17 PM Eron Wright wrote: > After some good discussion about how to enhance the Sink API to process > watermarks, I believe we're ready to proceed with a vote. Voting will be > open until at least Friday, June 4th, 2021. > > Reference: > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-167%3A+Watermarks+for+Sink+API > > Discussion thread: > > https://lists.apache.org/thread.html/r5194e1cf157d1fd5ba7ca9b567cb01723bd582aa12dda57d25bca37e%40%3Cdev.flink.apache.org%3E > > Implementation Issue: > https://issues.apache.org/jira/browse/FLINK-22700 > > Thanks, > Eron Wright > StreamNative >
[jira] [Created] (FLINK-22849) Drop remaining usages of legacy planner in E2E tests and Python
Timo Walther created FLINK-22849: Summary: Drop remaining usages of legacy planner in E2E tests and Python Key: FLINK-22849 URL: https://issues.apache.org/jira/browse/FLINK-22849 Project: Flink Issue Type: Sub-task Components: API / Python, Table SQL / Legacy Planner, Test Infrastructure Reporter: Timo Walther Assignee: Timo Walther This removes the remaining usages of legacy planner outside of the {{flink-table}} module. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22848) Deprecate unquoted options for SET / RESET
Ingo Bürk created FLINK-22848: - Summary: Deprecate unquoted options for SET / RESET Key: FLINK-22848 URL: https://issues.apache.org/jira/browse/FLINK-22848 Project: Flink Issue Type: Improvement Components: Table SQL / Client Affects Versions: 1.14.0 Reporter: Ingo Bürk Eventually we should agree to a version in which to deprecate, and a version in which to remove, the unquoted syntax for SET / RESET: {code:java} // To be deprecated / removed SET a = b; RESET a; // New SET 'a' = 'b'; RESET 'a';{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22847) SET should print options quoted
Ingo Bürk created FLINK-22847: - Summary: SET should print options quoted Key: FLINK-22847 URL: https://issues.apache.org/jira/browse/FLINK-22847 Project: Flink Issue Type: Improvement Components: Table SQL / Client Affects Versions: 1.14.0, 1.13.2 Reporter: Ingo Bürk Assignee: Ingo Bürk In FLINK-22770 we exposed SET/RESET in the parser and introduced quoting the options when using SET, but kept the unquoted support in SQL client for now as well. To facilitate the quoted syntax for a future deprecation of the unquoted one, SET; should print the current options using quotes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22846) Add SPNEGO authentication support
Gabor Somogyi created FLINK-22846: - Summary: Add SPNEGO authentication support Key: FLINK-22846 URL: https://issues.apache.org/jira/browse/FLINK-22846 Project: Flink Issue Type: Sub-task Components: Runtime / REST, Runtime / Web Frontend Affects Versions: 1.13.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22845) Add Basic authentication support
Gabor Somogyi created FLINK-22845: - Summary: Add Basic authentication support Key: FLINK-22845 URL: https://issues.apache.org/jira/browse/FLINK-22845 Project: Flink Issue Type: Sub-task Components: Runtime / REST, Runtime / Web Frontend Affects Versions: 1.13.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005)