Re: I create a hotfix pull request in `MiniCluster.java`, thanks!
Hi Jark Thanks for advcing!Wish I can contribute more for flink project! Yuan Zuo | | 左元 | | 邮箱:zuoyua...@126.com | On 11/26/2020 13:57, Jark Wu wrote: Hi Yuan, Thanks for contributing to Flink. I have helped to merge this PR. For the pull requests without JIRA id, it would be better to ping/request review from the committers in the PR (there is a suggestion reviewer in the right sidebar). Because such pull requests usually can't be notified to committers in time. Best, Jark On Wed, 25 Nov 2020 at 16:45, 左元 wrote: > The Pull Request Number is #14211. > > > Fix typo `dispatcherResourceManagreComponentRpcServiceFactory` -> > `dispatcherResourceManagerComponentRpcServiceFactory` in `MiniCluster.java` > > > > Best > Regards > > Yuan Zuo
[jira] [Created] (FLINK-20367) Show the in-use config of job to users
zlzhang0122 created FLINK-20367: --- Summary: Show the in-use config of job to users Key: FLINK-20367 URL: https://issues.apache.org/jira/browse/FLINK-20367 Project: Flink Issue Type: Improvement Reporter: zlzhang0122 Now the config can be set from global cluster configuration and single job code , since we can't absolutely sure that which config is in-use except we check it in the start-up log. I think maybe we can show the in-use config of job to users and this can be helpful! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20366) ColumnIntervalUtil#getColumnIntervalWithFilter does not consider the case when the predicate is a false constant
Caizhi Weng created FLINK-20366: --- Summary: ColumnIntervalUtil#getColumnIntervalWithFilter does not consider the case when the predicate is a false constant Key: FLINK-20366 URL: https://issues.apache.org/jira/browse/FLINK-20366 Project: Flink Issue Type: Bug Components: Table SQL / Planner Reporter: Caizhi Weng Fix For: 1.12.0 To reproduce this bug, add the following test case to {{DeadlockBreakupTest.scala}} {code:scala} @Test def testSubplanReuse_DeadlockCausedByReusingExchangeInAncestor(): Unit = { util.tableEnv.getConfig.getConfiguration.setBoolean( OptimizerConfigOptions.TABLE_OPTIMIZER_REUSE_SUB_PLAN_ENABLED, true) util.tableEnv.getConfig.getConfiguration.setBoolean( OptimizerConfigOptions.TABLE_OPTIMIZER_MULTIPLE_INPUT_ENABLED, false) util.tableEnv.getConfig.getConfiguration.setString( ExecutionConfigOptions.TABLE_EXEC_DISABLED_OPERATORS, "NestedLoopJoin,SortMergeJoin") val sqlQuery = """ |WITH T1 AS (SELECT x1.*, x2.a AS k, x2.b AS v FROM x x1 LEFT JOIN x x2 ON x1.a = x2.a WHERE x2.b > 0) |SELECT x.a, T1.* FROM x LEFT JOIN T1 ON x.a = T1.k WHERE x.b > 0 AND T1.v = 0 |""".stripMargin util.verifyPlan(sqlQuery) } {code} And we'll get the exception stack {code} java.lang.RuntimeException: Error while applying rule FlinkLogicalJoinConverter(in:NONE,out:LOGICAL), args [rel#414:LogicalJoin.NONE.any.[](left=RelSubset#406,right=RelSubset#413,condition==($0, $4),joinType=inner)] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:256) at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312) at org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64) at org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62) at org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58) at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104) at org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57) at org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer.optimizeTree(BatchCommonSubGraphBasedOptimizer.scala:86) at org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer.org$apache$flink$table$planner$plan$optimize$BatchCommonSubGraphBasedOptimizer$$optimizeBlock(BatchCommonSubGraphBasedOptimizer.scala:57) at org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer$$anonfun$doOptimize$1.apply(BatchCommonSubGraphBasedOptimizer.scala:45) at org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer$$anonfun$doOptimize$1.apply(BatchCommonSubGraphBasedOptimizer.scala:45) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer.doOptimize(BatchCommonSubGraphBasedOptimizer.scala:45) at org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77) at org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:286) at org.apache.flink.table.planner.utils.TableTestUtilBase.getOptimizedPlan(TableTestBase.scala:431) at org.apache.flink.table.planner.utils.TableTestUtilBase.doVerifyPlan(TableTestBase.scala:348) at org.apache.flink.table.planner.utils.TableTestUtilBase.verifyPlan(TableTestBase.scala:271) at org.apache.flink.table.planner.plan.batch.sql.DeadlockBreakupTest.testSubplanReuse_DeadlockCausedByReusingExchangeInAncestor(DeadlockBreakupTest.scala:248) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at
[ANNOUNCE] release-1.12 branch cut
Hi all, We have already cut the release-1.12 branch from the master branch based on c008907d2a629449c8d0ad9725d13b0604fc2141 commit. Please make sure that the PR is merged to both master and release-1.12 branch if you want it to be present in 1.12.0. Please also set the correct fix version in the JIRA, accordingly to which branch have you merged your code. Especially pay attention to it, if you have merged something to the master today (on Thursday), as your commit might have ended up before or after release cut. Regards, Robert & Dian
Re: I create a hotfix pull request in `MiniCluster.java`, thanks!
Hi Yuan, Thanks for contributing to Flink. I have helped to merge this PR. For the pull requests without JIRA id, it would be better to ping/request review from the committers in the PR (there is a suggestion reviewer in the right sidebar). Because such pull requests usually can't be notified to committers in time. Best, Jark On Wed, 25 Nov 2020 at 16:45, 左元 wrote: > The Pull Request Number is #14211. > > > Fix typo `dispatcherResourceManagreComponentRpcServiceFactory` -> > `dispatcherResourceManagerComponentRpcServiceFactory` in `MiniCluster.java` > > > > Best > Regards > > Yuan Zuo
[jira] [Created] (FLINK-20365) The native k8s cluster could not be unregistered when executing Python DataStream application attachedly.
Shuiqiang Chen created FLINK-20365: -- Summary: The native k8s cluster could not be unregistered when executing Python DataStream application attachedly. Key: FLINK-20365 URL: https://issues.apache.org/jira/browse/FLINK-20365 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.12.0 Reporter: Shuiqiang Chen -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20364) Add support for scheduling with slot sharing
Guruh Fajar Samudra created FLINK-20364: --- Summary: Add support for scheduling with slot sharing Key: FLINK-20364 URL: https://issues.apache.org/jira/browse/FLINK-20364 Project: Flink Issue Type: Test Components: Runtime / Coordination Affects Versions: statefun-2.2.1 Reporter: Guruh Fajar Samudra Fix For: statefun-2.2.2 In order to reach feature equivalence with the old code base, we should add support for scheduling with slot sharing to the SlotPool. This will also allow us to run all the IT cases based on the {{AbstractTestBase}} on the Flip-6 {{MiniCluster}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20363) "Batch SQL end-to-end test" failed during shutdown
Dian Fu created FLINK-20363: --- Summary: "Batch SQL end-to-end test" failed during shutdown Key: FLINK-20363 URL: https://issues.apache.org/jira/browse/FLINK-20363 Project: Flink Issue Type: Test Components: Table SQL / Planner Affects Versions: 1.11.2 Reporter: Dian Fu https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10138=logs=08866332-78f7-59e4-4f7e-49a56faa3179=3e8647c1-5a28-5917-dd93-bf78594ea994 {code} 2020-11-25T23:03:39.0657020Z == 2020-11-25T23:03:39.0658778Z Running 'Batch SQL end-to-end test' 2020-11-25T23:03:39.0659508Z == 2020-11-25T23:03:39.0802908Z TEST_DATA_DIR: /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-39079497862 2020-11-25T23:03:39.2712316Z Flink dist directory: /home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT 2020-11-25T23:03:39.3940809Z Starting cluster. 2020-11-25T23:03:40.2900610Z Starting standalonesession daemon on host fv-az510-522. 2020-11-25T23:03:41.8889378Z Starting taskexecutor daemon on host fv-az510-522. 2020-11-25T23:03:41.9309757Z Waiting for Dispatcher REST endpoint to come up... 2020-11-25T23:03:42.9869550Z Waiting for Dispatcher REST endpoint to come up... 2020-11-25T23:03:44.0431842Z Waiting for Dispatcher REST endpoint to come up... 2020-11-25T23:03:45.4007523Z Waiting for Dispatcher REST endpoint to come up... 2020-11-25T23:03:46.5129880Z Dispatcher REST endpoint is up. 2020-11-25T23:03:53.3634621Z Job has been submitted with JobID 2abe4546a6de428f2c19d51de93a5280 2020-11-25T23:03:56.1830568Z pass BatchSQL 2020-11-25T23:03:56.5054769Z Stopping taskexecutor daemon (pid: 56867) on host fv-az510-522. 2020-11-25T23:03:56.7417091Z Stopping standalonesession daemon (pid: 56577) on host fv-az510-522. 2020-11-25T23:03:57.1488127Z Skipping taskexecutor daemon (pid: 56806), because it is not running anymore on fv-az510-522. 2020-11-25T23:03:57.1490284Z Skipping taskexecutor daemon (pid: 57177), because it is not running anymore on fv-az510-522. 2020-11-25T23:03:57.1492572Z Skipping taskexecutor daemon (pid: 57526), because it is not running anymore on fv-az510-522. 2020-11-25T23:03:57.1493779Z Stopping taskexecutor daemon (pid: 57905) on host fv-az510-522. 2020-11-25T23:03:57.1495287Z /home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT/bin/taskmanager.sh: line 99: 57905 Terminated "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP $ENTRYPOINT "${ARGS[@]}" 2020-11-25T23:03:57.1499330Z [FAIL] Test script contains errors. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20362) Broken Link in dev/table/sourceSinks.zh.md
Huang Xingbo created FLINK-20362: Summary: Broken Link in dev/table/sourceSinks.zh.md Key: FLINK-20362 URL: https://issues.apache.org/jira/browse/FLINK-20362 Project: Flink Issue Type: Bug Components: Documentation, Table SQL / API Affects Versions: 1.12.0 Reporter: Huang Xingbo Fix For: 1.12.0 When executing the script build_docs.sh, it will throw the following exception: {code:java} Liquid Exception: Could not find document 'dev/table/legacySourceSinks.md' in tag 'link'. Make sure the document exists and the path is correct. in dev/table/sourceSinks.zh.md Could not find document 'dev/table/legacySourceSinks.md' in tag 'link'. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20361) Using sliding window with duration of hours in Table API returns wrong time
Aleksandra Cz created FLINK-20361: - Summary: Using sliding window with duration of hours in Table API returns wrong time Key: FLINK-20361 URL: https://issues.apache.org/jira/browse/FLINK-20361 Project: Flink Issue Type: Bug Components: Table SQL / API Affects Versions: 1.11.2, 1.11.1, 1.11.0 Environment: Java 11, test executed in IntelliJ IDE on mac OS. Reporter: Aleksandra Cz Fix For: 1.11.2 If in [Table walkthrough| [https://github.com/apache/flink-playgrounds/blob/master/table-walkthrough/src/main/java/org/apache/flink/playgrounds/spendreport/SpendReport.java]] implemented *report* method would be as follows: {code:java} public static Table report(Table transactions) { return transactions .window(Slide.over(lit(1).hours()).every(lit(5).minute()).on($("transaction_time")).as("log_ts")) .groupBy($("log_ts"),$("account_id")) .select( $("log_ts").start().as("log_ts_start"), $("log_ts").end().as("log_ts_end"), $("account_id"), $("amount").sum().as("amount")); {code} Then the resulting sliding window start and sliding window end would be in year 1969/1970. Please see first 3 elements of resulting table: {code:java} [1969-12-31T23:05,1970-01-01T00:05,3,432, 1969-12-31T23:10,1970-01-01T00:10,3,432, 1969-12-31T23:15,1970-01-01T00:15,3,432]{code} This behaviour repeats if using SQL instead of Table API, it does not repeat for window duration of minutes, nor in Tumbling window. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20360) AbstractFetcherWatermarksTest$PeriodicWatermarksSuite.testPeriodicWatermarks is unstable
Robert Metzger created FLINK-20360: -- Summary: AbstractFetcherWatermarksTest$PeriodicWatermarksSuite.testPeriodicWatermarks is unstable Key: FLINK-20360 URL: https://issues.apache.org/jira/browse/FLINK-20360 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.12.0 Reporter: Robert Metzger Fix For: 1.12.0 This is a CI PR run, but the change is unrelated: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10119=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5 {code} [ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.046 s <<< FAILURE! - in org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcherWatermarksTest [ERROR] testPeriodicWatermarks[0](org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcherWatermarksTest$PeriodicWatermarksSuite) Time elapsed: 0.004 s <<< FAILURE! java.lang.AssertionError: expected:<3> but was:<102> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:631) at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcherWatermarksTest$PeriodicWatermarksSuite.testPeriodicWatermarks(AbstractFetcherWatermarksTest.java:139) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20359) Support adding Owner Reference to Job Manager in native kubernetes setup
Boris Lublinsky created FLINK-20359: --- Summary: Support adding Owner Reference to Job Manager in native kubernetes setup Key: FLINK-20359 URL: https://issues.apache.org/jira/browse/FLINK-20359 Project: Flink Issue Type: New Feature Components: Deployment / Kubernetes Affects Versions: 1.11.2 Reporter: Boris Lublinsky Fix For: 1.12.0 Flink implementation is often a part of the larger application. As a result a synchronized management - clean up of Flink resources, when a main application is deleted is important. In Kubernetes, a common approach for such clean up is usage of the owner's reference ([https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/)] Adding owner reference support to Flink Job manager would be a nice addition to Flink kubernetes native support to accommodate such use cases -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20358) Support adding Owner Reference to Job Manager in native kubernetes setup
Boris Lublinsky created FLINK-20358: --- Summary: Support adding Owner Reference to Job Manager in native kubernetes setup Key: FLINK-20358 URL: https://issues.apache.org/jira/browse/FLINK-20358 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Affects Versions: 1.11.2 Reporter: Boris Lublinsky Fix For: 1.12.0 Flink-based implementations are often used as part of a larger applications. In this case, deletion of main (parent) application typically requires deletion of the Flink cluster. A common way for achieving this in kubernetes is usage of the owner reference ([https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/).] Adding Owner reference to Flink JM will fulfill this requirement -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20357) Rework HA documentation page
Till Rohrmann created FLINK-20357: - Summary: Rework HA documentation page Key: FLINK-20357 URL: https://issues.apache.org/jira/browse/FLINK-20357 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Affects Versions: 1.12.0 Reporter: Till Rohrmann Fix For: 1.12.0 We need to rework the HA documentation page. The first step is to split the existing documentation into general concepts as an overview page and HA service implementation specific sub pages. For the implementation specific sub pages we need to add Zookeeper and the K8s HA services. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20356) Rework Mesos deployment documentation page
Till Rohrmann created FLINK-20356: - Summary: Rework Mesos deployment documentation page Key: FLINK-20356 URL: https://issues.apache.org/jira/browse/FLINK-20356 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Affects Versions: 1.12.0 Reporter: Till Rohrmann Fix For: 1.12.0 Similar to FLINK-20347, we need to rework the Mesos deployment documentation page. Additionally, we should validate that everything which is stated in the documentation actually works. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20355) Rework K8s deployment documentation page
Till Rohrmann created FLINK-20355: - Summary: Rework K8s deployment documentation page Key: FLINK-20355 URL: https://issues.apache.org/jira/browse/FLINK-20355 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Affects Versions: 1.12.0 Reporter: Till Rohrmann Fix For: 1.12.0 Similar to FLINK-20347, we need to update the K8s deployment documentation page. Additionally, we should ensure that everything works which is stated in the documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20354) Rework standalone deployment documentation page
Till Rohrmann created FLINK-20354: - Summary: Rework standalone deployment documentation page Key: FLINK-20354 URL: https://issues.apache.org/jira/browse/FLINK-20354 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Affects Versions: 1.12.0 Reporter: Till Rohrmann Fix For: 1.12.0 Similar to FLINK-20347 we need to update the standalone deployment documentation page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20353) Rework logging documentation page
Till Rohrmann created FLINK-20353: - Summary: Rework logging documentation page Key: FLINK-20353 URL: https://issues.apache.org/jira/browse/FLINK-20353 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Affects Versions: 1.12.0 Reporter: Till Rohrmann Fix For: 1.12.0 The logging documentation page needs to be updated and verified. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20352) Rework command line interface documentation page
Till Rohrmann created FLINK-20352: - Summary: Rework command line interface documentation page Key: FLINK-20352 URL: https://issues.apache.org/jira/browse/FLINK-20352 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Affects Versions: 1.12.0 Reporter: Till Rohrmann Fix For: 1.12.0 The command line interface documentation page is quite out-dated and not very easy to read. A large part is simply the help message from the CLI which is wall of text. Ideally, we can loosen the page a bit up and update the examples. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20351) Execution.transitionState does not properly log slot location
Till Rohrmann created FLINK-20351: - Summary: Execution.transitionState does not properly log slot location Key: FLINK-20351 URL: https://issues.apache.org/jira/browse/FLINK-20351 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.11.2, 1.12.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.12.0, 1.11.3 {{Execution.transitionState}} does not properly log the slot location when reporting the state transition. The problem is that we rely on {{LogicalSlot.toString}} for this information. I suggest to explicitly log the location information consisting of hostname and {{ResourceID}} of the machine on which the {{Execution}} is running. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20350) [Kinesis][GCP PubSub] Incompatible Connectors due to Guava conflict
Danny Cranmer created FLINK-20350: - Summary: [Kinesis][GCP PubSub] Incompatible Connectors due to Guava conflict Key: FLINK-20350 URL: https://issues.apache.org/jira/browse/FLINK-20350 Project: Flink Issue Type: Bug Components: Connectors / Google Cloud PubSub, Connectors / Kinesis Affects Versions: 1.11.2, 1.11.1 Reporter: Danny Cranmer *Problem* Kinesis and GCP PubSub connector do not work together. The following error is thrown. {code} java.lang.NoClassDefFoundError: Could not initialize class io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder at org.apache.flink.streaming.connectors.gcp.pubsub.DefaultPubSubSubscriberFactory.getSubscriber(DefaultPubSubSubscriberFactory.java:52) ~[flink-connector-gcp-pubsub_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.connectors.gcp.pubsub.PubSubSource.createAndSetPubSubSubscriber(PubSubSource.java:213) ~[flink-connector-gcp-pubsub_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.connectors.gcp.pubsub.PubSubSource.open(PubSubSource.java:102) ~[flink-connector-gcp-pubsub_2.11-1.11.1.jar:1.11.1] at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) ~[flink-core-1.11.1.jar:1.11.1] at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102) ~[flink-streaming-java_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:291) ~[flink-streaming-java_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:473) ~[flink-streaming-java_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92) ~[flink-streaming-java_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:469) ~[flink-streaming-java_2.11-1.11.1.jar:1.11.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:522) ~[flink-streaming-java_2.11-1.11.1.jar:1.11.1] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) ~[flink-runtime_2.11-1.11.1.jar:1.11.1] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) ~[flink-runtime_2.11-1.11.1.jar:1.11.1] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_252] {code} *Cause* This is caused by a Guava dependency conflict: - Kinesis Consumer > {{18.0}} - GCP PubSub > {{26.0-android}} {{NettyChannelBuilder}} fails to initialise due to missing method in guava: - {{com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;CLjava/lang/Object;)V}} *Possible Fixes* - Align Guava versions - Shade Guava in either connector -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20349) Query fails with "A conflict is detected. This is unexpected."
Rui Li created FLINK-20349: -- Summary: Query fails with "A conflict is detected. This is unexpected." Key: FLINK-20349 URL: https://issues.apache.org/jira/browse/FLINK-20349 Project: Flink Issue Type: Bug Components: Table SQL / Planner Reporter: Rui Li Fix For: 1.13.0 The test case to reproduce: {code} @Test public void test() throws Exception { tableEnv.executeSql("create table src(key string,val string)"); tableEnv.executeSql("SELECT sum(char_length(src5.src1_value)) FROM " + "(SELECT src3.*, src4.val as src4_value, src4.key as src4_key FROM src src4 JOIN " + "(SELECT src2.*, src1.key as src1_key, src1.val as src1_value FROM src src1 JOIN src src2 ON src1.key = src2.key) src3 " + "ON src3.src1_key = src4.key) src5").collect(); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20348) Make "schema-registry.subject" optional for Kafka sink with avro-confluent format
Jark Wu created FLINK-20348: --- Summary: Make "schema-registry.subject" optional for Kafka sink with avro-confluent format Key: FLINK-20348 URL: https://issues.apache.org/jira/browse/FLINK-20348 Project: Flink Issue Type: Improvement Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table SQL / Ecosystem Reporter: Jark Wu Fix For: 1.12.0 Currently, configuration "schema-registry.subject" in avro-confluent format is required by sink. However, this is quite verbose set it manually. By default, it can be to set to {{-key}} and {{-value}} if it works with kafka or upsert-kafka connector. This can also makes 'avro-confluent' format to be more handy and works better with Kafka/Confluent ecosystem. {code:sql} CREATE TABLE kafka_gmv ( day_str STRING, gmv BIGINT, PRIMARY KEY (day_str) NOT ENFORCED ) WITH ( 'connector' = 'upsert-kafka', 'topic' = 'kafka_gmv', 'properties.bootstrap.servers' = 'localhost:9092', -- 'key.format' = 'raw', 'key.format' = 'avro-confluent', 'key.avro-confluent.schema-registry.url' = 'http://localhost:8181', 'key.avro-confluent.schema-registry.subject' = 'kafka_gmv-key', 'value.format' = 'avro-confluent', 'value.avro-confluent.schema-registry.url' = 'http://localhost:8181', 'value.avro-confluent.schema-registry.subject' = 'kafka_gmv-value' ); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20347) Rework YARN deployment documentation page
Robert Metzger created FLINK-20347: -- Summary: Rework YARN deployment documentation page Key: FLINK-20347 URL: https://issues.apache.org/jira/browse/FLINK-20347 Project: Flink Issue Type: Sub-task Components: Deployment / YARN, Documentation Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 1.12.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20346) Explain ChangelogMode for sinks
Timo Walther created FLINK-20346: Summary: Explain ChangelogMode for sinks Key: FLINK-20346 URL: https://issues.apache.org/jira/browse/FLINK-20346 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Reporter: Timo Walther When explaining an `INSERT INTO` statement, the output does not show a changelog mode. However, this might be useful for users to know which kind of updates end up in a connector such as Upsert Kafka. For example: {code} String initialValues = "INSERT INTO upsert_kafka\n" + "VALUES\n" + " (1, 'name 1', TIMESTAMP '2020-03-08 13:12:11.123', 100, 41, 'payload 1'),\n" + " (2, 'name 2', TIMESTAMP '2020-03-09 13:12:11.123', 101, 42, 'payload 2'),\n" + " (3, 'name 3', TIMESTAMP '2020-03-10 13:12:11.123', 102, 43, 'payload 3'),\n" + " (2, 'name 2', TIMESTAMP '2020-03-11 13:12:11.123', 101, 42, 'payload')"; System.out.println(tEnv.explainSql(initialValues, ExplainDetail.CHANGELOG_MODE)); {code} Leads to `changelogMode=[NONE]`: {code} == Optimized Logical Plan == Sink(table=[default_catalog.default_database.upsert_kafka], fields=[k_user_id, name, k_event_id, user_id, payload, timestamp], changelogMode=[NONE]) +- Calc(select=[CAST(EXPR$0) AS k_user_id, CAST(EXPR$1) AS name, CAST(EXPR$3) AS k_event_id, CAST(EXPR$4) AS user_id, CAST(EXPR$5) AS payload, CAST(EXPR$2) AS timestamp], changelogMode=[I]) +- Values(type=[RecordType(INTEGER EXPR$0, CHAR(6) EXPR$1, TIMESTAMP(3) EXPR$2, INTEGER EXPR$3, INTEGER EXPR$4, VARCHAR(9) EXPR$5)], tuples=[[{ 1, _UTF-16LE'name 1', 2020-03-08 13:12:11.123, 100, 41, _UTF-16LE'payload 1' }, { 2, _UTF-16LE'name 2', 2020-03-09 13:12:11.123, 101, 42, _UTF-16LE'payload 2' }, { 3, _UTF-16LE'name 3', 2020-03-10 13:12:11.123, 102, 43, _UTF-16LE'payload 3' }, { 2, _UTF-16LE'name 2', 2020-03-11 13:12:11.123, 101, 42, _UTF-16LE'payload' }]], changelogMode=[I]) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20345) Adds an Expand node only when there are more then one distinct aggregate function in an Aggregate when executes SplitAggregateRule
zhangqingru created FLINK-20345: --- Summary: Adds an Expand node only when there are more then one distinct aggregate function in an Aggregate when executes SplitAggregateRule Key: FLINK-20345 URL: https://issues.apache.org/jira/browse/FLINK-20345 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Affects Versions: 1.11.2 Reporter: zhangqingru Fix For: 1.11.3 As mentioned in [Flink Document|https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming_aggregation_optimization.html], we could split distinct aggregation to solve skew data on distinct keys which is a very good optimization. However, an unnecessary `Expand` node will be generated under some special cases, like the following sql. {code:java} SELECT COUNT(c) AS pv, COUNT(DISTINCT c) AS uv FROM T GROUP BY a {code} Which plan is like the following text, the Expand and filter condition in aggregate functions could be removed. {code:java} Sink(name=[DataStreamTableSink], fields=[pv, uv]) +- Calc(select=[pv, uv]) +- GroupAggregate(groupBy=[a], partialFinalType=[FINAL], select=[a, $SUM0_RETRACT($f2_0) AS $f1, $SUM0_RETRACT($f3) AS $f2]) +- Exchange(distribution=[hash[a]]) +- GroupAggregate(groupBy=[a, $f2], partialFinalType=[PARTIAL], select=[a, $f2, COUNT(c) FILTER $g_1 AS $f2_0, COUNT(DISTINCT c) FILTER $g_0 AS $f3]) +- Exchange(distribution=[hash[a, $f2]]) +- Calc(select=[a, c, $f2, =($e, 1) AS $g_1, =($e, 0) AS $g_0]) +- Expand(projects=[{a=[$0], c=[$1], $f2=[$2], $e=[0]}, {a=[$0], c=[$1], $f2=[null], $e=[1]}]) +- Calc(select=[a, c, MOD(HASH_CODE(c), 1024) AS $f2]) +- MiniBatchAssigner(interval=[1000ms], mode=[ProcTime]) +- DataStreamScan(table=[[default_catalog, default_database, T]], fields=[a, b, c]){code} An `Expand` node only is necessary when multiple aggregate function with different distinct keys appears in one Aggregate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20344) Modify the default value of the flink-conf savepoint folder to distinguish the checkpoint folder
OpenOpened created FLINK-20344: -- Summary: Modify the default value of the flink-conf savepoint folder to distinguish the checkpoint folder Key: FLINK-20344 URL: https://issues.apache.org/jira/browse/FLINK-20344 Project: Flink Issue Type: Bug Components: Documentation, flink-contrib Affects Versions: 1.11.2 Reporter: OpenOpened Should savepoints in the flink-conf.yml file be specified as the end of flink-savepoint instead of the default and the same as the configuration of flink-checkpoints state.*checkpoints*.dir: hdfs://namenode-host:port/flink-checkpoints state.*savepoints*.dir: hdfs://namenode-host:port/flink-checkpoints after modification state.*savepoints*.dir: hdfs://namenode-host:port/flink-*savepoints* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20343) Add overview / reference architecture page
Robert Metzger created FLINK-20343: -- Summary: Add overview / reference architecture page Key: FLINK-20343 URL: https://issues.apache.org/jira/browse/FLINK-20343 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 1.12.0 To properly guide users, we should add some generic overview of the deployment concepts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20342) Revisit page structure
Robert Metzger created FLINK-20342: -- Summary: Revisit page structure Key: FLINK-20342 URL: https://issues.apache.org/jira/browse/FLINK-20342 Project: Flink Issue Type: Sub-task Components: Documentation, Runtime / Coordination Reporter: Robert Metzger Assignee: Till Rohrmann Fix For: 1.12.0 Clean up page structure -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20341) Rework Deployment / Coordination Documentation
Robert Metzger created FLINK-20341: -- Summary: Rework Deployment / Coordination Documentation Key: FLINK-20341 URL: https://issues.apache.org/jira/browse/FLINK-20341 Project: Flink Issue Type: Task Components: Documentation, Runtime / Coordination Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 1.12.0 Problems: - Clusters & Deployment pages are very inhomogeneous - Overview page has good intentions, but is a huge wall of text - Native K8s and YARN have a “Background / Internals” page - difference between Local Cluster and Standalone Cluster is unclear Goals: - Deploying a Flink cluster is one of the first tasks when getting to know Flink. We need proper guidance for making these steps a success - We need a proper separation between general concepts (HA, session/perjob mode) and implementations of them (ZK HA, K8s HA, YARN session, …). Also orthogonal aspects such as FileSystems, Plugins, Security etc. Related work: https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation (see “Deployment Section”) -- This message was sent by Atlassian Jira (v8.3.4#803005)
I create a hotfix pull request in `MiniCluster.java`, thanks!
The Pull Request Number is #14211. Fix typo `dispatcherResourceManagreComponentRpcServiceFactory` -> `dispatcherResourceManagerComponentRpcServiceFactory` in `MiniCluster.java` Best Regards Yuan Zuo