[jira] [Commented] (FLINK-34528) Disconnect TM in JM when TM was killed to further reduce the job restart time
[ https://issues.apache.org/jira/browse/FLINK-34528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821132#comment-17821132 ] Lijie Wang commented on FLINK-34528: Does this exception always mean that the TM is killed or unavailable? I'm a bit doubtful. > Disconnect TM in JM when TM was killed to further reduce the job restart time > - > > Key: FLINK-34528 > URL: https://issues.apache.org/jira/browse/FLINK-34528 > Project: Flink > Issue Type: Sub-task >Reporter: junzhong qin >Assignee: junzhong qin >Priority: Not a Priority > Attachments: image-2024-02-27-16-35-04-464.png > > > In https://issues.apache.org/jira/browse/FLINK-34526 we disconnect the killed > TM in RM. But in the following scenario, we can further reduce the restart > time. > h3. Phenomenon > In the test case, the pipeline looks like: > !image-2024-02-27-16-35-04-464.png! > The Source: Custom Source generates strings, and the job keyBy the strings to > Sink: Unnamed. > # parallelism = 100 > # taskmanager.numberOfTaskSlots = 2 > # disable checkpoint > The worker was killed at > {code:java} > 2024-02-27 16:41:49,982 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: > Unnamed (6/100) > (2f1c7b22098a273f5471e3e8f794e1d3_0a448493b4782967b150582570326227_5_0) > switched from RUNNING to FAILED on > container_e2472_1705993319725_62292_01_46 @ xxx > (dataPort=38827).org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'xxx/10.169.18.138:35983 [ > container_e2472_1705993319725_62292_01_10(xxx:5454) ] '. This might > indicate that the remote task manager was lost.at > org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:134) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) > at > org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81) > at > org.apache.flink.runtime.io.network.netty.NettyMessageClientDecoderDelegate.channelInactive(NettyMessageClientDecoderDelegate.java:94) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) > at > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) > at > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:813) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) > at > org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > at > org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_252]{code} > {code:java} > // The task was scheduled to a task manager that had already been killed > 2024-02-27 16:41:53,506 INFO > org.apache.flink.runtime.execu
[jira] [Comment Edited] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
[ https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818346#comment-17818346 ] Lijie Wang edited comment on FLINK-34358 at 2/19/24 6:15 AM: - The test passes in flink 1.18 but fails in flink 1.19. The root cause is that after FLINK-34316, the code that thrown this exception was not called. The exception thrown in {{JdbcDynamicTableSource.getScanRuntimeProvider}} in flink 1.18, which was omitted in FLINK-34316. Call {{executeSql}} instead of {{sqlQuery}} should fix it. I will prepare a PR soon. was (Author: wanglijie95): The test passes in flink 1.18 but fails in flink 1.19. The root cause is that after FLINK-34316, the code that caused the exception was not called. The exception thrown in {{JdbcDynamicTableSource.getScanRuntimeProvider}} in flink 1.18, which was omitted in FLINK-34316. Call {{executeSql}} instead of {{sqlQuery}} should fix it. I will prepare a PR soon. > flink-connector-jdbc nightly fails with "Expecting code to raise a throwable" > - > > Key: FLINK-34358 > URL: https://issues.apache.org/jira/browse/FLINK-34358 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Reporter: Martijn Visser >Assignee: Lijie Wang >Priority: Blocker > Labels: pull-request-available, test-stability > > https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346 > {code:java} > [INFO] Running > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 0.554 s <<< FAILURE! - in > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19] > Time elapsed: 0.018 s <<< FAILURE! > java.lang.AssertionError: > Expecting code to raise a throwable. > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 > s - in org.apache.flink.connector.jdbc.catalog.JdbcCatalogUtilsTest > [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureTest > [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureBase > [INFO] Running org.apache.flink.architecture.rules.ApiAnnotationRules > [INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.155 > s - in org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest > [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTest > [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTestBase > [INFO] Running org.apache.flink.architecture.rules.ITCaseRules > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.109 > s - in org.apache.flink.architecture.rules.ApiAnnotationRules > [INFO] Running org.apache.flink.architecture.rules.TableApiRules > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 > s - in org.apache.flink.architecture.rules.TableApiRules > [INFO] Running org.apache.flink.architecture.rules.ConnectorRules > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.31 s > - in org.apache.flink.architecture.rules.ConnectorRules > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.464 > s - in org.apache.flink.architecture.ProductionCodeArchitectureBase > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.468 > s - in org.apache.flink.architecture.ProductionCodeArchitectureTest > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.758 > s - in org.apache.flink.architecture.rules.ITCaseRules > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.761 > s - in org.apache.flink.architecture.TestCodeArchitectureTestBase > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.775 > s - in org.apache.flink.architecture.TestCodeArchitectureTest > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.38 > s - in > org.apache.flink.connector.jdbc.databases.oracle.xa.OracleExactlyOnceSinkE2eTest > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: > 172.591 s - in > org.apache.flink.connector.jdbc.databases.db2.xa.Db2ExactlyOnceSinkE2eTest > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PostgresDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > Error:TrinoDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > Error:CrateDBDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > [INFO
[jira] [Commented] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
[ https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818347#comment-17818347 ] Lijie Wang commented on FLINK-34358: The exception stack in 1.18: {code:java} java.lang.UnsupportedOperationException: Unsupported type:TIMESTAMP_LTZ(3) at org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.createInternalConverter(AbstractJdbcRowConverter.java:187) at org.apache.flink.connector.jdbc.databases.trino.dialect.TrinoRowConverter.createInternalConverter(TrinoRowConverter.java:60) at org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.createNullableInternalConverter(AbstractJdbcRowConverter.java:118) at org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.(AbstractJdbcRowConverter.java:68) at org.apache.flink.connector.jdbc.databases.trino.dialect.TrinoRowConverter.(TrinoRowConverter.java:40) at org.apache.flink.connector.jdbc.databases.trino.dialect.TrinoDialect.getRowConverter(TrinoDialect.java:49) at org.apache.flink.connector.jdbc.table.JdbcDynamicTableSource.getScanRuntimeProvider(JdbcDynamicTableSource.java:184) at org.apache.flink.table.planner.connectors.DynamicSourceUtils.validateScanSource(DynamicSourceUtils.java:478) at org.apache.flink.table.planner.connectors.DynamicSourceUtils.prepareDynamicSource(DynamicSourceUtils.java:161) at org.apache.flink.table.planner.connectors.DynamicSourceUtils.convertSourceToRel(DynamicSourceUtils.java:125) at org.apache.flink.table.planner.plan.schema.CatalogSourceTable.toRel(CatalogSourceTable.java:118) at org.apache.calcite.sql2rel.SqlToRelConverter.toRel(SqlToRelConverter.java:4002) at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:2872) at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2432) at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2346) at org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2291) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:728) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:714) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3848) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:618) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:229) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:205) at org.apache.flink.table.planner.operations.SqlNodeConvertContext.toRelRoot(SqlNodeConvertContext.java:69) at org.apache.flink.table.planner.operations.converters.SqlQueryConverter.convertSqlNode(SqlQueryConverter.java:48) at org.apache.flink.table.planner.operations.converters.SqlNodeConverters.convertSqlNode(SqlNodeConverters.java:73) at org.apache.flink.table.planner.operations.SqlNodeToOperationConversion.convertValidatedSqlNode(SqlNodeToOperationConversion.java:272) at org.apache.flink.table.planner.operations.SqlNodeToOperationConversion.convert(SqlNodeToOperationConversion.java:262) at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:106) at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:708) at org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest.testDataTypeValidate(JdbcDialectTypeTest.java:101) {code} > flink-connector-jdbc nightly fails with "Expecting code to raise a throwable" > - > > Key: FLINK-34358 > URL: https://issues.apache.org/jira/browse/FLINK-34358 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Reporter: Martijn Visser >Assignee: Lijie Wang >Priority: Blocker > Labels: test-stability > > https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346 > {code:java} > [INFO] Running > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 0.554 s <<< FAILURE! - in > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19] > Time elapsed: 0.018 s <<< FAILURE! > java.lang.AssertionError: > Expecting code to raise a throwable. > [INFO]
[jira] [Assigned] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
[ https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-34358: -- Assignee: Lijie Wang > flink-connector-jdbc nightly fails with "Expecting code to raise a throwable" > - > > Key: FLINK-34358 > URL: https://issues.apache.org/jira/browse/FLINK-34358 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Reporter: Martijn Visser >Assignee: Lijie Wang >Priority: Blocker > Labels: test-stability > > https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346 > {code:java} > [INFO] Running > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 0.554 s <<< FAILURE! - in > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19] > Time elapsed: 0.018 s <<< FAILURE! > java.lang.AssertionError: > Expecting code to raise a throwable. > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 > s - in org.apache.flink.connector.jdbc.catalog.JdbcCatalogUtilsTest > [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureTest > [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureBase > [INFO] Running org.apache.flink.architecture.rules.ApiAnnotationRules > [INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.155 > s - in org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest > [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTest > [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTestBase > [INFO] Running org.apache.flink.architecture.rules.ITCaseRules > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.109 > s - in org.apache.flink.architecture.rules.ApiAnnotationRules > [INFO] Running org.apache.flink.architecture.rules.TableApiRules > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 > s - in org.apache.flink.architecture.rules.TableApiRules > [INFO] Running org.apache.flink.architecture.rules.ConnectorRules > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.31 s > - in org.apache.flink.architecture.rules.ConnectorRules > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.464 > s - in org.apache.flink.architecture.ProductionCodeArchitectureBase > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.468 > s - in org.apache.flink.architecture.ProductionCodeArchitectureTest > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.758 > s - in org.apache.flink.architecture.rules.ITCaseRules > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.761 > s - in org.apache.flink.architecture.TestCodeArchitectureTestBase > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.775 > s - in org.apache.flink.architecture.TestCodeArchitectureTest > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.38 > s - in > org.apache.flink.connector.jdbc.databases.oracle.xa.OracleExactlyOnceSinkE2eTest > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: > 172.591 s - in > org.apache.flink.connector.jdbc.databases.db2.xa.Db2ExactlyOnceSinkE2eTest > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PostgresDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > Error:TrinoDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > Error:CrateDBDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > [INFO] > Error: Tests run: 394, Failures: 3, Errors: 0, Skipped: 1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
[ https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818346#comment-17818346 ] Lijie Wang commented on FLINK-34358: The test passes in flink 1.18 but fails in flink 1.19. The root cause is that after FLINK-34316, the code that caused the exception was not called. The exception thrown in {{JdbcDynamicTableSource.getScanRuntimeProvider}} in flink 1.18, which was omitted in FLINK-34316. Call {{executeSql}} instead of {{sqlQuery}} should fix it. I will prepare a PR soon. > flink-connector-jdbc nightly fails with "Expecting code to raise a throwable" > - > > Key: FLINK-34358 > URL: https://issues.apache.org/jira/browse/FLINK-34358 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Reporter: Martijn Visser >Priority: Blocker > Labels: test-stability > > https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346 > {code:java} > [INFO] Running > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 0.554 s <<< FAILURE! - in > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest > Error: > org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19] > Time elapsed: 0.018 s <<< FAILURE! > java.lang.AssertionError: > Expecting code to raise a throwable. > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 > s - in org.apache.flink.connector.jdbc.catalog.JdbcCatalogUtilsTest > [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureTest > [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureBase > [INFO] Running org.apache.flink.architecture.rules.ApiAnnotationRules > [INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.155 > s - in org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest > [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTest > [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTestBase > [INFO] Running org.apache.flink.architecture.rules.ITCaseRules > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.109 > s - in org.apache.flink.architecture.rules.ApiAnnotationRules > [INFO] Running org.apache.flink.architecture.rules.TableApiRules > [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 > s - in org.apache.flink.architecture.rules.TableApiRules > [INFO] Running org.apache.flink.architecture.rules.ConnectorRules > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.31 s > - in org.apache.flink.architecture.rules.ConnectorRules > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.464 > s - in org.apache.flink.architecture.ProductionCodeArchitectureBase > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.468 > s - in org.apache.flink.architecture.ProductionCodeArchitectureTest > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.758 > s - in org.apache.flink.architecture.rules.ITCaseRules > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.761 > s - in org.apache.flink.architecture.TestCodeArchitectureTestBase > [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.775 > s - in org.apache.flink.architecture.TestCodeArchitectureTest > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.38 > s - in > org.apache.flink.connector.jdbc.databases.oracle.xa.OracleExactlyOnceSinkE2eTest > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: > 172.591 s - in > org.apache.flink.connector.jdbc.databases.db2.xa.Db2ExactlyOnceSinkE2eTest > [INFO] > [INFO] Results: > [INFO] > Error: Failures: > Error: > PostgresDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > Error:TrinoDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > Error:CrateDBDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 > Expecting code to raise a throwable. > [INFO] > Error: Tests run: 394, Failures: 3, Errors: 0, Skipped: 1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url
[ https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-34365. -- Resolution: Fixed > [docs] Delete repeated pages in Chinese Flink website and correct the Paimon > url > > > Key: FLINK-34365 > URL: https://issues.apache.org/jira/browse/FLINK-34365 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: Waterking >Assignee: Waterking >Priority: Major > Labels: pull-request-available > Attachments: 微信截图_20240205214854.png > > Original Estimate: 1h > Remaining Estimate: 1h > > The "教程" column on the [Flink > 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] > currently has two "[With Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink/%22]";. > Therefore, I delete one for brevity. > Also, the current link is wrong and I correct it with this link "[With > Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink]"; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url
[ https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815053#comment-17815053 ] Lijie Wang commented on FLINK-34365: Fixed via branch asf-site(flink-web): ec2e5c2b4a312fe56e44e13c57b84c6f1331b992 > [docs] Delete repeated pages in Chinese Flink website and correct the Paimon > url > > > Key: FLINK-34365 > URL: https://issues.apache.org/jira/browse/FLINK-34365 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: Waterking >Assignee: Waterking >Priority: Major > Labels: pull-request-available > Attachments: 微信截图_20240205214854.png > > Original Estimate: 1h > Remaining Estimate: 1h > > The "教程" column on the [Flink > 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] > currently has two "[With Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink/%22]";. > Therefore, I delete one for brevity. > Also, the current link is wrong and I correct it with this link "[With > Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink]"; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url
[ https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-34365: -- Assignee: Waterking > [docs] Delete repeated pages in Chinese Flink website and correct the Paimon > url > > > Key: FLINK-34365 > URL: https://issues.apache.org/jira/browse/FLINK-34365 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: Waterking >Assignee: Waterking >Priority: Major > Labels: pull-request-available > Attachments: 微信截图_20240205214854.png > > Original Estimate: 1h > Remaining Estimate: 1h > > The "教程" column on the [Flink > 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] > currently has two "[With Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink/%22]";. > Therefore, I delete one for brevity. > Also, the current link is wrong and I correct it with this link "[With > Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink]"; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url
[ https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814628#comment-17814628 ] Lijie Wang commented on FLINK-34365: Assigned to you [~waterking] :) > [docs] Delete repeated pages in Chinese Flink website and correct the Paimon > url > > > Key: FLINK-34365 > URL: https://issues.apache.org/jira/browse/FLINK-34365 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: Waterking >Assignee: Waterking >Priority: Major > Labels: pull-request-available > Attachments: 微信截图_20240205214854.png > > Original Estimate: 1h > Remaining Estimate: 1h > > The "教程" column on the [Flink > 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] > currently has two "[With Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink/%22]";. > Therefore, I delete one for brevity. > Also, the current link is wrong and I correct it with this link "[With > Paimon(incubating) (formerly Flink Table > Store)|https://paimon.apache.org/docs/master/engines/flink]"; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34132) Batch WordCount job fails when run with AdaptiveBatch scheduler
[ https://issues.apache.org/jira/browse/FLINK-34132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807697#comment-17807697 ] Lijie Wang commented on FLINK-34132: [~prabhujoseph] AFAIK, adaptive batch scheduler does not support dataset jobs, you can see details in [FLIP-283|https://cwiki.apache.org/confluence/display/FLINK/FLIP-283%3A+Use+adaptive+batch+scheduler+as+default+scheduler+for+batch+jobs]. But maybe we should make the exception message more friendly. cc [~JunRuiLi] > Batch WordCount job fails when run with AdaptiveBatch scheduler > --- > > Key: FLINK-34132 > URL: https://issues.apache.org/jira/browse/FLINK-34132 > Project: Flink > Issue Type: Bug >Affects Versions: 1.17.1, 1.18.1 >Reporter: Prabhu Joseph >Priority: Major > > Batch WordCount job fails when run with AdaptiveBatch scheduler. > *Repro Steps* > {code:java} > flink-yarn-session -Djobmanager.scheduler=adaptive -d > flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input > s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT > {code} > *Error logs* > {code:java} > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: > org.apache.flink.runtime.client.JobInitializationException: Could not start > the JobMaster. > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245) > at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095) > at > org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at > org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at > org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.flink.runtime.client.JobInitializationException: Could not start > the JobMaster. > at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) > at > org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1067) > at > org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:144) > at > org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:73) > at > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:106) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) > ... 12 more > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: > org.apache.flink.runtime.client.JobInitializationException: Could not start > the JobMaster. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1062) > ... 20 more > Caused by: java.lang.RuntimeException: > org.apache.flink.runtime.client.JobInitializationException: Could not start > the JobMaster. > at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) > at > org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:75) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFutu
[jira] [Commented] (FLINK-34025) Show data skew score on Flink Dashboard
[ https://issues.apache.org/jira/browse/FLINK-34025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804720#comment-17804720 ] Lijie Wang commented on FLINK-34025: Hi [~iemre], I think your proposal needs to be discussed on the dev mailing list via FLIP (as it involves UI and metrics add/change). You can find more detail in [here|https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals] > Show data skew score on Flink Dashboard > --- > > Key: FLINK-34025 > URL: https://issues.apache.org/jira/browse/FLINK-34025 > Project: Flink > Issue Type: New Feature > Components: Runtime / Web Frontend >Affects Versions: 1.19.0 >Reporter: Emre Kartoglu >Priority: Major > Labels: dashboard > Attachments: skew_proposal.png, skew_tab.png > > > *Problem:* Currently users have to click on every operator and check how much > data each subtask is processing to see if there is data skew. This is > particularly cumbersome and error-prone for jobs with big job graphs. Data > skew is an important metric that should be more visible. > > *Proposed solution:* > * Show a data skew score on each operator (see screenshot below). This would > be an improvement, but would not be sufficient. As it would still not be easy > to see the data skew score for jobs with very large job graphs (it'd require > a lot of zooming in/out). > * Show data skew score for each operator under a new "Data Skew" tab next to > the Exceptions tab. See screenshot below > !skew_tab.png|width=1226,height=719! . > > !skew_proposal.png|width=845,height=253! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33968) Compute the number of subpartitions when initializing executon job vertices
[ https://issues.apache.org/jira/browse/FLINK-33968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-33968: --- Description: Currently, when using dynamic graphs, the subpartition-num of a task is lazily calculated until the task deployment moment, this may lead to some uncertainties in job recovery scenarios: Before jm crashs, when deploying upstream tasks, the parallelism of downstream vertex may be unknown, so the subpartiton-num will be the max parallelism of downstream job vertex. However, after jm restarts, when deploying upstream tasks, the parallelism of downstream job vertex may be known(has been calculated before jm crashs and been recovered after jm restarts), so the subpartiton-num will be the actual parallelism of downstream job vertex. The difference of calculated subpartition-num will lead to the partitions generated before jm crashs cannot be reused after jm restarts. We will solve this problem by advancing the calculation of subpartitoin-num to the moment of initializing executon job vertex (in ctor of IntermediateResultPartition) was: Currently, when using dynamic graphs, the subpartition-num of a task is lazily calculated until the task deployment moment, this may lead to some uncertainties in job recovery scenarios. Before jm crashs, when deploying upstream tasks, the parallelism of downstream vertex may be unknown, so the subpartiton-num will be the max parallelism of downstream job vertex. However, after jm restarts, when deploying upstream tasks, the parallelism of downstream job vertex may be known(has been calculated before jm crashs and been recovered after jm restarts), so the subpartiton-num will be the actual parallelism of downstream job vertex. The difference of calculated subpartition-num will lead to the partitions generated before jm crashs cannot be reused after jm restarts. We will solve this problem by advancing the calculation of subpartitoin-num to the moment of initializing executon job vertex (in ctor of IntermediateResultPartition) > Compute the number of subpartitions when initializing executon job vertices > --- > > Key: FLINK-33968 > URL: https://issues.apache.org/jira/browse/FLINK-33968 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > > Currently, when using dynamic graphs, the subpartition-num of a task is > lazily calculated until the task deployment moment, this may lead to some > uncertainties in job recovery scenarios: > Before jm crashs, when deploying upstream tasks, the parallelism of > downstream vertex may be unknown, so the subpartiton-num will be the max > parallelism of downstream job vertex. However, after jm restarts, when > deploying upstream tasks, the parallelism of downstream job vertex may be > known(has been calculated before jm crashs and been recovered after jm > restarts), so the subpartiton-num will be the actual parallelism of > downstream job vertex. The difference of calculated subpartition-num will > lead to the partitions generated before jm crashs cannot be reused after jm > restarts. > We will solve this problem by advancing the calculation of subpartitoin-num > to the moment of initializing executon job vertex (in ctor of > IntermediateResultPartition) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33968) Compute the number of subpartitions when initializing executon job vertices
Lijie Wang created FLINK-33968: -- Summary: Compute the number of subpartitions when initializing executon job vertices Key: FLINK-33968 URL: https://issues.apache.org/jira/browse/FLINK-33968 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Lijie Wang Assignee: Lijie Wang Currently, when using dynamic graphs, the subpartition-num of a task is lazily calculated until the task deployment moment, this may lead to some uncertainties in job recovery scenarios. Before jm crashs, when deploying upstream tasks, the parallelism of downstream vertex may be unknown, so the subpartiton-num will be the max parallelism of downstream job vertex. However, after jm restarts, when deploying upstream tasks, the parallelism of downstream job vertex may be known(has been calculated before jm crashs and been recovered after jm restarts), so the subpartiton-num will be the actual parallelism of downstream job vertex. The difference of calculated subpartition-num will lead to the partitions generated before jm crashs cannot be reused after jm restarts. We will solve this problem by advancing the calculation of subpartitoin-num to the moment of initializing executon job vertex (in ctor of IntermediateResultPartition) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33892: -- Assignee: Junrui Li (was: Lijie Wang) > FLIP-383: Support Job Recovery for Batch Jobs > - > > Key: FLINK-33892 > URL: https://issues.apache.org/jira/browse/FLINK-33892 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Junrui Li >Priority: Major > > This is the umbrella ticket for > [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)
[ https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800679#comment-17800679 ] Lijie Wang commented on FLINK-33943: Maybe you can see the comments in FLINK-19358 and [FLIP-85|https://cwiki.apache.org/confluence/display/FLINK/FLIP-85%3A+Flink+Application+Mode] > Apache flink: Issues after configuring HA (using zookeeper setting) > --- > > Key: FLINK-33943 > URL: https://issues.apache.org/jira/browse/FLINK-33943 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.18.0 > Environment: Flink version: 1.18 > Zookeeper version: 3.7.2 > Env: Custom flink docker image (with embedded application class) deployed > over kubernetes (v1.26.11). > >Reporter: Vijay >Priority: Major > > Hi Team, > *Note:* Not sure whether I have picked the right component while raising the > issue. > Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink > cluster. Job manager has been deployed on "Application mode" and when HA is > disabled (high-availability.type: NONE) we are able to start multiple jobs > (using env.executeAsyn()) for a single application. But when I setup the > Zookeeper as the HA type (high-availability.type: zookeeper), we are only > seeing only one job is getting executed on the Flink dashboard. Following are > the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. > Please let us know if anyone has experienced similar issues and have any > suggestions. Thanks in advance for your assistance. > *Note:* We are using a Streaming application and following are the > flink-config.yaml configurations. > *Additional query:* Does "Session mode" of deployment support HA for multiple > execute() executions? > # high-availability.storageDir: /opt/flink/data > # high-availability.cluster-id: test > # high-availability.zookeeper.quorum: localhost:2181 > # high-availability.type: zookeeper > # high-availability.zookeeper.path.root: /dp/configs/flinkha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)
[ https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-33943. -- Resolution: Not A Bug > Apache flink: Issues after configuring HA (using zookeeper setting) > --- > > Key: FLINK-33943 > URL: https://issues.apache.org/jira/browse/FLINK-33943 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.18.0 > Environment: Flink version: 1.18 > Zookeeper version: 3.7.2 > Env: Custom flink docker image (with embedded application class) deployed > over kubernetes (v1.26.11). > >Reporter: Vijay >Priority: Major > > Hi Team, > *Note:* Not sure whether I have picked the right component while raising the > issue. > Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink > cluster. Job manager has been deployed on "Application mode" and when HA is > disabled (high-availability.type: NONE) we are able to start multiple jobs > (using env.executeAsyn()) for a single application. But when I setup the > Zookeeper as the HA type (high-availability.type: zookeeper), we are only > seeing only one job is getting executed on the Flink dashboard. Following are > the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. > Please let us know if anyone has experienced similar issues and have any > suggestions. Thanks in advance for your assistance. > *Note:* We are using a Streaming application and following are the > flink-config.yaml configurations. > *Additional query:* Does "Session mode" of deployment support HA for multiple > execute() executions? > # high-availability.storageDir: /opt/flink/data > # high-availability.cluster-id: test > # high-availability.zookeeper.quorum: localhost:2181 > # high-availability.type: zookeeper > # high-availability.zookeeper.path.root: /dp/configs/flinkha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)
[ https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800676#comment-17800676 ] Lijie Wang commented on FLINK-33943: I 'll close this issue because it's not a bug. cc [~vrang...@in.ibm.com] > Apache flink: Issues after configuring HA (using zookeeper setting) > --- > > Key: FLINK-33943 > URL: https://issues.apache.org/jira/browse/FLINK-33943 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.18.0 > Environment: Flink version: 1.18 > Zookeeper version: 3.7.2 > Env: Custom flink docker image (with embedded application class) deployed > over kubernetes (v1.26.11). > >Reporter: Vijay >Priority: Major > > Hi Team, > *Note:* Not sure whether I have picked the right component while raising the > issue. > Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink > cluster. Job manager has been deployed on "Application mode" and when HA is > disabled (high-availability.type: NONE) we are able to start multiple jobs > (using env.executeAsyn()) for a single application. But when I setup the > Zookeeper as the HA type (high-availability.type: zookeeper), we are only > seeing only one job is getting executed on the Flink dashboard. Following are > the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. > Please let us know if anyone has experienced similar issues and have any > suggestions. Thanks in advance for your assistance. > *Note:* We are using a Streaming application and following are the > flink-config.yaml configurations. > *Additional query:* Does "Session mode" of deployment support HA for multiple > execute() executions? > # high-availability.storageDir: /opt/flink/data > # high-availability.cluster-id: test > # high-availability.zookeeper.quorum: localhost:2181 > # high-availability.type: zookeeper > # high-availability.zookeeper.path.root: /dp/configs/flinkha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)
[ https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800671#comment-17800671 ] Lijie Wang commented on FLINK-33943: [~vrang...@in.ibm.com] I think it should work if you use session mode instead of application mode. > Apache flink: Issues after configuring HA (using zookeeper setting) > --- > > Key: FLINK-33943 > URL: https://issues.apache.org/jira/browse/FLINK-33943 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.18.0 > Environment: Flink version: 1.18 > Zookeeper version: 3.7.2 > Env: Custom flink docker image (with embedded application class) deployed > over kubernetes (v1.26.11). > >Reporter: Vijay >Priority: Major > > Hi Team, > Note: Not sure whether I have picked the right component while raising the > issue. > Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink > cluster. Job manager has been deployed on "Application mode" and when HA is > disabled (high-availability.type: NONE) we are able to start multiple jobs > (using env.executeAsyn()) for a single application. But when I setup the > Zookeeper as the HA type (high-availability.type: zookeeper), we are only > seeing only one job is getting executed on the Flink dashboard. Following are > the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. > Please let us know if anyone has experienced similar issues and have any > suggestions. Thanks in advance for your assistance. > Note: We are using a Streaming application and following are the > flink-config.yaml configurations. > # high-availability.storageDir: /opt/flink/data > # high-availability.cluster-id: test > # high-availability.zookeeper.quorum: localhost:2181 > # high-availability.type: zookeeper > # high-availability.zookeeper.path.root: /dp/configs/flinkha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)
[ https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800668#comment-17800668 ] Lijie Wang commented on FLINK-33943: Hi [~vrang...@in.ibm.com], currently, HA in Application Mode is only supported for single-execute() applications. You can find more details in [flink docs|https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/overview/#application-mode]. > Apache flink: Issues after configuring HA (using zookeeper setting) > --- > > Key: FLINK-33943 > URL: https://issues.apache.org/jira/browse/FLINK-33943 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.18.0 > Environment: Flink version: 1.18 > Zookeeper version: 3.7.2 > Env: Custom flink docker image (with embedded application class) deployed > over kubernetes (v1.26.11). > >Reporter: Vijay >Priority: Major > > Hi Team, > Note: Not sure whether I have picked the right component while raising the > issue. > Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink > cluster. Job manager has been deployed on "Application mode" and when HA is > disabled (high-availability.type: NONE) we are able to start multiple jobs > (using env.executeAsyn()) for a single application. But when I setup the > Zookeeper as the HA type (high-availability.type: zookeeper), we are only > seeing only one job is getting executed on the Flink dashboard. Following are > the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. > Please let us know if anyone has experienced similar issues and have any > suggestions. Thanks in advance for your assistance. > Note: We are using a Streaming application and following are the > flink-config.yaml configurations. > # high-availability.storageDir: /opt/flink/data > # high-availability.cluster-id: test > # high-availability.zookeeper.quorum: localhost:2181 > # high-availability.type: zookeeper > # high-availability.zookeeper.path.root: /dp/configs/flinkha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task
[ https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-31650. -- Fix Version/s: 1.19.0 1.18.1 1.17.3 Resolution: Fixed > Incorrect busyMsTimePerSecond metric value for FINISHED task > > > Key: FLINK-31650 > URL: https://issues.apache.org/jira/browse/FLINK-31650 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics, Runtime / REST >Affects Versions: 1.17.0, 1.16.1, 1.15.4 >Reporter: Lijie Wang >Assignee: Zhanghao Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0, 1.18.1, 1.17.3 > > Attachments: busyMsTimePerSecond.png > > > As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is > 100%, which is obviously unreasonable. > !busyMsTimePerSecond.png|width=1048,height=432! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task
[ https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800398#comment-17800398 ] Lijie Wang commented on FLINK-31650: Fixed via: master(1.19) : dd028282e8ab19c0d1cd05fad02b63bbda6c1358 release-1.18: ff1ef789eba612d008424d5fc28fe5905f96fb9c release-1.17: 024e970e27781a9fd1925d6c5938efbb364c6462 > Incorrect busyMsTimePerSecond metric value for FINISHED task > > > Key: FLINK-31650 > URL: https://issues.apache.org/jira/browse/FLINK-31650 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics, Runtime / REST >Affects Versions: 1.17.0, 1.16.1, 1.15.4 >Reporter: Lijie Wang >Assignee: Zhanghao Chen >Priority: Major > Labels: pull-request-available > Attachments: busyMsTimePerSecond.png > > > As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is > 100%, which is obviously unreasonable. > !busyMsTimePerSecond.png|width=1048,height=432! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33905) FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs
[ https://issues.apache.org/jira/browse/FLINK-33905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33905: -- Assignee: Wencong Liu > FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs > --- > > Key: FLINK-33905 > URL: https://issues.apache.org/jira/browse/FLINK-33905 > Project: Flink > Issue Type: Improvement > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Wencong Liu >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available > > This ticket is proposed for > [FLIP-382|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33892: -- Assignee: (was: Lijie Wang) > FLIP-383: Support Job Recovery for Batch Jobs > - > > Key: FLINK-33892 > URL: https://issues.apache.org/jira/browse/FLINK-33892 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Lijie Wang >Priority: Major > > This is the umbrella ticket for > [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33892: -- Assignee: Lijie Wang > FLIP-383: Support Job Recovery for Batch Jobs > - > > Key: FLINK-33892 > URL: https://issues.apache.org/jira/browse/FLINK-33892 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > > This is the umbrella ticket for > [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33892: -- Assignee: Lijie Wang > FLIP-383: Support Job Recovery for Batch Jobs > - > > Key: FLINK-33892 > URL: https://issues.apache.org/jira/browse/FLINK-33892 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > > This is the umbrella ticket for > [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs
Lijie Wang created FLINK-33892: -- Summary: FLIP-383: Support Job Recovery for Batch Jobs Key: FLINK-33892 URL: https://issues.apache.org/jira/browse/FLINK-33892 Project: Flink Issue Type: New Feature Components: Runtime / Coordination Reporter: Lijie Wang This is the umbrella ticket for [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33713) Deprecate RuntimeContext#getExecutionConfig
[ https://issues.apache.org/jira/browse/FLINK-33713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33713: -- Assignee: Junrui Li > Deprecate RuntimeContext#getExecutionConfig > --- > > Key: FLINK-33713 > URL: https://issues.apache.org/jira/browse/FLINK-33713 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > Labels: pull-request-available > > Deprecate RuntimeContext#getExecutionConfig -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33714) Update documentation about the usage of RuntimeContext#getExecutionConfig
[ https://issues.apache.org/jira/browse/FLINK-33714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33714: -- Assignee: Junrui Li > Update documentation about the usage of RuntimeContext#getExecutionConfig > - > > Key: FLINK-33714 > URL: https://issues.apache.org/jira/browse/FLINK-33714 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33721) Extend BashJavaUtils to Support Reading and Writing Standard YAML Files
[ https://issues.apache.org/jira/browse/FLINK-33721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33721: -- Assignee: Junrui Li > Extend BashJavaUtils to Support Reading and Writing Standard YAML Files > --- > > Key: FLINK-33721 > URL: https://issues.apache.org/jira/browse/FLINK-33721 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > > Currently, Flink's shell scripts, such as those used for end-to-end (e2e) > testing and Docker image building, require the ability to read from and > modify Flink's configuration files. With the introduction of standard YAML > files as the configuration format, the existing shell scripts are not > equipped to correctly handle read and modify operations on these files. > To accommodate this requirement and enhance our script capabilities, we > propose an extension to the BashJavaUtils functionality. This extension will > enable BashJavaUtils to support the reading and modifying of standard YAML > files, ensuring that our shell scripts can seamlessly interact with the new > configuration format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33581) FLIP-381: Deprecate configuration getters/setters that return/set complex Java objects
[ https://issues.apache.org/jira/browse/FLINK-33581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33581: -- Assignee: Junrui Li > FLIP-381: Deprecate configuration getters/setters that return/set complex > Java objects > -- > > Key: FLINK-33581 > URL: https://issues.apache.org/jira/browse/FLINK-33581 > Project: Flink > Issue Type: Technical Debt > Components: API / DataStream >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > > Deprecate he non-ConfigOption objects in the StreamExecutionEnvironment, > CheckpointConfig, and ExecutionConfig, and ultimately removing them in FLINK > 2.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33576) Introduce new Flink conf file "config.yaml" supporting standard YAML syntax
[ https://issues.apache.org/jira/browse/FLINK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33576: -- Assignee: Junrui Li > Introduce new Flink conf file "config.yaml" supporting standard YAML syntax > --- > > Key: FLINK-33576 > URL: https://issues.apache.org/jira/browse/FLINK-33576 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > > Introduce new Flink conf file config.yaml, and this file will be parsed by > standard YAML syntax. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33577) Make "conf.yaml" as the default Flink configuration file
[ https://issues.apache.org/jira/browse/FLINK-33577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33577: -- Assignee: Junrui Li > Make "conf.yaml" as the default Flink configuration file > > > Key: FLINK-33577 > URL: https://issues.apache.org/jira/browse/FLINK-33577 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > > This update ensures that the flink-dist package in FLINK will include the new > configuration file "conf.yaml" by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33356) The navigation bar on Flink’s official website is messed up.
[ https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781986#comment-17781986 ] Lijie Wang commented on FLINK-33356: The website is normal now, thanks [~JunRuiLi] [~Wencong Liu] :D > The navigation bar on Flink’s official website is messed up. > > > Key: FLINK-33356 > URL: https://issues.apache.org/jira/browse/FLINK-33356 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Junrui Li >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > Attachments: image-2023-10-25-11-55-52-653.png, > image-2023-10-25-12-34-22-790.png > > > The side navigation bar on the Flink official website at the following link: > [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed > up, as shown in the attached screenshot. > !image-2023-10-25-11-55-52-653.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33356) The navigation bar on Flink’s official website is messed up.
[ https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-33356. -- Fix Version/s: 1.19.0 Resolution: Fixed > The navigation bar on Flink’s official website is messed up. > > > Key: FLINK-33356 > URL: https://issues.apache.org/jira/browse/FLINK-33356 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Junrui Li >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > Attachments: image-2023-10-25-11-55-52-653.png, > image-2023-10-25-12-34-22-790.png > > > The side navigation bar on the Flink official website at the following link: > [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed > up, as shown in the attached screenshot. > !image-2023-10-25-11-55-52-653.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33356) The navigation bar on Flink’s official website is messed up.
[ https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781274#comment-17781274 ] Lijie Wang commented on FLINK-33356: Fixed via master 780a673d8e2c3845d685c86d95166d9169601726 > The navigation bar on Flink’s official website is messed up. > > > Key: FLINK-33356 > URL: https://issues.apache.org/jira/browse/FLINK-33356 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Junrui Li >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available > Attachments: image-2023-10-25-11-55-52-653.png, > image-2023-10-25-12-34-22-790.png > > > The side navigation bar on the Flink official website at the following link: > [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed > up, as shown in the attached screenshot. > !image-2023-10-25-11-55-52-653.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33364) Introduce standard YAML for flink configuration
[ https://issues.apache.org/jira/browse/FLINK-33364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33364: -- Assignee: Junrui Li > Introduce standard YAML for flink configuration > --- > > Key: FLINK-33364 > URL: https://issues.apache.org/jira/browse/FLINK-33364 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Affects Versions: 1.19.0 >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33356) The navigation bar on Flink’s official website is messed up.
[ https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779345#comment-17779345 ] Lijie Wang commented on FLINK-33356: [~Wencong Liu] Assigned to you. > The navigation bar on Flink’s official website is messed up. > > > Key: FLINK-33356 > URL: https://issues.apache.org/jira/browse/FLINK-33356 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Junrui Li >Assignee: Wencong Liu >Priority: Major > Attachments: image-2023-10-25-11-55-52-653.png, > image-2023-10-25-12-34-22-790.png > > > The side navigation bar on the Flink official website at the following link: > [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed > up, as shown in the attached screenshot. > !image-2023-10-25-11-55-52-653.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33356) The navigation bar on Flink’s official website is messed up.
[ https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33356: -- Assignee: Wencong Liu > The navigation bar on Flink’s official website is messed up. > > > Key: FLINK-33356 > URL: https://issues.apache.org/jira/browse/FLINK-33356 > Project: Flink > Issue Type: Bug > Components: Project Website >Reporter: Junrui Li >Assignee: Wencong Liu >Priority: Major > Attachments: image-2023-10-25-11-55-52-653.png, > image-2023-10-25-12-34-22-790.png > > > The side navigation bar on the Flink official website at the following link: > [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed > up, as shown in the attached screenshot. > !image-2023-10-25-11-55-52-653.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767357#comment-17767357 ] Lijie Wang commented on FLINK-32974: Fix via master: 6034d5ff335dc970672c290851811235399452fb release-1.18: 2aeb99804ba56c008df0a1730f3246d3fea856b9 release-1.17: 7c9e05ea8c67b12c657b60cd5e6d1bea52b4f9a3 > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0, 1.17.2 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Critical > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using > CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used > up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang resolved FLINK-32974. Fix Version/s: 1.18.0 1.17.2 Resolution: Fixed > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0, 1.17.2 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Critical > Fix For: 1.18.0, 1.17.2 > > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using > CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used > up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32974: --- Description: After FLINK-32226, a temporary directory(named {{flink-rest-client-jobgraphs*}}) is created when creating a new RestClusterClient, but this directory will never be cleaned up. This will result in a lot of {{flink-rest-client-jobgraphs*}} directories under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used up. was: After FLINK-32226, a temporary directory(named {{flink-rest-client-jobgraphs*}}) is created when creating a new RestClusterClient, but this directory will never be cleaned up. This will result in a lot of {{flink-rest-client-jobgraphs*}} directories under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher. > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0, 1.17.2 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Critical > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using > CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used > up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32974: --- Affects Version/s: 1.17.2 > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0, 1.17.2 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Critical > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32974: --- Priority: Critical (was: Major) > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Critical > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33084) Migrate globalJobParameter in ExecutionConfig to configuration instance
[ https://issues.apache.org/jira/browse/FLINK-33084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33084: -- Assignee: Junrui Li > Migrate globalJobParameter in ExecutionConfig to configuration instance > --- > > Key: FLINK-33084 > URL: https://issues.apache.org/jira/browse/FLINK-33084 > Project: Flink > Issue Type: Technical Debt > Components: Runtime / Configuration >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > Fix For: 1.19.0 > > > Currently, the globalJobParameter field in ExecutionConfig has not been > migrated to the Configuration. Considering the goal of unifying configuration > options, it is necessary to migrate it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33080) The checkpoint storage configured in the job level by config option will not take effect
[ https://issues.apache.org/jira/browse/FLINK-33080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-33080: -- Assignee: Junrui Li > The checkpoint storage configured in the job level by config option will not > take effect > > > Key: FLINK-33080 > URL: https://issues.apache.org/jira/browse/FLINK-33080 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > Fix For: 1.19.0 > > > When we configure the checkpoint storage at the job level, it can only be > done through the following method: > {code:java} > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.getCheckpointConfig().setCheckpointStorage(xxx); {code} > However, configure the checkpoint storage by the job-side configuration like > the following will not take effect: > {code:java} > Configuration configuration = new Configuration(); > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(configuration); > configuration.set(CheckpointingOptions.CHECKPOINT_STORAGE, "filesystem"); > configuration.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, checkpointDir); > {code} > This behavior is unexpected, we should allow this way will take effect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32974: -- Assignee: Lijie Wang > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32865) DynamicFilteringDataCollectorOperator can't chain with the upstream operator when the parallelism is inconsistent
[ https://issues.apache.org/jira/browse/FLINK-32865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32865. -- Assignee: Lijie Wang Resolution: Fixed > DynamicFilteringDataCollectorOperator can't chain with the upstream operator > when the parallelism is inconsistent > - > > Key: FLINK-32865 > URL: https://issues.apache.org/jira/browse/FLINK-32865 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Fix For: 1.18.0 > > Attachments: image-2023-08-14-19-17-22-109.png > > > !image-2023-08-14-19-17-22-109.png! > > If the DynamicFilteringDataCollectorOperator parallelism is not consistent > with the upstream operator, they can't chain together, this will the > DynamicFilteringDataCollectorOperator to execute after the fact source, so > the dpp won't work. Due to the operator parallelism being decided during > runtime, so we should add scheduler dependency forcibly in compile phase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32865) DynamicFilteringDataCollectorOperator can't chain with the upstream operator when the parallelism is inconsistent
[ https://issues.apache.org/jira/browse/FLINK-32865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32865: --- Fix Version/s: 1.19.0 > DynamicFilteringDataCollectorOperator can't chain with the upstream operator > when the parallelism is inconsistent > - > > Key: FLINK-32865 > URL: https://issues.apache.org/jira/browse/FLINK-32865 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Fix For: 1.18.0, 1.19.0 > > Attachments: image-2023-08-14-19-17-22-109.png > > > !image-2023-08-14-19-17-22-109.png! > > If the DynamicFilteringDataCollectorOperator parallelism is not consistent > with the upstream operator, they can't chain together, this will the > DynamicFilteringDataCollectorOperator to execute after the fact source, so > the dpp won't work. Due to the operator parallelism being decided during > runtime, so we should add scheduler dependency forcibly in compile phase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32865) DynamicFilteringDataCollectorOperator can't chain with the upstream operator when the parallelism is inconsistent
[ https://issues.apache.org/jira/browse/FLINK-32865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760389#comment-17760389 ] Lijie Wang commented on FLINK-32865: Fixed via: master(1.19): 5ce5c8142c8c360bc10804fa1e52b13a3376a50f 3fffed00df3234106464642f555275b846ff74b3 release-1.18: 83915d909dce3832ba2c3dbd9b1624715466aa2d 94ba97a13abc7e70a55d929fd5fdf85d52417639 > DynamicFilteringDataCollectorOperator can't chain with the upstream operator > when the parallelism is inconsistent > - > > Key: FLINK-32865 > URL: https://issues.apache.org/jira/browse/FLINK-32865 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: dalongliu >Priority: Major > Fix For: 1.18.0 > > Attachments: image-2023-08-14-19-17-22-109.png > > > !image-2023-08-14-19-17-22-109.png! > > If the DynamicFilteringDataCollectorOperator parallelism is not consistent > with the upstream operator, they can't chain together, this will the > DynamicFilteringDataCollectorOperator to execute after the fact source, so > the dpp won't work. Due to the operator parallelism being decided during > runtime, so we should add scheduler dependency forcibly in compile phase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task
[ https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760197#comment-17760197 ] Lijie Wang commented on FLINK-31650: Assigned to you [~Zhanghao Chen] > Incorrect busyMsTimePerSecond metric value for FINISHED task > > > Key: FLINK-31650 > URL: https://issues.apache.org/jira/browse/FLINK-31650 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics, Runtime / REST >Affects Versions: 1.17.0, 1.16.1, 1.15.4 >Reporter: Lijie Wang >Assignee: Junrui Li >Priority: Major > Attachments: busyMsTimePerSecond.png > > > As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is > 100%, which is obviously unreasonable. > !busyMsTimePerSecond.png|width=1048,height=432! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task
[ https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-31650: -- Assignee: Zhanghao Chen (was: Junrui Li) > Incorrect busyMsTimePerSecond metric value for FINISHED task > > > Key: FLINK-31650 > URL: https://issues.apache.org/jira/browse/FLINK-31650 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics, Runtime / REST >Affects Versions: 1.17.0, 1.16.1, 1.15.4 >Reporter: Lijie Wang >Assignee: Zhanghao Chen >Priority: Major > Attachments: busyMsTimePerSecond.png > > > As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is > 100%, which is obviously unreasonable. > !busyMsTimePerSecond.png|width=1048,height=432! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories
[ https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32974: --- Summary: RestClusterClient always leaks flink-rest-client-jobgraphs* directories (was: RestClusterClient leaks flink-rest-client-jobgraphs* directories) > RestClusterClient always leaks flink-rest-client-jobgraphs* directories > --- > > Key: FLINK-32974 > URL: https://issues.apache.org/jira/browse/FLINK-32974 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Priority: Major > > After FLINK-32226, a temporary directory(named > {{flink-rest-client-jobgraphs*}}) is created when creating a new > RestClusterClient, but this directory will never be cleaned up. > This will result in a lot of {{flink-rest-client-jobgraphs*}} directories > under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32974) RestClusterClient leaks flink-rest-client-jobgraphs* directories
Lijie Wang created FLINK-32974: -- Summary: RestClusterClient leaks flink-rest-client-jobgraphs* directories Key: FLINK-32974 URL: https://issues.apache.org/jira/browse/FLINK-32974 Project: Flink Issue Type: Bug Components: Client / Job Submission Affects Versions: 1.18.0 Reporter: Lijie Wang After FLINK-32226, a temporary directory(named {{flink-rest-client-jobgraphs*}}) is created when creating a new RestClusterClient, but this directory will never be cleaned up. This will result in a lot of {{flink-rest-client-jobgraphs*}} directories under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side
[ https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32831. -- Fix Version/s: 1.18.0 1.19.0 Resolution: Fixed > RuntimeFilterProgram should aware join type when looking for the build side > --- > > Key: FLINK-32831 > URL: https://issues.apache.org/jira/browse/FLINK-32831 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > Currently, runtime filter program will try to look for an {{Exchange}} as > build side to avoid affecting {{MultiInput}}. It will try to push down the > runtime filter builder if the original build side is not {{Exchange}}. > Currenlty, the builder-push-down does not aware the join type, which may lead > to incorrect results(For example, push down the builder to the right input of > left-join). > We should only support following cases: > 1. Inner join: builder can push to left + right input > 2. semi join: builder can push to left + right input > 3. left join: builder can only push to the left input > 4. right join: builder can only push to the right input -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side
[ https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759405#comment-17759405 ] Lijie Wang commented on FLINK-32831: Fixed via master: 43bcf319ff6104e1af829e34ae8a430c6417622d f518f8a5bdfcde4912961f60075e530399160f43 07888af59bbc2d24afa049e8a6aedcd9eb822986 a68dd419718b4304343c2b27dab94394c88c67b5 release-1.18: ac3f3cf4802d0271349636322dbd16772e86453b e48a02628349cdfdecf7a1aedd2a576fa2a7caf3 6938b928e0104f6b222a4281fbdc7bde20260f3b fa2ab5e3bfe401d0c599c8ffd224aafb69d6d503 > RuntimeFilterProgram should aware join type when looking for the build side > --- > > Key: FLINK-32831 > URL: https://issues.apache.org/jira/browse/FLINK-32831 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > > Currently, runtime filter program will try to look for an {{Exchange}} as > build side to avoid affecting {{MultiInput}}. It will try to push down the > runtime filter builder if the original build side is not {{Exchange}}. > Currenlty, the builder-push-down does not aware the join type, which may lead > to incorrect results(For example, push down the builder to the right input of > left-join). > We should only support following cases: > 1. Inner join: builder can push to left + right input > 2. semi join: builder can push to left + right input > 3. left join: builder can only push to the left input > 4. right join: builder can only push to the right input -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32780) Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32780: --- Description: This issue aims to verify FLIP-324: https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs We can enable runtime filter by set: table.optimizer.runtime-filter.enabled: true 1. Create two tables, one small table (small amount of data), one large table (large amount of data), and then run join query on these two tables(such as the example in FLIP doc: SELECT * FROM fact, dim WHERE x = a AND z = 2). The Flink table planner should be able to obtain the statistical information of these two tables (for example, Hive table), and the data volume of the small table should be less than "table.optimizer.runtime-filter.max-build-data-size", and the data volume of the large table should be larger than "table.optimizer.runtime-filter.min-probe-data-size". 2. Show the plan of the join query. The plan should include nodes such as LocalRuntimeFilterBuilder, GlobalRuntimeFilterBuilder and RuntimeFilter. We can also verify plan for the various variants of above query. 3. Execute the above plan, and: * Check whether the data in the large table has been successfully filtered * Verify the execution result, the execution result should be same with the execution plan which disable runtime filter. > Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch > Jobs > --- > > Key: FLINK-32780 > URL: https://issues.apache.org/jira/browse/FLINK-32780 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Assignee: dalongliu >Priority: Major > Fix For: 1.18.0 > > > This issue aims to verify FLIP-324: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs > We can enable runtime filter by set: table.optimizer.runtime-filter.enabled: > true > 1. Create two tables, one small table (small amount of data), one large table > (large amount of data), and then run join query on these two tables(such as > the example in FLIP doc: SELECT * FROM fact, dim WHERE x = a AND z = 2). The > Flink table planner should be able to obtain the statistical information of > these two tables (for example, Hive table), and the data volume of the small > table should be less than > "table.optimizer.runtime-filter.max-build-data-size", and the data volume of > the large table should be larger than > "table.optimizer.runtime-filter.min-probe-data-size". > 2. Show the plan of the join query. The plan should include nodes such as > LocalRuntimeFilterBuilder, GlobalRuntimeFilterBuilder and RuntimeFilter. We > can also verify plan for the various variants of above query. > 3. Execute the above plan, and: > * Check whether the data in the large table has been successfully filtered > * Verify the execution result, the execution result should be same with the > execution plan which disable runtime filter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32780) Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32780: -- Assignee: dalongliu > Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch > Jobs > --- > > Key: FLINK-32780 > URL: https://issues.apache.org/jira/browse/FLINK-32780 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Assignee: dalongliu >Priority: Major > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter
[ https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758317#comment-17758317 ] Lijie Wang commented on FLINK-32827: Fixed via master: 679436390db2ac1b54584dbafb6e1091f2f16ada release-1.18: 9cd04bd004fbf2d0fa4266cf07909c0bbcc94813 > Operator fusion codegen may not take effect when enable runtime filter > -- > > Key: FLINK-32827 > URL: https://issues.apache.org/jira/browse/FLINK-32827 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: dalongliu >Priority: Major > Labels: pull-request-available > > Currently, the RuntimeFilterOperator does not support operator fusion > codegen(OFCG), which means the Runtime Filter and OFCG can not take affect > together, we should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter
[ https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32827. -- Fix Version/s: 1.18.0 Resolution: Fixed > Operator fusion codegen may not take effect when enable runtime filter > -- > > Key: FLINK-32827 > URL: https://issues.apache.org/jira/browse/FLINK-32827 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: dalongliu >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > Currently, the RuntimeFilterOperator does not support operator fusion > codegen(OFCG), which means the Runtime Filter and OFCG can not take affect > together, we should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP
[ https://issues.apache.org/jira/browse/FLINK-32844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32844. -- Fix Version/s: 1.18.0 Resolution: Fixed > Runtime Filter should not be applied if the field is already filtered by DPP > > > Key: FLINK-32844 > URL: https://issues.apache.org/jira/browse/FLINK-32844 > Project: Flink > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > Currently, the runtime filter and DPP may take effect on the same key. In > this case, the runtime filter may be redundant because the data may have been > filtered out by the DPP. We should avoid this because redundant runtime > filters can have negative effects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP
[ https://issues.apache.org/jira/browse/FLINK-32844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754890#comment-17754890 ] Lijie Wang commented on FLINK-32844: Fix via master(1.18): 6b7d6d67808b7fcdd6fc93222c6c7055c4343475 > Runtime Filter should not be applied if the field is already filtered by DPP > > > Key: FLINK-32844 > URL: https://issues.apache.org/jira/browse/FLINK-32844 > Project: Flink > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > > Currently, the runtime filter and DPP may take effect on the same key. In > this case, the runtime filter may be redundant because the data may have been > filtered out by the DPP. We should avoid this because redundant runtime > filters can have negative effects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32876) ExecutionTimeBasedSlowTaskDetector treats unscheduled tasks as slow tasks and causes speculative execution to fail.
[ https://issues.apache.org/jira/browse/FLINK-32876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32876: -- Assignee: Junrui Li > ExecutionTimeBasedSlowTaskDetector treats unscheduled tasks as slow tasks and > causes speculative execution to fail. > --- > > Key: FLINK-32876 > URL: https://issues.apache.org/jira/browse/FLINK-32876 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > Fix For: 1.18.0 > > > When we enable speculative execution and configure job with the following > configuration: > {code:java} > execution.batch.speculative.enabled: true > slow-task-detector.execution-time.baseline-ratio: 0.0 > slow-task-detector.execution-time.baseline-lower-bound: 0s{code} > The ExecutionTimeBasedSlowTaskDetector will identify ExecutionJobVertex that > has not yet been scheduled as slow tasks and notify them to the > SpeculativeScheduler. However, the SpeculativeScheduler requires that the > corresponding ExecutionVertex has entered the scheduled state before > scheduling backup tasks. If this requirement is not met, it will result in > speculative execution failure. > The exception stack trace is as follows: > {code:java} > java.lang.IllegalStateException: Execution vertex > b3f44e8b1dc132ff2a47f7955c75ef7d_0 does not have a recorded version at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:215) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.scheduler.ExecutionVertexVersioner.getCurrentVersion(ExecutionVertexVersioner.java:71) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.scheduler.ExecutionVertexVersioner.lambda$getExecutionVertexVersions$1(ExecutionVertexVersioner.java:89) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > ~[?:1.8.0_333] at > java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580) > ~[?:1.8.0_333] at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > ~[?:1.8.0_333] at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > ~[?:1.8.0_333] at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > ~[?:1.8.0_333] at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > ~[?:1.8.0_333] at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > ~[?:1.8.0_333] at > org.apache.flink.runtime.scheduler.ExecutionVertexVersioner.getExecutionVertexVersions(ExecutionVertexVersioner.java:90) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.scheduler.adaptivebatch.SpeculativeScheduler.notifySlowTasks(SpeculativeScheduler.java:377) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.scheduler.slowtaskdetector.ExecutionTimeBasedSlowTaskDetector.lambda$scheduleTask$1(ExecutionTimeBasedSlowTaskDetector.java:129) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[?:1.8.0_333] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[?:1.8.0_333] at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451) > ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451) > ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218) > ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85) > ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168) > ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) > [flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) > [flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT] at > scala.PartialFunction.applyOrElse(PartialFunction.
[jira] [Assigned] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP
[ https://issues.apache.org/jira/browse/FLINK-32844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32844: -- Assignee: Lijie Wang > Runtime Filter should not be applied if the field is already filtered by DPP > > > Key: FLINK-32844 > URL: https://issues.apache.org/jira/browse/FLINK-32844 > Project: Flink > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > > Currently, the runtime filter and DPP may take effect on the same key. In > this case, the runtime filter may be redundant because the data may have been > filtered out by the DPP. We should avoid this because redundant runtime > filters can have negative effects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP
Lijie Wang created FLINK-32844: -- Summary: Runtime Filter should not be applied if the field is already filtered by DPP Key: FLINK-32844 URL: https://issues.apache.org/jira/browse/FLINK-32844 Project: Flink Issue Type: Bug Affects Versions: 1.18.0 Reporter: Lijie Wang Currently, the runtime filter and DPP may take effect on the same key. In this case, the runtime filter may be redundant because the data may have been filtered out by the DPP. We should avoid this because redundant runtime filters can have negative effects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side
[ https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32831: -- Assignee: Lijie Wang > RuntimeFilterProgram should aware join type when looking for the build side > --- > > Key: FLINK-32831 > URL: https://issues.apache.org/jira/browse/FLINK-32831 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > > Currently, runtime filter program will try to look for an {{Exchange}} as > build side to avoid affecting {{MultiInput}}. It will try to push down the > runtime filter builder if the original build side is not {{Exchange}}. > Currenlty, the builder-push-down does not aware the join type, which may lead > to incorrect results(For example, push down the builder to the right input of > left-join). > We should only support following cases: > 1. Inner join: builder can push to left + right input > 2. semi join: builder can push to left + right input > 3. left join: builder can only push to the left input > 4. right join: builder can only push to the right input -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side
[ https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32831: --- Description: Currently, runtime filter program will try to look for an {{Exchange}} as build side to avoid affecting {{MultiInput}}. It will try to push down the runtime filter builder if the original build side is not {{Exchange}}. Currenlty, the builder-push-down does not aware the join type, which may lead to incorrect results(For example, push down the builder to the right input of left-join). We should only support following cases: 1. Inner join: builder can push to left + right input 2. semi join: builder can push to left + right input 3. left join: builder can only push to the left input 4. right join: builder can only push to the right input was: Currently, runtime filter program will try to look for an {{Exchange}} as build side to avoid affecting {{MultiInput}}. It will try to push down the runtime filter builder if the original build side is not {{Exchange}}. Currenlty, the builder-push-down does not aware the join type, which may lead to incorrect results(For example, push down the builder to the right input of left-join). We should only support following cases: 1. Inner join: builder can push to left + right input 2. semi join: builder can push to left + right input 3. left join: builder can only push to the left input 4. right join: builder can only push to the right input > RuntimeFilterProgram should aware join type when looking for the build side > --- > > Key: FLINK-32831 > URL: https://issues.apache.org/jira/browse/FLINK-32831 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Priority: Major > > Currently, runtime filter program will try to look for an {{Exchange}} as > build side to avoid affecting {{MultiInput}}. It will try to push down the > runtime filter builder if the original build side is not {{Exchange}}. > Currenlty, the builder-push-down does not aware the join type, which may lead > to incorrect results(For example, push down the builder to the right input of > left-join). > We should only support following cases: > 1. Inner join: builder can push to left + right input > 2. semi join: builder can push to left + right input > 3. left join: builder can only push to the left input > 4. right join: builder can only push to the right input -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side
Lijie Wang created FLINK-32831: -- Summary: RuntimeFilterProgram should aware join type when looking for the build side Key: FLINK-32831 URL: https://issues.apache.org/jira/browse/FLINK-32831 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.18.0 Reporter: Lijie Wang Currently, runtime filter program will try to look for an {{Exchange}} as build side to avoid affecting {{MultiInput}}. It will try to push down the runtime filter builder if the original build side is not {{Exchange}}. Currenlty, the builder-push-down does not aware the join type, which may lead to incorrect results(For example, push down the builder to the right input of left-join). We should only support following cases: 1. Inner join: builder can push to left + right input 2. semi join: builder can push to left + right input 3. left join: builder can only push to the left input 4. right join: builder can only push to the right input -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter
[ https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32827: -- Assignee: dalongliu > Operator fusion codegen may not take effect when enable runtime filter > -- > > Key: FLINK-32827 > URL: https://issues.apache.org/jira/browse/FLINK-32827 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: dalongliu >Priority: Major > > Currently, the RuntimeFilterOperator does not support operator fusion > codegen(OFCG), which means the Runtime Filter and OFCG can not take affect > together, we should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter
[ https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752724#comment-17752724 ] Lijie Wang commented on FLINK-32827: cc [~lsy] > Operator fusion codegen may not take effect when enable runtime filter > -- > > Key: FLINK-32827 > URL: https://issues.apache.org/jira/browse/FLINK-32827 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Lijie Wang >Assignee: dalongliu >Priority: Major > > Currently, the RuntimeFilterOperator does not support operator fusion > codegen(OFCG), which means the Runtime Filter and OFCG can not take affect > together, we should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter
Lijie Wang created FLINK-32827: -- Summary: Operator fusion codegen may not take effect when enable runtime filter Key: FLINK-32827 URL: https://issues.apache.org/jira/browse/FLINK-32827 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.18.0 Reporter: Lijie Wang Currently, the RuntimeFilterOperator does not support operator fusion codegen(OFCG), which means the Runtime Filter and OFCG can not take affect together, we should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32788) SpeculativeScheduler do not handle allocate task errors when schedule speculative tasks may causes resource leakage.
[ https://issues.apache.org/jira/browse/FLINK-32788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32788: -- Assignee: Junrui Li > SpeculativeScheduler do not handle allocate task errors when schedule > speculative tasks may causes resource leakage. > > > Key: FLINK-32788 > URL: https://issues.apache.org/jira/browse/FLINK-32788 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Junrui Li >Assignee: Junrui Li >Priority: Major > > When the SpeculativeScheduler allocates slots for speculative tasks, > exceptions may occur, but there is currently no exception handling mechanism > in place. This can lead to resource leakage (such as FLINK-32768) when errors > occur. In such cases, a fatalError should be triggered. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32768) SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource times out
[ https://issues.apache.org/jira/browse/FLINK-32768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32768: --- Priority: Blocker (was: Critical) > SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource times > out > > > Key: FLINK-32768 > URL: https://issues.apache.org/jira/browse/FLINK-32768 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests >Affects Versions: 1.18.0 >Reporter: Matthias Pohl >Assignee: Junrui Li >Priority: Blocker > Labels: test-stability > > SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource is > timing out: > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52009&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=9022 > {code} > 2023-08-06T05:34:27.1867230Z Aug 06 05:34:27 "ForkJoinPool-1-worker-1" #14 > daemon prio=5 os_prio=0 tid=0x7fb7d4e82800 nid=0x6dde7 waiting on > condition [0x7fb7834a4000] > 2023-08-06T05:34:27.1867541Z Aug 06 05:34:27java.lang.Thread.State: > WAITING (parking) > 2023-08-06T05:34:27.186Z Aug 06 05:34:27 at sun.misc.Unsafe.park(Native > Method) > 2023-08-06T05:34:27.1868191Z Aug 06 05:34:27 - parking to wait for > <0xa77360d8> (a java.util.concurrent.CompletableFuture$Signaller) > 2023-08-06T05:34:27.1868571Z Aug 06 05:34:27 at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > 2023-08-06T05:34:27.1868896Z Aug 06 05:34:27 at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707) > 2023-08-06T05:34:27.1869240Z Aug 06 05:34:27 at > java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3313) > 2023-08-06T05:34:27.1869682Z Aug 06 05:34:27 at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742) > 2023-08-06T05:34:27.1870022Z Aug 06 05:34:27 at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > 2023-08-06T05:34:27.1870395Z Aug 06 05:34:27 at > org.apache.flink.test.scheduling.SpeculativeSchedulerITCase.executeJob(SpeculativeSchedulerITCase.java:229) > 2023-08-06T05:34:27.1870858Z Aug 06 05:34:27 at > org.apache.flink.test.scheduling.SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource(SpeculativeSchedulerITCase.java:165) > 2023-08-06T05:34:27.1871251Z Aug 06 05:34:27 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [...] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32680. -- Fix Version/s: 1.18.0 1.16.3 1.17.2 Resolution: Fixed > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Lijie Wang >Assignee: Junrui Li >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.16.3, 1.17.2 > > Attachments: image-2023-07-26-15-23-29-551.png, > image-2023-07-26-15-24-24-077.png > > > Take the following test(put it to {{MultipleInputITCase}}) as example: > {code:java} > @Test > public void testMultipleInputDoesNotChainedWithSource() throws Exception { > testJobVertexName(false); > } > > @Test > public void testMultipleInputChainedWithSource() throws Exception { > testJobVertexName(true); > } > public void testJobVertexName(boolean chain) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > TestListResultSink resultSink = new TestListResultSink<>(); > DataStream source1 = env.fromSequence(0L, 3L).name("source1"); > DataStream source2 = env.fromElements(4L, 6L).name("source2"); > DataStream source3 = env.fromElements(7L, 9L).name("source3"); > KeyedMultipleInputTransformation transform = > new KeyedMultipleInputTransformation<>( > "MultipleInput", > new KeyedSumMultipleInputOperatorFactory(), > BasicTypeInfo.LONG_TYPE_INFO, > 1, > BasicTypeInfo.LONG_TYPE_INFO); > if (chain) { > transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); > } > KeySelector keySelector = (KeySelector) value > -> value % 3; > env.addOperator( > transform > .addInput(source1.getTransformation(), keySelector) > .addInput(source2.getTransformation(), keySelector) > .addInput(source3.getTransformation(), keySelector)); > new > MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); > env.execute(); > }{code} > > When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex > names are normal: > !image-2023-07-26-15-24-24-077.png|width=494,height=246! > When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained > with source1), job vertex names get messed up (all job vertex names contain > {{{}Source: source1{}}}): > !image-2023-07-26-15-23-29-551.png|width=515,height=182! > > I think it's a bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749791#comment-17749791 ] Lijie Wang commented on FLINK-32680: Fixed via: master(1.18): 9e6f8e8e127e8dd81ded4a2214fd710c8ff8180a release-1.17: 5a87cfac875feabf29c0f001e5591ca48a9e5e94 release-1.16: 3c46ffb1e27d4a369c7016b631fb8d5f944c3e50 > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Lijie Wang >Assignee: Junrui Li >Priority: Major > Labels: pull-request-available > Attachments: image-2023-07-26-15-23-29-551.png, > image-2023-07-26-15-24-24-077.png > > > Take the following test(put it to {{MultipleInputITCase}}) as example: > {code:java} > @Test > public void testMultipleInputDoesNotChainedWithSource() throws Exception { > testJobVertexName(false); > } > > @Test > public void testMultipleInputChainedWithSource() throws Exception { > testJobVertexName(true); > } > public void testJobVertexName(boolean chain) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > TestListResultSink resultSink = new TestListResultSink<>(); > DataStream source1 = env.fromSequence(0L, 3L).name("source1"); > DataStream source2 = env.fromElements(4L, 6L).name("source2"); > DataStream source3 = env.fromElements(7L, 9L).name("source3"); > KeyedMultipleInputTransformation transform = > new KeyedMultipleInputTransformation<>( > "MultipleInput", > new KeyedSumMultipleInputOperatorFactory(), > BasicTypeInfo.LONG_TYPE_INFO, > 1, > BasicTypeInfo.LONG_TYPE_INFO); > if (chain) { > transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); > } > KeySelector keySelector = (KeySelector) value > -> value % 3; > env.addOperator( > transform > .addInput(source1.getTransformation(), keySelector) > .addInput(source2.getTransformation(), keySelector) > .addInput(source3.getTransformation(), keySelector)); > new > MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); > env.execute(); > }{code} > > When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex > names are normal: > !image-2023-07-26-15-24-24-077.png|width=494,height=246! > When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained > with source1), job vertex names get messed up (all job vertex names contain > {{{}Source: source1{}}}): > !image-2023-07-26-15-23-29-551.png|width=515,height=182! > > I think it's a bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32486) FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-32486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32486. -- Release Note: We introduce runtime filter for batch jobs in 1.18, which is designed to improve join performance. It will dynamically generate filter conditions for certain Join queries at runtime to reduce the amount of scanned or shuffled data, avoid unnecessary I/O and network transmission, and speed up the query. Its working principle is building a filter(e.g. bloom filter) based on the data on the small table side(build side) first, then pass this filter to the large table side(probe side) to filter the irrelevant data on it, this can reduce the data reaching the join and improve performance. Resolution: Done > FLIP-324: Introduce Runtime Filter for Flink Batch Jobs > --- > > Key: FLINK-32486 > URL: https://issues.apache.org/jira/browse/FLINK-32486 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner, Table SQL / Runtime >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Fix For: 1.18.0 > > > This is an umbrella ticket for > [FLIP-324|https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32486) FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
[ https://issues.apache.org/jira/browse/FLINK-32486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747806#comment-17747806 ] Lijie Wang commented on FLINK-32486: Thanks for reminder [~knaufk]. This feature is a performance optimization for join, and we tend to enable it by default soon (maybe next flink version), so we think it's not necessary to prepare a separate document for it (the description of config option is enough), and we will mention it in release note. > FLIP-324: Introduce Runtime Filter for Flink Batch Jobs > --- > > Key: FLINK-32486 > URL: https://issues.apache.org/jira/browse/FLINK-32486 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner, Table SQL / Runtime >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Fix For: 1.18.0 > > > This is an umbrella ticket for > [FLIP-324|https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32680: -- Assignee: Junrui Li > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Lijie Wang >Assignee: Junrui Li >Priority: Major > Attachments: image-2023-07-26-15-23-29-551.png, > image-2023-07-26-15-24-24-077.png > > > Take the following test(put it to {{MultipleInputITCase}}) as example: > {code:java} > @Test > public void testMultipleInputDoesNotChainedWithSource() throws Exception { > testJobVertexName(false); > } > > @Test > public void testMultipleInputChainedWithSource() throws Exception { > testJobVertexName(true); > } > public void testJobVertexName(boolean chain) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > TestListResultSink resultSink = new TestListResultSink<>(); > DataStream source1 = env.fromSequence(0L, 3L).name("source1"); > DataStream source2 = env.fromElements(4L, 6L).name("source2"); > DataStream source3 = env.fromElements(7L, 9L).name("source3"); > KeyedMultipleInputTransformation transform = > new KeyedMultipleInputTransformation<>( > "MultipleInput", > new KeyedSumMultipleInputOperatorFactory(), > BasicTypeInfo.LONG_TYPE_INFO, > 1, > BasicTypeInfo.LONG_TYPE_INFO); > if (chain) { > transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); > } > KeySelector keySelector = (KeySelector) value > -> value % 3; > env.addOperator( > transform > .addInput(source1.getTransformation(), keySelector) > .addInput(source2.getTransformation(), keySelector) > .addInput(source3.getTransformation(), keySelector)); > new > MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); > env.execute(); > }{code} > > When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex > names are normal: > !image-2023-07-26-15-24-24-077.png|width=494,height=246! > When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained > with source1), job vertex names get messed up (all job vertex names contain > {{{}Source: source1{}}}): > !image-2023-07-26-15-23-29-551.png|width=515,height=182! > > I think it's a bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747411#comment-17747411 ] Lijie Wang commented on FLINK-32680: Thanks [~JunRuiLi] , assiged to you. > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Lijie Wang >Priority: Major > Attachments: image-2023-07-26-15-23-29-551.png, > image-2023-07-26-15-24-24-077.png > > > Take the following test(put it to {{MultipleInputITCase}}) as example: > {code:java} > @Test > public void testMultipleInputDoesNotChainedWithSource() throws Exception { > testJobVertexName(false); > } > > @Test > public void testMultipleInputChainedWithSource() throws Exception { > testJobVertexName(true); > } > public void testJobVertexName(boolean chain) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > TestListResultSink resultSink = new TestListResultSink<>(); > DataStream source1 = env.fromSequence(0L, 3L).name("source1"); > DataStream source2 = env.fromElements(4L, 6L).name("source2"); > DataStream source3 = env.fromElements(7L, 9L).name("source3"); > KeyedMultipleInputTransformation transform = > new KeyedMultipleInputTransformation<>( > "MultipleInput", > new KeyedSumMultipleInputOperatorFactory(), > BasicTypeInfo.LONG_TYPE_INFO, > 1, > BasicTypeInfo.LONG_TYPE_INFO); > if (chain) { > transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); > } > KeySelector keySelector = (KeySelector) value > -> value % 3; > env.addOperator( > transform > .addInput(source1.getTransformation(), keySelector) > .addInput(source2.getTransformation(), keySelector) > .addInput(source3.getTransformation(), keySelector)); > new > MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); > env.execute(); > }{code} > > When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex > names are normal: > !image-2023-07-26-15-24-24-077.png|width=494,height=246! > When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained > with source1), job vertex names get messed up (all job vertex names contain > {{{}Source: source1{}}}): > !image-2023-07-26-15-23-29-551.png|width=515,height=182! > > I think it's a bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32680: --- Description: Take the following test(put it to {{MultipleInputITCase}}) as example: {code:java} @Test public void testMultipleInputDoesNotChainedWithSource() throws Exception { testJobVertexName(false); } @Test public void testMultipleInputChainedWithSource() throws Exception { testJobVertexName(true); } public void testJobVertexName(boolean chain) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); TestListResultSink resultSink = new TestListResultSink<>(); DataStream source1 = env.fromSequence(0L, 3L).name("source1"); DataStream source2 = env.fromElements(4L, 6L).name("source2"); DataStream source3 = env.fromElements(7L, 9L).name("source3"); KeyedMultipleInputTransformation transform = new KeyedMultipleInputTransformation<>( "MultipleInput", new KeyedSumMultipleInputOperatorFactory(), BasicTypeInfo.LONG_TYPE_INFO, 1, BasicTypeInfo.LONG_TYPE_INFO); if (chain) { transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); } KeySelector keySelector = (KeySelector) value -> value % 3; env.addOperator( transform .addInput(source1.getTransformation(), keySelector) .addInput(source2.getTransformation(), keySelector) .addInput(source3.getTransformation(), keySelector)); new MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); env.execute(); }{code} When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex names are normal: !image-2023-07-26-15-24-24-077.png|width=494,height=246! When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained with source1), job vertex names get messed up (all job vertex names contain {{{}Source: source1{}}}): !image-2023-07-26-15-23-29-551.png|width=515,height=182! I think it's a bug. was: Take the following test(put it to {{{}MultipleInputITCase{}}}) as example: {code:java} @Test public void testMultipleInputDoesNotChainedWithSource() throws Exception { testJobVertexName(false); } @Test public void testMultipleInputChainedWithSource() throws Exception { testJobVertexName(true); } public void testJobVertexName(boolean chain) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); TestListResultSink resultSink = new TestListResultSink<>(); DataStream source1 = env.fromSequence(0L, 3L).name("source1"); DataStream source2 = env.fromElements(4L, 6L).name("source2"); DataStream source3 = env.fromElements(7L, 9L).name("source3"); KeyedMultipleInputTransformation transform = new KeyedMultipleInputTransformation<>( "MultipleInput", new KeyedSumMultipleInputOperatorFactory(), BasicTypeInfo.LONG_TYPE_INFO, 1, BasicTypeInfo.LONG_TYPE_INFO); if (chain) { transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); } KeySelector keySelector = (KeySelector) value -> value % 3; env.addOperator( transform .addInput(source1.getTransformation(), keySelector) .addInput(source2.getTransformation(), keySelector) .addInput(source3.getTransformation(), keySelector)); new MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); env.execute(); }{code} When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex names are normal: !image-2023-07-26-15-24-24-077.png|width=494,height=246! When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get messed up (all names contain {{{}Source: source1{}}}): !image-2023-07-26-15-23-29-551.png|width=515,height=182! I think it's a bug. > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.1
[jira] [Updated] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32680: --- Attachment: (was: image-2023-07-26-15-01-51-886.png) > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Lijie Wang >Priority: Major > Attachments: image-2023-07-26-15-23-29-551.png, > image-2023-07-26-15-24-24-077.png > > > Take the following test(put it to {{{}MultipleInputITCase{}}}) as example: > {code:java} > @Test > public void testMultipleInputDoesNotChainedWithSource() throws Exception { > testJobVertexName(false); > } > > @Test > public void testMultipleInputChainedWithSource() throws Exception { > testJobVertexName(true); > } > public void testJobVertexName(boolean chain) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > TestListResultSink resultSink = new TestListResultSink<>(); > DataStream source1 = env.fromSequence(0L, 3L).name("source1"); > DataStream source2 = env.fromElements(4L, 6L).name("source2"); > DataStream source3 = env.fromElements(7L, 9L).name("source3"); > KeyedMultipleInputTransformation transform = > new KeyedMultipleInputTransformation<>( > "MultipleInput", > new KeyedSumMultipleInputOperatorFactory(), > BasicTypeInfo.LONG_TYPE_INFO, > 1, > BasicTypeInfo.LONG_TYPE_INFO); > if (chain) { > transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); > } > KeySelector keySelector = (KeySelector) value > -> value % 3; > env.addOperator( > transform > .addInput(source1.getTransformation(), keySelector) > .addInput(source2.getTransformation(), keySelector) > .addInput(source3.getTransformation(), keySelector)); > new > MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); > env.execute(); > }{code} > > When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex > names are normal: > !image-2023-07-26-15-24-24-077.png|width=494,height=246! > When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get > messed up (all names contain {{{}Source: source1{}}}): > !image-2023-07-26-15-23-29-551.png|width=515,height=182! > > I think it's a bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
[ https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32680: --- Description: Take the following test(put it to {{{}MultipleInputITCase{}}}) as example: {code:java} @Test public void testMultipleInputDoesNotChainedWithSource() throws Exception { testJobVertexName(false); } @Test public void testMultipleInputChainedWithSource() throws Exception { testJobVertexName(true); } public void testJobVertexName(boolean chain) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); TestListResultSink resultSink = new TestListResultSink<>(); DataStream source1 = env.fromSequence(0L, 3L).name("source1"); DataStream source2 = env.fromElements(4L, 6L).name("source2"); DataStream source3 = env.fromElements(7L, 9L).name("source3"); KeyedMultipleInputTransformation transform = new KeyedMultipleInputTransformation<>( "MultipleInput", new KeyedSumMultipleInputOperatorFactory(), BasicTypeInfo.LONG_TYPE_INFO, 1, BasicTypeInfo.LONG_TYPE_INFO); if (chain) { transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); } KeySelector keySelector = (KeySelector) value -> value % 3; env.addOperator( transform .addInput(source1.getTransformation(), keySelector) .addInput(source2.getTransformation(), keySelector) .addInput(source3.getTransformation(), keySelector)); new MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); env.execute(); }{code} When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex names are normal: !image-2023-07-26-15-24-24-077.png|width=494,height=246! When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get messed up (all names contain {{{}Source: source1{}}}): !image-2023-07-26-15-23-29-551.png|width=515,height=182! I think it's a bug. was: Take the following test(put it to {{{}MultipleInputITCase{}}}) as example: {code:java} @Test public void testMultipleInputDoesNotChainedWithSource() throws Exception { testJobVertexName(false); } @Test public void testMultipleInputChainedWithSource() throws Exception { testJobVertexName(true); } public void testJobVertexName(boolean chain) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); TestListResultSink resultSink = new TestListResultSink<>(); DataStream source1 = env.fromSequence(0L, 3L).name("source1"); DataStream source2 = env.fromElements(4L, 6L).name("source2"); DataStream source3 = env.fromElements(7L, 9L).name("source3"); KeyedMultipleInputTransformation transform = new KeyedMultipleInputTransformation<>( "MultipleInput", new KeyedSumMultipleInputOperatorFactory(), BasicTypeInfo.LONG_TYPE_INFO, 1, BasicTypeInfo.LONG_TYPE_INFO); if (chain) { transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); } KeySelector keySelector = (KeySelector) value -> value % 3; env.addOperator( transform .addInput(source1.getTransformation(), keySelector) .addInput(source2.getTransformation(), keySelector) .addInput(source3.getTransformation(), keySelector)); new MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); env.execute(); }{code} When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex names are normal. !image-2023-07-26-15-24-24-077.png|width=494,height=246! When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get messed up (all names contain {{{}Source: source1{}}}). I think it's a bug. !image-2023-07-26-15-23-29-551.png|width=515,height=182! > Job vertex names get messed up once there is a source vertex chained with a > MultipleInput vertex in job graph > - > > Key: FLINK-32680 > URL: https://issues.apache.org/jira/browse/FLINK-32680 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.2, 1.18.0, 1.17.1 >Reporter: Lij
[jira] [Created] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph
Lijie Wang created FLINK-32680: -- Summary: Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph Key: FLINK-32680 URL: https://issues.apache.org/jira/browse/FLINK-32680 Project: Flink Issue Type: Bug Affects Versions: 1.17.1, 1.16.2, 1.18.0 Reporter: Lijie Wang Attachments: image-2023-07-26-15-01-51-886.png, image-2023-07-26-15-23-29-551.png, image-2023-07-26-15-24-24-077.png Take the following test(put it to {{{}MultipleInputITCase{}}}) as example: {code:java} @Test public void testMultipleInputDoesNotChainedWithSource() throws Exception { testJobVertexName(false); } @Test public void testMultipleInputChainedWithSource() throws Exception { testJobVertexName(true); } public void testJobVertexName(boolean chain) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); TestListResultSink resultSink = new TestListResultSink<>(); DataStream source1 = env.fromSequence(0L, 3L).name("source1"); DataStream source2 = env.fromElements(4L, 6L).name("source2"); DataStream source3 = env.fromElements(7L, 9L).name("source3"); KeyedMultipleInputTransformation transform = new KeyedMultipleInputTransformation<>( "MultipleInput", new KeyedSumMultipleInputOperatorFactory(), BasicTypeInfo.LONG_TYPE_INFO, 1, BasicTypeInfo.LONG_TYPE_INFO); if (chain) { transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES); } KeySelector keySelector = (KeySelector) value -> value % 3; env.addOperator( transform .addInput(source1.getTransformation(), keySelector) .addInput(source2.getTransformation(), keySelector) .addInput(source3.getTransformation(), keySelector)); new MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink"); env.execute(); }{code} When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex names are normal. !image-2023-07-26-15-24-24-077.png|width=494,height=246! When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get messed up (all names contain {{{}Source: source1{}}}). I think it's a bug. !image-2023-07-26-15-23-29-551.png|width=515,height=182! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32492) Introduce FlinkRuntimeFilterProgram to inject runtime filter
[ https://issues.apache.org/jira/browse/FLINK-32492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746058#comment-17746058 ] Lijie Wang commented on FLINK-32492: Done via master(1.18): 8659dd788d0e9bc5e534377fd065f925a3e33bbb ad20b19fff808bf3a191279f0f137952a78c083a 72bee90ccb40b71e760d251b26c8d48e1b110307 9f73d3d81a5471998d854001a09c7299a10f1424 > Introduce FlinkRuntimeFilterProgram to inject runtime filter > > > Key: FLINK-32492 > URL: https://issues.apache.org/jira/browse/FLINK-32492 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32492) Introduce FlinkRuntimeFilterProgram to inject runtime filter
[ https://issues.apache.org/jira/browse/FLINK-32492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32492. -- Resolution: Done > Introduce FlinkRuntimeFilterProgram to inject runtime filter > > > Key: FLINK-32492 > URL: https://issues.apache.org/jira/browse/FLINK-32492 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32045) optimize task deployment performance for large-scale jobs
[ https://issues.apache.org/jira/browse/FLINK-32045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32045: --- Fix Version/s: 1.18.0 > optimize task deployment performance for large-scale jobs > - > > Key: FLINK-32045 > URL: https://issues.apache.org/jira/browse/FLINK-32045 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Fix For: 1.18.0 > > > h1. Background > In FLINK-21110, we cache shuffle descriptors on the job manager side and > support using blob servers to offload these descriptors in order to reduce > the cost of tasks deployment. > I think there is also some improvement we could do for large-scale jobs. > # The default min size to enable distribution via blob server is 1MB. But > for a large wordcount job with 2 parallelism, the size of serialized > shuffle descriptors is only 300KB. It means users need to lower the > "blob.offload.minsize", but the value is hard for users to decide. > # The task executor side still needs to load blob files and deserialize > shuffle descriptors for each task. Since these operations are running in the > main thread, it may be pending other RPCs from the job manager. > h1. Propose > # Enable distribute shuffle descriptors via blob server automatically. This > could be decided by the edge number of the current shuffle descriptor. The > blob offload will be enabled when the edge number exceeds an internal > threshold. > # Introduce cache of deserialized shuffle descriptors on the task executor > side. This could reduce the cost of reading from local blob files and > deserialization. Of course, the cache should have TTL to avoid occupying too > much memory. And the cache should have the same switch mechanism as the blob > server offload. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32386) Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors
[ https://issues.apache.org/jira/browse/FLINK-32386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32386: --- Fix Version/s: 1.18.0 > Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors > --- > > Key: FLINK-32386 > URL: https://issues.apache.org/jira/browse/FLINK-32386 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > Introduce a new struct named ShuffleDescriptorsCache to cache > ShuffleDescriptorAndIndex which are offloaded by the blob server. > The cache should have the following capabilities: > 1. Expired after exceeding the TTL. > 2. Limit the size of the cache. Remove the oldest element from the cache when > its maximum size has been exceeded. > 3. Clear elements belong to a job when it disconnects from TaskExecutor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32387) InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle descriptors multiple times
[ https://issues.apache.org/jira/browse/FLINK-32387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32387. -- Fix Version/s: 1.18.0 Resolution: Done > InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle > descriptors multiple times > -- > > Key: FLINK-32387 > URL: https://issues.apache.org/jira/browse/FLINK-32387 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle > descriptors multiple times. > The cache only affects when the shuffle descriptors are offloaded by the blob > server. This means the shuffle descriptors size is large enough to use caches. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32387) InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle descriptors multiple times
[ https://issues.apache.org/jira/browse/FLINK-32387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746040#comment-17746040 ] Lijie Wang commented on FLINK-32387: Done via master(1.18): 7a9efbff0a2ea3e4e736c48a57a362cc9c5bbf37 > InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle > descriptors multiple times > -- > > Key: FLINK-32387 > URL: https://issues.apache.org/jira/browse/FLINK-32387 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > > InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle > descriptors multiple times. > The cache only affects when the shuffle descriptors are offloaded by the blob > server. This means the shuffle descriptors size is large enough to use caches. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32385) Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group of ShuffleDescriptorAndIndex
[ https://issues.apache.org/jira/browse/FLINK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32385: --- Fix Version/s: 1.18.0 > Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group > of ShuffleDescriptorAndIndex > - > > Key: FLINK-32385 > URL: https://issues.apache.org/jira/browse/FLINK-32385 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > Introduce a new struct named SerializedShuffleDescriptorAndIndices to > identify a group of ShuffleDescriptorAndIndex. > Then we could cache these ShuffleDescriptorAndIndex in TaskExecutor side -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32386) Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors
[ https://issues.apache.org/jira/browse/FLINK-32386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32386. -- Resolution: Done > Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors > --- > > Key: FLINK-32386 > URL: https://issues.apache.org/jira/browse/FLINK-32386 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > > Introduce a new struct named ShuffleDescriptorsCache to cache > ShuffleDescriptorAndIndex which are offloaded by the blob server. > The cache should have the following capabilities: > 1. Expired after exceeding the TTL. > 2. Limit the size of the cache. Remove the oldest element from the cache when > its maximum size has been exceeded. > 3. Clear elements belong to a job when it disconnects from TaskExecutor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32386) Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors
[ https://issues.apache.org/jira/browse/FLINK-32386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746039#comment-17746039 ] Lijie Wang commented on FLINK-32386: Done via master(1.18): 4d642635c9409883724bd0127f2a00ef14993bf0 > Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors > --- > > Key: FLINK-32386 > URL: https://issues.apache.org/jira/browse/FLINK-32386 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > > Introduce a new struct named ShuffleDescriptorsCache to cache > ShuffleDescriptorAndIndex which are offloaded by the blob server. > The cache should have the following capabilities: > 1. Expired after exceeding the TTL. > 2. Limit the size of the cache. Remove the oldest element from the cache when > its maximum size has been exceeded. > 3. Clear elements belong to a job when it disconnects from TaskExecutor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32385) Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group of ShuffleDescriptorAndIndex
[ https://issues.apache.org/jira/browse/FLINK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32385. -- Resolution: Done > Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group > of ShuffleDescriptorAndIndex > - > > Key: FLINK-32385 > URL: https://issues.apache.org/jira/browse/FLINK-32385 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > > Introduce a new struct named SerializedShuffleDescriptorAndIndices to > identify a group of ShuffleDescriptorAndIndex. > Then we could cache these ShuffleDescriptorAndIndex in TaskExecutor side -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32385) Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group of ShuffleDescriptorAndIndex
[ https://issues.apache.org/jira/browse/FLINK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746038#comment-17746038 ] Lijie Wang commented on FLINK-32385: Done via master(1.18): fa4518960fd009531a584ca5450604649e94bac0 > Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group > of ShuffleDescriptorAndIndex > - > > Key: FLINK-32385 > URL: https://issues.apache.org/jira/browse/FLINK-32385 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Weihua Hu >Assignee: Weihua Hu >Priority: Major > Labels: pull-request-available > > Introduce a new struct named SerializedShuffleDescriptorAndIndices to > identify a group of ShuffleDescriptorAndIndex. > Then we could cache these ShuffleDescriptorAndIndex in TaskExecutor side -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32491) Introduce RuntimeFilterOperator to support runtime filter which can reduce the shuffle data size before shuffle join
[ https://issues.apache.org/jira/browse/FLINK-32491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32491. -- Resolution: Done > Introduce RuntimeFilterOperator to support runtime filter which can reduce > the shuffle data size before shuffle join > > > Key: FLINK-32491 > URL: https://issues.apache.org/jira/browse/FLINK-32491 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32491) Introduce RuntimeFilterOperator to support runtime filter which can reduce the shuffle data size before shuffle join
[ https://issues.apache.org/jira/browse/FLINK-32491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745872#comment-17745872 ] Lijie Wang commented on FLINK-32491: Done via master(1.18): 59850a91f96ddb5c58d8554b76e46bc6245dda7b > Introduce RuntimeFilterOperator to support runtime filter which can reduce > the shuffle data size before shuffle join > > > Key: FLINK-32491 > URL: https://issues.apache.org/jira/browse/FLINK-32491 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32493) Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side data of shuffle join
[ https://issues.apache.org/jira/browse/FLINK-32493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang closed FLINK-32493. -- Resolution: Done > Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side > data of shuffle join > -- > > Key: FLINK-32493 > URL: https://issues.apache.org/jira/browse/FLINK-32493 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32493) Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side data of shuffle join
[ https://issues.apache.org/jira/browse/FLINK-32493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745467#comment-17745467 ] Lijie Wang commented on FLINK-32493: Done via master(1.18) 47596ea9250415bc333fa612c6b1ec5c7407fdd6 and 41b35260bba91463bd9e44a4661beaa74c4cbe10 > Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side > data of shuffle join > -- > > Key: FLINK-32493 > URL: https://issues.apache.org/jira/browse/FLINK-32493 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32632) Run Kubernetes test is unstable on AZP
[ https://issues.apache.org/jira/browse/FLINK-32632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-32632: --- Priority: Blocker (was: Critical) > Run Kubernetes test is unstable on AZP > -- > > Key: FLINK-32632 > URL: https://issues.apache.org/jira/browse/FLINK-32632 > Project: Flink > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Sergey Nuyanzin >Priority: Blocker > Labels: test-stability > > This test > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51447&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=43ba8ce7-ebbf-57cd-9163-444305d74117&l=6213 > fails with > {noformat} > 2023-07-19T17:14:49.8144730Z Jul 19 17:14:49 > deployment.apps/flink-task-manager created > 2023-07-19T17:15:03.7983703Z Jul 19 17:15:03 job.batch/flink-job-cluster > condition met > 2023-07-19T17:15:04.0937620Z error: Internal error occurred: error executing > command in container: http: invalid Host header > 2023-07-19T17:15:04.0988752Z sort: cannot read: > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-11919909188/out/kubernetes_wc_out*': > No such file or directory > 2023-07-19T17:15:04.1017388Z Jul 19 17:15:04 FAIL WordCount: Output hash > mismatch. Got d41d8cd98f00b204e9800998ecf8427e, expected > e682ec6622b5e83f2eb614617d5ab2cf. > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32632) Run Kubernetes test is unstable on AZP
[ https://issues.apache.org/jira/browse/FLINK-32632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745367#comment-17745367 ] Lijie Wang commented on FLINK-32632: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51495&view=logs&j=81be5d54-0dc6-5130-d390-233dd2956037&t=cfb9de70-be4e-5162-887e-653276e3edee > Run Kubernetes test is unstable on AZP > -- > > Key: FLINK-32632 > URL: https://issues.apache.org/jira/browse/FLINK-32632 > Project: Flink > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This test > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51447&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=43ba8ce7-ebbf-57cd-9163-444305d74117&l=6213 > fails with > {noformat} > 2023-07-19T17:14:49.8144730Z Jul 19 17:14:49 > deployment.apps/flink-task-manager created > 2023-07-19T17:15:03.7983703Z Jul 19 17:15:03 job.batch/flink-job-cluster > condition met > 2023-07-19T17:15:04.0937620Z error: Internal error occurred: error executing > command in container: http: invalid Host header > 2023-07-19T17:15:04.0988752Z sort: cannot read: > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-11919909188/out/kubernetes_wc_out*': > No such file or directory > 2023-07-19T17:15:04.1017388Z Jul 19 17:15:04 FAIL WordCount: Output hash > mismatch. Got d41d8cd98f00b204e9800998ecf8427e, expected > e682ec6622b5e83f2eb614617d5ab2cf. > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32491) Introduce RuntimeFilterOperator to support runtime filter which can reduce the shuffle data size before shuffle join
[ https://issues.apache.org/jira/browse/FLINK-32491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang reassigned FLINK-32491: -- Assignee: Lijie Wang (was: dalongliu) > Introduce RuntimeFilterOperator to support runtime filter which can reduce > the shuffle data size before shuffle join > > > Key: FLINK-32491 > URL: https://issues.apache.org/jira/browse/FLINK-32491 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: dalongliu >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)