[jira] [Commented] (FLINK-34528) Disconnect TM in JM when TM was killed to further reduce the job restart time

2024-02-27 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821132#comment-17821132
 ] 

Lijie Wang commented on FLINK-34528:


Does this exception always mean that the TM is killed or unavailable? I'm a bit 
doubtful.

> Disconnect TM in JM when TM was killed to further reduce the job restart time
> -
>
> Key: FLINK-34528
> URL: https://issues.apache.org/jira/browse/FLINK-34528
> Project: Flink
>  Issue Type: Sub-task
>Reporter: junzhong qin
>Assignee: junzhong qin
>Priority: Not a Priority
> Attachments: image-2024-02-27-16-35-04-464.png
>
>
> In https://issues.apache.org/jira/browse/FLINK-34526 we disconnect the killed 
> TM in RM. But in the following scenario, we can further reduce the restart 
> time.
> h3. Phenomenon
> In the test case, the pipeline looks like:
> !image-2024-02-27-16-35-04-464.png!
> The Source: Custom Source generates strings, and the job keyBy the strings to 
> Sink: Unnamed.
>  # parallelism = 100
>  # taskmanager.numberOfTaskSlots = 2
>  # disable checkpoint
> The worker was killed at 
> {code:java}
> 2024-02-27 16:41:49,982 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Sink: 
> Unnamed (6/100) 
> (2f1c7b22098a273f5471e3e8f794e1d3_0a448493b4782967b150582570326227_5_0) 
> switched from RUNNING to FAILED on 
> container_e2472_1705993319725_62292_01_46 @ xxx 
> (dataPort=38827).org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException:
>  Connection unexpectedly closed by remote task manager 
> 'xxx/10.169.18.138:35983 [ 
> container_e2472_1705993319725_62292_01_10(xxx:5454) ] '. This might 
> indicate that the remote task manager was lost.at 
> org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:134)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)
> at 
> org.apache.flink.runtime.io.network.netty.NettyMessageClientDecoderDelegate.channelInactive(NettyMessageClientDecoderDelegate.java:94)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:813)
> at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
> at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
> at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
> at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
> at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_252]{code}
> {code:java}
> // The task was scheduled to a task manager that had already been killed
> 2024-02-27 16:41:53,506 INFO 
> 

[jira] [Comment Edited] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"

2024-02-18 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818346#comment-17818346
 ] 

Lijie Wang edited comment on FLINK-34358 at 2/19/24 6:15 AM:
-

The test passes in flink 1.18 but fails in flink 1.19. The root cause is that 
after FLINK-34316, the code that thrown this exception was not called. 

The exception thrown in {{JdbcDynamicTableSource.getScanRuntimeProvider}} in 
flink 1.18, which was omitted in FLINK-34316.

Call {{executeSql}} instead of {{sqlQuery}} should fix it. I will prepare a PR 
soon.


was (Author: wanglijie95):
The test passes in flink 1.18 but fails in flink 1.19. The root cause is that 
after FLINK-34316, the code that caused the exception was not called. 

The exception thrown in {{JdbcDynamicTableSource.getScanRuntimeProvider}} in 
flink 1.18, which was omitted in FLINK-34316.

Call {{executeSql}} instead of {{sqlQuery}} should fix it. I will prepare a PR 
soon.

> flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
> -
>
> Key: FLINK-34358
> URL: https://issues.apache.org/jira/browse/FLINK-34358
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Reporter: Martijn Visser
>Assignee: Lijie Wang
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346
> {code:java}
> [INFO] Running 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 0.554 s <<< FAILURE! - in 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19]
>   Time elapsed: 0.018 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting code to raise a throwable.
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 
> s - in org.apache.flink.connector.jdbc.catalog.JdbcCatalogUtilsTest
> [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureTest
> [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureBase
> [INFO] Running org.apache.flink.architecture.rules.ApiAnnotationRules
> [INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.155 
> s - in org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest
> [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTest
> [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTestBase
> [INFO] Running org.apache.flink.architecture.rules.ITCaseRules
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.109 
> s - in org.apache.flink.architecture.rules.ApiAnnotationRules
> [INFO] Running org.apache.flink.architecture.rules.TableApiRules
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 
> s - in org.apache.flink.architecture.rules.TableApiRules
> [INFO] Running org.apache.flink.architecture.rules.ConnectorRules
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.31 s 
> - in org.apache.flink.architecture.rules.ConnectorRules
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.464 
> s - in org.apache.flink.architecture.ProductionCodeArchitectureBase
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.468 
> s - in org.apache.flink.architecture.ProductionCodeArchitectureTest
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.758 
> s - in org.apache.flink.architecture.rules.ITCaseRules
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.761 
> s - in org.apache.flink.architecture.TestCodeArchitectureTestBase
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.775 
> s - in org.apache.flink.architecture.TestCodeArchitectureTest
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.38 
> s - in 
> org.apache.flink.connector.jdbc.databases.oracle.xa.OracleExactlyOnceSinkE2eTest
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 172.591 s - in 
> org.apache.flink.connector.jdbc.databases.db2.xa.Db2ExactlyOnceSinkE2eTest
> [INFO] 
> [INFO] Results:
> [INFO] 
> Error:  Failures: 
> Error:
> PostgresDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> Error:TrinoDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> Error:CrateDBDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> [INFO] 
> Error:  

[jira] [Commented] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"

2024-02-18 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818347#comment-17818347
 ] 

Lijie Wang commented on FLINK-34358:


The exception stack in 1.18:
{code:java}
java.lang.UnsupportedOperationException: Unsupported type:TIMESTAMP_LTZ(3)

at 
org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.createInternalConverter(AbstractJdbcRowConverter.java:187)
at 
org.apache.flink.connector.jdbc.databases.trino.dialect.TrinoRowConverter.createInternalConverter(TrinoRowConverter.java:60)
at 
org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.createNullableInternalConverter(AbstractJdbcRowConverter.java:118)
at 
org.apache.flink.connector.jdbc.converter.AbstractJdbcRowConverter.(AbstractJdbcRowConverter.java:68)
at 
org.apache.flink.connector.jdbc.databases.trino.dialect.TrinoRowConverter.(TrinoRowConverter.java:40)
at 
org.apache.flink.connector.jdbc.databases.trino.dialect.TrinoDialect.getRowConverter(TrinoDialect.java:49)
at 
org.apache.flink.connector.jdbc.table.JdbcDynamicTableSource.getScanRuntimeProvider(JdbcDynamicTableSource.java:184)
at 
org.apache.flink.table.planner.connectors.DynamicSourceUtils.validateScanSource(DynamicSourceUtils.java:478)
at 
org.apache.flink.table.planner.connectors.DynamicSourceUtils.prepareDynamicSource(DynamicSourceUtils.java:161)
at 
org.apache.flink.table.planner.connectors.DynamicSourceUtils.convertSourceToRel(DynamicSourceUtils.java:125)
at 
org.apache.flink.table.planner.plan.schema.CatalogSourceTable.toRel(CatalogSourceTable.java:118)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.toRel(SqlToRelConverter.java:4002)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:2872)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2432)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2346)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2291)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:728)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:714)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3848)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:618)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:229)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:205)
at 
org.apache.flink.table.planner.operations.SqlNodeConvertContext.toRelRoot(SqlNodeConvertContext.java:69)
at 
org.apache.flink.table.planner.operations.converters.SqlQueryConverter.convertSqlNode(SqlQueryConverter.java:48)
at 
org.apache.flink.table.planner.operations.converters.SqlNodeConverters.convertSqlNode(SqlNodeConverters.java:73)
at 
org.apache.flink.table.planner.operations.SqlNodeToOperationConversion.convertValidatedSqlNode(SqlNodeToOperationConversion.java:272)
at 
org.apache.flink.table.planner.operations.SqlNodeToOperationConversion.convert(SqlNodeToOperationConversion.java:262)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:106)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:708)
at 
org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest.testDataTypeValidate(JdbcDialectTypeTest.java:101)
{code}

> flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
> -
>
> Key: FLINK-34358
> URL: https://issues.apache.org/jira/browse/FLINK-34358
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Reporter: Martijn Visser
>Assignee: Lijie Wang
>Priority: Blocker
>  Labels: test-stability
>
> https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346
> {code:java}
> [INFO] Running 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 0.554 s <<< FAILURE! - in 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19]
>   Time elapsed: 0.018 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting code to raise a throwable.
> [INFO] Tests run: 2, 

[jira] [Assigned] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"

2024-02-18 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-34358:
--

Assignee: Lijie Wang

> flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
> -
>
> Key: FLINK-34358
> URL: https://issues.apache.org/jira/browse/FLINK-34358
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Reporter: Martijn Visser
>Assignee: Lijie Wang
>Priority: Blocker
>  Labels: test-stability
>
> https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346
> {code:java}
> [INFO] Running 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 0.554 s <<< FAILURE! - in 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19]
>   Time elapsed: 0.018 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting code to raise a throwable.
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 
> s - in org.apache.flink.connector.jdbc.catalog.JdbcCatalogUtilsTest
> [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureTest
> [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureBase
> [INFO] Running org.apache.flink.architecture.rules.ApiAnnotationRules
> [INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.155 
> s - in org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest
> [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTest
> [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTestBase
> [INFO] Running org.apache.flink.architecture.rules.ITCaseRules
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.109 
> s - in org.apache.flink.architecture.rules.ApiAnnotationRules
> [INFO] Running org.apache.flink.architecture.rules.TableApiRules
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 
> s - in org.apache.flink.architecture.rules.TableApiRules
> [INFO] Running org.apache.flink.architecture.rules.ConnectorRules
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.31 s 
> - in org.apache.flink.architecture.rules.ConnectorRules
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.464 
> s - in org.apache.flink.architecture.ProductionCodeArchitectureBase
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.468 
> s - in org.apache.flink.architecture.ProductionCodeArchitectureTest
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.758 
> s - in org.apache.flink.architecture.rules.ITCaseRules
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.761 
> s - in org.apache.flink.architecture.TestCodeArchitectureTestBase
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.775 
> s - in org.apache.flink.architecture.TestCodeArchitectureTest
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.38 
> s - in 
> org.apache.flink.connector.jdbc.databases.oracle.xa.OracleExactlyOnceSinkE2eTest
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 172.591 s - in 
> org.apache.flink.connector.jdbc.databases.db2.xa.Db2ExactlyOnceSinkE2eTest
> [INFO] 
> [INFO] Results:
> [INFO] 
> Error:  Failures: 
> Error:
> PostgresDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> Error:TrinoDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> Error:CrateDBDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> [INFO] 
> Error:  Tests run: 394, Failures: 3, Errors: 0, Skipped: 1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34358) flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"

2024-02-18 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818346#comment-17818346
 ] 

Lijie Wang commented on FLINK-34358:


The test passes in flink 1.18 but fails in flink 1.19. The root cause is that 
after FLINK-34316, the code that caused the exception was not called. 

The exception thrown in {{JdbcDynamicTableSource.getScanRuntimeProvider}} in 
flink 1.18, which was omitted in FLINK-34316.

Call {{executeSql}} instead of {{sqlQuery}} should fix it. I will prepare a PR 
soon.

> flink-connector-jdbc nightly fails with "Expecting code to raise a throwable"
> -
>
> Key: FLINK-34358
> URL: https://issues.apache.org/jira/browse/FLINK-34358
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Reporter: Martijn Visser
>Priority: Blocker
>  Labels: test-stability
>
> https://github.com/apache/flink-connector-jdbc/actions/runs/7770283211/job/21190280602#step:14:346
> {code:java}
> [INFO] Running 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 0.554 s <<< FAILURE! - in 
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest
> Error:  
> org.apache.flink.connector.jdbc.dialect.cratedb.CrateDBDialectTypeTest.testDataTypeValidate(TestItem)[19]
>   Time elapsed: 0.018 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting code to raise a throwable.
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 
> s - in org.apache.flink.connector.jdbc.catalog.JdbcCatalogUtilsTest
> [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureTest
> [INFO] Running org.apache.flink.architecture.ProductionCodeArchitectureBase
> [INFO] Running org.apache.flink.architecture.rules.ApiAnnotationRules
> [INFO] Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.155 
> s - in org.apache.flink.connector.jdbc.dialect.JdbcDialectTypeTest
> [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTest
> [INFO] Running org.apache.flink.architecture.TestCodeArchitectureTestBase
> [INFO] Running org.apache.flink.architecture.rules.ITCaseRules
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.109 
> s - in org.apache.flink.architecture.rules.ApiAnnotationRules
> [INFO] Running org.apache.flink.architecture.rules.TableApiRules
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 
> s - in org.apache.flink.architecture.rules.TableApiRules
> [INFO] Running org.apache.flink.architecture.rules.ConnectorRules
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.31 s 
> - in org.apache.flink.architecture.rules.ConnectorRules
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.464 
> s - in org.apache.flink.architecture.ProductionCodeArchitectureBase
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.468 
> s - in org.apache.flink.architecture.ProductionCodeArchitectureTest
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.758 
> s - in org.apache.flink.architecture.rules.ITCaseRules
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.761 
> s - in org.apache.flink.architecture.TestCodeArchitectureTestBase
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.775 
> s - in org.apache.flink.architecture.TestCodeArchitectureTest
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.38 
> s - in 
> org.apache.flink.connector.jdbc.databases.oracle.xa.OracleExactlyOnceSinkE2eTest
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 172.591 s - in 
> org.apache.flink.connector.jdbc.databases.db2.xa.Db2ExactlyOnceSinkE2eTest
> [INFO] 
> [INFO] Results:
> [INFO] 
> Error:  Failures: 
> Error:
> PostgresDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> Error:TrinoDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> Error:CrateDBDialectTypeTest>JdbcDialectTypeTest.testDataTypeValidate:102 
> Expecting code to raise a throwable.
> [INFO] 
> Error:  Tests run: 394, Failures: 3, Errors: 0, Skipped: 1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url

2024-02-06 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-34365.
--
Resolution: Fixed

> [docs] Delete repeated pages in Chinese Flink website and correct the Paimon 
> url
> 
>
> Key: FLINK-34365
> URL: https://issues.apache.org/jira/browse/FLINK-34365
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Waterking
>Assignee: Waterking
>Priority: Major
>  Labels: pull-request-available
> Attachments: 微信截图_20240205214854.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The "教程" column on the [Flink 
> 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] 
> currently has two "[With Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink/%22];.
> Therefore, I delete one for brevity.
> Also, the current link is wrong and I correct it with this link "[With 
> Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink];



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url

2024-02-06 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815053#comment-17815053
 ] 

Lijie Wang commented on FLINK-34365:


Fixed via branch asf-site(flink-web): ec2e5c2b4a312fe56e44e13c57b84c6f1331b992

> [docs] Delete repeated pages in Chinese Flink website and correct the Paimon 
> url
> 
>
> Key: FLINK-34365
> URL: https://issues.apache.org/jira/browse/FLINK-34365
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Waterking
>Assignee: Waterking
>Priority: Major
>  Labels: pull-request-available
> Attachments: 微信截图_20240205214854.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The "教程" column on the [Flink 
> 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] 
> currently has two "[With Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink/%22];.
> Therefore, I delete one for brevity.
> Also, the current link is wrong and I correct it with this link "[With 
> Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink];



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url

2024-02-05 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-34365:
--

Assignee: Waterking

> [docs] Delete repeated pages in Chinese Flink website and correct the Paimon 
> url
> 
>
> Key: FLINK-34365
> URL: https://issues.apache.org/jira/browse/FLINK-34365
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Waterking
>Assignee: Waterking
>Priority: Major
>  Labels: pull-request-available
> Attachments: 微信截图_20240205214854.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The "教程" column on the [Flink 
> 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] 
> currently has two "[With Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink/%22];.
> Therefore, I delete one for brevity.
> Also, the current link is wrong and I correct it with this link "[With 
> Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink];



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34365) [docs] Delete repeated pages in Chinese Flink website and correct the Paimon url

2024-02-05 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814628#comment-17814628
 ] 

Lijie Wang commented on FLINK-34365:


Assigned to you [~waterking] :)

> [docs] Delete repeated pages in Chinese Flink website and correct the Paimon 
> url
> 
>
> Key: FLINK-34365
> URL: https://issues.apache.org/jira/browse/FLINK-34365
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Waterking
>Assignee: Waterking
>Priority: Major
>  Labels: pull-request-available
> Attachments: 微信截图_20240205214854.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The "教程" column on the [Flink 
> 中文网|https://flink.apache.org/zh/how-to-contribute/contribute-documentation/] 
> currently has two "[With Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink/%22];.
> Therefore, I delete one for brevity.
> Also, the current link is wrong and I correct it with this link "[With 
> Paimon(incubating) (formerly Flink Table 
> Store)|https://paimon.apache.org/docs/master/engines/flink];



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34132) Batch WordCount job fails when run with AdaptiveBatch scheduler

2024-01-17 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807697#comment-17807697
 ] 

Lijie Wang commented on FLINK-34132:


[~prabhujoseph] AFAIK, adaptive batch scheduler does not support dataset jobs, 
you can see details in 
[FLIP-283|https://cwiki.apache.org/confluence/display/FLINK/FLIP-283%3A+Use+adaptive+batch+scheduler+as+default+scheduler+for+batch+jobs].

But maybe we should make the exception message more friendly. cc [~JunRuiLi]

> Batch WordCount job fails when run with AdaptiveBatch scheduler
> ---
>
> Key: FLINK-34132
> URL: https://issues.apache.org/jira/browse/FLINK-34132
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.1, 1.18.1
>Reporter: Prabhu Joseph
>Priority: Major
>
> Batch WordCount job fails when run with AdaptiveBatch scheduler.
> *Repro Steps*
> {code:java}
> flink-yarn-session -Djobmanager.scheduler=adaptive -d
>  flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input 
> s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT
> {code}
> *Error logs*
> {code:java}
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: 
> org.apache.flink.runtime.client.JobInitializationException: Could not start 
> the JobMaster.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
>   at 
> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>   at 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>   at 
> org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157)
> Caused by: java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.flink.runtime.client.JobInitializationException: Could not start 
> the JobMaster.
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1067)
>   at 
> org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:144)
>   at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:73)
>   at 
> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:106)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
>   ... 12 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: 
> org.apache.flink.runtime.client.JobInitializationException: Could not start 
> the JobMaster.
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1062)
>   ... 20 more
> Caused by: java.lang.RuntimeException: 
> org.apache.flink.runtime.client.JobInitializationException: Could not start 
> the JobMaster.
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:75)
>   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
>   at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
>   at 
> 

[jira] [Commented] (FLINK-34025) Show data skew score on Flink Dashboard

2024-01-09 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804720#comment-17804720
 ] 

Lijie Wang commented on FLINK-34025:


Hi [~iemre], I think your proposal needs to be discussed on the dev mailing 
list via FLIP (as it involves UI and metrics add/change). You can find more 
detail in 
[here|https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals]

> Show data skew score on Flink Dashboard
> ---
>
> Key: FLINK-34025
> URL: https://issues.apache.org/jira/browse/FLINK-34025
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Emre Kartoglu
>Priority: Major
>  Labels: dashboard
> Attachments: skew_proposal.png, skew_tab.png
>
>
> *Problem:* Currently users have to click on every operator and check how much 
> data each subtask is processing to see if there is data skew. This is 
> particularly cumbersome and error-prone for jobs with big job graphs. Data 
> skew is an important metric that should be more visible.
>  
> *Proposed solution:*
>  * Show a data skew score on each operator (see screenshot below). This would 
> be an improvement, but would not be sufficient. As it would still not be easy 
> to see the data skew score for jobs with very large job graphs (it'd require 
> a lot of zooming in/out).
>  * Show data skew score for each operator under a new "Data Skew" tab next to 
> the Exceptions tab. See screenshot below  
> !skew_tab.png|width=1226,height=719! .
>  
> !skew_proposal.png|width=845,height=253!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33968) Compute the number of subpartitions when initializing executon job vertices

2024-01-02 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-33968:
---
Description: 
Currently, when using dynamic graphs, the subpartition-num of a task is lazily 
calculated until the task deployment moment, this may lead to some 
uncertainties in job recovery scenarios:

Before jm crashs, when deploying upstream tasks, the parallelism of downstream 
vertex may be unknown, so the subpartiton-num will be the max parallelism of 
downstream job vertex. However, after jm restarts, when deploying upstream 
tasks, the parallelism of downstream job vertex may be known(has been 
calculated before jm crashs and been recovered after jm restarts), so the 
subpartiton-num will be the actual parallelism of downstream job vertex. The 
difference of calculated subpartition-num will lead to the partitions generated 
before jm crashs cannot be reused after jm restarts.

We will solve this problem by advancing the calculation of subpartitoin-num to 
the moment of initializing executon job vertex (in ctor of 
IntermediateResultPartition)

  was:
Currently, when using dynamic graphs, the subpartition-num of a task is lazily 
calculated until the task deployment moment, this may lead to some 
uncertainties in job recovery scenarios.

Before jm crashs, when deploying upstream tasks, the parallelism of downstream 
vertex may be unknown, so the subpartiton-num will be the max parallelism of 
downstream job vertex. However, after jm restarts, when deploying upstream 
tasks, the parallelism of downstream job vertex may be known(has been 
calculated before jm crashs and been recovered after jm restarts), so the 
subpartiton-num will be the actual parallelism of downstream job vertex.
 
The difference of calculated subpartition-num will lead to the partitions 
generated before jm crashs cannot be reused after jm restarts.

We will solve this problem by advancing the calculation of subpartitoin-num to 
the moment of initializing executon job vertex (in ctor of 
IntermediateResultPartition)


> Compute the number of subpartitions when initializing executon job vertices
> ---
>
> Key: FLINK-33968
> URL: https://issues.apache.org/jira/browse/FLINK-33968
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>
> Currently, when using dynamic graphs, the subpartition-num of a task is 
> lazily calculated until the task deployment moment, this may lead to some 
> uncertainties in job recovery scenarios:
> Before jm crashs, when deploying upstream tasks, the parallelism of 
> downstream vertex may be unknown, so the subpartiton-num will be the max 
> parallelism of downstream job vertex. However, after jm restarts, when 
> deploying upstream tasks, the parallelism of downstream job vertex may be 
> known(has been calculated before jm crashs and been recovered after jm 
> restarts), so the subpartiton-num will be the actual parallelism of 
> downstream job vertex. The difference of calculated subpartition-num will 
> lead to the partitions generated before jm crashs cannot be reused after jm 
> restarts.
> We will solve this problem by advancing the calculation of subpartitoin-num 
> to the moment of initializing executon job vertex (in ctor of 
> IntermediateResultPartition)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33968) Compute the number of subpartitions when initializing executon job vertices

2024-01-02 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-33968:
--

 Summary: Compute the number of subpartitions when initializing 
executon job vertices
 Key: FLINK-33968
 URL: https://issues.apache.org/jira/browse/FLINK-33968
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang
Assignee: Lijie Wang


Currently, when using dynamic graphs, the subpartition-num of a task is lazily 
calculated until the task deployment moment, this may lead to some 
uncertainties in job recovery scenarios.

Before jm crashs, when deploying upstream tasks, the parallelism of downstream 
vertex may be unknown, so the subpartiton-num will be the max parallelism of 
downstream job vertex. However, after jm restarts, when deploying upstream 
tasks, the parallelism of downstream job vertex may be known(has been 
calculated before jm crashs and been recovered after jm restarts), so the 
subpartiton-num will be the actual parallelism of downstream job vertex.
 
The difference of calculated subpartition-num will lead to the partitions 
generated before jm crashs cannot be reused after jm restarts.

We will solve this problem by advancing the calculation of subpartitoin-num to 
the moment of initializing executon job vertex (in ctor of 
IntermediateResultPartition)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs

2024-01-02 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33892:
--

Assignee: Junrui Li  (was: Lijie Wang)

> FLIP-383: Support Job Recovery for Batch Jobs
> -
>
> Key: FLINK-33892
> URL: https://issues.apache.org/jira/browse/FLINK-33892
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Junrui Li
>Priority: Major
>
> This is the umbrella ticket for 
> [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800679#comment-17800679
 ] 

Lijie Wang commented on FLINK-33943:


Maybe you can see the comments in FLINK-19358 and 
[FLIP-85|https://cwiki.apache.org/confluence/display/FLINK/FLIP-85%3A+Flink+Application+Mode]

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-33943.
--
Resolution: Not A Bug

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800676#comment-17800676
 ] 

Lijie Wang commented on FLINK-33943:


I 'll close this issue because it's not a bug. cc [~vrang...@in.ibm.com] 

 

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800671#comment-17800671
 ] 

Lijie Wang commented on FLINK-33943:


[~vrang...@in.ibm.com] I think it should work if you use session mode instead 
of application mode.

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800668#comment-17800668
 ] 

Lijie Wang commented on FLINK-33943:


Hi [~vrang...@in.ibm.com], currently, HA in Application Mode is only supported 
for single-execute() applications. You can find more details in [flink 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/overview/#application-mode].


> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task

2023-12-25 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-31650.
--
Fix Version/s: 1.19.0
   1.18.1
   1.17.3
   Resolution: Fixed

> Incorrect busyMsTimePerSecond metric value for FINISHED task
> 
>
> Key: FLINK-31650
> URL: https://issues.apache.org/jira/browse/FLINK-31650
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / REST
>Affects Versions: 1.17.0, 1.16.1, 1.15.4
>Reporter: Lijie Wang
>Assignee: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0, 1.18.1, 1.17.3
>
> Attachments: busyMsTimePerSecond.png
>
>
> As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is 
> 100%, which is obviously unreasonable.
> !busyMsTimePerSecond.png|width=1048,height=432!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task

2023-12-25 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800398#comment-17800398
 ] 

Lijie Wang commented on FLINK-31650:


Fixed via:
master(1.19) : dd028282e8ab19c0d1cd05fad02b63bbda6c1358
release-1.18: ff1ef789eba612d008424d5fc28fe5905f96fb9c
release-1.17: 024e970e27781a9fd1925d6c5938efbb364c6462

> Incorrect busyMsTimePerSecond metric value for FINISHED task
> 
>
> Key: FLINK-31650
> URL: https://issues.apache.org/jira/browse/FLINK-31650
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / REST
>Affects Versions: 1.17.0, 1.16.1, 1.15.4
>Reporter: Lijie Wang
>Assignee: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: busyMsTimePerSecond.png
>
>
> As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is 
> 100%, which is obviously unreasonable.
> !busyMsTimePerSecond.png|width=1048,height=432!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33905) FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs

2023-12-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33905:
--

Assignee: Wencong Liu

> FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs
> ---
>
> Key: FLINK-33905
> URL: https://issues.apache.org/jira/browse/FLINK-33905
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.19.0
>Reporter: Wencong Liu
>Assignee: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
>
> This ticket is proposed for 
> [FLIP-382|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs

2023-12-19 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33892:
--

Assignee: (was: Lijie Wang)

> FLIP-383: Support Job Recovery for Batch Jobs
> -
>
> Key: FLINK-33892
> URL: https://issues.apache.org/jira/browse/FLINK-33892
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Priority: Major
>
> This is the umbrella ticket for 
> [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs

2023-12-19 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33892:
--

Assignee: Lijie Wang

> FLIP-383: Support Job Recovery for Batch Jobs
> -
>
> Key: FLINK-33892
> URL: https://issues.apache.org/jira/browse/FLINK-33892
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>
> This is the umbrella ticket for 
> [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs

2023-12-19 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33892:
--

Assignee: Lijie Wang

> FLIP-383: Support Job Recovery for Batch Jobs
> -
>
> Key: FLINK-33892
> URL: https://issues.apache.org/jira/browse/FLINK-33892
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>
> This is the umbrella ticket for 
> [FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33892) FLIP-383: Support Job Recovery for Batch Jobs

2023-12-19 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-33892:
--

 Summary: FLIP-383: Support Job Recovery for Batch Jobs
 Key: FLINK-33892
 URL: https://issues.apache.org/jira/browse/FLINK-33892
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Coordination
Reporter: Lijie Wang


This is the umbrella ticket for 
[FLIP-383|https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33713) Deprecate RuntimeContext#getExecutionConfig

2023-12-01 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33713:
--

Assignee: Junrui Li

> Deprecate RuntimeContext#getExecutionConfig
> ---
>
> Key: FLINK-33713
> URL: https://issues.apache.org/jira/browse/FLINK-33713
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>  Labels: pull-request-available
>
> Deprecate RuntimeContext#getExecutionConfig



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33714) Update documentation about the usage of RuntimeContext#getExecutionConfig

2023-12-01 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33714:
--

Assignee: Junrui Li

> Update documentation about the usage of RuntimeContext#getExecutionConfig
> -
>
> Key: FLINK-33714
> URL: https://issues.apache.org/jira/browse/FLINK-33714
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33721) Extend BashJavaUtils to Support Reading and Writing Standard YAML Files

2023-12-01 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33721:
--

Assignee: Junrui Li

> Extend BashJavaUtils to Support Reading and Writing Standard YAML Files
> ---
>
> Key: FLINK-33721
> URL: https://issues.apache.org/jira/browse/FLINK-33721
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>
> Currently, Flink's shell scripts, such as those used for end-to-end (e2e) 
> testing and Docker image building, require the ability to read from and 
> modify Flink's configuration files. With the introduction of standard YAML 
> files as the configuration format, the existing shell scripts are not 
> equipped to correctly handle read and modify operations on these files.
> To accommodate this requirement and enhance our script capabilities, we 
> propose an extension to the BashJavaUtils functionality. This extension will 
> enable BashJavaUtils to support the reading and modifying of standard YAML 
> files, ensuring that our shell scripts can seamlessly interact with the new 
> configuration format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33581) FLIP-381: Deprecate configuration getters/setters that return/set complex Java objects

2023-11-16 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33581:
--

Assignee: Junrui Li

> FLIP-381: Deprecate configuration getters/setters that return/set complex 
> Java objects
> --
>
> Key: FLINK-33581
> URL: https://issues.apache.org/jira/browse/FLINK-33581
> Project: Flink
>  Issue Type: Technical Debt
>  Components: API / DataStream
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>
> Deprecate he non-ConfigOption objects in the StreamExecutionEnvironment, 
> CheckpointConfig, and ExecutionConfig, and ultimately removing them in FLINK 
> 2.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33576) Introduce new Flink conf file "config.yaml" supporting standard YAML syntax

2023-11-16 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33576:
--

Assignee: Junrui Li

> Introduce new Flink conf file "config.yaml" supporting standard YAML syntax
> ---
>
> Key: FLINK-33576
> URL: https://issues.apache.org/jira/browse/FLINK-33576
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Configuration
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>
> Introduce new Flink conf file config.yaml, and this file will be parsed by 
> standard YAML syntax.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33577) Make "conf.yaml" as the default Flink configuration file

2023-11-16 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33577:
--

Assignee: Junrui Li

> Make "conf.yaml" as the default Flink configuration file
> 
>
> Key: FLINK-33577
> URL: https://issues.apache.org/jira/browse/FLINK-33577
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Configuration
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>
> This update ensures that the flink-dist package in FLINK will include the new 
> configuration file "conf.yaml" by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33356) The navigation bar on Flink’s official website is messed up.

2023-11-01 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781986#comment-17781986
 ] 

Lijie Wang commented on FLINK-33356:


The website is normal now, thanks [~JunRuiLi] [~Wencong Liu] :D

> The navigation bar on Flink’s official website is messed up.
> 
>
> Key: FLINK-33356
> URL: https://issues.apache.org/jira/browse/FLINK-33356
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Junrui Li
>Assignee: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
> Attachments: image-2023-10-25-11-55-52-653.png, 
> image-2023-10-25-12-34-22-790.png
>
>
> The side navigation bar on the Flink official website at the following link: 
> [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed 
> up, as shown in the attached screenshot.
> !image-2023-10-25-11-55-52-653.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33356) The navigation bar on Flink’s official website is messed up.

2023-10-31 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-33356.
--
Fix Version/s: 1.19.0
   Resolution: Fixed

> The navigation bar on Flink’s official website is messed up.
> 
>
> Key: FLINK-33356
> URL: https://issues.apache.org/jira/browse/FLINK-33356
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Junrui Li
>Assignee: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
> Attachments: image-2023-10-25-11-55-52-653.png, 
> image-2023-10-25-12-34-22-790.png
>
>
> The side navigation bar on the Flink official website at the following link: 
> [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed 
> up, as shown in the attached screenshot.
> !image-2023-10-25-11-55-52-653.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33356) The navigation bar on Flink’s official website is messed up.

2023-10-31 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781274#comment-17781274
 ] 

Lijie Wang commented on FLINK-33356:


Fixed via master 780a673d8e2c3845d685c86d95166d9169601726

> The navigation bar on Flink’s official website is messed up.
> 
>
> Key: FLINK-33356
> URL: https://issues.apache.org/jira/browse/FLINK-33356
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Junrui Li
>Assignee: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-10-25-11-55-52-653.png, 
> image-2023-10-25-12-34-22-790.png
>
>
> The side navigation bar on the Flink official website at the following link: 
> [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed 
> up, as shown in the attached screenshot.
> !image-2023-10-25-11-55-52-653.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33364) Introduce standard YAML for flink configuration

2023-10-25 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33364:
--

Assignee: Junrui Li

> Introduce standard YAML for flink configuration
> ---
>
> Key: FLINK-33364
> URL: https://issues.apache.org/jira/browse/FLINK-33364
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Configuration
>Affects Versions: 1.19.0
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33356) The navigation bar on Flink’s official website is messed up.

2023-10-25 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779345#comment-17779345
 ] 

Lijie Wang commented on FLINK-33356:


[~Wencong Liu] Assigned to you.

> The navigation bar on Flink’s official website is messed up.
> 
>
> Key: FLINK-33356
> URL: https://issues.apache.org/jira/browse/FLINK-33356
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Junrui Li
>Assignee: Wencong Liu
>Priority: Major
> Attachments: image-2023-10-25-11-55-52-653.png, 
> image-2023-10-25-12-34-22-790.png
>
>
> The side navigation bar on the Flink official website at the following link: 
> [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed 
> up, as shown in the attached screenshot.
> !image-2023-10-25-11-55-52-653.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33356) The navigation bar on Flink’s official website is messed up.

2023-10-25 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33356:
--

Assignee: Wencong Liu

> The navigation bar on Flink’s official website is messed up.
> 
>
> Key: FLINK-33356
> URL: https://issues.apache.org/jira/browse/FLINK-33356
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Junrui Li
>Assignee: Wencong Liu
>Priority: Major
> Attachments: image-2023-10-25-11-55-52-653.png, 
> image-2023-10-25-12-34-22-790.png
>
>
> The side navigation bar on the Flink official website at the following link: 
> [https://nightlies.apache.org/flink/flink-docs-master/] appears to be messed 
> up, as shown in the attached screenshot.
> !image-2023-10-25-11-55-52-653.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-09-20 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767357#comment-17767357
 ] 

Lijie Wang commented on FLINK-32974:


Fix via
master: 6034d5ff335dc970672c290851811235399452fb
release-1.18: 2aeb99804ba56c008df0a1730f3246d3fea856b9
release-1.17: 7c9e05ea8c67b12c657b60cd5e6d1bea52b4f9a3

> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Critical
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using 
> CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used 
> up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-09-20 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang resolved FLINK-32974.

Fix Version/s: 1.18.0
   1.17.2
   Resolution: Fixed

> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Critical
> Fix For: 1.18.0, 1.17.2
>
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using 
> CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used 
> up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-09-19 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32974:
---
Description: 
After FLINK-32226, a temporary directory(named 
{{flink-rest-client-jobgraphs*}}) is created when creating a new 
RestClusterClient, but this directory will never be cleaned up.

This will result in a lot of {{flink-rest-client-jobgraphs*}} directories under 
{{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher, which 
may cause the inode to be used up.

  was:
After FLINK-32226, a temporary directory(named 
{{flink-rest-client-jobgraphs*}}) is created when creating a new 
RestClusterClient, but this directory will never be cleaned up.

This will result in a lot of {{flink-rest-client-jobgraphs*}} directories under 
{{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher.


> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Critical
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using 
> CollectDynamicSink/CollectResultFetcher, which may cause the inode to be used 
> up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-09-18 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32974:
---
Affects Version/s: 1.17.2

> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Critical
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-09-18 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32974:
---
Priority: Critical  (was: Major)

> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Critical
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33084) Migrate globalJobParameter in ExecutionConfig to configuration instance

2023-09-13 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33084:
--

Assignee: Junrui Li

> Migrate globalJobParameter in ExecutionConfig to configuration instance
> ---
>
> Key: FLINK-33084
> URL: https://issues.apache.org/jira/browse/FLINK-33084
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Configuration
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, the globalJobParameter field in ExecutionConfig has not been 
> migrated to the Configuration. Considering the goal of unifying configuration 
> options, it is necessary to migrate it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33080) The checkpoint storage configured in the job level by config option will not take effect

2023-09-13 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-33080:
--

Assignee: Junrui Li

> The checkpoint storage configured in the job level by config option will not 
> take effect
> 
>
> Key: FLINK-33080
> URL: https://issues.apache.org/jira/browse/FLINK-33080
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
> Fix For: 1.19.0
>
>
> When we configure the checkpoint storage at the job level, it can only be 
> done through the following method:
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.getCheckpointConfig().setCheckpointStorage(xxx); {code}
> However, configure the checkpoint storage by the job-side configuration like 
> the following will not take effect:
> {code:java}
> Configuration configuration = new Configuration();
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment(configuration);
> configuration.set(CheckpointingOptions.CHECKPOINT_STORAGE, "filesystem");
> configuration.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, checkpointDir); 
> {code}
> This behavior is unexpected, we should allow this way will take effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-09-05 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32974:
--

Assignee: Lijie Wang

> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32865) DynamicFilteringDataCollectorOperator can't chain with the upstream operator when the parallelism is inconsistent

2023-08-30 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32865.
--
  Assignee: Lijie Wang
Resolution: Fixed

> DynamicFilteringDataCollectorOperator can't chain with the upstream operator 
> when the parallelism is inconsistent
> -
>
> Key: FLINK-32865
> URL: https://issues.apache.org/jira/browse/FLINK-32865
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: image-2023-08-14-19-17-22-109.png
>
>
> !image-2023-08-14-19-17-22-109.png!
>  
> If the DynamicFilteringDataCollectorOperator parallelism is not consistent 
> with the upstream operator, they can't chain together, this will the 
> DynamicFilteringDataCollectorOperator to execute after the fact source, so 
> the dpp won't work. Due to the operator parallelism being decided during 
> runtime, so we should add scheduler dependency forcibly in compile phase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32865) DynamicFilteringDataCollectorOperator can't chain with the upstream operator when the parallelism is inconsistent

2023-08-30 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32865:
---
Fix Version/s: 1.19.0

> DynamicFilteringDataCollectorOperator can't chain with the upstream operator 
> when the parallelism is inconsistent
> -
>
> Key: FLINK-32865
> URL: https://issues.apache.org/jira/browse/FLINK-32865
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
> Fix For: 1.18.0, 1.19.0
>
> Attachments: image-2023-08-14-19-17-22-109.png
>
>
> !image-2023-08-14-19-17-22-109.png!
>  
> If the DynamicFilteringDataCollectorOperator parallelism is not consistent 
> with the upstream operator, they can't chain together, this will the 
> DynamicFilteringDataCollectorOperator to execute after the fact source, so 
> the dpp won't work. Due to the operator parallelism being decided during 
> runtime, so we should add scheduler dependency forcibly in compile phase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32865) DynamicFilteringDataCollectorOperator can't chain with the upstream operator when the parallelism is inconsistent

2023-08-30 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760389#comment-17760389
 ] 

Lijie Wang commented on FLINK-32865:


Fixed via:
master(1.19):
5ce5c8142c8c360bc10804fa1e52b13a3376a50f
3fffed00df3234106464642f555275b846ff74b3

release-1.18:
83915d909dce3832ba2c3dbd9b1624715466aa2d
94ba97a13abc7e70a55d929fd5fdf85d52417639

> DynamicFilteringDataCollectorOperator can't chain with the upstream operator 
> when the parallelism is inconsistent
> -
>
> Key: FLINK-32865
> URL: https://issues.apache.org/jira/browse/FLINK-32865
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: dalongliu
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: image-2023-08-14-19-17-22-109.png
>
>
> !image-2023-08-14-19-17-22-109.png!
>  
> If the DynamicFilteringDataCollectorOperator parallelism is not consistent 
> with the upstream operator, they can't chain together, this will the 
> DynamicFilteringDataCollectorOperator to execute after the fact source, so 
> the dpp won't work. Due to the operator parallelism being decided during 
> runtime, so we should add scheduler dependency forcibly in compile phase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task

2023-08-29 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760197#comment-17760197
 ] 

Lijie Wang commented on FLINK-31650:


Assigned to you [~Zhanghao Chen] 

> Incorrect busyMsTimePerSecond metric value for FINISHED task
> 
>
> Key: FLINK-31650
> URL: https://issues.apache.org/jira/browse/FLINK-31650
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / REST
>Affects Versions: 1.17.0, 1.16.1, 1.15.4
>Reporter: Lijie Wang
>Assignee: Junrui Li
>Priority: Major
> Attachments: busyMsTimePerSecond.png
>
>
> As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is 
> 100%, which is obviously unreasonable.
> !busyMsTimePerSecond.png|width=1048,height=432!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-31650) Incorrect busyMsTimePerSecond metric value for FINISHED task

2023-08-29 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-31650:
--

Assignee: Zhanghao Chen  (was: Junrui Li)

> Incorrect busyMsTimePerSecond metric value for FINISHED task
> 
>
> Key: FLINK-31650
> URL: https://issues.apache.org/jira/browse/FLINK-31650
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / REST
>Affects Versions: 1.17.0, 1.16.1, 1.15.4
>Reporter: Lijie Wang
>Assignee: Zhanghao Chen
>Priority: Major
> Attachments: busyMsTimePerSecond.png
>
>
> As shown in the figure below, the busyMsTimePerSecond of the FINISHED task is 
> 100%, which is obviously unreasonable.
> !busyMsTimePerSecond.png|width=1048,height=432!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32974) RestClusterClient always leaks flink-rest-client-jobgraphs* directories

2023-08-28 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32974:
---
Summary: RestClusterClient always leaks flink-rest-client-jobgraphs* 
directories  (was: RestClusterClient leaks flink-rest-client-jobgraphs* 
directories)

> RestClusterClient always leaks flink-rest-client-jobgraphs* directories
> ---
>
> Key: FLINK-32974
> URL: https://issues.apache.org/jira/browse/FLINK-32974
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Priority: Major
>
> After FLINK-32226, a temporary directory(named 
> {{flink-rest-client-jobgraphs*}}) is created when creating a new 
> RestClusterClient, but this directory will never be cleaned up.
> This will result in a lot of {{flink-rest-client-jobgraphs*}} directories 
> under {{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32974) RestClusterClient leaks flink-rest-client-jobgraphs* directories

2023-08-28 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-32974:
--

 Summary: RestClusterClient leaks flink-rest-client-jobgraphs* 
directories
 Key: FLINK-32974
 URL: https://issues.apache.org/jira/browse/FLINK-32974
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission
Affects Versions: 1.18.0
Reporter: Lijie Wang


After FLINK-32226, a temporary directory(named 
{{flink-rest-client-jobgraphs*}}) is created when creating a new 
RestClusterClient, but this directory will never be cleaned up.

This will result in a lot of {{flink-rest-client-jobgraphs*}} directories under 
{{/tmp}}, especially when using CollectDynamicSink/CollectResultFetcher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side

2023-08-27 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32831.
--
Fix Version/s: 1.18.0
   1.19.0
   Resolution: Fixed

> RuntimeFilterProgram should aware join type when looking for the build side
> ---
>
> Key: FLINK-32831
> URL: https://issues.apache.org/jira/browse/FLINK-32831
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0, 1.19.0
>
>
> Currently, runtime filter program will try to look for an {{Exchange}} as 
> build side to avoid affecting {{MultiInput}}. It will try to push down the 
> runtime filter builder if the original build side is not {{Exchange}}.
> Currenlty, the builder-push-down does not aware the join type, which may lead 
> to incorrect results(For example, push down the builder to the right input of 
> left-join).
> We should only support following cases:
> 1. Inner join: builder can push to left + right input
> 2. semi join: builder can push to left + right input
> 3. left join: builder can only push to the left input
> 4. right join: builder can only push to the right input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side

2023-08-27 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759405#comment-17759405
 ] 

Lijie Wang commented on FLINK-32831:


Fixed via
master:
43bcf319ff6104e1af829e34ae8a430c6417622d
f518f8a5bdfcde4912961f60075e530399160f43
07888af59bbc2d24afa049e8a6aedcd9eb822986
a68dd419718b4304343c2b27dab94394c88c67b5

release-1.18:
ac3f3cf4802d0271349636322dbd16772e86453b
e48a02628349cdfdecf7a1aedd2a576fa2a7caf3
6938b928e0104f6b222a4281fbdc7bde20260f3b
fa2ab5e3bfe401d0c599c8ffd224aafb69d6d503

> RuntimeFilterProgram should aware join type when looking for the build side
> ---
>
> Key: FLINK-32831
> URL: https://issues.apache.org/jira/browse/FLINK-32831
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
>
> Currently, runtime filter program will try to look for an {{Exchange}} as 
> build side to avoid affecting {{MultiInput}}. It will try to push down the 
> runtime filter builder if the original build side is not {{Exchange}}.
> Currenlty, the builder-push-down does not aware the join type, which may lead 
> to incorrect results(For example, push down the builder to the right input of 
> left-join).
> We should only support following cases:
> 1. Inner join: builder can push to left + right input
> 2. semi join: builder can push to left + right input
> 3. left join: builder can only push to the left input
> 4. right join: builder can only push to the right input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32780) Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-08-25 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32780:
---
Description: 
This issue aims to verify FLIP-324: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs

We can enable runtime filter by set: table.optimizer.runtime-filter.enabled: 
true

1. Create two tables, one small table (small amount of data), one large table 
(large amount of data), and then run join query on these two tables(such as the 
example in FLIP doc: SELECT * FROM fact, dim WHERE x = a AND z = 2). The Flink 
table planner should be able to obtain the statistical information of these two 
tables (for example, Hive table), and the data volume of the small table should 
be less than "table.optimizer.runtime-filter.max-build-data-size", and the data 
volume of the large table should be larger than 
"table.optimizer.runtime-filter.min-probe-data-size".

2. Show the plan of the join query. The plan should include nodes such as 
LocalRuntimeFilterBuilder, GlobalRuntimeFilterBuilder and RuntimeFilter. We can 
also verify plan for the various variants of above query.

3. Execute the above plan, and: 
* Check whether the data in the large table has been successfully filtered  
* Verify the execution result, the execution result should be same with the 
execution plan which disable runtime filter.

> Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch 
> Jobs
> ---
>
> Key: FLINK-32780
> URL: https://issues.apache.org/jira/browse/FLINK-32780
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Assignee: dalongliu
>Priority: Major
> Fix For: 1.18.0
>
>
> This issue aims to verify FLIP-324: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs
> We can enable runtime filter by set: table.optimizer.runtime-filter.enabled: 
> true
> 1. Create two tables, one small table (small amount of data), one large table 
> (large amount of data), and then run join query on these two tables(such as 
> the example in FLIP doc: SELECT * FROM fact, dim WHERE x = a AND z = 2). The 
> Flink table planner should be able to obtain the statistical information of 
> these two tables (for example, Hive table), and the data volume of the small 
> table should be less than 
> "table.optimizer.runtime-filter.max-build-data-size", and the data volume of 
> the large table should be larger than 
> "table.optimizer.runtime-filter.min-probe-data-size".
> 2. Show the plan of the join query. The plan should include nodes such as 
> LocalRuntimeFilterBuilder, GlobalRuntimeFilterBuilder and RuntimeFilter. We 
> can also verify plan for the various variants of above query.
> 3. Execute the above plan, and: 
> * Check whether the data in the large table has been successfully filtered  
> * Verify the execution result, the execution result should be same with the 
> execution plan which disable runtime filter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32780) Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-08-24 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32780:
--

Assignee: dalongliu

> Release Testing: Verify FLIP-324: Introduce Runtime Filter for Flink Batch 
> Jobs
> ---
>
> Key: FLINK-32780
> URL: https://issues.apache.org/jira/browse/FLINK-32780
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Assignee: dalongliu
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter

2023-08-23 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758317#comment-17758317
 ] 

Lijie Wang commented on FLINK-32827:


Fixed via
master: 679436390db2ac1b54584dbafb6e1091f2f16ada
release-1.18: 9cd04bd004fbf2d0fa4266cf07909c0bbcc94813

> Operator fusion codegen may not take effect when enable runtime filter
> --
>
> Key: FLINK-32827
> URL: https://issues.apache.org/jira/browse/FLINK-32827
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the RuntimeFilterOperator does not support operator fusion 
> codegen(OFCG), which means the Runtime Filter and OFCG can not take affect 
> together, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter

2023-08-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32827.
--
Fix Version/s: 1.18.0
   Resolution: Fixed

> Operator fusion codegen may not take effect when enable runtime filter
> --
>
> Key: FLINK-32827
> URL: https://issues.apache.org/jira/browse/FLINK-32827
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Currently, the RuntimeFilterOperator does not support operator fusion 
> codegen(OFCG), which means the Runtime Filter and OFCG can not take affect 
> together, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP

2023-08-16 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32844.
--
Fix Version/s: 1.18.0
   Resolution: Fixed

> Runtime Filter should not be applied if the field is already filtered by DPP
> 
>
> Key: FLINK-32844
> URL: https://issues.apache.org/jira/browse/FLINK-32844
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Currently, the runtime filter and DPP may take effect on the same key. In 
> this case, the runtime filter may be redundant because the data may have been 
> filtered out by the DPP. We should avoid this because redundant runtime 
> filters can have negative effects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP

2023-08-16 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754890#comment-17754890
 ] 

Lijie Wang commented on FLINK-32844:


Fix via master(1.18): 6b7d6d67808b7fcdd6fc93222c6c7055c4343475

> Runtime Filter should not be applied if the field is already filtered by DPP
> 
>
> Key: FLINK-32844
> URL: https://issues.apache.org/jira/browse/FLINK-32844
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the runtime filter and DPP may take effect on the same key. In 
> this case, the runtime filter may be redundant because the data may have been 
> filtered out by the DPP. We should avoid this because redundant runtime 
> filters can have negative effects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32876) ExecutionTimeBasedSlowTaskDetector treats unscheduled tasks as slow tasks and causes speculative execution to fail.

2023-08-15 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32876:
--

Assignee: Junrui Li

> ExecutionTimeBasedSlowTaskDetector treats unscheduled tasks as slow tasks and 
> causes speculative execution to fail.
> ---
>
> Key: FLINK-32876
> URL: https://issues.apache.org/jira/browse/FLINK-32876
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
> Fix For: 1.18.0
>
>
> When we enable speculative execution and configure job with the following 
> configuration:
> {code:java}
> execution.batch.speculative.enabled: true
> slow-task-detector.execution-time.baseline-ratio: 0.0
> slow-task-detector.execution-time.baseline-lower-bound: 0s{code}
> The ExecutionTimeBasedSlowTaskDetector will identify ExecutionJobVertex that 
> has not yet been scheduled as slow tasks and notify them to the 
> SpeculativeScheduler. However, the SpeculativeScheduler requires that the 
> corresponding ExecutionVertex has entered the scheduled state before 
> scheduling backup tasks. If this requirement is not met, it will result in 
> speculative execution failure.
> The exception stack trace is as follows:
> {code:java}
> java.lang.IllegalStateException: Execution vertex 
> b3f44e8b1dc132ff2a47f7955c75ef7d_0 does not have a recorded version  at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:215) 
> ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.scheduler.ExecutionVertexVersioner.getCurrentVersion(ExecutionVertexVersioner.java:71)
>  ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.scheduler.ExecutionVertexVersioner.lambda$getExecutionVertexVersions$1(ExecutionVertexVersioner.java:89)
>  ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
> ~[?:1.8.0_333]  at 
> java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580) 
> ~[?:1.8.0_333]  at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) 
> ~[?:1.8.0_333]  at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) 
> ~[?:1.8.0_333]  at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) 
> ~[?:1.8.0_333]  at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
> ~[?:1.8.0_333]  at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) 
> ~[?:1.8.0_333]  at 
> org.apache.flink.runtime.scheduler.ExecutionVertexVersioner.getExecutionVertexVersions(ExecutionVertexVersioner.java:90)
>  ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.scheduler.adaptivebatch.SpeculativeScheduler.notifySlowTasks(SpeculativeScheduler.java:377)
>  ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.scheduler.slowtaskdetector.ExecutionTimeBasedSlowTaskDetector.lambda$scheduleTask$1(ExecutionTimeBasedSlowTaskDetector.java:129)
>  ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_333]  at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_333]  at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRunAsync$4(PekkoRpcActor.java:451)
>  ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>  ~[flink-dist-1.18-SNAPSHOT.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRunAsync(PekkoRpcActor.java:451)
>  ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:218)
>  ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:85)
>  ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:168)
>  ~[flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) 
> [flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) 
> [flink-rpc-akka48a43f0a-d73c-494a-a57b-ded9f5d82a84.jar:1.18-SNAPSHOT]  at 
> 

[jira] [Assigned] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP

2023-08-11 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32844:
--

Assignee: Lijie Wang

> Runtime Filter should not be applied if the field is already filtered by DPP
> 
>
> Key: FLINK-32844
> URL: https://issues.apache.org/jira/browse/FLINK-32844
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>
> Currently, the runtime filter and DPP may take effect on the same key. In 
> this case, the runtime filter may be redundant because the data may have been 
> filtered out by the DPP. We should avoid this because redundant runtime 
> filters can have negative effects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32844) Runtime Filter should not be applied if the field is already filtered by DPP

2023-08-11 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-32844:
--

 Summary: Runtime Filter should not be applied if the field is 
already filtered by DPP
 Key: FLINK-32844
 URL: https://issues.apache.org/jira/browse/FLINK-32844
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Lijie Wang


Currently, the runtime filter and DPP may take effect on the same key. In this 
case, the runtime filter may be redundant because the data may have been 
filtered out by the DPP. We should avoid this because redundant runtime filters 
can have negative effects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side

2023-08-10 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32831:
--

Assignee: Lijie Wang

> RuntimeFilterProgram should aware join type when looking for the build side
> ---
>
> Key: FLINK-32831
> URL: https://issues.apache.org/jira/browse/FLINK-32831
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>
> Currently, runtime filter program will try to look for an {{Exchange}} as 
> build side to avoid affecting {{MultiInput}}. It will try to push down the 
> runtime filter builder if the original build side is not {{Exchange}}.
> Currenlty, the builder-push-down does not aware the join type, which may lead 
> to incorrect results(For example, push down the builder to the right input of 
> left-join).
> We should only support following cases:
> 1. Inner join: builder can push to left + right input
> 2. semi join: builder can push to left + right input
> 3. left join: builder can only push to the left input
> 4. right join: builder can only push to the right input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side

2023-08-10 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32831:
---
Description: 
Currently, runtime filter program will try to look for an {{Exchange}} as build 
side to avoid affecting {{MultiInput}}. It will try to push down the runtime 
filter builder if the original build side is not {{Exchange}}.

Currenlty, the builder-push-down does not aware the join type, which may lead 
to incorrect results(For example, push down the builder to the right input of 
left-join).

We should only support following cases:

1. Inner join: builder can push to left + right input
2. semi join: builder can push to left + right input
3. left join: builder can only push to the left input
4. right join: builder can only push to the right input

  was:
Currently, runtime filter program will try to look for an {{Exchange}} as build 
side to avoid affecting {{MultiInput}}. It will try to push down the runtime 
filter builder if the original build side is not {{Exchange}}.

Currenlty, the builder-push-down does not aware the join type, which may lead 
to incorrect results(For example, push down the builder to the right input of 
left-join).

We should only support following cases:
1. Inner join: builder can push to left + right input
2. semi join: builder can push to left + right input
3. left join: builder can only push to the left input
4. right join: builder can only push to the right input


> RuntimeFilterProgram should aware join type when looking for the build side
> ---
>
> Key: FLINK-32831
> URL: https://issues.apache.org/jira/browse/FLINK-32831
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Priority: Major
>
> Currently, runtime filter program will try to look for an {{Exchange}} as 
> build side to avoid affecting {{MultiInput}}. It will try to push down the 
> runtime filter builder if the original build side is not {{Exchange}}.
> Currenlty, the builder-push-down does not aware the join type, which may lead 
> to incorrect results(For example, push down the builder to the right input of 
> left-join).
> We should only support following cases:
> 1. Inner join: builder can push to left + right input
> 2. semi join: builder can push to left + right input
> 3. left join: builder can only push to the left input
> 4. right join: builder can only push to the right input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32831) RuntimeFilterProgram should aware join type when looking for the build side

2023-08-10 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-32831:
--

 Summary: RuntimeFilterProgram should aware join type when looking 
for the build side
 Key: FLINK-32831
 URL: https://issues.apache.org/jira/browse/FLINK-32831
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Lijie Wang


Currently, runtime filter program will try to look for an {{Exchange}} as build 
side to avoid affecting {{MultiInput}}. It will try to push down the runtime 
filter builder if the original build side is not {{Exchange}}.

Currenlty, the builder-push-down does not aware the join type, which may lead 
to incorrect results(For example, push down the builder to the right input of 
left-join).

We should only support following cases:
1. Inner join: builder can push to left + right input
2. semi join: builder can push to left + right input
3. left join: builder can only push to the left input
4. right join: builder can only push to the right input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter

2023-08-10 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32827:
--

Assignee: dalongliu

> Operator fusion codegen may not take effect when enable runtime filter
> --
>
> Key: FLINK-32827
> URL: https://issues.apache.org/jira/browse/FLINK-32827
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: dalongliu
>Priority: Major
>
> Currently, the RuntimeFilterOperator does not support operator fusion 
> codegen(OFCG), which means the Runtime Filter and OFCG can not take affect 
> together, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter

2023-08-10 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752724#comment-17752724
 ] 

Lijie Wang commented on FLINK-32827:


cc [~lsy]

> Operator fusion codegen may not take effect when enable runtime filter
> --
>
> Key: FLINK-32827
> URL: https://issues.apache.org/jira/browse/FLINK-32827
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Lijie Wang
>Assignee: dalongliu
>Priority: Major
>
> Currently, the RuntimeFilterOperator does not support operator fusion 
> codegen(OFCG), which means the Runtime Filter and OFCG can not take affect 
> together, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32827) Operator fusion codegen may not take effect when enable runtime filter

2023-08-10 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-32827:
--

 Summary: Operator fusion codegen may not take effect when enable 
runtime filter
 Key: FLINK-32827
 URL: https://issues.apache.org/jira/browse/FLINK-32827
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Lijie Wang


Currently, the RuntimeFilterOperator does not support operator fusion 
codegen(OFCG), which means the Runtime Filter and OFCG can not take affect 
together, we should fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32788) SpeculativeScheduler do not handle allocate task errors when schedule speculative tasks may causes resource leakage.

2023-08-08 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32788:
--

Assignee: Junrui Li

> SpeculativeScheduler do not handle allocate task errors when schedule 
> speculative tasks may causes resource leakage.
> 
>
> Key: FLINK-32788
> URL: https://issues.apache.org/jira/browse/FLINK-32788
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Junrui Li
>Assignee: Junrui Li
>Priority: Major
>
> When the SpeculativeScheduler allocates slots for speculative tasks, 
> exceptions may occur, but there is currently no exception handling mechanism 
> in place. This can lead to resource leakage (such as FLINK-32768) when errors 
> occur. In such cases, a fatalError should be triggered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32768) SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource times out

2023-08-07 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32768:
---
Priority: Blocker  (was: Critical)

> SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource times 
> out
> 
>
> Key: FLINK-32768
> URL: https://issues.apache.org/jira/browse/FLINK-32768
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.18.0
>Reporter: Matthias Pohl
>Assignee: Junrui Li
>Priority: Blocker
>  Labels: test-stability
>
> SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource is 
> timing out:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52009=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=9022
> {code}
> 2023-08-06T05:34:27.1867230Z Aug 06 05:34:27 "ForkJoinPool-1-worker-1" #14 
> daemon prio=5 os_prio=0 tid=0x7fb7d4e82800 nid=0x6dde7 waiting on 
> condition [0x7fb7834a4000]
> 2023-08-06T05:34:27.1867541Z Aug 06 05:34:27java.lang.Thread.State: 
> WAITING (parking)
> 2023-08-06T05:34:27.186Z Aug 06 05:34:27  at sun.misc.Unsafe.park(Native 
> Method)
> 2023-08-06T05:34:27.1868191Z Aug 06 05:34:27  - parking to wait for  
> <0xa77360d8> (a java.util.concurrent.CompletableFuture$Signaller)
> 2023-08-06T05:34:27.1868571Z Aug 06 05:34:27  at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 2023-08-06T05:34:27.1868896Z Aug 06 05:34:27  at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> 2023-08-06T05:34:27.1869240Z Aug 06 05:34:27  at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3313)
> 2023-08-06T05:34:27.1869682Z Aug 06 05:34:27  at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> 2023-08-06T05:34:27.1870022Z Aug 06 05:34:27  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 2023-08-06T05:34:27.1870395Z Aug 06 05:34:27  at 
> org.apache.flink.test.scheduling.SpeculativeSchedulerITCase.executeJob(SpeculativeSchedulerITCase.java:229)
> 2023-08-06T05:34:27.1870858Z Aug 06 05:34:27  at 
> org.apache.flink.test.scheduling.SpeculativeSchedulerITCase.testSpeculativeExecutionOfInputFormatSource(SpeculativeSchedulerITCase.java:165)
> 2023-08-06T05:34:27.1871251Z Aug 06 05:34:27  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-08-01 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32680.
--
Fix Version/s: 1.18.0
   1.16.3
   1.17.2
   Resolution: Fixed

> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Lijie Wang
>Assignee: Junrui Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0, 1.16.3, 1.17.2
>
> Attachments: image-2023-07-26-15-23-29-551.png, 
> image-2023-07-26-15-24-24-077.png
>
>
> Take the following test(put it to {{MultipleInputITCase}}) as example:
> {code:java}
> @Test
> public void testMultipleInputDoesNotChainedWithSource() throws Exception {
> testJobVertexName(false);
> }
> 
> @Test
> public void testMultipleInputChainedWithSource() throws Exception {
> testJobVertexName(true);
> }
> public void testJobVertexName(boolean chain) throws Exception {
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> TestListResultSink resultSink = new TestListResultSink<>();
> DataStream source1 = env.fromSequence(0L, 3L).name("source1");
> DataStream source2 = env.fromElements(4L, 6L).name("source2");
> DataStream source3 = env.fromElements(7L, 9L).name("source3");
> KeyedMultipleInputTransformation transform =
> new KeyedMultipleInputTransformation<>(
> "MultipleInput",
> new KeyedSumMultipleInputOperatorFactory(),
> BasicTypeInfo.LONG_TYPE_INFO,
> 1,
> BasicTypeInfo.LONG_TYPE_INFO);
> if (chain) {
> transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
> }
> KeySelector keySelector = (KeySelector) value 
> -> value % 3;
> env.addOperator(
> transform
> .addInput(source1.getTransformation(), keySelector)
> .addInput(source2.getTransformation(), keySelector)
> .addInput(source3.getTransformation(), keySelector));
> new 
> MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");
> env.execute();
> }{code}
>  
> When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex 
> names are normal:
> !image-2023-07-26-15-24-24-077.png|width=494,height=246!
> When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained 
> with source1), job vertex names get messed up (all job vertex names contain 
> {{{}Source: source1{}}}):
> !image-2023-07-26-15-23-29-551.png|width=515,height=182!
>  
> I think it's a bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-08-01 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749791#comment-17749791
 ] 

Lijie Wang commented on FLINK-32680:


Fixed via:
master(1.18): 9e6f8e8e127e8dd81ded4a2214fd710c8ff8180a
release-1.17: 5a87cfac875feabf29c0f001e5591ca48a9e5e94
release-1.16: 3c46ffb1e27d4a369c7016b631fb8d5f944c3e50

> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Lijie Wang
>Assignee: Junrui Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-07-26-15-23-29-551.png, 
> image-2023-07-26-15-24-24-077.png
>
>
> Take the following test(put it to {{MultipleInputITCase}}) as example:
> {code:java}
> @Test
> public void testMultipleInputDoesNotChainedWithSource() throws Exception {
> testJobVertexName(false);
> }
> 
> @Test
> public void testMultipleInputChainedWithSource() throws Exception {
> testJobVertexName(true);
> }
> public void testJobVertexName(boolean chain) throws Exception {
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> TestListResultSink resultSink = new TestListResultSink<>();
> DataStream source1 = env.fromSequence(0L, 3L).name("source1");
> DataStream source2 = env.fromElements(4L, 6L).name("source2");
> DataStream source3 = env.fromElements(7L, 9L).name("source3");
> KeyedMultipleInputTransformation transform =
> new KeyedMultipleInputTransformation<>(
> "MultipleInput",
> new KeyedSumMultipleInputOperatorFactory(),
> BasicTypeInfo.LONG_TYPE_INFO,
> 1,
> BasicTypeInfo.LONG_TYPE_INFO);
> if (chain) {
> transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
> }
> KeySelector keySelector = (KeySelector) value 
> -> value % 3;
> env.addOperator(
> transform
> .addInput(source1.getTransformation(), keySelector)
> .addInput(source2.getTransformation(), keySelector)
> .addInput(source3.getTransformation(), keySelector));
> new 
> MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");
> env.execute();
> }{code}
>  
> When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex 
> names are normal:
> !image-2023-07-26-15-24-24-077.png|width=494,height=246!
> When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained 
> with source1), job vertex names get messed up (all job vertex names contain 
> {{{}Source: source1{}}}):
> !image-2023-07-26-15-23-29-551.png|width=515,height=182!
>  
> I think it's a bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32486) FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-07-27 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32486.
--
Release Note: 
We introduce runtime filter for batch jobs in 1.18, which is designed to 
improve join performance. It will dynamically generate filter conditions for 
certain Join queries at runtime to reduce the amount of scanned or shuffled 
data, avoid unnecessary I/O and network transmission, and speed up the query. 
Its working principle is building a filter(e.g. bloom filter) based on the data 
on the small table side(build side) first, then pass this filter to the large 
table side(probe side) to filter the irrelevant data on it, this can reduce the 
data reaching the join and improve performance. 

  Resolution: Done

> FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
> ---
>
> Key: FLINK-32486
> URL: https://issues.apache.org/jira/browse/FLINK-32486
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner, Table SQL / Runtime
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
> Fix For: 1.18.0
>
>
> This is an umbrella ticket for 
> [FLIP-324|https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32486) FLIP-324: Introduce Runtime Filter for Flink Batch Jobs

2023-07-27 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747806#comment-17747806
 ] 

Lijie Wang commented on FLINK-32486:


Thanks for reminder [~knaufk]. This feature is a performance optimization for 
join, and we tend to enable it by default soon (maybe next flink version), so 
we think it's not necessary to prepare a separate document for it (the 
description of config option is enough), and we will mention it in release note.

> FLIP-324: Introduce Runtime Filter for Flink Batch Jobs
> ---
>
> Key: FLINK-32486
> URL: https://issues.apache.org/jira/browse/FLINK-32486
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner, Table SQL / Runtime
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
> Fix For: 1.18.0
>
>
> This is an umbrella ticket for 
> [FLIP-324|https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-07-26 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32680:
--

Assignee: Junrui Li

> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Lijie Wang
>Assignee: Junrui Li
>Priority: Major
> Attachments: image-2023-07-26-15-23-29-551.png, 
> image-2023-07-26-15-24-24-077.png
>
>
> Take the following test(put it to {{MultipleInputITCase}}) as example:
> {code:java}
> @Test
> public void testMultipleInputDoesNotChainedWithSource() throws Exception {
> testJobVertexName(false);
> }
> 
> @Test
> public void testMultipleInputChainedWithSource() throws Exception {
> testJobVertexName(true);
> }
> public void testJobVertexName(boolean chain) throws Exception {
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> TestListResultSink resultSink = new TestListResultSink<>();
> DataStream source1 = env.fromSequence(0L, 3L).name("source1");
> DataStream source2 = env.fromElements(4L, 6L).name("source2");
> DataStream source3 = env.fromElements(7L, 9L).name("source3");
> KeyedMultipleInputTransformation transform =
> new KeyedMultipleInputTransformation<>(
> "MultipleInput",
> new KeyedSumMultipleInputOperatorFactory(),
> BasicTypeInfo.LONG_TYPE_INFO,
> 1,
> BasicTypeInfo.LONG_TYPE_INFO);
> if (chain) {
> transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
> }
> KeySelector keySelector = (KeySelector) value 
> -> value % 3;
> env.addOperator(
> transform
> .addInput(source1.getTransformation(), keySelector)
> .addInput(source2.getTransformation(), keySelector)
> .addInput(source3.getTransformation(), keySelector));
> new 
> MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");
> env.execute();
> }{code}
>  
> When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex 
> names are normal:
> !image-2023-07-26-15-24-24-077.png|width=494,height=246!
> When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained 
> with source1), job vertex names get messed up (all job vertex names contain 
> {{{}Source: source1{}}}):
> !image-2023-07-26-15-23-29-551.png|width=515,height=182!
>  
> I think it's a bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-07-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747411#comment-17747411
 ] 

Lijie Wang commented on FLINK-32680:


Thanks [~JunRuiLi] , assiged to you.

> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Lijie Wang
>Priority: Major
> Attachments: image-2023-07-26-15-23-29-551.png, 
> image-2023-07-26-15-24-24-077.png
>
>
> Take the following test(put it to {{MultipleInputITCase}}) as example:
> {code:java}
> @Test
> public void testMultipleInputDoesNotChainedWithSource() throws Exception {
> testJobVertexName(false);
> }
> 
> @Test
> public void testMultipleInputChainedWithSource() throws Exception {
> testJobVertexName(true);
> }
> public void testJobVertexName(boolean chain) throws Exception {
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> TestListResultSink resultSink = new TestListResultSink<>();
> DataStream source1 = env.fromSequence(0L, 3L).name("source1");
> DataStream source2 = env.fromElements(4L, 6L).name("source2");
> DataStream source3 = env.fromElements(7L, 9L).name("source3");
> KeyedMultipleInputTransformation transform =
> new KeyedMultipleInputTransformation<>(
> "MultipleInput",
> new KeyedSumMultipleInputOperatorFactory(),
> BasicTypeInfo.LONG_TYPE_INFO,
> 1,
> BasicTypeInfo.LONG_TYPE_INFO);
> if (chain) {
> transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
> }
> KeySelector keySelector = (KeySelector) value 
> -> value % 3;
> env.addOperator(
> transform
> .addInput(source1.getTransformation(), keySelector)
> .addInput(source2.getTransformation(), keySelector)
> .addInput(source3.getTransformation(), keySelector));
> new 
> MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");
> env.execute();
> }{code}
>  
> When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex 
> names are normal:
> !image-2023-07-26-15-24-24-077.png|width=494,height=246!
> When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained 
> with source1), job vertex names get messed up (all job vertex names contain 
> {{{}Source: source1{}}}):
> !image-2023-07-26-15-23-29-551.png|width=515,height=182!
>  
> I think it's a bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-07-26 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32680:
---
Description: 
Take the following test(put it to {{MultipleInputITCase}}) as example:
{code:java}
@Test
public void testMultipleInputDoesNotChainedWithSource() throws Exception {
testJobVertexName(false);
}

@Test
public void testMultipleInputChainedWithSource() throws Exception {
testJobVertexName(true);
}

public void testJobVertexName(boolean chain) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);

TestListResultSink resultSink = new TestListResultSink<>();

DataStream source1 = env.fromSequence(0L, 3L).name("source1");
DataStream source2 = env.fromElements(4L, 6L).name("source2");
DataStream source3 = env.fromElements(7L, 9L).name("source3");

KeyedMultipleInputTransformation transform =
new KeyedMultipleInputTransformation<>(
"MultipleInput",
new KeyedSumMultipleInputOperatorFactory(),
BasicTypeInfo.LONG_TYPE_INFO,
1,
BasicTypeInfo.LONG_TYPE_INFO);
if (chain) {
transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
}
KeySelector keySelector = (KeySelector) value 
-> value % 3;

env.addOperator(
transform
.addInput(source1.getTransformation(), keySelector)
.addInput(source2.getTransformation(), keySelector)
.addInput(source3.getTransformation(), keySelector));

new 
MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");

env.execute();
}{code}
 

When we run {{testMultipleInputDoesNotChainedWithSource}} , all job vertex 
names are normal:

!image-2023-07-26-15-24-24-077.png|width=494,height=246!

When we run {{testMultipleInputChainedWithSource}} (the MultipleInput chained 
with source1), job vertex names get messed up (all job vertex names contain 
{{{}Source: source1{}}}):

!image-2023-07-26-15-23-29-551.png|width=515,height=182!

 

I think it's a bug.

  was:
Take the following test(put it to {{{}MultipleInputITCase{}}}) as example:
{code:java}
@Test
public void testMultipleInputDoesNotChainedWithSource() throws Exception {
testJobVertexName(false);
}

@Test
public void testMultipleInputChainedWithSource() throws Exception {
testJobVertexName(true);
}

public void testJobVertexName(boolean chain) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);

TestListResultSink resultSink = new TestListResultSink<>();

DataStream source1 = env.fromSequence(0L, 3L).name("source1");
DataStream source2 = env.fromElements(4L, 6L).name("source2");
DataStream source3 = env.fromElements(7L, 9L).name("source3");

KeyedMultipleInputTransformation transform =
new KeyedMultipleInputTransformation<>(
"MultipleInput",
new KeyedSumMultipleInputOperatorFactory(),
BasicTypeInfo.LONG_TYPE_INFO,
1,
BasicTypeInfo.LONG_TYPE_INFO);
if (chain) {
transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
}
KeySelector keySelector = (KeySelector) value 
-> value % 3;

env.addOperator(
transform
.addInput(source1.getTransformation(), keySelector)
.addInput(source2.getTransformation(), keySelector)
.addInput(source3.getTransformation(), keySelector));

new 
MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");

env.execute();
}{code}
 

When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex 
names are normal:

!image-2023-07-26-15-24-24-077.png|width=494,height=246!

When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get 
messed up (all names contain {{{}Source: source1{}}}):

!image-2023-07-26-15-23-29-551.png|width=515,height=182!

 

I think it's a bug.


> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 

[jira] [Updated] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-07-26 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32680:
---
Attachment: (was: image-2023-07-26-15-01-51-886.png)

> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Lijie Wang
>Priority: Major
> Attachments: image-2023-07-26-15-23-29-551.png, 
> image-2023-07-26-15-24-24-077.png
>
>
> Take the following test(put it to {{{}MultipleInputITCase{}}}) as example:
> {code:java}
> @Test
> public void testMultipleInputDoesNotChainedWithSource() throws Exception {
> testJobVertexName(false);
> }
> 
> @Test
> public void testMultipleInputChainedWithSource() throws Exception {
> testJobVertexName(true);
> }
> public void testJobVertexName(boolean chain) throws Exception {
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(1);
> TestListResultSink resultSink = new TestListResultSink<>();
> DataStream source1 = env.fromSequence(0L, 3L).name("source1");
> DataStream source2 = env.fromElements(4L, 6L).name("source2");
> DataStream source3 = env.fromElements(7L, 9L).name("source3");
> KeyedMultipleInputTransformation transform =
> new KeyedMultipleInputTransformation<>(
> "MultipleInput",
> new KeyedSumMultipleInputOperatorFactory(),
> BasicTypeInfo.LONG_TYPE_INFO,
> 1,
> BasicTypeInfo.LONG_TYPE_INFO);
> if (chain) {
> transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
> }
> KeySelector keySelector = (KeySelector) value 
> -> value % 3;
> env.addOperator(
> transform
> .addInput(source1.getTransformation(), keySelector)
> .addInput(source2.getTransformation(), keySelector)
> .addInput(source3.getTransformation(), keySelector));
> new 
> MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");
> env.execute();
> }{code}
>  
> When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex 
> names are normal:
> !image-2023-07-26-15-24-24-077.png|width=494,height=246!
> When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get 
> messed up (all names contain {{{}Source: source1{}}}):
> !image-2023-07-26-15-23-29-551.png|width=515,height=182!
>  
> I think it's a bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-07-26 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32680:
---
Description: 
Take the following test(put it to {{{}MultipleInputITCase{}}}) as example:
{code:java}
@Test
public void testMultipleInputDoesNotChainedWithSource() throws Exception {
testJobVertexName(false);
}

@Test
public void testMultipleInputChainedWithSource() throws Exception {
testJobVertexName(true);
}

public void testJobVertexName(boolean chain) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);

TestListResultSink resultSink = new TestListResultSink<>();

DataStream source1 = env.fromSequence(0L, 3L).name("source1");
DataStream source2 = env.fromElements(4L, 6L).name("source2");
DataStream source3 = env.fromElements(7L, 9L).name("source3");

KeyedMultipleInputTransformation transform =
new KeyedMultipleInputTransformation<>(
"MultipleInput",
new KeyedSumMultipleInputOperatorFactory(),
BasicTypeInfo.LONG_TYPE_INFO,
1,
BasicTypeInfo.LONG_TYPE_INFO);
if (chain) {
transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
}
KeySelector keySelector = (KeySelector) value 
-> value % 3;

env.addOperator(
transform
.addInput(source1.getTransformation(), keySelector)
.addInput(source2.getTransformation(), keySelector)
.addInput(source3.getTransformation(), keySelector));

new 
MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");

env.execute();
}{code}
 

When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex 
names are normal:

!image-2023-07-26-15-24-24-077.png|width=494,height=246!

When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get 
messed up (all names contain {{{}Source: source1{}}}):

!image-2023-07-26-15-23-29-551.png|width=515,height=182!

 

I think it's a bug.

  was:
Take the following test(put it to {{{}MultipleInputITCase{}}}) as example:
{code:java}
@Test
public void testMultipleInputDoesNotChainedWithSource() throws Exception {
testJobVertexName(false);
}

@Test
public void testMultipleInputChainedWithSource() throws Exception {
testJobVertexName(true);
}

public void testJobVertexName(boolean chain) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);

TestListResultSink resultSink = new TestListResultSink<>();

DataStream source1 = env.fromSequence(0L, 3L).name("source1");
DataStream source2 = env.fromElements(4L, 6L).name("source2");
DataStream source3 = env.fromElements(7L, 9L).name("source3");

KeyedMultipleInputTransformation transform =
new KeyedMultipleInputTransformation<>(
"MultipleInput",
new KeyedSumMultipleInputOperatorFactory(),
BasicTypeInfo.LONG_TYPE_INFO,
1,
BasicTypeInfo.LONG_TYPE_INFO);
if (chain) {
transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
}
KeySelector keySelector = (KeySelector) value 
-> value % 3;

env.addOperator(
transform
.addInput(source1.getTransformation(), keySelector)
.addInput(source2.getTransformation(), keySelector)
.addInput(source3.getTransformation(), keySelector));

new 
MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");

env.execute();
}{code}
 

When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex 
names are normal.

!image-2023-07-26-15-24-24-077.png|width=494,height=246!

When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get 
messed up (all names contain {{{}Source: source1{}}}). I think it's a bug.

!image-2023-07-26-15-23-29-551.png|width=515,height=182!


> Job vertex names get messed up once there is a source vertex chained with a 
> MultipleInput vertex in job graph
> -
>
> Key: FLINK-32680
> URL: https://issues.apache.org/jira/browse/FLINK-32680
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: 

[jira] [Created] (FLINK-32680) Job vertex names get messed up once there is a source vertex chained with a MultipleInput vertex in job graph

2023-07-26 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-32680:
--

 Summary: Job vertex names get messed up once there is a source 
vertex chained with a MultipleInput vertex in job graph
 Key: FLINK-32680
 URL: https://issues.apache.org/jira/browse/FLINK-32680
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.17.1, 1.16.2, 1.18.0
Reporter: Lijie Wang
 Attachments: image-2023-07-26-15-01-51-886.png, 
image-2023-07-26-15-23-29-551.png, image-2023-07-26-15-24-24-077.png

Take the following test(put it to {{{}MultipleInputITCase{}}}) as example:
{code:java}
@Test
public void testMultipleInputDoesNotChainedWithSource() throws Exception {
testJobVertexName(false);
}

@Test
public void testMultipleInputChainedWithSource() throws Exception {
testJobVertexName(true);
}

public void testJobVertexName(boolean chain) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);

TestListResultSink resultSink = new TestListResultSink<>();

DataStream source1 = env.fromSequence(0L, 3L).name("source1");
DataStream source2 = env.fromElements(4L, 6L).name("source2");
DataStream source3 = env.fromElements(7L, 9L).name("source3");

KeyedMultipleInputTransformation transform =
new KeyedMultipleInputTransformation<>(
"MultipleInput",
new KeyedSumMultipleInputOperatorFactory(),
BasicTypeInfo.LONG_TYPE_INFO,
1,
BasicTypeInfo.LONG_TYPE_INFO);
if (chain) {
transform.setChainingStrategy(ChainingStrategy.HEAD_WITH_SOURCES);
}
KeySelector keySelector = (KeySelector) value 
-> value % 3;

env.addOperator(
transform
.addInput(source1.getTransformation(), keySelector)
.addInput(source2.getTransformation(), keySelector)
.addInput(source3.getTransformation(), keySelector));

new 
MultipleConnectedStreams(env).transform(transform).rebalance().addSink(resultSink).name("sink");

env.execute();
}{code}
 

When we run {{{}testMultipleInputDoesNotChainedWithSource{}}}, all job vertex 
names are normal.

!image-2023-07-26-15-24-24-077.png|width=494,height=246!

When we run {{{}testMultipleInputChainedWithSource{}}}, job vertex names get 
messed up (all names contain {{{}Source: source1{}}}). I think it's a bug.

!image-2023-07-26-15-23-29-551.png|width=515,height=182!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32492) Introduce FlinkRuntimeFilterProgram to inject runtime filter

2023-07-23 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746058#comment-17746058
 ] 

Lijie Wang commented on FLINK-32492:


Done via master(1.18):
8659dd788d0e9bc5e534377fd065f925a3e33bbb
ad20b19fff808bf3a191279f0f137952a78c083a
72bee90ccb40b71e760d251b26c8d48e1b110307
9f73d3d81a5471998d854001a09c7299a10f1424

> Introduce FlinkRuntimeFilterProgram to inject runtime filter
> 
>
> Key: FLINK-32492
> URL: https://issues.apache.org/jira/browse/FLINK-32492
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32492) Introduce FlinkRuntimeFilterProgram to inject runtime filter

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32492.
--
Resolution: Done

> Introduce FlinkRuntimeFilterProgram to inject runtime filter
> 
>
> Key: FLINK-32492
> URL: https://issues.apache.org/jira/browse/FLINK-32492
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32045) optimize task deployment performance for large-scale jobs

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32045:
---
Fix Version/s: 1.18.0

> optimize task deployment performance for large-scale jobs
> -
>
> Key: FLINK-32045
> URL: https://issues.apache.org/jira/browse/FLINK-32045
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
> Fix For: 1.18.0
>
>
> h1. Background
> In FLINK-21110, we cache shuffle descriptors on the job manager side and 
> support using blob servers to offload these descriptors in order to reduce 
> the cost of tasks deployment.
> I think there is also some improvement we could do for large-scale jobs.
>  # The default min size to enable distribution via blob server is 1MB. But 
> for a large wordcount job with 2 parallelism, the size of serialized 
> shuffle descriptors is only 300KB. It means users need to lower the 
> "blob.offload.minsize", but the value is hard for users to decide.
>  # The task executor side still needs to load blob files and deserialize 
> shuffle descriptors for each task. Since these operations are running in the 
> main thread, it may be pending other RPCs from the job manager.
> h1. Propose
>  # Enable distribute shuffle descriptors via blob server automatically. This 
> could be decided by the edge number of the current shuffle descriptor. The 
> blob offload will be enabled when the edge number exceeds an internal 
> threshold.
>  # Introduce cache of deserialized shuffle descriptors on the task executor 
> side. This could reduce the cost of reading from local blob files and 
> deserialization. Of course, the cache should have TTL to avoid occupying too 
> much memory. And the cache should have the same switch mechanism as the blob 
> server offload.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32386) Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32386:
---
Fix Version/s: 1.18.0

> Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors
> ---
>
> Key: FLINK-32386
> URL: https://issues.apache.org/jira/browse/FLINK-32386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Introduce a new struct named ShuffleDescriptorsCache to cache 
> ShuffleDescriptorAndIndex which are offloaded by the blob server. 
> The cache should have the following capabilities:
> 1. Expired after exceeding the TTL.
> 2. Limit the size of the cache. Remove the oldest element from the cache when 
> its maximum size has been exceeded.
> 3. Clear elements belong to a job when it disconnects from TaskExecutor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32387) InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle descriptors multiple times

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32387.
--
Fix Version/s: 1.18.0
   Resolution: Done

> InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle 
> descriptors multiple times
> --
>
> Key: FLINK-32387
> URL: https://issues.apache.org/jira/browse/FLINK-32387
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle 
> descriptors multiple times.
> The cache only affects when the shuffle descriptors are offloaded by the blob 
> server. This means the shuffle descriptors size is large enough to use caches.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32387) InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle descriptors multiple times

2023-07-23 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746040#comment-17746040
 ] 

Lijie Wang commented on FLINK-32387:


Done via master(1.18): 7a9efbff0a2ea3e4e736c48a57a362cc9c5bbf37

> InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle 
> descriptors multiple times
> --
>
> Key: FLINK-32387
> URL: https://issues.apache.org/jira/browse/FLINK-32387
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
>
> InputGateDeploymentDescriptor uses cache to avoid deserializing shuffle 
> descriptors multiple times.
> The cache only affects when the shuffle descriptors are offloaded by the blob 
> server. This means the shuffle descriptors size is large enough to use caches.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32385) Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group of ShuffleDescriptorAndIndex

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32385:
---
Fix Version/s: 1.18.0

> Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group 
> of ShuffleDescriptorAndIndex
> -
>
> Key: FLINK-32385
> URL: https://issues.apache.org/jira/browse/FLINK-32385
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Introduce a new struct named SerializedShuffleDescriptorAndIndices to 
> identify a group of ShuffleDescriptorAndIndex. 
> Then we could cache these ShuffleDescriptorAndIndex in TaskExecutor side



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32386) Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32386.
--
Resolution: Done

> Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors
> ---
>
> Key: FLINK-32386
> URL: https://issues.apache.org/jira/browse/FLINK-32386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
>
> Introduce a new struct named ShuffleDescriptorsCache to cache 
> ShuffleDescriptorAndIndex which are offloaded by the blob server. 
> The cache should have the following capabilities:
> 1. Expired after exceeding the TTL.
> 2. Limit the size of the cache. Remove the oldest element from the cache when 
> its maximum size has been exceeded.
> 3. Clear elements belong to a job when it disconnects from TaskExecutor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32386) Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors

2023-07-23 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746039#comment-17746039
 ] 

Lijie Wang commented on FLINK-32386:


Done via master(1.18): 4d642635c9409883724bd0127f2a00ef14993bf0

> Add ShuffleDescriptorsCache in TaskExecutor to cache ShuffleDescriptors
> ---
>
> Key: FLINK-32386
> URL: https://issues.apache.org/jira/browse/FLINK-32386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
>
> Introduce a new struct named ShuffleDescriptorsCache to cache 
> ShuffleDescriptorAndIndex which are offloaded by the blob server. 
> The cache should have the following capabilities:
> 1. Expired after exceeding the TTL.
> 2. Limit the size of the cache. Remove the oldest element from the cache when 
> its maximum size has been exceeded.
> 3. Clear elements belong to a job when it disconnects from TaskExecutor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32385) Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group of ShuffleDescriptorAndIndex

2023-07-23 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32385.
--
Resolution: Done

> Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group 
> of ShuffleDescriptorAndIndex
> -
>
> Key: FLINK-32385
> URL: https://issues.apache.org/jira/browse/FLINK-32385
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
>
> Introduce a new struct named SerializedShuffleDescriptorAndIndices to 
> identify a group of ShuffleDescriptorAndIndex. 
> Then we could cache these ShuffleDescriptorAndIndex in TaskExecutor side



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32385) Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group of ShuffleDescriptorAndIndex

2023-07-23 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746038#comment-17746038
 ] 

Lijie Wang commented on FLINK-32385:


Done via master(1.18): fa4518960fd009531a584ca5450604649e94bac0

> Introduce a struct SerializedShuffleDescriptorAndIndices to identify a group 
> of ShuffleDescriptorAndIndex
> -
>
> Key: FLINK-32385
> URL: https://issues.apache.org/jira/browse/FLINK-32385
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Weihua Hu
>Assignee: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
>
> Introduce a new struct named SerializedShuffleDescriptorAndIndices to 
> identify a group of ShuffleDescriptorAndIndex. 
> Then we could cache these ShuffleDescriptorAndIndex in TaskExecutor side



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32491) Introduce RuntimeFilterOperator to support runtime filter which can reduce the shuffle data size before shuffle join

2023-07-21 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32491.
--
Resolution: Done

> Introduce RuntimeFilterOperator to support runtime filter which can reduce 
> the shuffle data size before shuffle join
> 
>
> Key: FLINK-32491
> URL: https://issues.apache.org/jira/browse/FLINK-32491
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32491) Introduce RuntimeFilterOperator to support runtime filter which can reduce the shuffle data size before shuffle join

2023-07-21 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745872#comment-17745872
 ] 

Lijie Wang commented on FLINK-32491:


Done via master(1.18): 59850a91f96ddb5c58d8554b76e46bc6245dda7b

> Introduce RuntimeFilterOperator to support runtime filter which can reduce 
> the shuffle data size before shuffle join
> 
>
> Key: FLINK-32491
> URL: https://issues.apache.org/jira/browse/FLINK-32491
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32493) Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side data of shuffle join

2023-07-21 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-32493.
--
Resolution: Done

> Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side 
> data of shuffle join
> --
>
> Key: FLINK-32493
> URL: https://issues.apache.org/jira/browse/FLINK-32493
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32493) Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side data of shuffle join

2023-07-21 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745467#comment-17745467
 ] 

Lijie Wang commented on FLINK-32493:


Done via master(1.18) 47596ea9250415bc333fa612c6b1ec5c7407fdd6 and 
41b35260bba91463bd9e44a4661beaa74c4cbe10

> Introduce RuntimeFilterBuilderOperator to build a BloomFilter from build side 
> data of shuffle join
> --
>
> Key: FLINK-32493
> URL: https://issues.apache.org/jira/browse/FLINK-32493
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32632) Run Kubernetes test is unstable on AZP

2023-07-20 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-32632:
---
Priority: Blocker  (was: Critical)

> Run Kubernetes test is unstable on AZP
> --
>
> Key: FLINK-32632
> URL: https://issues.apache.org/jira/browse/FLINK-32632
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Blocker
>  Labels: test-stability
>
> This test 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51447=logs=bea52777-eaf8-5663-8482-18fbc3630e81=43ba8ce7-ebbf-57cd-9163-444305d74117=6213
> fails with
> {noformat}
> 2023-07-19T17:14:49.8144730Z Jul 19 17:14:49 
> deployment.apps/flink-task-manager created
> 2023-07-19T17:15:03.7983703Z Jul 19 17:15:03 job.batch/flink-job-cluster 
> condition met
> 2023-07-19T17:15:04.0937620Z error: Internal error occurred: error executing 
> command in container: http: invalid Host header
> 2023-07-19T17:15:04.0988752Z sort: cannot read: 
> '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-11919909188/out/kubernetes_wc_out*':
>  No such file or directory
> 2023-07-19T17:15:04.1017388Z Jul 19 17:15:04 FAIL WordCount: Output hash 
> mismatch.  Got d41d8cd98f00b204e9800998ecf8427e, expected 
> e682ec6622b5e83f2eb614617d5ab2cf.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32632) Run Kubernetes test is unstable on AZP

2023-07-20 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745367#comment-17745367
 ] 

Lijie Wang commented on FLINK-32632:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51495=logs=81be5d54-0dc6-5130-d390-233dd2956037=cfb9de70-be4e-5162-887e-653276e3edee

> Run Kubernetes test is unstable on AZP
> --
>
> Key: FLINK-32632
> URL: https://issues.apache.org/jira/browse/FLINK-32632
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This test 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51447=logs=bea52777-eaf8-5663-8482-18fbc3630e81=43ba8ce7-ebbf-57cd-9163-444305d74117=6213
> fails with
> {noformat}
> 2023-07-19T17:14:49.8144730Z Jul 19 17:14:49 
> deployment.apps/flink-task-manager created
> 2023-07-19T17:15:03.7983703Z Jul 19 17:15:03 job.batch/flink-job-cluster 
> condition met
> 2023-07-19T17:15:04.0937620Z error: Internal error occurred: error executing 
> command in container: http: invalid Host header
> 2023-07-19T17:15:04.0988752Z sort: cannot read: 
> '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-11919909188/out/kubernetes_wc_out*':
>  No such file or directory
> 2023-07-19T17:15:04.1017388Z Jul 19 17:15:04 FAIL WordCount: Output hash 
> mismatch.  Got d41d8cd98f00b204e9800998ecf8427e, expected 
> e682ec6622b5e83f2eb614617d5ab2cf.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32491) Introduce RuntimeFilterOperator to support runtime filter which can reduce the shuffle data size before shuffle join

2023-07-17 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang reassigned FLINK-32491:
--

Assignee: Lijie Wang  (was: dalongliu)

> Introduce RuntimeFilterOperator to support runtime filter which can reduce 
> the shuffle data size before shuffle join
> 
>
> Key: FLINK-32491
> URL: https://issues.apache.org/jira/browse/FLINK-32491
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: dalongliu
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   >