[GitHub] [flink] flinkbot edited a comment on pull request #16352: [FLINK-23102][runtime] Accessing FlameGraphs while not being enabled …
flinkbot edited a comment on pull request #16352: URL: https://github.com/apache/flink/pull/16352#issuecomment-872812232 ## CI report: * 6a7d378800616b57c9551b849a550ebb1716315f Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20170) * fd2bddea66cd255bf9e36fbdd6b383c890dbb1bf Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20214) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23289) BinarySection should add null check in constructor method
[ https://issues.apache.org/jira/browse/FLINK-23289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377823#comment-17377823 ] Caizhi Weng commented on FLINK-23289: - I've added some description after discussing with [~Terry1897] offline. I'd like to take this issue. > BinarySection should add null check in constructor method > - > > Key: FLINK-23289 > URL: https://issues.apache.org/jira/browse/FLINK-23289 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime >Reporter: Terry Wang >Priority: Major > > {{BinarySection}} currently does not check if {{MemorySegment[]}} is null in > its constructor. This might cause {{NullPointerException}} somewhere else and > makes it harder to debug (as we don't know who sets the null value into > {{BinarySection}}). > {code:java} > Caused by: java.lang.NullPointerException > at > org.apache.flink.table.data.binary.BinarySegmentUtils.inFirstSegment(BinarySegmentUtils.java:411) > at > org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:132) > at > org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:118) > at > org.apache.flink.table.data.binary.BinaryStringData.copy(BinaryStringData.java:360) > at > org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:59) > at > org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:37) > at > org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copyGenericArray(ArrayDataSerializer.java:128) > at > org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:86) > at > org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:47) > at > org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170) > at > org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131) > at > org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48) > at > org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinWithCalcRunner$CalcCollectionCollector.collect(AsyncLookupJoinWithCalcRunner.java:152) > at > org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinWithCalcRunner$CalcCollectionCollector.collect(AsyncLookupJoinWithCalcRunner.java:142) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23289) BinarySection should add null check in constructor method
[ https://issues.apache.org/jira/browse/FLINK-23289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caizhi Weng updated FLINK-23289: Description: {{BinarySection}} currently does not check if {{MemorySegment[]}} is null in its constructor. This might cause {{NullPointerException}} somewhere else and makes it harder to debug (as we don't know who sets the null value into {{BinarySection}}). {code:java} Caused by: java.lang.NullPointerException at org.apache.flink.table.data.binary.BinarySegmentUtils.inFirstSegment(BinarySegmentUtils.java:411) at org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:132) at org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:118) at org.apache.flink.table.data.binary.BinaryStringData.copy(BinaryStringData.java:360) at org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:59) at org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:37) at org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copyGenericArray(ArrayDataSerializer.java:128) at org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:86) at org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:47) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48) at org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinWithCalcRunner$CalcCollectionCollector.collect(AsyncLookupJoinWithCalcRunner.java:152) at org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinWithCalcRunner$CalcCollectionCollector.collect(AsyncLookupJoinWithCalcRunner.java:142) {code} was: {code:java} Caused by: java.lang.NullPointerException at org.apache.flink.table.data.binary.BinarySegmentUtils.inFirstSegment(BinarySegmentUtils.java:411) at org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:132) at org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:118) at org.apache.flink.table.data.binary.BinaryStringData.copy(BinaryStringData.java:360) at org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:59) at org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:37) at org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copyGenericArray(ArrayDataSerializer.java:128) at org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:86) at org.apache.flink.table.runtime.typeutils.ArrayDataSerializer.copy(ArrayDataSerializer.java:47) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48) at org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinWithCalcRunner$CalcCollectionCollector.collect(AsyncLookupJoinWithCalcRunner.java:152) at org.apache.flink.table.runtime.operators.join.lookup.AsyncLookupJoinWithCalcRunner$CalcCollectionCollector.collect(AsyncLookupJoinWithCalcRunner.java:142) {code} > BinarySection should add null check in constructor method > - > > Key: FLINK-23289 > URL: https://issues.apache.org/jira/browse/FLINK-23289 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime >Reporter: Terry Wang >Priority: Major > > {{BinarySection}} currently does not check if {{MemorySegment[]}} is null in > its constructor. This might cause {{NullPointerException}} somewhere else and > makes it harder to debug (as we don't know who sets the null value into > {{BinarySection}}). > {code:java} > Caused by: java.lang.NullPointerException > at > org.apache.flink.table.data.binary.BinarySegmentUtils.inFirstSegment(BinarySegmentUtils.java:411) > at > org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:132) > at > org.apache.flink.table.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:118) > at > org.apache.flink.table.data.binary.BinaryStringData.copy(BinaryStringData.java:360) > at > org.apache.flink.table.runtime.typeutils.StringDataSerializer.copy(StringDataSerializer.java:59) > at >
[jira] [Comment Edited] (FLINK-23219) temproary join ttl configruation does not take effect
[ https://issues.apache.org/jira/browse/FLINK-23219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377821#comment-17377821 ] Caizhi Weng edited comment on FLINK-23219 at 7/9/21, 5:40 AM: -- [~waywtdcc] If I'm not mistaken, it is possible for your keys on the right side to appear only a few times and will not appear any more, and you want to remove them from the state by setting an expiration time? If it is the case, Flink currently does not have an option to achieve such behavior. I'd like to see it as a feature request or an "improvement". Shall we add an option for this? [~Leonard Xu] was (Author: tsreaper): [~waywtdcc] If I'm not mistaken, it is possible for your keys on the right side to appear only a few times and will not appear any more, and you want to remove them by setting an expiration time? If it is the case, Flink currently does not have an option to achieve such behavior. I'd like to see it as a feature request or an "improvement". Shall we add an option for this? [~Leonard Xu] > temproary join ttl configruation does not take effect > - > > Key: FLINK-23219 > URL: https://issues.apache.org/jira/browse/FLINK-23219 > Project: Flink > Issue Type: Bug > Components: Table SQL / API, Table SQL / Runtime >Affects Versions: 1.12.2 >Reporter: waywtdcc >Priority: Major > Labels: flink, pull-request-available, sql > Attachments: image-2021-07-02-16-29-40-310.png > > > * version: flink 1.12.2 > * problem: I run the job of table A temproary left join table B, and set > the table.exec.state.ttl configuration > to 3 hour or 2 sencond for test. But the task status keeps growing for more > than 7 days. > * code > {code:java} > tableEnv.getConfig().setIdleStateRetention(Duration.ofSeconds(2)); > tableEnv.executeSql("drop table if exists persons_table_kafka2"); > String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" + > " `id` BIGINT,\n" + > " `name` STRING,\n" + > " `age` INT,\n" + > " proctime as PROCTIME(),\n" + > " `ts` TIMESTAMP(3),\n" + > " WATERMARK FOR ts AS ts\n" + > ") WITH (\n" + > " 'connector' = 'kafka',\n" + > " 'topic' = 'persons_test_auto',\n" + > " 'properties.bootstrap.servers' = 'node2:6667',\n" + > " 'properties.group.id' = 'testGrodsu1765',\n" + > " 'scan.startup.mode' = 'group-offsets',\n" + > " 'format' = 'json'\n" + > ")"; > tableEnv.executeSql(kafka_source_sql); > tableEnv.executeSql("drop table if exists persons_message_table_kafka2"); > String kafka_source_sql2 = "CREATE TABLE persons_message_table_kafka2 (\n" + > " `id` BIGINT,\n" + > " `name` STRING,\n" + > " `message` STRING,\n" + > " `ts` TIMESTAMP(3) ," + > // " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" + > " WATERMARK FOR ts AS ts\n" + > ") WITH (\n" + > " 'connector' = 'kafka',\n" + > " 'topic' = 'persons_extra_message_auto',\n" + > " 'properties.bootstrap.servers' = 'node2:6667',\n" + > " 'properties.group.id' = 'testGroud125313',\n" + > " 'scan.startup.mode' = 'group-offsets',\n" + > " 'format' = 'json'\n" + > ")"; > tableEnv.executeSql(kafka_source_sql2); > tableEnv.executeSql( > "CREATE TEMPORARY VIEW persons_message_table22 AS \n" + > "SELECT id, name, message,ts \n" + > " FROM (\n" + > " SELECT *,\n" + > " ROW_NUMBER() OVER (PARTITION BY name \n" + > " ORDER BY ts DESC) AS rowNum \n" + > " FROM persons_message_table_kafka2 " + > " )\n" + > "WHERE rowNum = 1"); > tableEnv.executeSql("" + > "CREATE TEMPORARY VIEW result_data_view " + > " as " + > " select " + > " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as string) as ts2 " + > " from persons_table_kafka2 t1 " + > " left join persons_message_table22 FOR SYSTEM_TIME AS OF t1.ts AS t2 on > t1.name = t2.name " > ); > Table resultTable = tableEnv.from("result_data_view"); > DataStream rowDataDataStream = tableEnv.toAppendStream(resultTable, > RowData.class); > rowDataDataStream.print(); > env.execute("test_it"); > {code} > * the result like !image-2021-07-02-16-29-40-310.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23219) temproary join ttl configruation does not take effect
[ https://issues.apache.org/jira/browse/FLINK-23219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377821#comment-17377821 ] Caizhi Weng commented on FLINK-23219: - [~waywtdcc] If I'm not mistaken, it is possible for your keys on the right side to appear only a few times and will not appear any more, and you want to remove them by setting an expiration time? If it is the case, Flink currently does not have an option to achieve such behavior. I'd like to see it as a feature request or an "improvement". Shall we add an option for this? [~Leonard Xu] > temproary join ttl configruation does not take effect > - > > Key: FLINK-23219 > URL: https://issues.apache.org/jira/browse/FLINK-23219 > Project: Flink > Issue Type: Bug > Components: Table SQL / API, Table SQL / Runtime >Affects Versions: 1.12.2 >Reporter: waywtdcc >Priority: Major > Labels: flink, pull-request-available, sql > Attachments: image-2021-07-02-16-29-40-310.png > > > * version: flink 1.12.2 > * problem: I run the job of table A temproary left join table B, and set > the table.exec.state.ttl configuration > to 3 hour or 2 sencond for test. But the task status keeps growing for more > than 7 days. > * code > {code:java} > tableEnv.getConfig().setIdleStateRetention(Duration.ofSeconds(2)); > tableEnv.executeSql("drop table if exists persons_table_kafka2"); > String kafka_source_sql = "CREATE TABLE persons_table_kafka2 (\n" + > " `id` BIGINT,\n" + > " `name` STRING,\n" + > " `age` INT,\n" + > " proctime as PROCTIME(),\n" + > " `ts` TIMESTAMP(3),\n" + > " WATERMARK FOR ts AS ts\n" + > ") WITH (\n" + > " 'connector' = 'kafka',\n" + > " 'topic' = 'persons_test_auto',\n" + > " 'properties.bootstrap.servers' = 'node2:6667',\n" + > " 'properties.group.id' = 'testGrodsu1765',\n" + > " 'scan.startup.mode' = 'group-offsets',\n" + > " 'format' = 'json'\n" + > ")"; > tableEnv.executeSql(kafka_source_sql); > tableEnv.executeSql("drop table if exists persons_message_table_kafka2"); > String kafka_source_sql2 = "CREATE TABLE persons_message_table_kafka2 (\n" + > " `id` BIGINT,\n" + > " `name` STRING,\n" + > " `message` STRING,\n" + > " `ts` TIMESTAMP(3) ," + > // " WATERMARK FOR ts AS ts - INTERVAL '5' SECOND\n" + > " WATERMARK FOR ts AS ts\n" + > ") WITH (\n" + > " 'connector' = 'kafka',\n" + > " 'topic' = 'persons_extra_message_auto',\n" + > " 'properties.bootstrap.servers' = 'node2:6667',\n" + > " 'properties.group.id' = 'testGroud125313',\n" + > " 'scan.startup.mode' = 'group-offsets',\n" + > " 'format' = 'json'\n" + > ")"; > tableEnv.executeSql(kafka_source_sql2); > tableEnv.executeSql( > "CREATE TEMPORARY VIEW persons_message_table22 AS \n" + > "SELECT id, name, message,ts \n" + > " FROM (\n" + > " SELECT *,\n" + > " ROW_NUMBER() OVER (PARTITION BY name \n" + > " ORDER BY ts DESC) AS rowNum \n" + > " FROM persons_message_table_kafka2 " + > " )\n" + > "WHERE rowNum = 1"); > tableEnv.executeSql("" + > "CREATE TEMPORARY VIEW result_data_view " + > " as " + > " select " + > " t1.name, t1.age,t1.ts as ts1,t2.message, cast (t2.ts as string) as ts2 " + > " from persons_table_kafka2 t1 " + > " left join persons_message_table22 FOR SYSTEM_TIME AS OF t1.ts AS t2 on > t1.name = t2.name " > ); > Table resultTable = tableEnv.from("result_data_view"); > DataStream rowDataDataStream = tableEnv.toAppendStream(resultTable, > RowData.class); > rowDataDataStream.print(); > env.execute("test_it"); > {code} > * the result like !image-2021-07-02-16-29-40-310.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23262) FileReadingWatermarkITCase.testWatermarkEmissionWithChaining fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377819#comment-17377819 ] Xintong Song commented on FLINK-23262: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20201=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0=4820 > FileReadingWatermarkITCase.testWatermarkEmissionWithChaining fails on azure > --- > > Key: FLINK-23262 > URL: https://issues.apache.org/jira/browse/FLINK-23262 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Roman Khachatryan >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19942=logs=219e462f-e75e-506c-3671-5017d866ccf6=4c5dc768-5c82-5ab0-660d-086cb90b76a0=5584 > {code} > Jul 05 22:19:00 [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 4.334 s <<< FAILURE! - in > org.apache.flink.test.streaming.api.FileReadingWatermarkITCase > Jul 05 22:19:00 [ERROR] > testWatermarkEmissionWithChaining(org.apache.flink.test.streaming.api.FileReadingWatermarkITCase) > Time elapsed: 4.16 s <<< FAILURE! > Jul 05 22:19:00 java.lang.AssertionError: too few watermarks emitted: 4 > Jul 05 22:19:00 at org.junit.Assert.fail(Assert.java:89) > Jul 05 22:19:00 at org.junit.Assert.assertTrue(Assert.java:42) > Jul 05 22:19:00 at > org.apache.flink.test.streaming.api.FileReadingWatermarkITCase.testWatermarkEmissionWithChaining(FileReadingWatermarkITCase.java:65) > Jul 05 22:19:00 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 05 22:19:00 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 05 22:19:00 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 05 22:19:00 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 05 22:19:00 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 05 22:19:00 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 05 22:19:00 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 05 22:19:00 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 05 22:19:00 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 05 22:19:00 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 05 22:19:00 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 05 22:19:00 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 05 22:19:00 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 05 22:19:00 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 05 22:19:00 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 05 22:19:00 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 05 22:19:00 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 05 22:19:00 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 05 22:19:00 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 05 22:19:00 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23218) Distribute the ShuffleDescriptors via blob server
[ https://issues.apache.org/jira/browse/FLINK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377811#comment-17377811 ] Zhu Zhu commented on FLINK-23218: - 1. To not affect existing users, I prefer limit to not be too small otherwise the deployment performance may be affected and more loads download can result in heavier master/dfs loads. 1GB sound good to me. Note that for most users, jobs are low scale and the {{ShuffleDescriptors}} cache can be very small and will not be shipped via blobs. So that a large limit will not cause new issues (compared that currently there is no limitation). 2. For low scale jobs, the {{ShuffleDescriptors}} cache will not be shipped via blobs, so residue problems will not be worse. Even for large scale jobs, IIRC, the compressed {{ShuffleDescriptors}} cache of a 8000x8000 shuffle is 200k+ bytes which still does not exceed the 1MB blob offloading threshold. Therefore I think we can document for configuration "blob.offload.minsize" to notify users to be aware of the residuals and blob size limit. > Distribute the ShuffleDescriptors via blob server > - > > Key: FLINK-23218 > URL: https://issues.apache.org/jira/browse/FLINK-23218 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhilong Hong >Priority: Major > Fix For: 1.14.0 > > > h3. Introduction > The optimizations introduced in FLINK-21110 so far have improved the > performance of job initialization, failover and partitions releasing. > However, the task deployment is still slow. For a job with two vertices, each > vertex has 8k parallelism and they are connected with the all-to-all edge. It > takes 32.611s to deploy all the tasks and make them transition to running. If > the parallelisms are 16k, it may take more than 2 minutes. > As the creation of TaskDeploymentDescriptors runs in the main thread of > jobmanager, it means that the jobmanager cannot deal with other akka messages > like heartbeats, task status update, and etc., for more than two minutes. > > All in all, currently there are two issues in the deployment of tasks for > large scale jobs: > # It takes a long time to deploy tasks, especially for all-to-all edges. > # Heartbeat timeout may happen during or after the procedure of task > deployments. For the streaming job, it would cause the failover of the entire > region. The job may never transition to running since there would be another > heartbeat timeout during the procedure of new task deployments. > h3. Proposal > Task deployment involves the following procedures: > # Jobmanager creates TaskDeploymentDescriptor for each task in the main > thread > # TaskDeploymentDescriptor is serialized in the future executor > # Akka transports TaskDeploymentDescriptors to TaskExecutors via RPC call > # TaskExecutors create a new task thread and execute it > The optimization contains two parts: > *1. Cache the compressed serialized value of ShuffleDescriptors* > ShuffleDescriptors in the TaskDeploymentDescriptor are used to describe the > IntermediateResultPartitions that a task consumes. For the downstream > vertices connected with the all-to-all edge that has _N_ parallelism, we need > to calculate _N_ ShuffleDescriptors for _N_ times. However, for these > vertices, they share the same ShuffleDescriptors since they all consume the > same IntermediateResultPartitions. We don't need to calculate > ShuffleDescriptors for each downstream vertex individually. We can just cache > them. This will decrease the overall complexity of calculating > TaskDeploymentDescriptors from O(N^2) to O(N). > Furthermore, we don't need to serialize the same ShuffleDescriptors for _N_ > times, so we can just cache the serialized value of ShuffleDescriptors > instead of the original object. To decrease the size of akka messages and > reduce the transmission of replicated data over the network, these serialized > value can be compressed. > *2. Distribute the ShuffleDescriptors via blob server* > For ShuffleDescriptors of vertices with 8k parallelism, the size of their > serialized value is more than 700 Kilobytes. After the compression, it would > be 200 Kilobytes or so. The overall size of 8k TaskDeploymentDescriptors is > more than 1.6 Gigabytes. Since Akka cannot send the messages as fast as the > TaskDeploymentDescriptors are created, these TaskDeploymentDescriptors would > become a heavy burden for the garbage collector to deal with. > In TaskDeploymentDescriptor, JobInformation and TaskInformation are > distributed via the blob server if their sizes exceed a certain threshold > (which is defined as {{blob.offload.minsize}}). TaskExecutors request the > information from the blob server once they begin to process the >
[GitHub] [flink] pnowojski commented on a change in pull request #11877: [FLINK-16641][network] Announce sender's backlog to solve the deadlock issue without exclusive buffers
pnowojski commented on a change in pull request #11877: URL: https://github.com/apache/flink/pull/11877#discussion_r77997 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/PipelinedSubpartition.java ## @@ -312,6 +323,16 @@ BufferAndBacklog pollBuffer() { decreaseBuffersInBacklogUnsafe(bufferConsumer.isBuffer()); } +// if we have an empty finished buffer and the exclusive credit is 0, we just return +// the empty buffer so that the downstream task can release the allocated credit for +// this empty buffer, this happens in two main scenarios currently: +// 1. all data of a buffer builder has been read and after that the buffer builder +// is finished +// 2. in approximate recovery mode, a partial record takes a whole buffer builder +if (buffersPerChannel == 0 && bufferConsumer.isFinished()) { +break; +} + Review comment: Yes, you are right. It wouldn't be that simple. In that case, how complicated would it be to optimise the code to skip the all of the empty buffers until: 1. non empty data buffer 2. event (then send empty buffer first) 3. last empty buffer, without any events after it - here we would indeed need to send that empty buffer ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16368: ignore
flinkbot edited a comment on pull request #16368: URL: https://github.com/apache/flink/pull/16368#issuecomment-873854539 ## CI report: * 4812b42d100c84091daf0841abff818eef20ada9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20144) * 09ac6e50da595b080878d1f18815d2a9fb251f13 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20213) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16352: [FLINK-23102][runtime] Accessing FlameGraphs while not being enabled …
flinkbot edited a comment on pull request #16352: URL: https://github.com/apache/flink/pull/16352#issuecomment-872812232 ## CI report: * 6a7d378800616b57c9551b849a550ebb1716315f Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20170) * fd2bddea66cd255bf9e36fbdd6b383c890dbb1bf UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
flinkbot edited a comment on pull request #16432: URL: https://github.com/apache/flink/pull/16432#issuecomment-876475935 ## CI report: * 473db5d7954ebc197a64bb49d3b91d5c09d2c595 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20203) * 1b3a0d615d4f69eafd023cb3a7dd6cd70e3d6673 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20209) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16435: [FLINK-23184][table-runtime-blink] Fix compile error in code generation of unary plus and minus
flinkbot edited a comment on pull request #16435: URL: https://github.com/apache/flink/pull/16435#issuecomment-876899708 ## CI report: * 3032756fb594f7e8ca62c59c896b933542453659 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20210) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16368: ignore
flinkbot edited a comment on pull request #16368: URL: https://github.com/apache/flink/pull/16368#issuecomment-873854539 ## CI report: * 4812b42d100c84091daf0841abff818eef20ada9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20144) * 09ac6e50da595b080878d1f18815d2a9fb251f13 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan edited a comment on pull request #16415: [FLINK-23262][tests] Check watermark frequency instead of count
rkhachatryan edited a comment on pull request #16415: URL: https://github.com/apache/flink/pull/16415#issuecomment-876898158 Sorry, my bad, I should have apply the same change to record the start time. Now (with 499dfac) the test passed (also checked locally with 10k runs, with smaller `FILE_SIZE_LINES = 1_000_000`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on pull request #16404: [FLINK-23277][state/changelog] Store and recover TTL metadata using changelog
rkhachatryan commented on pull request #16404: URL: https://github.com/apache/flink/pull/16404#issuecomment-876908810 Thanks for the review @Myasuka, I've replied to your comments and updated the PR, please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-23144) Add percentiles to checkpoint stats
[ https://issues.apache.org/jira/browse/FLINK-23144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Khachatryan reassigned FLINK-23144: - Assignee: Roman Khachatryan > Add percentiles to checkpoint stats > --- > > Key: FLINK-23144 > URL: https://issues.apache.org/jira/browse/FLINK-23144 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Checkpointing, Runtime / Web Frontend >Reporter: Roman Khachatryan >Assignee: Roman Khachatryan >Priority: Minor > Fix For: 1.14.0 > > > Currently, only min/avg/max are shown, which doesn't allow to easily assess > checkpointing times. > Ideally, with breakdown by operator/channel state write times and sync/async > phases (exact requirements up to implementer) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23235) SinkITCase.writerAndCommitterAndGlobalCommitterExecuteInStreamingMode fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377791#comment-17377791 ] Guowei Ma commented on FLINK-23235: --- I thought about it again. In fact, we only need to exclude the "end of input" contained in GLOBAL_COMMIT_QUEUE. Such modification will be simpler than modifying TestSink. After the Final Checkpoint is over in the future, it will be very convenient to resume the detection of "end of input". > SinkITCase.writerAndCommitterAndGlobalCommitterExecuteInStreamingMode fails > on azure > > > Key: FLINK-23235 > URL: https://issues.apache.org/jira/browse/FLINK-23235 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.13.1 >Reporter: Xintong Song >Priority: Major > Labels: test-stability > Fix For: 1.13.2 > > Attachments: screenshot-1.png > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19867=logs=02c4e775-43bf-5625-d1cc-542b5209e072=e5961b24-88d9-5c77-efd3-955422674c25=9972 > {code} > Jul 03 23:57:29 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 5.53 s <<< FAILURE! - in > org.apache.flink.test.streaming.runtime.SinkITCase > Jul 03 23:57:29 [ERROR] > writerAndCommitterAndGlobalCommitterExecuteInStreamingMode(org.apache.flink.test.streaming.runtime.SinkITCase) > Time elapsed: 0.68 s <<< FAILURE! > Jul 03 23:57:29 java.lang.AssertionError: > Jul 03 23:57:29 > Jul 03 23:57:29 Expected: iterable over ["(895,null,-9223372036854775808)", > "(895,null,-9223372036854775808)", "(127,null,-9223372036854775808)", > "(127,null,-9223372036854775808)", "(148,null,-9223372036854775808)", > "(148,null,-9223372036854775808)", "(161,null,-9223372036854775808)", > "(161,null,-9223372036854775808)", "(148,null,-9223372036854775808)", > "(148,null,-9223372036854775808)", "(662,null,-9223372036854775808)", > "(662,null,-9223372036854775808)", "(822,null,-9223372036854775808)", > "(822,null,-9223372036854775808)", "(491,null,-9223372036854775808)", > "(491,null,-9223372036854775808)", "(275,null,-9223372036854775808)", > "(275,null,-9223372036854775808)", "(122,null,-9223372036854775808)", > "(122,null,-9223372036854775808)", "(850,null,-9223372036854775808)", > "(850,null,-9223372036854775808)", "(630,null,-9223372036854775808)", > "(630,null,-9223372036854775808)", "(682,null,-9223372036854775808)", > "(682,null,-9223372036854775808)", "(765,null,-9223372036854775808)", > "(765,null,-9223372036854775808)", "(434,null,-9223372036854775808)", > "(434,null,-9223372036854775808)", "(970,null,-9223372036854775808)", > "(970,null,-9223372036854775808)", "(714,null,-9223372036854775808)", > "(714,null,-9223372036854775808)", "(795,null,-9223372036854775808)", > "(795,null,-9223372036854775808)", "(288,null,-9223372036854775808)", > "(288,null,-9223372036854775808)", "(422,null,-9223372036854775808)", > "(422,null,-9223372036854775808)"] in any order > Jul 03 23:57:29 but: Not matched: "end of input" > Jul 03 23:57:29 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 03 23:57:29 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) > Jul 03 23:57:29 at > org.apache.flink.test.streaming.runtime.SinkITCase.writerAndCommitterAndGlobalCommitterExecuteInStreamingMode(SinkITCase.java:139) > Jul 03 23:57:29 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 03 23:57:29 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 03 23:57:29 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 03 23:57:29 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 03 23:57:29 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > Jul 03 23:57:29 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 03 23:57:29 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > Jul 03 23:57:29 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 03 23:57:29 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 03 23:57:29 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 03 23:57:29 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 03 23:57:29 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > Jul 03 23:57:29 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Jul 03 23:57:29 at >
[GitHub] [flink] flinkbot edited a comment on pull request #11877: [FLINK-16641][network] Announce sender's backlog to solve the deadlock issue without exclusive buffers
flinkbot edited a comment on pull request #11877: URL: https://github.com/apache/flink/pull/11877#issuecomment-618273998 ## CI report: * 01b2bc58b30a2a3730895f7c50ff59099bd273d2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20184) * d4b35b61d395564b24dd98896b785e22c6e3ab30 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20208) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23074) There is a class conflict between flink-connector-hive and flink-parquet
[ https://issues.apache.org/jira/browse/FLINK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377786#comment-17377786 ] Rui Li commented on FLINK-23074: Fixed in release-1.13: 8359fa8a9149392d964f52e7492b4dc24d74bb15 Fixed in release-1.12: 09ac6e50da595b080878d1f18815d2a9fb251f13 > There is a class conflict between flink-connector-hive and flink-parquet > > > Key: FLINK-23074 > URL: https://issues.apache.org/jira/browse/FLINK-23074 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.13.1, 1.12.4 >Reporter: Ada Wong >Assignee: Ada Wong >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > Attachments: E8C394D1-F970-4825-82CD-3EFA74C65B27.png, > image-2021-06-23-17-26-32-559.png, image-2021-07-01-18-23-47-105.png, > image-2021-07-01-18-40-00-991.png, image-2021-07-01-18-40-31-729.png, > screenshot-1.png, screenshot-3.png, screenshot-4.png > > > flink-connector-hive 2.3.6 include parquet-hadoop 1.8.1 version but > flink-parquet include 1.11.1. > org.apache.parquet.hadoop.example.GroupWriteSupport > is different. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] lirui-apache closed pull request #16424: [FLINK-23074][connector-hive] Shade parquet class in hive-exec to prevent conflict.
lirui-apache closed pull request #16424: URL: https://github.com/apache/flink/pull/16424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] lirui-apache closed pull request #16423: [FLINK-23074][connector-hive] Shade parquet class in hive-exec to prevent conflict.
lirui-apache closed pull request #16423: URL: https://github.com/apache/flink/pull/16423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #16435: [FLINK-23184][table-runtime-blink] Fix compile error in code generation of unary plus and minus
flinkbot commented on pull request #16435: URL: https://github.com/apache/flink/pull/16435#issuecomment-876899708 ## CI report: * 3032756fb594f7e8ca62c59c896b933542453659 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
flinkbot edited a comment on pull request #16434: URL: https://github.com/apache/flink/pull/16434#issuecomment-876881346 ## CI report: * e1f1b3df1ae855ffd1be6dded31e24bdb0b1f9db Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20204) * 122a5ecbc5d48497663dee76e0d915e198aa363e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20207) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16420: [FLINK-23290][table-runtime] Fix NullPointerException when filter contains casting to boolean
flinkbot edited a comment on pull request #16420: URL: https://github.com/apache/flink/pull/16420#issuecomment-876213306 ## CI report: * cc63e6c45a73722102e26cf51e1b9c4b11554336 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20150) * 1fea76ef80b83b701c952e5b1d86a763c6a38724 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20206) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16349: [FLINK-23267][table-planner-blink] Enable Java code splitting for all generated classes
flinkbot edited a comment on pull request #16349: URL: https://github.com/apache/flink/pull/16349#issuecomment-872790415 ## CI report: * 0d0cd1ddbc25b0f960ed0864c02a0bf21c0867c1 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20165) * 4fc82d1d8342b1f69ca6720424257a928c4609ff Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20205) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan edited a comment on pull request #16415: [FLINK-23262][tests] Check watermark frequency instead of count
rkhachatryan edited a comment on pull request #16415: URL: https://github.com/apache/flink/pull/16415#issuecomment-876898158 Sorry, my bad, I should have apply the same change to record the start time. Now (with 499dfac) the test passed (also checked locally with 10k runs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on pull request #16415: [FLINK-23262][tests] Check watermark frequency instead of count
rkhachatryan commented on pull request #16415: URL: https://github.com/apache/flink/pull/16415#issuecomment-876898158 Sorry, my bad, I should have apply the same change to record the start time. Now the test passed (also checked locally with 10k runs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #11877: [FLINK-16641][network] Announce sender's backlog to solve the deadlock issue without exclusive buffers
flinkbot edited a comment on pull request #11877: URL: https://github.com/apache/flink/pull/11877#issuecomment-618273998 ## CI report: * 01b2bc58b30a2a3730895f7c50ff59099bd273d2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20184) * d4b35b61d395564b24dd98896b785e22c6e3ab30 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wsry commented on pull request #11877: [FLINK-16641][network] Announce sender's backlog to solve the deadlock issue without exclusive buffers
wsry commented on pull request #11877: URL: https://github.com/apache/flink/pull/11877#issuecomment-876895192 @pnowojski I have rebased the latest master branch and appended a fixup commit which fixes the comments and enables 0 exclusive credit for unaligned checkpoint itcase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-22672) Some enhancements for pluggable shuffle service framework
[ https://issues.apache.org/jira/browse/FLINK-22672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated FLINK-22672: - Description: "Pluggable shuffle service" in Flink provides an architecture which are unified for both streaming and batch jobs, allowing user to customize the process of data transfer between shuffle stages according to scenarios. There are already a number of implementations of "remote shuffle service" on Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote cluster and achieves benefits like : # The lifecycle of computing resource can be decoupled with shuffle data, once computing task is finished, idle computing nodes can be released with its completed shuffle data accommodated on remote shuffle cluster. # There is no need to reserve disk capacity for shuffle on computing nodes. Remote shuffle cluster serves shuffling request with better scaling ability and alleviates the local disk pressure on computing nodes when data skew. Based on "pluggable shuffle service", we build our own "remote shuffle service" on Flink –- Lattice, which targets to provide functionalities and improve performance for batch processing jobs. Basically it works as below: # Lattice cluster works as an independent service for shuffling request; # LatticeShuffleMaster extends ShuffleMaster, works inside JM and talks with remote Lattice cluster for shuffle resource application and shuffle data lifecycle management; # LatticeShuffleEnvironment extends ShuffleEnvironment, works inside TM and provides an environment for shuffling data from/to remote Lattice cluster; During the process of building Lattice we find some potential enhancements on "pluggable shuffle service". I will enumerate and create some sub JIRAs under this umbrella [1] [https://www.alibabacloud.com/blog/emr-remote-shuffle-service-a-powerful-elastic-tool-of-serverless-spark_597728] [2] [https://bestoreo.github.io/post/cosco/cosco/] [3] [https://github.com/uber/RemoteShuffleService] was: "Pluggable shuffle service" in Flink provides an architecture which are unified for both streaming and batch jobs, allowing user to customize the process of data transfer between shuffle stages according to scenarios. There are already a number of implementations of "remote shuffle service" on Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote cluster and achieves benefits like : # The lifecycle of computing resource can be decoupled with shuffle data, once computing task is finished, idle computing nodes can be released with its completed shuffle data accormmadated on remote shuffle cluster. # There is no need to reserve disk capacity for shuffle on computing nodes. Remote shuffle cluster serves shuffling request with better scaling ability and alleviates the local disk pressure on computing nodes when data skew. Based on "pluggable shuffle service", we build our own "remote shuffle service" on Flink –- Lattice, which targets to provide functionalities and improve performance for batch processing jobs. Basically it works as below: # Lattice cluster works as an independent service for shuffling request; # LatticeShuffleMaster extends ShuffleMaster, works inside JM and talks with remote Lattice cluster for shuffle resource application and shuffle data lifecycle management; # LatticeShuffleEnvironment extends ShuffleEnvironment, works inside TM and provides an environment for shuffling data from/to remote Lattice cluster; During the process of building Lattice we find some potential enhancements on "pluggable shuffle service". I will enumerate and create some sub JIRAs under this umbrella [1] [https://www.alibabacloud.com/blog/emr-remote-shuffle-service-a-powerful-elastic-tool-of-serverless-spark_597728] [2] [https://bestoreo.github.io/post/cosco/cosco/] [3] [https://github.com/uber/RemoteShuffleService] > Some enhancements for pluggable shuffle service framework > - > > Key: FLINK-22672 > URL: https://issues.apache.org/jira/browse/FLINK-22672 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Jin Xing >Priority: Major > Fix For: 1.14.0 > > > "Pluggable shuffle service" in Flink provides an architecture which are > unified for both streaming and batch jobs, allowing user to customize the > process of data transfer between shuffle stages according to scenarios. > There are already a number of implementations of "remote shuffle service" on > Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote > cluster and achieves benefits like : > # The lifecycle of computing resource can be decoupled with shuffle data, > once computing task is finished, idle computing nodes can be
[jira] [Updated] (FLINK-22672) Some enhancements for pluggable shuffle service framework
[ https://issues.apache.org/jira/browse/FLINK-22672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated FLINK-22672: - Description: "Pluggable shuffle service" in Flink provides an architecture which are unified for both streaming and batch jobs, allowing user to customize the process of data transfer between shuffle stages according to scenarios. There are already a number of implementations of "remote shuffle service" on Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote cluster and achieves benefits like : # The lifecycle of computing resource can be decoupled with shuffle data, once computing task is finished, idle computing nodes can be released with its completed shuffle data accormmadated on remote shuffle cluster. # There is no need to reserve disk capacity for shuffle on computing nodes. Remote shuffle cluster serves shuffling request with better scaling ability and alleviates the local disk pressure on computing nodes when data skew. Based on "pluggable shuffle service", we build our own "remote shuffle service" on Flink –- Lattice, which targets to provide functionalities and improve performance for batch processing jobs. Basically it works as below: # Lattice cluster works as an independent service for shuffling request; # LatticeShuffleMaster extends ShuffleMaster, works inside JM and talks with remote Lattice cluster for shuffle resource application and shuffle data lifecycle management; # LatticeShuffleEnvironment extends ShuffleEnvironment, works inside TM and provides an environment for shuffling data from/to remote Lattice cluster; During the process of building Lattice we find some potential enhancements on "pluggable shuffle service". I will enumerate and create some sub JIRAs under this umbrella [1] [https://www.alibabacloud.com/blog/emr-remote-shuffle-service-a-powerful-elastic-tool-of-serverless-spark_597728] [2] [https://bestoreo.github.io/post/cosco/cosco/] [3] [https://github.com/uber/RemoteShuffleService] was: "Pluggable shuffle service" in Flink provides an architecture which are unified for both streaming and batch jobs, allowing user to customize the process of data transfer between shuffle stages according to scenarios. There are already a number of implementations of "remote shuffle service" on Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote cluster and achieves benefits like : # The lifecycle of computing resource can be decoupled with shuffle data, once computing task is finished, idle computing nodes can be released with its completed shuffle data accormadated on remote shuffle cluster. # There is no need to reserve disk capacity for shuffle on computing nodes. Remote shuffle cluster serves shuffling request with better scaling ability and alleviates the local disk pressure on computing nodes when data skew. Based "pluggable shuffle service", we build our own "remote shuffle service" on Flink –- Lattice, which targets to provide functionalities and improve performance for batch processing jobs. Basically it works as below: # Lattice cluster works as an independent service for shuffling request; # LatticeShuffleMaster extends ShuffleMaster, works inside JM and talks with remote Lattice cluster for shuffle resouce application and shuffle data lifecycle management; # LatticeShuffleEnvironmente extends ShuffleEnvironment, works inside TM and provides an environment for shuffling data from/to remote Lattice cluster; During the process of building Lattice we find some potential enhancements on "pluggable shuffle service". I will enumerate and create some sub JIRAs under this umbrella [1] [https://www.alibabacloud.com/blog/emr-remote-shuffle-service-a-powerful-elastic-tool-of-serverless-spark_597728] [2] [https://bestoreo.github.io/post/cosco/cosco/] [3] [https://github.com/uber/RemoteShuffleService] > Some enhancements for pluggable shuffle service framework > - > > Key: FLINK-22672 > URL: https://issues.apache.org/jira/browse/FLINK-22672 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Jin Xing >Priority: Major > Fix For: 1.14.0 > > > "Pluggable shuffle service" in Flink provides an architecture which are > unified for both streaming and batch jobs, allowing user to customize the > process of data transfer between shuffle stages according to scenarios. > There are already a number of implementations of "remote shuffle service" on > Spark like [1][2][3]. Remote shuffle enables to shuffle data from/to a remote > cluster and achieves benefits like : > # The lifecycle of computing resource can be decoupled with shuffle data, > once computing task is finished, idle computing nodes can be
[GitHub] [flink] flinkbot edited a comment on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
flinkbot edited a comment on pull request #16434: URL: https://github.com/apache/flink/pull/16434#issuecomment-876881346 ## CI report: * e1f1b3df1ae855ffd1be6dded31e24bdb0b1f9db Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20204) * 122a5ecbc5d48497663dee76e0d915e198aa363e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
flinkbot edited a comment on pull request #16432: URL: https://github.com/apache/flink/pull/16432#issuecomment-876475935 ## CI report: * 473db5d7954ebc197a64bb49d3b91d5c09d2c595 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20203) * 1b3a0d615d4f69eafd023cb3a7dd6cd70e3d6673 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16420: [FLINK-23290][table-runtime] Fix NullPointerException when filter contains casting to boolean
flinkbot edited a comment on pull request #16420: URL: https://github.com/apache/flink/pull/16420#issuecomment-876213306 ## CI report: * cc63e6c45a73722102e26cf51e1b9c4b11554336 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20150) * 1fea76ef80b83b701c952e5b1d86a763c6a38724 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16349: [FLINK-23267][table-planner-blink] Enable Java code splitting for all generated classes
flinkbot edited a comment on pull request #16349: URL: https://github.com/apache/flink/pull/16349#issuecomment-872790415 ## CI report: * 0d0cd1ddbc25b0f960ed0864c02a0bf21c0867c1 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20165) * 4fc82d1d8342b1f69ca6720424257a928c4609ff UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #16435: [FLINK-23184][table-runtime-blink] Fix compile error in code generation of unary plus and minus
flinkbot commented on pull request #16435: URL: https://github.com/apache/flink/pull/16435#issuecomment-876890574 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 3032756fb594f7e8ca62c59c896b933542453659 (Fri Jul 09 03:45:35 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-23323) HAQueryableStateRocksDBBackendITCase failed due to heap OOM
Xintong Song created FLINK-23323: Summary: HAQueryableStateRocksDBBackendITCase failed due to heap OOM Key: FLINK-23323 URL: https://issues.apache.org/jira/browse/FLINK-23323 Project: Flink Issue Type: Bug Components: Runtime / Queryable State Affects Versions: 1.14.0 Reporter: Xintong Song https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=c91190b6-40ae-57b2-5999-31b869b0a7c1=43529380-51b4-5e90-5af4-2dccec0ef402=14431 {code} Jul 08 21:43:22 [ERROR] Tests run: 12, Failures: 0, Errors: 9, Skipped: 1, Time elapsed: 246.345 s <<< FAILURE! - in org.apache.flink.queryablestate.itcases.HAQueryableStateRocksDBBackendITCase Jul 08 21:43:22 [ERROR] testReducingState(org.apache.flink.queryablestate.itcases.HAQueryableStateRocksDBBackendITCase) Time elapsed: 241.454 s <<< ERROR! Jul 08 21:43:22 java.lang.OutOfMemoryError: Java heap space {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377764#comment-17377764 ] Xintong Song commented on FLINK-23318: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=f0ac5c25-1168-55a5-07ff-0e88223afed9=0dbaca5d-7c38-52e6-f4fe-2fb69ccb3ada=7356 > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Jul 08
[jira] [Commented] (FLINK-23233) OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377763#comment-17377763 ] Xintong Song commented on FLINK-23233: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=219e462f-e75e-506c-3671-5017d866ccf6=4c5dc768-5c82-5ab0-660d-086cb90b76a0=5134 > OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure > fails on azure > -- > > Key: FLINK-23233 > URL: https://issues.apache.org/jira/browse/FLINK-23233 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0, 1.12.3, 1.13.1 >Reporter: Xintong Song >Assignee: Yun Gao >Priority: Blocker > Labels: pull-request-available > Fix For: 1.14.0, 1.12.5, 1.13.2 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19857=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=c2734c79-73b6-521c-e85a-67c7ecae9107=9382 > {code} > Jul 03 01:37:31 [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 21.415 s <<< FAILURE! - in > org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase > Jul 03 01:37:31 [ERROR] > testOperatorEventLostWithReaderFailure(org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase) > Time elapsed: 3.623 s <<< FAILURE! > Jul 03 01:37:31 java.lang.AssertionError: expected:<[1, 2, 3, 4, 5, 6, 7, 8, > 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, > 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, > 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, > 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, > 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]> but > was:<[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, > 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, > 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, > 59, 60, 61, 62, 63, 64, 65, 66, 67]> > Jul 03 01:37:31 at org.junit.Assert.fail(Assert.java:88) > Jul 03 01:37:31 at org.junit.Assert.failNotEquals(Assert.java:834) > Jul 03 01:37:31 at org.junit.Assert.assertEquals(Assert.java:118) > Jul 03 01:37:31 at org.junit.Assert.assertEquals(Assert.java:144) > Jul 03 01:37:31 at > org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase.runTest(OperatorEventSendingCheckpointITCase.java:254) > Jul 03 01:37:31 at > org.apache.flink.runtime.operators.coordination.OperatorEventSendingCheckpointITCase.testOperatorEventLostWithReaderFailure(OperatorEventSendingCheckpointITCase.java:143) > Jul 03 01:37:31 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 03 01:37:31 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 03 01:37:31 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 03 01:37:31 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 03 01:37:31 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > Jul 03 01:37:31 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 03 01:37:31 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > Jul 03 01:37:31 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 03 01:37:31 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 03 01:37:31 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > Jul 03 01:37:31 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Jul 03 01:37:31 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > Jul 03 01:37:31 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > Jul 03 01:37:31 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tsreaper opened a new pull request #16435: [FLINK-23184][table-runtime-blink] Fix compile error in code generation of unary plus and minus
tsreaper opened a new pull request #16435: URL: https://github.com/apache/flink/pull/16435 (This commit is cherry-picked from #16406) ## What is the purpose of the change Currently code generation of unary plus and minus will not cast the result term to its desired type. This will cause compile error if the desired type is `short` or `byte`. Consider the following java code. ```java short a = 1; short b = -a; ``` This code will not compile as `-a` is of `int` type. This PR fixes this issue by casting the result term. ## Brief change log - Fix compile error in code generation of unary plus and minus ## Verifying this change This change added tests and can be verified by running the added tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377760#comment-17377760 ] Xintong Song commented on FLINK-23318: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=d8d26c26-7ec2-5ed2-772e-7a1a1eb8317c=be5fb08e-1ad7-563c-4f1a-a97ad4ce4865=5993 > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Jul 08
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377762#comment-17377762 ] Xintong Song commented on FLINK-23318: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=3b6ec2fd-a816-5e75-c775-06fb87cb6670=2aff8966-346f-518f-e6ce-de64002a5034=6874 > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Jul 08
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377761#comment-17377761 ] Xintong Song commented on FLINK-23318: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=d89de3df-4600-5585-dadc-9bbc9a5e661c=19336553-69ec-5b03-471a-791a483cced6=6159 > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Jul 08
[jira] [Commented] (FLINK-22085) KafkaSourceLegacyITCase hangs/fails on azure
[ https://issues.apache.org/jira/browse/FLINK-22085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377758#comment-17377758 ] Xintong Song commented on FLINK-22085: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=72d4811f-9f0d-5fd0-014a-0bc26b72b642=c1d93a6a-ba91-515d-3196-2ee8019fbda7=6529 > KafkaSourceLegacyITCase hangs/fails on azure > > > Key: FLINK-22085 > URL: https://issues.apache.org/jira/browse/FLINK-22085 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.13.0, 1.14.0 >Reporter: Dawid Wysakowicz >Assignee: Yun Gao >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > 1) Observations > a) The Azure pipeline would occasionally hang without printing any test error > information. > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15939=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5=8219] > b) By running the test KafkaSourceLegacyITCase::testBrokerFailure() with INFO > level logging, the the test would hang with the following error message > printed repeatedly: > {code:java} > 20451 [New I/O boss #50] ERROR > org.apache.flink.networking.NetworkFailureHandler [] - Closing communication > channel because of an exception > java.net.ConnectException: Connection refused: localhost/127.0.0.1:50073 > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > ~[?:1.8.0_151] > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > ~[?:1.8.0_151] > at > org.apache.flink.shaded.testutils.org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) > ~[flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > org.apache.flink.shaded.testutils.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) > [flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > org.apache.flink.shaded.testutils.org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) > [flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > org.apache.flink.shaded.testutils.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) > [flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > org.apache.flink.shaded.testutils.org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) > [flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > org.apache.flink.shaded.testutils.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) > [flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > org.apache.flink.shaded.testutils.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) > [flink-test-utils_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_151] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_151] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151] > {code} > *2) Root cause explanations* > The test would hang because it enters the following loop: > - closeOnFlush() is called for a given channel > - closeOnFlush() calls channel.write(..) > - channel.write() triggers the exceptionCaught(...) callback > - closeOnFlush() is called for the same channel again. > *3) Solution* > Update closeOnFlush() so that, if a channel is being closed by this method, > then closeOnFlush() would not try to write to this channel if it is called on > this channel again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377755#comment-17377755 ] Xintong Song commented on FLINK-23318: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=7c61167f-30b3-5893-cc38-a9e3d057e392=7388 > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Jul 08
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377756#comment-17377756 ] Xintong Song commented on FLINK-23318: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=c2734c79-73b6-521c-e85a-67c7ecae9107=6024 > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Jul 08
[GitHub] [flink] gaoyunhaii commented on pull request #16385: [FLINK-23080][datastream] Introduce SinkFunction#finish() method.
gaoyunhaii commented on pull request #16385: URL: https://github.com/apache/flink/pull/16385#issuecomment-876885109 No problem, I'll have a look~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20431) KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 expected:<10> but was:<1>
[ https://issues.apache.org/jira/browse/FLINK-20431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377753#comment-17377753 ] Xintong Song commented on FLINK-20431: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20195=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5=6471 > KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 > expected:<10> but was:<1> > - > > Key: FLINK-20431 > URL: https://issues.apache.org/jira/browse/FLINK-20431 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.2, 1.13.0 >Reporter: Huang Xingbo >Priority: Major > Labels: auto-deprioritized-critical, auto-unassigned, > pull-request-available, test-stability > Fix For: 1.13.2, 1.12.6 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10351=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5] > [ERROR] Failures: > [ERROR] > KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 > expected:<10> but was:<1> > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20431) KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 expected:<10> but was:<1>
[ https://issues.apache.org/jira/browse/FLINK-20431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-20431: - Labels: auto-unassigned pull-request-available test-stability (was: auto-deprioritized-critical auto-unassigned pull-request-available test-stability) > KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 > expected:<10> but was:<1> > - > > Key: FLINK-20431 > URL: https://issues.apache.org/jira/browse/FLINK-20431 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.2, 1.13.0 >Reporter: Huang Xingbo >Priority: Major > Labels: auto-unassigned, pull-request-available, test-stability > Fix For: 1.13.2, 1.12.6 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10351=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5] > [ERROR] Failures: > [ERROR] > KafkaSourceReaderTest.testCommitOffsetsWithoutAliveFetchers:133->lambda$testCommitOffsetsWithoutAliveFetchers$3:134 > expected:<10> but was:<1> > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] gaoyunhaii commented on pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on pull request #16432: URL: https://github.com/apache/flink/pull/16432#issuecomment-876884430 Hi Till, very thanks for the review! I addressed the inline comments, and for another possible option I think it seems to might cause deadlock. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] cshuo commented on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
cshuo commented on pull request #16434: URL: https://github.com/apache/flink/pull/16434#issuecomment-876884044 cc @JingsongLi plz have a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22889) JdbcExactlyOnceSinkE2eTest hangs on azure
[ https://issues.apache.org/jira/browse/FLINK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377750#comment-17377750 ] Xintong Song commented on FLINK-22889: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20197=logs=ba53eb01-1462-56a3-8e98-0dd97fbcaab5=bfbc6239-57a0-5db0-63f3-41551b4f7d51=15222 > JdbcExactlyOnceSinkE2eTest hangs on azure > - > > Key: FLINK-22889 > URL: https://issues.apache.org/jira/browse/FLINK-22889 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.14.0, 1.13.1 >Reporter: Dawid Wysakowicz >Assignee: Roman Khachatryan >Priority: Critical > Labels: pull-request-available, test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18690=logs=ba53eb01-1462-56a3-8e98-0dd97fbcaab5=bfbc6239-57a0-5db0-63f3-41551b4f7d51=16658 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] gaoyunhaii commented on a change in pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on a change in pull request #16432: URL: https://github.com/apache/flink/pull/16432#discussion_r44562 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/OperatorCoordinatorHolderTest.java ## @@ -446,6 +525,27 @@ public OperatorCoordinator create(OperatorCoordinator.Context context) { return holder; } +private static class ReorderableManualExecutorService +extends ManuallyTriggeredScheduledExecutorService { + +private boolean pendingNewRunnables; + +private final Queue pendingRunnables = new ArrayDeque<>(); + +public void setPendingNewRunnables(boolean pendingNewRunnables) { Review comment: Perhaps we change it to something like `delayNewRunnables`, and in the test we finally execute these `Runnables`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22387) UpsertKafkaTableITCase hangs when setting up kafka
[ https://issues.apache.org/jira/browse/FLINK-22387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377749#comment-17377749 ] Xintong Song commented on FLINK-22387: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20197=logs=72d4811f-9f0d-5fd0-014a-0bc26b72b642=c1d93a6a-ba91-515d-3196-2ee8019fbda7=6857 > UpsertKafkaTableITCase hangs when setting up kafka > -- > > Key: FLINK-22387 > URL: https://issues.apache.org/jira/browse/FLINK-22387 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka, Table SQL / Ecosystem >Affects Versions: 1.14.0, 1.13.1, 1.12.4 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: auto-deprioritized-critical, test-stability > Fix For: 1.14.0, 1.13.2, 1.12.6 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16901=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5=6932 > {code} > 2021-04-20T20:01:32.2276988Z Apr 20 20:01:32 "main" #1 prio=5 os_prio=0 > tid=0x7fe87400b000 nid=0x4028 runnable [0x7fe87df22000] > 2021-04-20T20:01:32.2277666Z Apr 20 20:01:32java.lang.Thread.State: > RUNNABLE > 2021-04-20T20:01:32.2278338Z Apr 20 20:01:32 at > org.testcontainers.shaded.okio.Buffer.getByte(Buffer.java:312) > 2021-04-20T20:01:32.2279325Z Apr 20 20:01:32 at > org.testcontainers.shaded.okio.RealBufferedSource.readHexadecimalUnsignedLong(RealBufferedSource.java:310) > 2021-04-20T20:01:32.2280656Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.readChunkSize(Http1ExchangeCodec.java:492) > 2021-04-20T20:01:32.2281603Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.read(Http1ExchangeCodec.java:471) > 2021-04-20T20:01:32.2282163Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.Util.skipAll(Util.java:204) > 2021-04-20T20:01:32.2282870Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.Util.discard(Util.java:186) > 2021-04-20T20:01:32.2283494Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.close(Http1ExchangeCodec.java:511) > 2021-04-20T20:01:32.2284460Z Apr 20 20:01:32 at > org.testcontainers.shaded.okio.ForwardingSource.close(ForwardingSource.java:43) > 2021-04-20T20:01:32.2285183Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.connection.Exchange$ResponseBodySource.close(Exchange.java:313) > 2021-04-20T20:01:32.2285756Z Apr 20 20:01:32 at > org.testcontainers.shaded.okio.RealBufferedSource.close(RealBufferedSource.java:476) > 2021-04-20T20:01:32.2286287Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.internal.Util.closeQuietly(Util.java:139) > 2021-04-20T20:01:32.2286795Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.ResponseBody.close(ResponseBody.java:192) > 2021-04-20T20:01:32.2287270Z Apr 20 20:01:32 at > org.testcontainers.shaded.okhttp3.Response.close(Response.java:290) > 2021-04-20T20:01:32.2287913Z Apr 20 20:01:32 at > org.testcontainers.shaded.com.github.dockerjava.okhttp.OkDockerHttpClient$OkResponse.close(OkDockerHttpClient.java:285) > 2021-04-20T20:01:32.2288606Z Apr 20 20:01:32 at > org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.lambda$null$0(DefaultInvocationBuilder.java:272) > 2021-04-20T20:01:32.2289295Z Apr 20 20:01:32 at > org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder$$Lambda$340/2058508175.close(Unknown > Source) > 2021-04-20T20:01:32.2289886Z Apr 20 20:01:32 at > com.github.dockerjava.api.async.ResultCallbackTemplate.close(ResultCallbackTemplate.java:77) > 2021-04-20T20:01:32.2290567Z Apr 20 20:01:32 at > org.testcontainers.utility.ResourceReaper.start(ResourceReaper.java:202) > 2021-04-20T20:01:32.2291051Z Apr 20 20:01:32 at > org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:205) > 2021-04-20T20:01:32.2291879Z Apr 20 20:01:32 - locked <0xe9cd50f8> > (a [Ljava.lang.Object;) > 2021-04-20T20:01:32.2292313Z Apr 20 20:01:32 at > org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14) > 2021-04-20T20:01:32.2292870Z Apr 20 20:01:32 at > org.testcontainers.LazyDockerClient.authConfig(LazyDockerClient.java:12) > 2021-04-20T20:01:32.2293383Z Apr 20 20:01:32 at > org.testcontainers.containers.GenericContainer.start(GenericContainer.java:310) > 2021-04-20T20:01:32.2293890Z Apr 20 20:01:32 at > org.testcontainers.containers.GenericContainer.starting(GenericContainer.java:1029) > 2021-04-20T20:01:32.2294578Z Apr 20 20:01:32 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:29) >
[GitHub] [flink] gaoyunhaii commented on a change in pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on a change in pull request #16432: URL: https://github.com/apache/flink/pull/16432#discussion_r44248 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/OperatorCoordinatorHolderTest.java ## @@ -446,6 +525,27 @@ public OperatorCoordinator create(OperatorCoordinator.Context context) { return holder; } +private static class ReorderableManualExecutorService +extends ManuallyTriggeredScheduledExecutorService { + +private boolean pendingNewRunnables; + +private final Queue pendingRunnables = new ArrayDeque<>(); + +public void setPendingNewRunnables(boolean pendingNewRunnables) { +this.pendingNewRunnables = pendingNewRunnables; +} + +@Override +public void execute(@Nonnull Runnable command) { +if (pendingNewRunnables) { +pendingRunnables.add(command); Review comment: For the real case these `Runnables` would be executed finally, and here if they are executed would not change the result since we are testing the logic before they get executed. I think I could also finally execute these `Runnables` to be more consistent with the real case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22457) KafkaSourceLegacyITCase.testMultipleSourcesOnePartition fails because of timeout
[ https://issues.apache.org/jira/browse/FLINK-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377747#comment-17377747 ] Xintong Song commented on FLINK-22457: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20197=logs=c5f0071e-1851-543e-9a45-9ac140befc32=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5=6872 > KafkaSourceLegacyITCase.testMultipleSourcesOnePartition fails because of > timeout > > > Key: FLINK-22457 > URL: https://issues.apache.org/jira/browse/FLINK-22457 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.13.0, 1.14.0 >Reporter: Guowei Ma >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17140=logs=1fc6e7bf-633c-5081-c32a-9dea24b05730=80a658d1-f7f6-5d93-2758-53ac19fd5b19=7045 > {code:java} > Apr 24 23:47:33 [ERROR] Tests run: 21, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 174.335 s <<< FAILURE! - in > org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase > Apr 24 23:47:33 [ERROR] > testMultipleSourcesOnePartition(org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase) > Time elapsed: 60.019 s <<< ERROR! > Apr 24 23:47:33 org.junit.runners.model.TestTimedOutException: test timed out > after 6 milliseconds > Apr 24 23:47:33 at sun.misc.Unsafe.park(Native Method) > Apr 24 23:47:33 at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > Apr 24 23:47:33 at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707) > Apr 24 23:47:33 at > java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > Apr 24 23:47:33 at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742) > Apr 24 23:47:33 at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > Apr 24 23:47:33 at > org.apache.flink.test.util.TestUtils.tryExecute(TestUtils.java:49) > Apr 24 23:47:33 at > org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runMultipleSourcesOnePartitionExactlyOnceTest(KafkaConsumerTestBase.java:1112) > Apr 24 23:47:33 at > org.apache.flink.connector.kafka.source.KafkaSourceLegacyITCase.testMultipleSourcesOnePartition(KafkaSourceLegacyITCase.java:87) > Apr 24 23:47:33 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Apr 24 23:47:33 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Apr 24 23:47:33 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Apr 24 23:47:33 at java.lang.reflect.Method.invoke(Method.java:498) > Apr 24 23:47:33 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > Apr 24 23:47:33 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Apr 24 23:47:33 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > Apr 24 23:47:33 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Apr 24 23:47:33 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > Apr 24 23:47:33 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > Apr 24 23:47:33 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > Apr 24 23:47:33 at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
flinkbot commented on pull request #16434: URL: https://github.com/apache/flink/pull/16434#issuecomment-876881346 ## CI report: * e1f1b3df1ae855ffd1be6dded31e24bdb0b1f9db UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
flinkbot edited a comment on pull request #16432: URL: https://github.com/apache/flink/pull/16432#issuecomment-876475935 ## CI report: * fa6a87a6d47d79d400ef4964b168e8a767c8bad0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20190) * 473db5d7954ebc197a64bb49d3b91d5c09d2c595 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] cshuo commented on a change in pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
cshuo commented on a change in pull request #16434: URL: https://github.com/apache/flink/pull/16434#discussion_r40410 ## File path: flink-table/flink-table-runtime/src/test/java/org/apache/flink/table/runtime/operators/rank/FastTop1FunctionTest.java ## @@ -0,0 +1,181 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.runtime.operators.rank; + +import org.apache.flink.runtime.checkpoint.OperatorSubtaskState; +import org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness; +import org.apache.flink.table.data.RowData; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.table.runtime.util.StreamRecordUtils.insertRecord; +import static org.apache.flink.table.runtime.util.StreamRecordUtils.updateAfterRecord; +import static org.apache.flink.table.runtime.util.StreamRecordUtils.updateBeforeRecord; + +/** Tests for {@link FastTop1Function}. */ Review comment: ...will add some more tests with UPDATE inputs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on a change in pull request #16432: URL: https://github.com/apache/flink/pull/16432#discussion_r40109 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/OperatorCoordinatorHolderTest.java ## @@ -370,6 +373,82 @@ private void checkpointEventValueAtomicity( } } +@Test +public void testFailingCheckpointsIfSendingEventFailed() throws Exception { +CompletableFuture eventSendingResult = new CompletableFuture<>(); +final EventReceivingTasks tasks = + EventReceivingTasks.createForRunningTasksWithRpcResult(eventSendingResult); +final OperatorCoordinatorHolder holder = +createCoordinatorHolder(tasks, TestingOperatorCoordinator::new); + +// Send one event without finishing it. +getCoordinator(holder).getSubtaskGateway(0).sendEvent(new TestOperatorEvent(0)); + +// Triggers one checkpoint. +CompletableFuture checkpointResult = new CompletableFuture<>(); +holder.checkpointCoordinator(1, checkpointResult); +getCoordinator(holder).getLastTriggeredCheckpoint().complete(new byte[0]); + +// Fails the event sending. +eventSendingResult.completeExceptionally(new RuntimeException("Artificial")); + +assertTrue(eventSendingResult.isCompletedExceptionally()); +} + +@Test +public void testFailingCheckpointIfFailedEventNotProcessed() throws Exception { +final ReorderableManualExecutorService executor = new ReorderableManualExecutorService(); +final ComponentMainThreadExecutor mainThreadExecutor = +new ComponentMainThreadExecutorServiceAdapter( +(ScheduledExecutorService) executor, Thread.currentThread()); + +CompletableFuture eventSendingResult = new CompletableFuture<>(); +final EventReceivingTasks tasks = + EventReceivingTasks.createForRunningTasksWithRpcResult(eventSendingResult); + +final OperatorCoordinatorHolder holder = +createCoordinatorHolder(tasks, TestingOperatorCoordinator::new, mainThreadExecutor); + +// Send one event without finishing it. +getCoordinator(holder).getSubtaskGateway(0).sendEvent(new TestOperatorEvent(0)); +executor.triggerAll(); + +// Finish the event sending. This will insert one runnable that handle +// failed events to the executor. And we pending this runnable to Review comment: Here should be "delay the runnable", the initial thought is to test the case that a new checkpoint is triggered before the stage to trigger failover get executed, thus we need some method to delay this stage till the checkpoint is triggered. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
flinkbot commented on pull request #16434: URL: https://github.com/apache/flink/pull/16434#issuecomment-876877559 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit e1f1b3df1ae855ffd1be6dded31e24bdb0b1f9db (Fri Jul 09 03:04:26 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23107) Separate deduplicate rank from rank functions
[ https://issues.apache.org/jira/browse/FLINK-23107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-23107: --- Labels: pull-request-available (was: ) > Separate deduplicate rank from rank functions > - > > Key: FLINK-23107 > URL: https://issues.apache.org/jira/browse/FLINK-23107 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime >Reporter: Jingsong Lee >Assignee: Shuo Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY d ORDER BY e DESC) > AS rownum from T) WHERE rownum=1 > Actually above sql is a deduplicate rank instead of a normal rank. We should > separate the implementation for optimize the deduplicate rank and reduce bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] cshuo opened a new pull request #16434: [FLINK-23107][table-runtime] Separate implementation of deduplicate r…
cshuo opened a new pull request #16434: URL: https://github.com/apache/flink/pull/16434 …ank from other rank functions ## What is the purpose of the change Consider the following sql: ``` SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY d ORDER BY e DESC) AS rownum from T ) WHERE rownum = 1 ``` It's very common for users to write the above sql to achieve the deduplication of records, but the sql cannot be translated to `Deduplcate` when the field `e` is not time attribute field, then it'll be translated to `Rank` instead. The common implementation for rank currently is a bit of complex and error-prone. This pr aims to separate the implementation for the deduplicate rank and reduce bugs. ## Brief change log - Add a new rank function `FastTop1Function` for deduplicate rank ## Verifying this change - Add unit tests for the new rank function - Add IT case ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): yes - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23318) AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377734#comment-17377734 ] Caizhi Weng commented on FLINK-23318: - After binary searching through the commits I found that this issue is caused by c47d8b92499fe25405c337be68907098309fd6c9 by [~trohrmann] . For developers blocked by this issue, you can rebase your pull requests on commit 09cd685d6afcffb0d5488a1dd6a9b3b742a5661d (which is one commit behind the above commit). > AkkaRpcActorTest#testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor > fails on azure > > > Key: FLINK-23318 > URL: https://issues.apache.org/jira/browse/FLINK-23318 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.14.0 >Reporter: Dawid Wysakowicz >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20163=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=6023 > {code} > Jul 08 11:03:13 java.lang.AssertionError: > Jul 08 11:03:13 > Jul 08 11:03:13 Expected: is > Jul 08 11:03:13 but: was > Jul 08 11:03:13 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:964) > Jul 08 11:03:13 at org.junit.Assert.assertThat(Assert.java:930) > Jul 08 11:03:13 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActorTest.testOnStopFutureCompletionDirectlyTerminatesAkkaRpcActor(AkkaRpcActorTest.java:375) > Jul 08 11:03:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Jul 08 11:03:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Jul 08 11:03:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Jul 08 11:03:13 at java.lang.reflect.Method.invoke(Method.java:498) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Jul 08 11:03:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Jul 08 11:03:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Jul 08 11:03:13 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Jul 08 11:03:13 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Jul 08 11:03:13 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Jul 08 11:03:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Jul 08 11:03:13 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Jul 08 11:03:13 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Jul 08 11:03:13 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Jul 08 11:03:13 at >
[GitHub] [flink] gaoyunhaii commented on a change in pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on a change in pull request #16432: URL: https://github.com/apache/flink/pull/16432#discussion_r38491 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/OperatorCoordinatorHolderTest.java ## @@ -370,6 +373,82 @@ private void checkpointEventValueAtomicity( } } +@Test +public void testFailingCheckpointsIfSendingEventFailed() throws Exception { +CompletableFuture eventSendingResult = new CompletableFuture<>(); +final EventReceivingTasks tasks = + EventReceivingTasks.createForRunningTasksWithRpcResult(eventSendingResult); +final OperatorCoordinatorHolder holder = +createCoordinatorHolder(tasks, TestingOperatorCoordinator::new); + +// Send one event without finishing it. +getCoordinator(holder).getSubtaskGateway(0).sendEvent(new TestOperatorEvent(0)); + +// Triggers one checkpoint. +CompletableFuture checkpointResult = new CompletableFuture<>(); +holder.checkpointCoordinator(1, checkpointResult); +getCoordinator(holder).getLastTriggeredCheckpoint().complete(new byte[0]); + +// Fails the event sending. +eventSendingResult.completeExceptionally(new RuntimeException("Artificial")); + +assertTrue(eventSendingResult.isCompletedExceptionally()); +} Review comment: Very sorry for the careless, this should indeed be `assertTrue(checkpointResult.isCompletedExceptionally())`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on a change in pull request #16432: URL: https://github.com/apache/flink/pull/16432#discussion_r37899 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/operators/coordination/util/IncompleteFuturesTracker.java ## @@ -108,4 +94,22 @@ void removeFromSet(CompletableFuture future) { lock.unlock(); } } + +@VisibleForTesting +Collection> getCurrentIncomplete() { Review comment: Sorry here I made a mistake that I initially want to make this method a pure getter one without modifying the state, and it would be only used in the tests for the verification. Do you think this would be also ok that I remove `incompleteFutures.clear()` and keep the name as `getCurrentIncomplete()` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-13400) Remove Hive and Hadoop dependencies from SQL Client
[ https://issues.apache.org/jira/browse/FLINK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377733#comment-17377733 ] frank wang commented on FLINK-13400: hi [~lirui] maybe you can assign this to me > Remove Hive and Hadoop dependencies from SQL Client > --- > > Key: FLINK-13400 > URL: https://issues.apache.org/jira/browse/FLINK-13400 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Reporter: Timo Walther >Priority: Major > Labels: auto-deprioritized-critical, auto-unassigned, stale-major > Fix For: 1.14.0 > > > 340/550 lines in the SQL Client {{pom.xml}} are just around Hive and Hadoop > dependencies. Hive has nothing to do with the SQL Client and it will be hard > to maintain the long list of exclusion there. Some dependencies are even in > a {{provided}} scope and not {{test}} scope. > We should remove all dependencies on Hive/Hadoop and replace catalog-related > tests by a testing catalog. Similar to how we tests source/sinks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tsreaper commented on a change in pull request #16420: [FLINK-23290][table-runtime] Fix NullPointerException when filter is a function call which returns null values
tsreaper commented on a change in pull request #16420: URL: https://github.com/apache/flink/pull/16420#discussion_r36299 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/GenerateUtils.scala ## @@ -176,7 +184,15 @@ object GenerateUtils { val resultTerm = ctx.addReusableLocalVariable(resultTypeTerm, "result") val isResultNullable = resultNullable || (isReference(returnType) && !isTemporal(returnType)) val nullTermCode = if (ctx.nullCheck && isResultNullable) { - s"$nullTerm = ($resultTerm == null);" + // we assume that result term of a primitive SQL type will never be null, + // violating this assumption might cause null pointer exception somewhere else. + // when using this column we'll first check its null term. + s""" + |$nullTerm = ($resultTerm == null); Review comment: Indeed. I used to think that this change will only affect non-primitive numbers and booleans. However string is also a non-primitive type so this change might affect a larger scope than I thought. I once also thought of checking null terms when generating filter code, however this will affect almost all generated code which I also think is not efficient. Currently I can't come up with other places where null terms aren't checked before result terms are used. So I'd like to only change the `GeneratedExpression` produced by casting to boolean. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16415: [FLINK-23262][tests] Check watermark frequency instead of count
flinkbot edited a comment on pull request #16415: URL: https://github.com/apache/flink/pull/16415#issuecomment-875773502 ## CI report: * 499dfac89e8c2411c75e14c407e46a1195a4ac60 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20198) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23237) Add log to print data that failed to deserialize when ignore-parse-error=true
[ https://issues.apache.org/jira/browse/FLINK-23237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377724#comment-17377724 ] Aiden Gong commented on FLINK-23237: [~hehuiyuan] ,You catch 'IOException' in your application and log the origin data.It's normal way to solve this. > Add log to print data that failed to deserialize when > ignore-parse-error=true > --- > > Key: FLINK-23237 > URL: https://issues.apache.org/jira/browse/FLINK-23237 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: hehuiyuan >Priority: Minor > > Add log to print error data that failed to deserialize when set > `ignore-parse-error` = `true` > > {code:java} > public RowData deserialize(@Nullable byte[] message) throws IOException { > if (message == null) { > return null; > } > try { > final JsonNode root = objectReader.readValue(message); > return (RowData) runtimeConverter.convert(root); > } catch (Throwable t) { > if (ignoreParseErrors) { > return null; > } > throw new IOException( > String.format("Failed to deserialize CSV row '%s'.", new > String(message)), t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi commented on a change in pull request #16420: [FLINK-23290][table-runtime] Fix NullPointerException when filter is a function call which returns null values
JingsongLi commented on a change in pull request #16420: URL: https://github.com/apache/flink/pull/16420#discussion_r32299 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/GenerateUtils.scala ## @@ -176,7 +184,15 @@ object GenerateUtils { val resultTerm = ctx.addReusableLocalVariable(resultTypeTerm, "result") val isResultNullable = resultNullable || (isReference(returnType) && !isTemporal(returnType)) val nullTermCode = if (ctx.nullCheck && isResultNullable) { - s"$nullTerm = ($resultTerm == null);" + // we assume that result term of a primitive SQL type will never be null, + // violating this assumption might cause null pointer exception somewhere else. + // when using this column we'll first check its null term. + s""" + |$nullTerm = ($resultTerm == null); Review comment: I'm worried about adding a few lines of code here. This is the only way for each call, which will increase the code generated by CodeGen. Can we just fix the result term or add null check in the `CalcCodeGenerator`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #16432: [FLINK-23233][runtime] Failing checkpoints before failover for failed events in OperatorCoordinator
gaoyunhaii commented on a change in pull request #16432: URL: https://github.com/apache/flink/pull/16432#discussion_r32156 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/operators/coordination/util/NonSuccessFuturesTrack.java ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.operators.coordination.util; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.locks.ReentrantLock; + +import static org.apache.flink.util.Preconditions.checkState; + +/** + * This trackers remember the CompletableFutures if they are not complete normally. It also allows + * the callers to remove the failed ones after they have handled the future. + */ +public class NonSuccessFuturesTrack { Review comment: Yes, `IncompleteFuturesTracker` is still used in the code. I renamed it to `IncompleteFuturesTracker` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-22994) Improve the performance of nesting udf invoking
[ https://issues.apache.org/jira/browse/FLINK-22994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-22994. Fix Version/s: 1.14.0 Resolution: Fixed Implements via: master: 31fcb6c22baaac1308a72559e32e2d089fc8fcad > Improve the performance of nesting udf invoking > --- > > Key: FLINK-22994 > URL: https://issues.apache.org/jira/browse/FLINK-22994 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime >Affects Versions: 1.12.4 > Environment: h5. >Reporter: lynn1.zhang >Assignee: lynn1.zhang >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > Attachments: StringConverterTest.java, Test.java, > image-2021-06-15-15-18-12-619.png, image-2021-06-15-15-19-01-103.png, > image-2021-06-15-15-27-26-739.png, image-2021-06-15-15-28-28-137.png, > image-2021-06-15-15-29-09-773.png, image-2021-06-15-15-30-14-775.png, > image-2021-06-15-15-42-08-065.png, new_projection_code, old_projection_code, > test.sql > > > h1. BackGround > In some nesting udf invoking cases, Flink convert the udf result to external > object and then convert to internalOrNull object as params for next udf > invoking. The performance of some converter is poor. > h1. Test Params > Source = Kafka, Schema = PB with snappy; Flink Slot = 1; > taskmanager.memory.process.size=4g; Linux Core = Intel(R) Xeon(R) Gold 5218 > CPU @ 2.30GHz > UDF Introduction: > * ipip: input: int ip, output: map ip_info, map size = 14. > * ip_2_country: input map ip_info, output: string country. > * ip_2_region: input map ip_info, output: string region. > * ip_2_isp_domain: input map ip_info, output: string isp. > * ip_2_timezone: input map ip_info, output: string timezone. > h1. Performance Compare with MapMapConverter > h5. The throughput of nesting udf invoking with MapMapConverter(5 times): > 41.42 k/s > !image-2021-06-15-15-29-09-773.png! > h1. Goal > For some cases, skip toInternalOrNull & toExternal, Use the udf result > directly. > h1. Performance Compare without MapMapConverter > h5. The throughput of nesting udf invoking without MapMapConverter: 174.41 k/s > !image-2021-06-15-15-30-14-775.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi merged pull request #16163: [FLINK-22994][Table SQL / Planner] improve the performace of invoking…
JingsongLi merged pull request #16163: URL: https://github.com/apache/flink/pull/16163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingsongLi commented on pull request #16163: [FLINK-22994][Table SQL / Planner] improve the performace of invoking…
JingsongLi commented on pull request #16163: URL: https://github.com/apache/flink/pull/16163#issuecomment-876864094 I think this PR is ready to merge. Merging... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23237) Add log to print data that failed to deserialize when ignore-parse-error=true
[ https://issues.apache.org/jira/browse/FLINK-23237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377712#comment-17377712 ] hehuiyuan commented on FLINK-23237: --- Hi [~Aiden Gong], it will failed when set `ignore-parse-error` = `false`. I hope it can run normlly. > Add log to print data that failed to deserialize when > ignore-parse-error=true > --- > > Key: FLINK-23237 > URL: https://issues.apache.org/jira/browse/FLINK-23237 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: hehuiyuan >Priority: Minor > > Add log to print error data that failed to deserialize when set > `ignore-parse-error` = `true` > > {code:java} > public RowData deserialize(@Nullable byte[] message) throws IOException { > if (message == null) { > return null; > } > try { > final JsonNode root = objectReader.readValue(message); > return (RowData) runtimeConverter.convert(root); > } catch (Throwable t) { > if (ignoreParseErrors) { > return null; > } > throw new IOException( > String.format("Failed to deserialize CSV row '%s'.", new > String(message)), t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wsry commented on a change in pull request #11877: [FLINK-16641][network] Announce sender's backlog to solve the deadlock issue without exclusive buffers
wsry commented on a change in pull request #11877: URL: https://github.com/apache/flink/pull/11877#discussion_r26886 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/PipelinedSubpartition.java ## @@ -312,6 +323,16 @@ BufferAndBacklog pollBuffer() { decreaseBuffersInBacklogUnsafe(bufferConsumer.isBuffer()); } +// if we have an empty finished buffer and the exclusive credit is 0, we just return +// the empty buffer so that the downstream task can release the allocated credit for +// this empty buffer, this happens in two main scenarios currently: +// 1. all data of a buffer builder has been read and after that the buffer builder +// is finished +// 2. in approximate recovery mode, a partial record takes a whole buffer builder +if (buffersPerChannel == 0 && bufferConsumer.isFinished()) { +break; +} + Review comment: > One thing to add (unless we are not doing it already) would be to use this backlog information, to maybe release floating buffers if backlog dropped to 0? Currently, we are not doing that. Actually, I think it is a little complicated to do so. Because we need to keep consistency between the sender side available credit and the receiver side floating buffers. If we just release the floating buffers at the receiver side, if the sender side available credit is not reset, then there is may data sent out without buffers at receiver side to receive them. If we also reset the available credit at the sender side when the backlog is 0, there is a possibility that some AddCredit messages are on the way and we are not resetting this part. Maybe one way is to not sending any data out after sending a buffer with 0 backlog at sender side, then the receivers clear all floating credits and send a reset message to the senders. Then the senders reset all available credits. This process is similar to the channel blocking and resumption. I think this is a little complicated and can incur extra overhead. What do you think? Or is there any simple way? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] hehuiyuan edited a comment on pull request #16394: [FLINK-20321][formats] Fix NPE when using AvroDeserializationSchema to deserialize null input in 1.12 version
hehuiyuan edited a comment on pull request #16394: URL: https://github.com/apache/flink/pull/16394#issuecomment-875305823 Hi @wuchong , could you please review ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] LongWangXX commented on pull request #16417: FLINK-22969
LongWangXX commented on pull request #16417: URL: https://github.com/apache/flink/pull/16417#issuecomment-876861105 @luoyuxia Hello, can you help me merge the code into the master branch? Thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23322) RMQSourceITCase.testStopWithSavepoint fails on azure due to timeout
[ https://issues.apache.org/jira/browse/FLINK-23322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377701#comment-17377701 ] Xintong Song commented on FLINK-23322: -- cc [~arvid] [~cmick] > RMQSourceITCase.testStopWithSavepoint fails on azure due to timeout > --- > > Key: FLINK-23322 > URL: https://issues.apache.org/jira/browse/FLINK-23322 > Project: Flink > Issue Type: Bug > Components: Connectors/ RabbitMQ >Affects Versions: 1.12.4 >Reporter: Xintong Song >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20196=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20=13696 > {code} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 41.237 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase > [ERROR] > testStopWithSavepoint(org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase) > Time elapsed: 7.609 s <<< ERROR! > java.util.concurrent.TimeoutException > at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:77) > at > com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:120) > at > com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:36) > at > com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:502) > at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:326) > at > com.rabbitmq.client.impl.recovery.RecoveryAwareAMQConnectionFactory.newConnection(RecoveryAwareAMQConnectionFactory.java:64) > at > com.rabbitmq.client.impl.recovery.AutorecoveringConnection.init(AutorecoveringConnection.java:156) > at > com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1130) > at > com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1087) > at > com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1045) > at > com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1207) > at > org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase.getRMQConnection(RMQSourceITCase.java:133) > at > org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase.setUp(RMQSourceITCase.java:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at >
[jira] [Created] (FLINK-23322) RMQSourceITCase.testStopWithSavepoint fails on azure due to timeout
Xintong Song created FLINK-23322: Summary: RMQSourceITCase.testStopWithSavepoint fails on azure due to timeout Key: FLINK-23322 URL: https://issues.apache.org/jira/browse/FLINK-23322 Project: Flink Issue Type: Bug Components: Connectors/ RabbitMQ Affects Versions: 1.12.4 Reporter: Xintong Song https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20196=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20=13696 {code} [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.237 s <<< FAILURE! - in org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase [ERROR] testStopWithSavepoint(org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase) Time elapsed: 7.609 s <<< ERROR! java.util.concurrent.TimeoutException at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:77) at com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:120) at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:36) at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:502) at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:326) at com.rabbitmq.client.impl.recovery.RecoveryAwareAMQConnectionFactory.newConnection(RecoveryAwareAMQConnectionFactory.java:64) at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.init(AutorecoveringConnection.java:156) at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1130) at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1087) at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1045) at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:1207) at org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase.getRMQConnection(RMQSourceITCase.java:133) at org.apache.flink.streaming.connectors.rabbitmq.RMQSourceITCase.setUp(RMQSourceITCase.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20374) Wrong result when shuffling changelog stream on non-primary-key columns
[ https://issues.apache.org/jira/browse/FLINK-20374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377700#comment-17377700 ] Jingsong Lee commented on FLINK-20374: -- [~twalthr] Sorry, I forgot to close this. This is solved by FLINK-22899 FLINK-22901 FLINK-23054. I will close this one. CC [~jark] Please help to confirm. > Wrong result when shuffling changelog stream on non-primary-key columns > --- > > Key: FLINK-20374 > URL: https://issues.apache.org/jira/browse/FLINK-20374 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Reporter: Jark Wu >Assignee: Jingsong Lee >Priority: Critical > Labels: auto-deprioritized-critical > Fix For: 1.14.0 > > > This is reported from user-zh ML: > http://apache-flink.147419.n8.nabble.com/flink-1-11-2-cdc-cdc-sql-sink-save-point-job-sink-td8593.html > {code:sql} > CREATE TABLE test ( > `id` INT, > `name` VARCHAR(255), > `time` TIMESTAMP(3), > `status` INT, > PRIMARY KEY(id) NOT ENFORCED > ) WITH ( > 'connector' = 'mysql-cdc', > 'hostname' = 'localhost', > 'port' = '3306', > 'username' = 'root', > 'password' = '1', > 'database-name' = 'ai_audio_lyric_task', > 'table-name' = 'test' > ) > CREATE TABLE status ( > `id` INT, > `name` VARCHAR(255), > PRIMARY KEY(id) NOT ENFORCED > ) WITH ( > 'connector' = 'mysql-cdc', > 'hostname' = 'localhost', > 'port' = '3306', > 'username' = 'root', > 'password' = '1', > 'database-name' = 'ai_audio_lyric_task', > 'table-name' = 'status' > ); > -- output > CREATE TABLE test_status ( > `id` INT, > `name` VARCHAR(255), > `time` TIMESTAMP(3), > `status` INT, > `status_name` VARCHAR(255) > PRIMARY KEY(id) NOT ENFORCED > ) WITH ( > 'connector' = 'elasticsearch-7', > 'hosts' = 'xxx', > 'index' = 'xxx', > 'username' = 'xxx', > 'password' = 'xxx', > 'sink.bulk-flush.backoff.max-retries' = '10', > 'sink.bulk-flush.backoff.strategy' = 'CONSTANT', > 'sink.bulk-flush.max-actions' = '5000', > 'sink.bulk-flush.max-size' = '10mb', > 'sink.bulk-flush.interval' = '1s' > ); > INSERT into test_status > SELECT t.*, s.name > FROM test AS t > LEFT JOIN status AS s ON t.status = s.id; > {code} > Data in mysql table: > {code} > test: > 0, name0, 2020-07-06 00:00:00 , 0 > 1, name1, 2020-07-06 00:00:00 , 1 > 2, name2, 2020-07-06 00:00:00 , 1 > . > status > 0, status0 > 1, status1 > 2, status2 > . > {code} > Operations: > 1. start job with paralleslim=40, result in test_status sink is correct: > {code} > 0, name0, 2020-07-06 00:00:00 , 0, status0 > 1, name1, 2020-07-06 00:00:00 , 1, status1 > 2, name2, 2020-07-06 00:00:00 , 1, status1 > {code} > 2. Update {{status}} of {{id=2}} record in table {{test}} from {{1}} to {{2}}. > 3. Result is not correct because the {{id=2}} record is missing in the > result. > The reason is that it shuffles the changelog {{test}} on {{status}} column > which is not the primary key. Therefore, the ordering can't be guaranteed, > and the result is wrong. > The {{-U[2, name2, 2020-07-06 00:00:00 , 1]}} and {{+U[2, name2, 2020-07-06 > 00:00:00 , 2]}} will possible be shuffled to different join task, so the > order of joined results is not guaranteed when they arrive to the sink task. > It is possbile {{+U[2, name2, 2020-07-06 00:00:00 , status2]}} arrives > first, and then {{-U[2, name2, 2020-07-06 00:00:00 , status1]}} , then the > {{id=2}} record is missing in Elasticsearch. > It seems that we need a changelog ordering mechanism in the planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-20374) Wrong result when shuffling changelog stream on non-primary-key columns
[ https://issues.apache.org/jira/browse/FLINK-20374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-20374. Resolution: Fixed > Wrong result when shuffling changelog stream on non-primary-key columns > --- > > Key: FLINK-20374 > URL: https://issues.apache.org/jira/browse/FLINK-20374 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Reporter: Jark Wu >Assignee: Jingsong Lee >Priority: Critical > Labels: auto-deprioritized-critical > Fix For: 1.14.0 > > > This is reported from user-zh ML: > http://apache-flink.147419.n8.nabble.com/flink-1-11-2-cdc-cdc-sql-sink-save-point-job-sink-td8593.html > {code:sql} > CREATE TABLE test ( > `id` INT, > `name` VARCHAR(255), > `time` TIMESTAMP(3), > `status` INT, > PRIMARY KEY(id) NOT ENFORCED > ) WITH ( > 'connector' = 'mysql-cdc', > 'hostname' = 'localhost', > 'port' = '3306', > 'username' = 'root', > 'password' = '1', > 'database-name' = 'ai_audio_lyric_task', > 'table-name' = 'test' > ) > CREATE TABLE status ( > `id` INT, > `name` VARCHAR(255), > PRIMARY KEY(id) NOT ENFORCED > ) WITH ( > 'connector' = 'mysql-cdc', > 'hostname' = 'localhost', > 'port' = '3306', > 'username' = 'root', > 'password' = '1', > 'database-name' = 'ai_audio_lyric_task', > 'table-name' = 'status' > ); > -- output > CREATE TABLE test_status ( > `id` INT, > `name` VARCHAR(255), > `time` TIMESTAMP(3), > `status` INT, > `status_name` VARCHAR(255) > PRIMARY KEY(id) NOT ENFORCED > ) WITH ( > 'connector' = 'elasticsearch-7', > 'hosts' = 'xxx', > 'index' = 'xxx', > 'username' = 'xxx', > 'password' = 'xxx', > 'sink.bulk-flush.backoff.max-retries' = '10', > 'sink.bulk-flush.backoff.strategy' = 'CONSTANT', > 'sink.bulk-flush.max-actions' = '5000', > 'sink.bulk-flush.max-size' = '10mb', > 'sink.bulk-flush.interval' = '1s' > ); > INSERT into test_status > SELECT t.*, s.name > FROM test AS t > LEFT JOIN status AS s ON t.status = s.id; > {code} > Data in mysql table: > {code} > test: > 0, name0, 2020-07-06 00:00:00 , 0 > 1, name1, 2020-07-06 00:00:00 , 1 > 2, name2, 2020-07-06 00:00:00 , 1 > . > status > 0, status0 > 1, status1 > 2, status2 > . > {code} > Operations: > 1. start job with paralleslim=40, result in test_status sink is correct: > {code} > 0, name0, 2020-07-06 00:00:00 , 0, status0 > 1, name1, 2020-07-06 00:00:00 , 1, status1 > 2, name2, 2020-07-06 00:00:00 , 1, status1 > {code} > 2. Update {{status}} of {{id=2}} record in table {{test}} from {{1}} to {{2}}. > 3. Result is not correct because the {{id=2}} record is missing in the > result. > The reason is that it shuffles the changelog {{test}} on {{status}} column > which is not the primary key. Therefore, the ordering can't be guaranteed, > and the result is wrong. > The {{-U[2, name2, 2020-07-06 00:00:00 , 1]}} and {{+U[2, name2, 2020-07-06 > 00:00:00 , 2]}} will possible be shuffled to different join task, so the > order of joined results is not guaranteed when they arrive to the sink task. > It is possbile {{+U[2, name2, 2020-07-06 00:00:00 , status2]}} arrives > first, and then {{-U[2, name2, 2020-07-06 00:00:00 , status1]}} , then the > {{id=2}} record is missing in Elasticsearch. > It seems that we need a changelog ordering mechanism in the planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] xintongsong commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support
xintongsong commented on pull request #15599: URL: https://github.com/apache/flink/pull/15599#issuecomment-876855655 Hi @galenwarren, How are things going? Just to let you know that the feature freeze for the 1.14 release will be around early August, which is about 3 weeks from now. No worries if you need more time for this. We don't necessarily need to get this feature into the 1.14 release. This notice is just in case you'd like to seeing it in the release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23237) Add log to print data that failed to deserialize when ignore-parse-error=true
[ https://issues.apache.org/jira/browse/FLINK-23237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377693#comment-17377693 ] Aiden Gong commented on FLINK-23237: Hi,[~hehuiyuan].I think if you care about accuracy of data,you should set `ignore-parse-error` = `false`. > Add log to print data that failed to deserialize when > ignore-parse-error=true > --- > > Key: FLINK-23237 > URL: https://issues.apache.org/jira/browse/FLINK-23237 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: hehuiyuan >Priority: Minor > > Add log to print error data that failed to deserialize when set > `ignore-parse-error` = `true` > > {code:java} > public RowData deserialize(@Nullable byte[] message) throws IOException { > if (message == null) { > return null; > } > try { > final JsonNode root = objectReader.readValue(message); > return (RowData) runtimeConverter.convert(root); > } catch (Throwable t) { > if (ignoreParseErrors) { > return null; > } > throw new IOException( > String.format("Failed to deserialize CSV row '%s'.", new > String(message)), t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-22195) YARNHighAvailabilityITCase.testClusterClientRetrieval because of TestTimedOutException
[ https://issues.apache.org/jira/browse/FLINK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-22195. Fix Version/s: 1.13.2 1.14.0 Assignee: Xintong Song Resolution: Fixed Fixed by FLINK-22662 > YARNHighAvailabilityITCase.testClusterClientRetrieval because of > TestTimedOutException > -- > > Key: FLINK-22195 > URL: https://issues.apache.org/jira/browse/FLINK-22195 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN, Runtime / Coordination >Affects Versions: 1.13.0 >Reporter: Guowei Ma >Assignee: Xintong Song >Priority: Major > Labels: test-stability > Fix For: 1.14.0, 1.13.2 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16308=logs=8fd975ef-f478-511d-4997-6f15fe8a1fd3=ac0fa443-5d45-5a6b-3597-0310ecc1d2ab=31562 > {code:java} > org.junit.runners.model.TestTimedOutException: test timed out after 180 > milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:1223) > at > org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593) > at > org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:418) > at > org.apache.flink.yarn.YARNHighAvailabilityITCase.deploySessionCluster(YARNHighAvailabilityITCase.java:356) > at > org.apache.flink.yarn.YARNHighAvailabilityITCase.lambda$testClusterClientRetrieval$2(YARNHighAvailabilityITCase.java:224) > at > org.apache.flink.yarn.YARNHighAvailabilityITCase$$Lambda$250/401276136.run(Unknown > Source) > at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:287) > at > org.apache.flink.yarn.YARNHighAvailabilityITCase.testClusterClientRetrieval(YARNHighAvailabilityITCase.java:219) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-22662) YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint fail
[ https://issues.apache.org/jira/browse/FLINK-22662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-22662. Fix Version/s: 1.13.2 1.14.0 Resolution: Fixed Fixed via - master (1.14): 928b6897b7a0c609ab27f679b31852f18fa40684 - release-1.13: 76edcdade12c3ebc2157debf80f4fe398c541a12 > YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint fail > > > Key: FLINK-22662 > URL: https://issues.apache.org/jira/browse/FLINK-22662 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.14.0, 1.13.1 >Reporter: Guowei Ma >Assignee: Xintong Song >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.14.0, 1.13.2 > > > {code:java} > 2021-05-14T00:24:57.8487649Z May 14 00:24:57 [ERROR] > testKillYarnSessionClusterEntrypoint(org.apache.flink.yarn.YARNHighAvailabilityITCase) > Time elapsed: 34.667 s <<< ERROR! > 2021-05-14T00:24:57.8488567Z May 14 00:24:57 > java.util.concurrent.ExecutionException: > 2021-05-14T00:24:57.8489301Z May 14 00:24:57 > org.apache.flink.runtime.rest.util.RestClientException: > [org.apache.flink.runtime.rest.handler.RestHandlerException: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8493142Z May 14 00:24:57 at > org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.propagateException(JobExecutionResultHandler.java:94) > 2021-05-14T00:24:57.8495823Z May 14 00:24:57 at > org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.lambda$handleRequest$1(JobExecutionResultHandler.java:84) > 2021-05-14T00:24:57.8496733Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) > 2021-05-14T00:24:57.8497640Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) > 2021-05-14T00:24:57.8498491Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > 2021-05-14T00:24:57.8499222Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > 2021-05-14T00:24:57.853Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234) > 2021-05-14T00:24:57.8500872Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > 2021-05-14T00:24:57.8501702Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > 2021-05-14T00:24:57.8502662Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > 2021-05-14T00:24:57.8503472Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > 2021-05-14T00:24:57.8504269Z May 14 00:24:57 at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1079) > 2021-05-14T00:24:57.8504892Z May 14 00:24:57 at > akka.dispatch.OnComplete.internal(Future.scala:263) > 2021-05-14T00:24:57.8505565Z May 14 00:24:57 at > akka.dispatch.OnComplete.internal(Future.scala:261) > 2021-05-14T00:24:57.8506062Z May 14 00:24:57 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > 2021-05-14T00:24:57.8506819Z May 14 00:24:57 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > 2021-05-14T00:24:57.8507418Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > 2021-05-14T00:24:57.8508373Z May 14 00:24:57 at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73) > 2021-05-14T00:24:57.8509144Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > 2021-05-14T00:24:57.8509972Z May 14 00:24:57 at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > 2021-05-14T00:24:57.8510675Z May 14 00:24:57 at > akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > 2021-05-14T00:24:57.8511376Z May 14 00:24:57 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:23) > 2021-05-14T00:24:57.851Z May 14 00:24:57 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > 2021-05-14T00:24:57.8513090Z May 14 00:24:57 at > scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > 2021-05-14T00:24:57.8513835Z May 14 00:24:57 at > scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > 2021-05-14T00:24:57.8514576Z
[GitHub] [flink] xintongsong closed pull request #16395: [FLINK-22662][yarn][test] Stabilize YARNHighAvailabilityITCase
xintongsong closed pull request #16395: URL: https://github.com/apache/flink/pull/16395 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20329) Elasticsearch7DynamicSinkITCase hangs
[ https://issues.apache.org/jira/browse/FLINK-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377682#comment-17377682 ] Xintong Song commented on FLINK-20329: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20191=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20=11903 > Elasticsearch7DynamicSinkITCase hangs > - > > Key: FLINK-20329 > URL: https://issues.apache.org/jira/browse/FLINK-20329 > Project: Flink > Issue Type: Bug > Components: Connectors / ElasticSearch >Affects Versions: 1.12.0, 1.13.0 >Reporter: Dian Fu >Assignee: Yangze Guo >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10052=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20 > {code} > 2020-11-24T16:04:05.9260517Z [INFO] Running > org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch7DynamicSinkITCase > 2020-11-24T16:19:25.5481231Z > == > 2020-11-24T16:19:25.5483549Z Process produced no output for 900 seconds. > 2020-11-24T16:19:25.5484064Z > == > 2020-11-24T16:19:25.5484498Z > == > 2020-11-24T16:19:25.5484882Z The following Java processes are running (JPS) > 2020-11-24T16:19:25.5485475Z > == > 2020-11-24T16:19:25.5694497Z Picked up JAVA_TOOL_OPTIONS: > -XX:+HeapDumpOnOutOfMemoryError > 2020-11-24T16:19:25.7263048Z 16192 surefirebooter5057948964630155904.jar > 2020-11-24T16:19:25.7263515Z 18566 Jps > 2020-11-24T16:19:25.7263709Z 959 Launcher > 2020-11-24T16:19:25.7411148Z > == > 2020-11-24T16:19:25.7427013Z Printing stack trace of Java process 16192 > 2020-11-24T16:19:25.7427369Z > == > 2020-11-24T16:19:25.7484365Z Picked up JAVA_TOOL_OPTIONS: > -XX:+HeapDumpOnOutOfMemoryError > 2020-11-24T16:19:26.0848776Z 2020-11-24 16:19:26 > 2020-11-24T16:19:26.0849578Z Full thread dump OpenJDK 64-Bit Server VM > (25.275-b01 mixed mode): > 2020-11-24T16:19:26.0849831Z > 2020-11-24T16:19:26.0850185Z "Attach Listener" #32 daemon prio=9 os_prio=0 > tid=0x7fc148001000 nid=0x48e7 waiting on condition [0x] > 2020-11-24T16:19:26.0850595Zjava.lang.Thread.State: RUNNABLE > 2020-11-24T16:19:26.0850814Z > 2020-11-24T16:19:26.0851375Z "testcontainers-ryuk" #31 daemon prio=5 > os_prio=0 tid=0x7fc251232000 nid=0x3fb0 in Object.wait() > [0x7fc1012c4000] > 2020-11-24T16:19:26.0854688Zjava.lang.Thread.State: TIMED_WAITING (on > object monitor) > 2020-11-24T16:19:26.0855379Z at java.lang.Object.wait(Native Method) > 2020-11-24T16:19:26.0855844Z at > org.testcontainers.utility.ResourceReaper.lambda$null$1(ResourceReaper.java:142) > 2020-11-24T16:19:26.0857272Z - locked <0x8e2bd2d0> (a > java.util.ArrayList) > 2020-11-24T16:19:26.0857977Z at > org.testcontainers.utility.ResourceReaper$$Lambda$93/1981729428.run(Unknown > Source) > 2020-11-24T16:19:26.0858471Z at > org.rnorth.ducttape.ratelimits.RateLimiter.doWhenReady(RateLimiter.java:27) > 2020-11-24T16:19:26.0858961Z at > org.testcontainers.utility.ResourceReaper.lambda$start$2(ResourceReaper.java:133) > 2020-11-24T16:19:26.0859422Z at > org.testcontainers.utility.ResourceReaper$$Lambda$92/40191541.run(Unknown > Source) > 2020-11-24T16:19:26.0859788Z at java.lang.Thread.run(Thread.java:748) > 2020-11-24T16:19:26.0860030Z > 2020-11-24T16:19:26.0860371Z "process reaper" #24 daemon prio=10 os_prio=0 > tid=0x7fc0f803b800 nid=0x3f92 waiting on condition [0x7fc10296e000] > 2020-11-24T16:19:26.0860913Zjava.lang.Thread.State: TIMED_WAITING > (parking) > 2020-11-24T16:19:26.0861387Z at sun.misc.Unsafe.park(Native Method) > 2020-11-24T16:19:26.0862495Z - parking to wait for <0x8814bf30> (a > java.util.concurrent.SynchronousQueue$TransferStack) > 2020-11-24T16:19:26.0863253Z at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > 2020-11-24T16:19:26.0863760Z at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > 2020-11-24T16:19:26.0864274Z at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) > 2020-11-24T16:19:26.0864762Z at > java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) > 2020-11-24T16:19:26.0865299Z
[jira] [Updated] (FLINK-23321) Make JM REST Netty thread pool count configurable
[ https://issues.apache.org/jira/browse/FLINK-23321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Raga updated FLINK-23321: - Description: Currently [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] works to create more DispatcherRestEndpoint threads which help in high usage of functionality in the purview of job the dispatcher functionality. However, we cannot tune the number of netty threads use to handle requests. Especially when using the REST server as a monitoring endpoint, a lack of netty threads when processing of AbstractHandlers takes a long causes extended response times. Such configuration already exists for taskmanagers (*taskmanager.network.netty.client/server.numThreads*) and such implementation of a *server.netty-boss.numThreads* and *server.netty-worker.numThreads* would be of great use. was: Currently [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] works to create more DispatcherRestEndpoint threads which help in high usage of functionality in the purview of job the dispatcher functionality. However, we cannot tune the number of netty threads use to handle incoming connections. Especially when using the REST server as a monitoring endpoint, a lack of netty threads cause connection times and a long lags in responses. Such configuration already exists for taskmanagers (*taskmanager.network.netty.client/server.numThreads*) and such implementation of a *server.netty-boss.numThreads* and *server.netty-worker.numThreads* would be of great use. > Make JM REST Netty thread pool count configurable > - > > Key: FLINK-23321 > URL: https://issues.apache.org/jira/browse/FLINK-23321 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / REST >Reporter: Nicolas Raga >Priority: Minor > Labels: Monitoring, Netty, REST > Original Estimate: 1h > Remaining Estimate: 1h > > Currently > [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] > works to create more DispatcherRestEndpoint threads which help in high usage > of functionality in the purview of job the dispatcher functionality. However, > we cannot tune the number of netty threads use to handle requests. Especially > when using the REST server as a monitoring endpoint, a lack of netty threads > when processing of AbstractHandlers takes a long causes extended response > times. > Such configuration already exists for taskmanagers > (*taskmanager.network.netty.client/server.numThreads*) and such > implementation of a *server.netty-boss.numThreads* and > *server.netty-worker.numThreads* would be of great use. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23321) Make JM REST Netty thread pool count configurable
[ https://issues.apache.org/jira/browse/FLINK-23321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Raga updated FLINK-23321: - Summary: Make JM REST Netty thread pool count configurable (was: Make REST Netty thread pool count configurable) > Make JM REST Netty thread pool count configurable > - > > Key: FLINK-23321 > URL: https://issues.apache.org/jira/browse/FLINK-23321 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / REST >Reporter: Nicolas Raga >Priority: Minor > Labels: Monitoring, Netty, REST > Original Estimate: 1h > Remaining Estimate: 1h > > Currently > [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] > works to create more DispatcherRestEndpoint threads which help in high usage > of functionality in the purview of job the dispatcher functionality. However, > we cannot tune the number of netty threads use to handle incoming > connections. Especially when using the REST server as a monitoring endpoint, > a lack of netty threads cause connection times and a long lags in responses. > > Such configuration already exists for taskmanagers > (*taskmanager.network.netty.client/server.numThreads*) and such > implementation of a *server.netty-boss.numThreads* and > *server.netty-worker.numThreads* would be of great use. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23230) Cannot compile Flink on MacOS with M1 chip
[ https://issues.apache.org/jira/browse/FLINK-23230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377661#comment-17377661 ] Osama Neiroukh commented on FLINK-23230: Thank you [~trohrmann]. Let me provide update on what I did, and I'll let you decide what to do with this issue as I'm able to make forward progress. I didn't want to block on these. For node, I installed the expected version 10.9.0 manually on my machine. Performance with Rosetta 2 (MacOS's x86/aarch64 emulation) is very good. So that's good enough for now. For protobuf within format/Parquet, I installed an x86 version in my local maven repository, registered it with Maven, and again Rosetta 2 is able to run it with no issues. The final issue, fyi, was in flink-python. This uses protoc-jar, which only supports very limited versions of protoc. I installed protoc x86 locally, modified flink-python/pom.xml to include protoc and now it is using my locally installed protoc and working fine. With all this, I was able to compile Flink all the way, run an example locally, and verify that the web dashboard is also working well. Thanks Osama > Cannot compile Flink on MacOS with M1 chip > -- > > Key: FLINK-23230 > URL: https://issues.apache.org/jira/browse/FLINK-23230 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.13.1 >Reporter: Osama Neiroukh >Priority: Minor > > Flink doesn't currently compile on MacOS with M1 silicon. > This is true for all recent versions (1.13.X) as well as master. > Some of the problems have potentially easy fixes, such as installing node > separately or updating the relevant pom.xml to use a newer version of node. I > am getting some errors about deprecated features being used which are not > supported by newer node, but on the surface they seem easy to resolve. > I've had less success with complex dependencies such as protobuf. > My long term objective is to use and contribute to Flink. If I can get some > help with the above issues, I am willing to make the modifications, submit > the changes as a pull request, and shepherd them to release. If compilation > on MacOS/M1 is not a priority, I can look for a virtual machine solution > instead. Feedback appreciated. > > Thanks > > Osama -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #16433: FLIP-180: Adjust StreamStatus and Idleness definition (POC)
flinkbot edited a comment on pull request #16433: URL: https://github.com/apache/flink/pull/16433#issuecomment-876747425 ## CI report: * 7e0fa4ef81529c8b80baa99a8bbd8c515b9b5c69 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20199) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23321) Make REST Netty thread pool count configurable
[ https://issues.apache.org/jira/browse/FLINK-23321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Raga updated FLINK-23321: - Description: Currently [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] works to create more DispatcherRestEndpoint threads which help in high usage of functionality in the purview of job the dispatcher functionality. However, we cannot tune the number of netty threads use to handle incoming connections. Especially when using the REST server as a monitoring endpoint, a lack of netty threads cause connection times and a long lags in responses. Such configuration already exists for taskmanagers (*taskmanager.network.netty.client/server.numThreads*) and such implementation of a *server.netty-boss.numThreads* and *server.netty-worker.numThreads* would be of great use. was:Currently [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] works to create more DispatcherRestEndpoint threads which help in the purview of job the dispatcher functionality. However, when used for the REST server for monitoring of a running job, AbstractRestHandlers actually run on a netty thread pool we have no control over as seen [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L188-L193]. Especially when monitoring unhealthy jobs, we would want to allow creation of more netty-worker threads that can process incoming connection requests. > Make REST Netty thread pool count configurable > -- > > Key: FLINK-23321 > URL: https://issues.apache.org/jira/browse/FLINK-23321 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / REST >Reporter: Nicolas Raga >Priority: Minor > Labels: Monitoring, Netty, REST > Original Estimate: 1h > Remaining Estimate: 1h > > Currently > [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] > works to create more DispatcherRestEndpoint threads which help in high usage > of functionality in the purview of job the dispatcher functionality. However, > we cannot tune the number of netty threads use to handle incoming > connections. Especially when using the REST server as a monitoring endpoint, > a lack of netty threads cause connection times and a long lags in responses. > > Such configuration already exists for taskmanagers > (*taskmanager.network.netty.client/server.numThreads*) and such > implementation of a *server.netty-boss.numThreads* and > *server.netty-worker.numThreads* would be of great use. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23321) Make REST Netty thread county configurable
Nicolas Raga created FLINK-23321: Summary: Make REST Netty thread county configurable Key: FLINK-23321 URL: https://issues.apache.org/jira/browse/FLINK-23321 Project: Flink Issue Type: Improvement Components: Runtime / Metrics, Runtime / REST Reporter: Nicolas Raga Currently [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] works to create more DispatcherRestEndpoint threads which help in the purview of job the dispatcher functionality. However, when used for the REST server for monitoring of a running job, AbstractRestHandlers actually run on a netty thread pool we have no control over as seen [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L188-L193]. Especially when monitoring unhealthy jobs, we would want to allow creation of more netty-worker threads that can process incoming connection requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23321) Make REST Netty thread pool count configurable
[ https://issues.apache.org/jira/browse/FLINK-23321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Raga updated FLINK-23321: - Summary: Make REST Netty thread pool count configurable (was: Make REST Netty thread county configurable) > Make REST Netty thread pool count configurable > -- > > Key: FLINK-23321 > URL: https://issues.apache.org/jira/browse/FLINK-23321 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / REST >Reporter: Nicolas Raga >Priority: Minor > Labels: Monitoring, Netty, REST > Original Estimate: 1h > Remaining Estimate: 1h > > Currently > [rest.server.numThreads|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#rest-server-numthreads] > works to create more DispatcherRestEndpoint threads which help in the > purview of job the dispatcher functionality. However, when used for the REST > server for monitoring of a running job, AbstractRestHandlers actually run on > a netty thread pool we have no control over as seen > [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestServerEndpoint.java#L188-L193]. > Especially when monitoring unhealthy jobs, we would want to allow creation > of more netty-worker threads that can process incoming connection requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #16345: [FLINK-18783] Load AkkaRpcSystem through separate classloader
flinkbot edited a comment on pull request #16345: URL: https://github.com/apache/flink/pull/16345#issuecomment-872302449 ## CI report: * 3c74e6208e91e48260fb5d1036680fc40e58a7f5 UNKNOWN * b0edbbdef02426e81c0bafde9faeea2fd97483b8 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=20194) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-18112) Approximate Task-Local Recovery -- Milestone One
[ https://issues.apache.org/jira/browse/FLINK-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-18112: --- Labels: auto-deprioritized-major auto-unassigned (was: auto-unassigned stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Approximate Task-Local Recovery -- Milestone One > > > Key: FLINK-18112 > URL: https://issues.apache.org/jira/browse/FLINK-18112 > Project: Flink > Issue Type: New Feature > Components: Runtime / Checkpointing, Runtime / Coordination, Runtime > / Network >Affects Versions: 1.12.0 >Reporter: Yuan Mei >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned > > This is the Jira ticket for Milestone One of [FLIP-135 Approximate Task-Local > Recovery|https://cwiki.apache.org/confluence/display/FLINK/FLIP-135+Approximate+Task-Local+Recovery] > In short, in Approximate Task-Local Recovery, if a task fails, only the > failed task restarts without affecting the rest of the job. To ease > discussion, we divide the problem of approximate task-local recovery into > three parts with each part only focusing on addressing a set of problems. > This Jira ticket focuses on address the first milestone. > Milestone One: sink recovery. Here a sink task stands for no consumers > reading data from it. In this scenario, if a sink vertex fails, the sink is > restarted from the last successfully completed checkpoint and data loss is > expected. If a non-sink vertex fails, a regional failover strategy takes > place. In milestone one, we focus on issues related to task failure handling > and upstream reconnection. > > Milestone one includes two parts of change: > *Part 1*: Network Part: how the failed task able to link to the upstream > Result(Sub)Partitions, and continue processing data > *Part 2*: Scheduling part, a new failover strategy to restart the sink only > when the sink fails. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22949) java.io.InvalidClassException With Flink Kafka Beam
[ https://issues.apache.org/jira/browse/FLINK-22949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-22949: --- I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as a Blocker but is unassigned and neither itself nor its Sub-Tasks have been updated for 1 days. I have gone ahead and marked it "stale-blocker". If this ticket is a Blocker, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > java.io.InvalidClassException With Flink Kafka Beam > --- > > Key: FLINK-22949 > URL: https://issues.apache.org/jira/browse/FLINK-22949 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.12.0 >Reporter: Ravikiran Borse >Priority: Blocker > Labels: stale-blocker > > Beam: 2.30.0 > Flink: 1.12.0 > Kafka: 2.6.0 > ERROR:root:java.io.InvalidClassException: > org.apache.flink.streaming.api.graph.StreamConfig$NetworkInputConfig; local > class incompatible: stream classdesc serialVersionUID = 3698633776553163849, > local class serialVersionUID = -3137689219135046939 > > In Flink Logs > KafkaIO.Read.ReadFromKafkaViaSDF/{ParDo(GenerateKafkaSourceDescriptor), > KafkaIO.ReadSourceDescriptors} (1/1)#0 (b0c31371874208adb0ccaff85b971883) > switched from RUNNING to FAILED. > org.apache.flink.streaming.runtime.tasks.StreamTaskException: Could not > deserialize inputs > at > org.apache.flink.streaming.api.graph.StreamConfig.getInputs(StreamConfig.java:265) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.api.graph.StreamConfig.getTypeSerializerIn(StreamConfig.java:280) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.api.graph.StreamConfig.getTypeSerializerIn1(StreamConfig.java:271) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.wrapOperatorIntoOutput(OperatorChain.java:639) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:591) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:526) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:164) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:485) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531) > ~[flink-dist_2.12-1.12.0.jar:1.12.0] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) > [flink-dist_2.12-1.12.0.jar:1.12.0] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) > [flink-dist_2.12-1.12.0.jar:1.12.0] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > Caused by: java.io.InvalidClassException: > org.apache.flink.streaming.api.graph.StreamConfig$NetworkInputConfig; local > class incompatible: stream classdesc serialVersionUID = 3698633776553163849, > local class serialVersionUID = -3137689219135046939 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-17826) Add missing custom query support on new jdbc connector
[ https://issues.apache.org/jira/browse/FLINK-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-17826: --- Labels: auto-deprioritized-major auto-unassigned pull-request-available (was: auto-unassigned pull-request-available stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Add missing custom query support on new jdbc connector > -- > > Key: FLINK-17826 > URL: https://issues.apache.org/jira/browse/FLINK-17826 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, > pull-request-available > Fix For: 1.14.0 > > > In FLINK-17361, we added custom query on JDBC tables, but missing to add the > same ability on new jdbc connector (i.e. > {{JdbcDynamicTableSourceSinkFactory}}). > In the new jdbc connector, maybe we should call it {{scan.query}} to keep > consistent with other scan options, besides we need to make {{"table-name"}} > optional, but add validation that "table-name" and "scan.query" shouldn't > both be empty, and "table-name" must not be empty when used as sink. -- This message was sent by Atlassian Jira (v8.3.4#803005)