[jira] [Commented] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712937#comment-17712937 ] Krzysztof Chmielewski commented on FLINK-31811: --- Hi [~jirawech.s] ??Could you share me reproducible code??? The code is attached to the issue I've created https://issues.apache.org/jira/browse/FLINK-31197 > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712410#comment-17712410 ] Krzysztof Chmielewski edited comment on FLINK-31811 at 4/14/23 2:18 PM: [~jirawech.s] I've pasted a wrong ticket number, already edited my previous comment sory. I was talking about this one https://issues.apache.org/jira/browse/FLINK-31197 which is about parquet writer. was (Author: kristoffsc): [~jirawech.s] I've pasted a wrong ticket number, already edited my previous comment. I was talking about this one https://issues.apache.org/jira/browse/FLINK-31197 which is about parquet writer. Sorry. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712390#comment-17712390 ] Krzysztof Chmielewski edited comment on FLINK-31811 at 4/14/23 2:18 PM: I think this might be a duplicate of https://issues.apache.org/jira/browse/FLINK-31197 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. was (Author: kristoffsc): -I think this might be a duplicate of https://issues.apache.org/jira/browse/FLINK-31197 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712410#comment-17712410 ] Krzysztof Chmielewski edited comment on FLINK-31811 at 4/14/23 2:17 PM: [~jirawech.s] I've pasted a wrong ticket number, already edited my previous comment. I was talking about this one https://issues.apache.org/jira/browse/FLINK-31197 which is about parquet writer. Sorry. was (Author: kristoffsc): [~jirawech.s] I've pasted a wrong ticket number, already edidt my previous comment. I was talking about this one https://issues.apache.org/jira/browse/FLINK-31197 which is about parquet writer. Sorry. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712410#comment-17712410 ] Krzysztof Chmielewski commented on FLINK-31811: --- [~jirawech.s] I've pasted a wrong ticket number, already edidt my previous comment. I was talking about this one https://issues.apache.org/jira/browse/FLINK-31197 which is about parquet writer. Sorry. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712390#comment-17712390 ] Krzysztof Chmielewski edited comment on FLINK-31811 at 4/14/23 2:16 PM: -I think this might be a duplicate of https://issues.apache.org/jira/browse/FLINK-31197 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. was (Author: kristoffsc): I think this might be a duplicate of -https://issues.apache.org/jira/browse/FLINK-31202- https://issues.apache.org/jira/browse/FLINK-31197 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712390#comment-17712390 ] Krzysztof Chmielewski edited comment on FLINK-31811 at 4/14/23 2:15 PM: I think this might be a duplicate of -https://issues.apache.org/jira/browse/FLINK-31202- https://issues.apache.org/jira/browse/FLINK-31197 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. was (Author: kristoffsc): I think this might be a duplicate of https://issues.apache.org/jira/browse/FLINK-31202 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31811) Unsupported complex data type for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-31811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712390#comment-17712390 ] Krzysztof Chmielewski commented on FLINK-31811: --- I think this might be a duplicate of https://issues.apache.org/jira/browse/FLINK-31202 that manifest in SQL API. P.S. Are you sure that this was working in 1.15.1? I know that writing "simple" complex types like Array of Intigers or Strings bBut not sure about this one -> Arrays Of Map. > Unsupported complex data type for Flink SQL > --- > > Key: FLINK-31811 > URL: https://issues.apache.org/jira/browse/FLINK-31811 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.16.1 >Reporter: jirawech.s >Priority: Major > Fix For: 1.16.2 > > > I found this issue when I tried to write data on local filesystem using Flink > SQL > {code:java} > 19:51:32,966 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > [] - compact-operator (1/4) > (4f2a09b638c786f74262c675d248afd9_80fe6c4f32f605d447b391cdb16cc1ff_0_4) > switched from RUNNING to FAILED on 69ed2306-371b-4bfc-a98e-bf75fb41748f @ > localhost (dataPort=-1). > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301] > at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301] > at org.apache.parquet.schema.GroupType.getType(GroupType.java:216) > ~[parquet-column-1.12.2.jar:1.12.2] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:523) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil.createWritableColumnVector(ParquetSplitReaderUtil.java:503) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createWritableVectors(ParquetVectorizedInputFormat.java:281) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReaderBatch(ParquetVectorizedInputFormat.java:270) > ~[flink-parquet-1.16.1.jar:1.16.1] > at > org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createPoolOfBatches(ParquetVectorizedInputFormat.java:260) > ~[flink-parquet-1.16.1.jar:1.16.1] > {code} > What i tried to do is writing complex data type to parquet file > Here is the schema of sink table. The problematic data type is > ARRAY> > {code:java} > CREATE TEMPORARY TABLE local_table ( > `user_id` STRING, `order_id` STRING, `amount` INT, `restaurant_id` STRING, > `experiment` ARRAY>, `dt` STRING > ) PARTITIONED BY (`dt`) WITH ( > 'connector'='filesystem', > 'path'='file:///tmp/test_hadoop_write', > 'format'='parquet', > 'auto-compaction'='true', > 'sink.partition-commit.policy.kind'='success-file' > ) {code} > PS. It is used to work in Flink version 1.15.1 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-26051) one sql has row_number =1 and the subsequent SQL has "case when" and "where" statement result Exception : The window can only be ordered in ASCENDING mode
[ https://issues.apache.org/jira/browse/FLINK-26051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17700264#comment-17700264 ] Krzysztof Chmielewski edited comment on FLINK-26051 at 4/4/23 11:00 AM: hi [~qingyue] The problem is that compCnt is wrongly calculated as zero for this query/rule or problem is that zero is wrongly handled later in the code? With compCnt + 1 it seems that you add a bias value to every computation cost value, so Im not surprised that so many plans have changed and tests are failing. was (Author: kristoffsc): hi [~qingyue] I wonder, isn't the issue that for this query here the compCnt (CPU cost) is wrongly calculated as zero? In other words issue is that compCnt is wrongly calculated as zero for this query/rule or problem is that zero is wrongly handled later in the code? With compCnt + 1 it seems that you add a bias value to every computation cost value, so Im not surprised that so many plans have changed and tests are failing. > one sql has row_number =1 and the subsequent SQL has "case when" and "where" > statement result Exception : The window can only be ordered in ASCENDING mode > -- > > Key: FLINK-26051 > URL: https://issues.apache.org/jira/browse/FLINK-26051 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.2, 1.14.4 >Reporter: chuncheng wu >Assignee: Jane Chan >Priority: Major > Attachments: image-2022-02-10-20-13-14-424.png, > image-2022-02-11-11-18-20-594.png, image-2022-06-17-21-28-54-886.png > > > hello, > i have 2 sqls. One sql (sql0) is "select xx from ( ROW_NUMBER statment) > where rn=1" and the other one (sql1) is "s{color:#505f79}elect ${fields} > from result where ${filter_conditions}{color}" . The fields quoted in sql1 > has one "case when" field .The two sql can work well seperately.but if they > combine it results the exception as follow . It happen in the occasion when > logical plan turn into physical plan : > > {code:java} > org.apache.flink.table.api.TableException: The window can only be ordered in > ASCENDING mode. > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:98) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:52) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregateBase.translateToPlan(StreamExecOverAggregateBase.scala:42) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:54) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:39) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalcBase.translateToPlan(StreamExecCalcBase.scala:38) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:66) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:65) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:65) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:103) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:42) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.explainInternal(TableEnvironmentImpl.java:630) > at > org.apache.flink.table.api.internal.TableImpl.explain(TableImpl.java:582) > at > com.meituan.grocery.data.flink.test.BugTest.testRowNumber(BugTest.java:69) >
[jira] [Commented] (FLINK-26051) one sql has row_number =1 and the subsequent SQL has "case when" and "where" statement result Exception : The window can only be ordered in ASCENDING mode
[ https://issues.apache.org/jira/browse/FLINK-26051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17700264#comment-17700264 ] Krzysztof Chmielewski commented on FLINK-26051: --- hi [~qingyue] I wonder, isn't the issue that for this query here the compCnt (CPU cost) is wrongly calculated as zero? In other words issue is that compCnt is wrongly calculated as zero for this query/rule or problem is that zero is wrongly handled later in the code? With compCnt + 1 it seems that you add a bias value to every computation cost value, so Im not surprised that so many plans have changed and tests are failing. > one sql has row_number =1 and the subsequent SQL has "case when" and "where" > statement result Exception : The window can only be ordered in ASCENDING mode > -- > > Key: FLINK-26051 > URL: https://issues.apache.org/jira/browse/FLINK-26051 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.2, 1.14.4 >Reporter: chuncheng wu >Assignee: Jane Chan >Priority: Major > Attachments: image-2022-02-10-20-13-14-424.png, > image-2022-02-11-11-18-20-594.png, image-2022-06-17-21-28-54-886.png > > > hello, > i have 2 sqls. One sql (sql0) is "select xx from ( ROW_NUMBER statment) > where rn=1" and the other one (sql1) is "s{color:#505f79}elect ${fields} > from result where ${filter_conditions}{color}" . The fields quoted in sql1 > has one "case when" field .The two sql can work well seperately.but if they > combine it results the exception as follow . It happen in the occasion when > logical plan turn into physical plan : > > {code:java} > org.apache.flink.table.api.TableException: The window can only be ordered in > ASCENDING mode. > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:98) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:52) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregateBase.translateToPlan(StreamExecOverAggregateBase.scala:42) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:54) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:39) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalcBase.translateToPlan(StreamExecCalcBase.scala:38) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:66) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:65) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:65) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:103) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:42) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.explainInternal(TableEnvironmentImpl.java:630) > at > org.apache.flink.table.api.internal.TableImpl.explain(TableImpl.java:582) > at > com.meituan.grocery.data.flink.test.BugTest.testRowNumber(BugTest.java:69) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at >
[jira] [Commented] (FLINK-26051) one sql has row_number =1 and the subsequent SQL has "case when" and "where" statement result Exception : The window can only be ordered in ASCENDING mode
[ https://issues.apache.org/jira/browse/FLINK-26051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699685#comment-17699685 ] Krzysztof Chmielewski commented on FLINK-26051: --- Hi [~qingyue] I would like to help with solving this issue. could you tell me where are we regarding this one? Reading the comments I'm not sure if you have some other branch/solution that was proposed by [~zhangbinzaifendou] or do you have something new? I would like to work on this one but I would need some help with starting. I read your comment about `CommonCalc cannot produce right CPU cost for computations.`. I see that switch conditions there are really straight forward. Do you have an idea what might be missing there? > one sql has row_number =1 and the subsequent SQL has "case when" and "where" > statement result Exception : The window can only be ordered in ASCENDING mode > -- > > Key: FLINK-26051 > URL: https://issues.apache.org/jira/browse/FLINK-26051 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.2, 1.14.4 >Reporter: chuncheng wu >Assignee: Jane Chan >Priority: Major > Attachments: image-2022-02-10-20-13-14-424.png, > image-2022-02-11-11-18-20-594.png, image-2022-06-17-21-28-54-886.png > > > hello, > i have 2 sqls. One sql (sql0) is "select xx from ( ROW_NUMBER statment) > where rn=1" and the other one (sql1) is "s{color:#505f79}elect ${fields} > from result where ${filter_conditions}{color}" . The fields quoted in sql1 > has one "case when" field .The two sql can work well seperately.but if they > combine it results the exception as follow . It happen in the occasion when > logical plan turn into physical plan : > > {code:java} > org.apache.flink.table.api.TableException: The window can only be ordered in > ASCENDING mode. > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:98) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:52) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregateBase.translateToPlan(StreamExecOverAggregateBase.scala:42) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:54) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:39) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalcBase.translateToPlan(StreamExecCalcBase.scala:38) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:66) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:65) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:65) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:103) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:42) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.explainInternal(TableEnvironmentImpl.java:630) > at > org.apache.flink.table.api.internal.TableImpl.explain(TableImpl.java:582) > at > com.meituan.grocery.data.flink.test.BugTest.testRowNumber(BugTest.java:69) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at >
[jira] [Updated] (FLINK-31202) Add support for reading Parquet files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31202: -- Attachment: ParquetSourceArrayOfArraysIssue.java ParquetSourceArrayOfRowIssue.java > Add support for reading Parquet files containing Arrays with complex types. > --- > > Key: FLINK-31202 > URL: https://issues.apache.org/jira/browse/FLINK-31202 > Project: Flink > Issue Type: New Feature >Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.17.1 >Reporter: Krzysztof Chmielewski >Priority: Major > Attachments: ParquetSourceArrayOfArraysIssue.java, > ParquetSourceArrayOfRowIssue.java, arrayOfArrayOfInts.snappy.parquet, > arrayOfrows.snappy.parquet > > > reading complex types to Parquet is possible since Flink 1.16 after > implementing https://issues.apache.org/jira/browse/FLINK-24614 > However this implementation lacks support for reading complex nested types > such as > * Array > * Array > * Array > This ticket is about to add support for reading below types from Parquet > format files. > Currently when trying to read Parquet file containing column which such a > type, below exception is thrown: > {code:java} > Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1` > INT> > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) > {code} > OR: > {code:java} > Caused by: java.lang.RuntimeException: Unsupported type in the list: > ARRAY > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) > {code} > Parquet files and reproducer code is attached to the ticket -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31202) Add support for reading Parquet files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31202: -- Attachment: arrayOfArrayOfInts.snappy.parquet arrayOfrows.snappy.parquet > Add support for reading Parquet files containing Arrays with complex types. > --- > > Key: FLINK-31202 > URL: https://issues.apache.org/jira/browse/FLINK-31202 > Project: Flink > Issue Type: New Feature >Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.17.1 >Reporter: Krzysztof Chmielewski >Priority: Major > Attachments: ParquetSourceArrayOfArraysIssue.java, > ParquetSourceArrayOfRowIssue.java, arrayOfArrayOfInts.snappy.parquet, > arrayOfrows.snappy.parquet > > > reading complex types to Parquet is possible since Flink 1.16 after > implementing https://issues.apache.org/jira/browse/FLINK-24614 > However this implementation lacks support for reading complex nested types > such as > * Array > * Array > * Array > This ticket is about to add support for reading below types from Parquet > format files. > Currently when trying to read Parquet file containing column which such a > type, below exception is thrown: > {code:java} > Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1` > INT> > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) > {code} > OR: > {code:java} > Caused by: java.lang.RuntimeException: Unsupported type in the list: > ARRAY > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) > {code} > Parquet files and reproducer code is attached to the ticket -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31202) Add support for reading Parquet files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31202: -- Description: reading complex types to Parquet is possible since Flink 1.16 after implementing https://issues.apache.org/jira/browse/FLINK-24614 However this implementation lacks support for reading complex nested types such as * Array * Array * Array This ticket is about to add support for reading below types from Parquet format files. Currently when trying to read Parquet file containing column which such a type, below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1` INT> at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) {code} OR: {code:java} Caused by: java.lang.RuntimeException: Unsupported type in the list: ARRAY at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) {code} Parquet files and reproducer code is attached to the ticket was: reading complex types to Parquet is possible since Flink 1.16 after implementing https://issues.apache.org/jira/browse/FLINK-24614 However this implementation lacks support for reading complex nested types such as * Array * Array * Array This ticket is about to add support for reading below types from Parquet format files. Currently when trying to read Parquet file containing column which such a type, below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1` INT> at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) {code} OR: {code:java} Caused by: java.lang.RuntimeException: Unsupported type in the list: ARRAY at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) {code} > Add support for reading Parquet files containing Arrays with complex types. > --- > > Key: FLINK-31202 > URL: https://issues.apache.org/jira/browse/FLINK-31202 > Project: Flink > Issue Type: New Feature >Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.16.2, 1.17.1 >Reporter: Krzysztof Chmielewski >Priority: Major > > reading complex types to Parquet is possible since Flink 1.16 after > implementing https://issues.apache.org/jira/browse/FLINK-24614 > However this implementation lacks support for reading complex nested types > such as > * Array > * Array > * Array > This ticket is about to add support for reading below types from Parquet > format files. > Currently when trying to read Parquet file containing column which such a > type, below exception is thrown: > {code:java} > Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1` > INT> > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) > {code} > OR: > {code:java} > Caused by: java.lang.RuntimeException: Unsupported type in the list: > ARRAY > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) > at > org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) > {code} > Parquet files and reproducer code is attached to the
[jira] [Created] (FLINK-31202) Add support for reading Parquet files containing Arrays with complex types.
Krzysztof Chmielewski created FLINK-31202: - Summary: Add support for reading Parquet files containing Arrays with complex types. Key: FLINK-31202 URL: https://issues.apache.org/jira/browse/FLINK-31202 Project: Flink Issue Type: New Feature Affects Versions: 1.16.1, 1.16.0, 1.17.0, 1.16.2, 1.17.1 Reporter: Krzysztof Chmielewski reading complex types to Parquet is possible since Flink 1.16 after implementing https://issues.apache.org/jira/browse/FLINK-24614 However this implementation lacks support for reading complex nested types such as * Array * Array * Array This ticket is about to add support for reading below types from Parquet format files. Currently when trying to read Parquet file containing column which such a type, below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1` INT> at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) {code} OR: {code:java} Caused by: java.lang.RuntimeException: Unsupported type in the list: ARRAY at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113) at org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31197) Unable to write Parquet files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31197: -- Summary: Unable to write Parquet files containing Arrays with complex types. (was: Exception while writing Parqeut files containing Arrays with complex types.) > Unable to write Parquet files containing Arrays with complex types. > --- > > Key: FLINK-31197 > URL: https://issues.apache.org/jira/browse/FLINK-31197 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, > ORC, SequenceFile) >Affects Versions: 1.15.0, 1.15.1, 1.16.0, 1.17.0, 1.15.2, 1.15.3, 1.16.1, > 1.15.4, 1.16.2, 1.17.1, 1.15.5 >Reporter: Krzysztof Chmielewski >Priority: Major > Attachments: ParquetSinkArrayOfArraysIssue.java > > > After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible > to write complex types with File sink using Parquet format. > However it turns out that still it is impossible to write types such as: > * Array > * Array > * Array > When trying to write a Parquet row with such types, the below exception is > thrown: > {code:java} > Caused by: java.lang.RuntimeException: > org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the > field should be ommited completely instead > at > org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) > at > org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) > at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) > at > org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) > at > org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) > at > org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) > {code} > The exception is misleading, not showing the real problem. > The reason why those complex types are still not working is that during > developemnt of https://issues.apache.org/jira/browse/FLINK-17782 > code paths for those types were left without implementation, no Unsupported > Exception no nothing, simply empty methods. In > https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java > You will see > {code:java} > @Override > public void write(ArrayData arrayData, int ordinal) {} > {code} > for MapWriter, ArrayWriter and RowWriter. > I see two problems here: > 1. Writing those three types is still not possible. > 2. Flink is throwing an exception that gives no hint about the real issue > here. It could throw "Unsupported operation" for now. Maybe this should be > item for a different ticket? > The code to reproduce this issue is attached to the ticket. It tries to write > to Parquet file a single row with one column of type Array> -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31197) Exception while writing Parqeut files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31197: -- Component/s: Connectors / FileSystem Formats (JSON, Avro, Parquet, ORC, SequenceFile) > Exception while writing Parqeut files containing Arrays with complex types. > --- > > Key: FLINK-31197 > URL: https://issues.apache.org/jira/browse/FLINK-31197 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, > ORC, SequenceFile) >Affects Versions: 1.15.0, 1.15.1, 1.16.0, 1.17.0, 1.15.2, 1.15.3, 1.16.1, > 1.15.4, 1.16.2, 1.17.1, 1.15.5 >Reporter: Krzysztof Chmielewski >Priority: Major > Attachments: ParquetSinkArrayOfArraysIssue.java > > > After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible > to write complex types with File sink using Parquet format. > However it turns out that still it is impossible to write types such as: > * Array > * Array > * Array > When trying to write a Parquet row with such types, the below exception is > thrown: > {code:java} > Caused by: java.lang.RuntimeException: > org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the > field should be ommited completely instead > at > org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) > at > org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) > at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) > at > org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) > at > org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) > at > org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) > {code} > The exception is misleading, not showing the real problem. > The reason why those complex types are still not working is that during > developemnt of https://issues.apache.org/jira/browse/FLINK-17782 > code paths for those types were left without implementation, no Unsupported > Exception no nothing, simply empty methods. In > https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java > You will see > {code:java} > @Override > public void write(ArrayData arrayData, int ordinal) {} > {code} > for MapWriter, ArrayWriter and RowWriter. > I see two problems here: > 1. Writing those three types is still not possible. > 2. Flink is throwing an exception that gives no hint about the real issue > here. It could throw "Unsupported operation" for now. Maybe this should be > item for a different ticket? > The code to reproduce this issue is attached to the ticket. It tries to write > to Parquet file a single row with one column of type Array> -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31197) Exception while writing Parqeut files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31197: -- Description: After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible to write complex types with File sink using Parquet format. However it turns out that still it is impossible to write types such as: * Array * Array * Array When trying to write a Parquet row with such types, the below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) at org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) at org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) {code} The exception is misleading, not showing the real problem. The reason why those complex types are still not working is that during developemnt of https://issues.apache.org/jira/browse/FLINK-17782 code paths for those types were left without implementation, no Unsupported Exception no nothing, simply empty methods. In https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java You will see {code:java} @Override public void write(ArrayData arrayData, int ordinal) {} {code} for MapWriter, ArrayWriter and RowWriter. I see two problems here: 1. Writing those three types is still not possible. 2. Flink is throwing an exception that gives no hint about the real issue here. It could throw "Unsupported operation" for now. Maybe this should be item for a different ticket? The code to reproduce this issue is attached to the ticket. It tries to write to Parquet file a single row with one column of type Array> was: After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible to write complex types with File sink using Parquet format. However it turns out that still it is impossible to write types such as: Array Array Array When trying to write a Parquet row with such types, the below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) at org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) at org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) {code} The exception is misleading, not showing the real problem. The reason why those complex types are still not working is that during developemnt of https://issues.apache.org/jira/browse/FLINK-17782 code paths for those types were left without implementation, no Unsupported Exception no nothing, simply empty methods. In https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java You will see {code:java} @Override public void write(ArrayData arrayData, int ordinal) {} {code} for MapWriter, ArrayWriter and RowWriter. I see two problems here: 1. Writing those three types is still not possible. 2. Flink is throwing an exception that gives no hint about the real issue here. It could throw "Unsupported operation" for now. Maybe this should be item for a different ticket? The code to reproduce this issue is attached to the ticket. It tries to write to Parquet file a single row with one column of type Array> > Exception while writing Parqeut files containing Arrays with complex types. > --- > >
[jira] [Updated] (FLINK-31197) Exception while writing Parqeut files containing Arrays with complex types.
[ https://issues.apache.org/jira/browse/FLINK-31197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-31197: -- Description: After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible to write complex types with File sink using Parquet format. However it turns out that still it is impossible to write types such as: Array Array Array When trying to write a Parquet row with such types, the below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) at org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) at org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) {code} The exception is misleading, not showing the real problem. The reason why those complex types are still not working is that during developemnt of https://issues.apache.org/jira/browse/FLINK-17782 code paths for those types were left without implementation, no Unsupported Exception no nothing, simply empty methods. In https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java You will see {code:java} @Override public void write(ArrayData arrayData, int ordinal) {} {code} for MapWriter, ArrayWriter and RowWriter. I see two problems here: 1. Writing those three types is still not possible. 2. Flink is throwing an exception that gives no hint about the real issue here. It could throw "Unsupported operation" for now. Maybe this should be item for a different ticket? The code to reproduce this issue is attached to the ticket. It tries to write to Parquet file a single row with one column of type Array> was: After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible to write complex types with File sink using Parquet format. However it turns out that still it is impossible to write types such as: Array Array Array When trying to write a Parquet row with such types, the below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) at org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) at org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) {code} The exception is misleading, not showing the real problem. The reason why those complex types are still not working is that during developemnt of https://issues.apache.org/jira/browse/FLINK-17782 code paths for those types were left without implementation, no Unsupported Exception no nothing, simply empty methods. In https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java You will see {code:java} @Override public void write(ArrayData arrayData, int ordinal) {} {code} for MapWriter, ArrayWriter and RowWriter. I see two problems here. 1. writing those three types is still not possible 2. Flink is throwing an exception that gives no hint about the real issue here. It could throw "Unsupported operation" for now. Maybe this should be item for different ticket? The code to reproduce this issue is attached to the ticket. It tries to write to Parquet file a single row with one column of type Array> > Exception while writing Parqeut files containing Arrays with complex types. > --- > >
[jira] [Created] (FLINK-31197) Exception while writing Parqeut files containing Arrays with complex types.
Krzysztof Chmielewski created FLINK-31197: - Summary: Exception while writing Parqeut files containing Arrays with complex types. Key: FLINK-31197 URL: https://issues.apache.org/jira/browse/FLINK-31197 Project: Flink Issue Type: Bug Affects Versions: 1.16.1, 1.15.3, 1.15.2, 1.16.0, 1.15.1, 1.15.0, 1.17.0, 1.15.4, 1.16.2, 1.17.1, 1.15.5 Reporter: Krzysztof Chmielewski Attachments: ParquetSinkArrayOfArraysIssue.java After https://issues.apache.org/jira/browse/FLINK-17782 It should be possible to write complex types with File sink using Parquet format. However it turns out that still it is impossible to write types such as: Array Array Array When trying to write a Parquet row with such types, the below exception is thrown: {code:java} Caused by: java.lang.RuntimeException: org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:91) at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:71) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) at org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:51) at org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:191) {code} The exception is misleading, not showing the real problem. The reason why those complex types are still not working is that during developemnt of https://issues.apache.org/jira/browse/FLINK-17782 code paths for those types were left without implementation, no Unsupported Exception no nothing, simply empty methods. In https://github.com/apache/flink/blob/release-1.16.1/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java You will see {code:java} @Override public void write(ArrayData arrayData, int ordinal) {} {code} for MapWriter, ArrayWriter and RowWriter. I see two problems here. 1. writing those three types is still not possible 2. Flink is throwing an exception that gives no hint about the real issue here. It could throw "Unsupported operation" for now. Maybe this should be item for different ticket? The code to reproduce this issue is attached to the ticket. It tries to write to Parquet file a single row with one column of type Array> -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
[ https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687449#comment-17687449 ] Krzysztof Chmielewski edited comment on FLINK-31021 at 2/11/23 2:10 PM: Ok so I verified and it seems that this is NOT a regression caused by my recent change to Code Splitter. It seems that splitting static methods was never supported. I've checked on 1.15 branch. I think that the question we should ask here is this in fact a bug and do we need to make Code Splitter to handle Static methods? Quating [~TsReaper] from https://github.com/apache/flink/pull/21393#pullrequestreview-1273870828 {code:java} Our code splitter is not a universal solution. It only works for Flink generated code under several restrictions. {code} Having said that, [~xccui] is there Flink SQL query that makes Planner to generate Java code with static methods? If so, could you provide one? If in fact this is needed feature I can work on it since recently I've made bigger changes to code splitter and I would be fairly easy for me to add this. [~TsReaper] What do you think? was (Author: kristoffsc): Ok so I verified and it seems that this is NOT a regression caused by my recent change to Code Splitter. It seems that splitting static methods was never supported. I've checked on 1.15 branch. I think that the question we should ask here is this in fact a bug and do we need to make Code Splitter to handle Static methods? Quating [~TsReaper] from https://github.com/apache/flink/pull/21393#pullrequestreview-1273870828 {code:java} Our code splitter is not a universal solution. It only works for Flink generated code under several restrictions. {code} [~xccui] is there any Flink SQL query that makes Planner to generate Java code with static methods? If so, could you provide one? If in fact this is needed feature I can work on it since recently I've made bigger changes to code splitter and I would be fairly easy for me to add this. [~TsReaper] What do you think? > JavaCodeSplitter doesn't split static method properly > - > > Key: FLINK-31021 > URL: https://issues.apache.org/jira/browse/FLINK-31021 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.4, 1.15.3, 1.16.1 >Reporter: Xingcan Cui >Priority: Minor > > The exception while compiling the generated source > {code:java} > cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: > Instance method "default void > org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" > cannot be invoked in static context,{code} > The original method header > {code:java} > public static RowData decode(foo.bar.LogData message){{code} > The code after split > > {code:java} > Line 3383: public static RowData decode(foo.bar.LogData message){ > decodeImpl(message); return decodeReturnValue$0; } > Line 3384: > Line 3385: void decodeImpl(foo.bar.LogData message) {{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
[ https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687449#comment-17687449 ] Krzysztof Chmielewski commented on FLINK-31021: --- Ok so I verified and it seems that this is NOT a regression caused by my recent change to Code Splitter. It seems that splitting static methods was never supported. I've checked on 1.15 branch. I think that the question we should ask here is this in fact a bug and do we need to make Code Splitter to handle Static methods? Quating [~TsReaper] from https://github.com/apache/flink/pull/21393#pullrequestreview-1273870828 {code:java} Our code splitter is not a universal solution. It only works for Flink generated code under several restrictions. {code} [~xccui] is there any Flink SQL query that makes Planner to generate Java code with static methods? If so, could you provide one? If in fact this is needed feature I can work on it since recently I've made bigger changes to code splitter and I would be fairly easy for me to add this. [~TsReaper] What do you think? > JavaCodeSplitter doesn't split static method properly > - > > Key: FLINK-31021 > URL: https://issues.apache.org/jira/browse/FLINK-31021 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.4, 1.15.3, 1.16.1 >Reporter: Xingcan Cui >Priority: Minor > > The exception while compiling the generated source > {code:java} > cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: > Instance method "default void > org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" > cannot be invoked in static context,{code} > The original method header > {code:java} > public static RowData decode(foo.bar.LogData message){{code} > The code after split > > {code:java} > Line 3383: public static RowData decode(foo.bar.LogData message){ > decodeImpl(message); return decodeReturnValue$0; } > Line 3384: > Line 3385: void decodeImpl(foo.bar.LogData message) {{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31021) JavaCodeSplitter doesn't split static method properly
[ https://issues.apache.org/jira/browse/FLINK-31021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687368#comment-17687368 ] Krzysztof Chmielewski commented on FLINK-31021: --- Hi, i have few questions. 1. Could you provide full body of original decode method? 2. do you have sql query that reproduces the problem? 3. You marked affect version as 1.16.1 and below. Did you in fact had this on those or on a current master? Im asking because recently there was a change in code splitter merged to master 1.17 and 1.16 release that is not included in 1.16.1 so I'm wondering if this is a regression or something new. Let me know, Cheers. > JavaCodeSplitter doesn't split static method properly > - > > Key: FLINK-31021 > URL: https://issues.apache.org/jira/browse/FLINK-31021 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.4, 1.15.3, 1.16.1 >Reporter: Xingcan Cui >Priority: Minor > > The exception while compiling the generated source > {code:java} > cause=org.codehaus.commons.compiler.CompileException: Line 3383, Column 90: > Instance method "default void > org.apache.flink.formats.protobuf.deserialize.GeneratedProtoToRow_655d75db1cf943838f5500013edfba82.decodeImpl(foo.bar.LogData)" > cannot be invoked in static context,{code} > The original method header > {code:java} > public static RowData decode(foo.bar.LogData message){{code} > The code after split > > {code:java} > Line 3383: public static RowData decode(foo.bar.LogData message){ > decodeImpl(message); return decodeReturnValue$0; } > Line 3384: > Line 3385: void decodeImpl(foo.bar.LogData message) {{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31018) SQL Client -j option does not load user jars to classpath.
[ https://issues.apache.org/jira/browse/FLINK-31018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687218#comment-17687218 ] Krzysztof Chmielewski edited comment on FLINK-31018 at 2/10/23 5:39 PM: [~martijnvisser] yes it seemt that this is the case. I;ve used DynamicTableFactory.Context#getClassLoader instead Thread.currentThread().getContextClassLoader() as suggested in one of the comments and it seems that problem disappeared. Thanks, Thicket can be closed. was (Author: kristoffsc): [~martijnvisser] yes it seemt that this is the case. I;ve used `DynamicTableFactory.Context#getClassLoader` instead `Thread.currentThread().getContextClassLoader()` as suggested in one of the comments and it seems that problem disappeared. Thanks, Thicket can be closed. > SQL Client -j option does not load user jars to classpath. > -- > > Key: FLINK-31018 > URL: https://issues.apache.org/jira/browse/FLINK-31018 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.17.0, 1.16.1 >Reporter: Krzysztof Chmielewski >Priority: Minor > Attachments: image-2023-02-10-15-53-39-330.png, > image-2023-02-10-15-54-32-537.png, image-2023-02-10-16-05-12-407.png > > > SQL Client '-j' option does not load custom jars to classpath as it was for > example in Flink 1.15 > As a result Flink 1.16 SQL Client is not able to discover classes through > Flink's Factory discovery mechanism throwing an error like: > {code:java} > [ERROR] Could not execute SQL statement. Reason: > org.apache.flink.table.api.ValidationException: Could not find any factories > that implement 'com.getindata.connectors.http.LookupQueryCreatorFactory' in > the classpath. > {code} > The same Jar and sample job are working fine with Flink 1.15. > Flink 1.15.2 > ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar > !image-2023-02-10-15-53-39-330.png! > Flink 1.16.1 > ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar > !image-2023-02-10-15-54-32-537.png! > ADD JAR command does not solve " Could not find any factories" issue although > jar seems to be added: > !image-2023-02-10-16-05-12-407.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-31018) SQL Client -j option does not load user jars to classpath.
[ https://issues.apache.org/jira/browse/FLINK-31018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski closed FLINK-31018. - Resolution: Not A Bug > SQL Client -j option does not load user jars to classpath. > -- > > Key: FLINK-31018 > URL: https://issues.apache.org/jira/browse/FLINK-31018 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.17.0, 1.16.1 >Reporter: Krzysztof Chmielewski >Priority: Minor > Attachments: image-2023-02-10-15-53-39-330.png, > image-2023-02-10-15-54-32-537.png, image-2023-02-10-16-05-12-407.png > > > SQL Client '-j' option does not load custom jars to classpath as it was for > example in Flink 1.15 > As a result Flink 1.16 SQL Client is not able to discover classes through > Flink's Factory discovery mechanism throwing an error like: > {code:java} > [ERROR] Could not execute SQL statement. Reason: > org.apache.flink.table.api.ValidationException: Could not find any factories > that implement 'com.getindata.connectors.http.LookupQueryCreatorFactory' in > the classpath. > {code} > The same Jar and sample job are working fine with Flink 1.15. > Flink 1.15.2 > ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar > !image-2023-02-10-15-53-39-330.png! > Flink 1.16.1 > ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar > !image-2023-02-10-15-54-32-537.png! > ADD JAR command does not solve " Could not find any factories" issue although > jar seems to be added: > !image-2023-02-10-16-05-12-407.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31018) SQL Client -j option does not load user jars to classpath.
[ https://issues.apache.org/jira/browse/FLINK-31018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687218#comment-17687218 ] Krzysztof Chmielewski commented on FLINK-31018: --- [~martijnvisser] yes it seemt that this is the case. I;ve used `DynamicTableFactory.Context#getClassLoader` instead `Thread.currentThread().getContextClassLoader()` as suggested in one of the comments and it seems that problem disappeared. Thanks, Thicket can be closed. > SQL Client -j option does not load user jars to classpath. > -- > > Key: FLINK-31018 > URL: https://issues.apache.org/jira/browse/FLINK-31018 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.17.0, 1.16.1 >Reporter: Krzysztof Chmielewski >Priority: Minor > Attachments: image-2023-02-10-15-53-39-330.png, > image-2023-02-10-15-54-32-537.png, image-2023-02-10-16-05-12-407.png > > > SQL Client '-j' option does not load custom jars to classpath as it was for > example in Flink 1.15 > As a result Flink 1.16 SQL Client is not able to discover classes through > Flink's Factory discovery mechanism throwing an error like: > {code:java} > [ERROR] Could not execute SQL statement. Reason: > org.apache.flink.table.api.ValidationException: Could not find any factories > that implement 'com.getindata.connectors.http.LookupQueryCreatorFactory' in > the classpath. > {code} > The same Jar and sample job are working fine with Flink 1.15. > Flink 1.15.2 > ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar > !image-2023-02-10-15-53-39-330.png! > Flink 1.16.1 > ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar > !image-2023-02-10-15-54-32-537.png! > ADD JAR command does not solve " Could not find any factories" issue although > jar seems to be added: > !image-2023-02-10-16-05-12-407.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31018) SQL Client -j option does not load user jars to classpath.
Krzysztof Chmielewski created FLINK-31018: - Summary: SQL Client -j option does not load user jars to classpath. Key: FLINK-31018 URL: https://issues.apache.org/jira/browse/FLINK-31018 Project: Flink Issue Type: Bug Components: Table SQL / Client Affects Versions: 1.16.1, 1.17.0 Reporter: Krzysztof Chmielewski Attachments: image-2023-02-10-15-53-39-330.png, image-2023-02-10-15-54-32-537.png, image-2023-02-10-16-05-12-407.png SQL Client '-j' option does not load custom jars to classpath as it was for example in Flink 1.15 As a result Flink 1.16 SQL Client is not able to discover classes through Flink's Factory discovery mechanism throwing an error like: {code:java} [ERROR] Could not execute SQL statement. Reason: org.apache.flink.table.api.ValidationException: Could not find any factories that implement 'com.getindata.connectors.http.LookupQueryCreatorFactory' in the classpath. {code} The same Jar and sample job are working fine with Flink 1.15. Flink 1.15.2 ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar !image-2023-02-10-15-53-39-330.png! Flink 1.16.1 ./bin/sql-client.sh -j flink-http-connector-0.9.0.jar !image-2023-02-10-15-54-32-537.png! ADD JAR command does not solve " Could not find any factories" issue although jar seems to be added: !image-2023-02-10-16-05-12-407.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-23016) Job client must be a Coordination Request Gateway when submit a job on web ui
[ https://issues.apache.org/jira/browse/FLINK-23016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687090#comment-17687090 ] Krzysztof Chmielewski commented on FLINK-23016: --- FYI, got the same error on Flink 1.16.1 > Job client must be a Coordination Request Gateway when submit a job on web ui > -- > > Key: FLINK-23016 > URL: https://issues.apache.org/jira/browse/FLINK-23016 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.13.1 > Environment: flink: 1.13.1 > flink-cdc: com.alibaba.ververica:flink-connector-postgres-cdc:1.4.0 > jdk:1.8 >Reporter: wen qi >Priority: Not a Priority > Labels: auto-deprioritized-critical, auto-deprioritized-major, > auto-deprioritized-minor > Attachments: WechatIMG10.png, WechatIMG11.png, WechatIMG8.png > > > I used flink cdc to collect data,and use table api to transfer data and > write to another table. > That's all ritht when I run code in IDE and submit jar of jobs use cli, but > web ui > When I use StreamTableEnvironment.from('table-path').execute(), it's failed! > please check my attachments , it seems that a bug of web ui bug ? > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 2/7/23 11:20 AM: master commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 backport PR https://github.com/apache/flink/pull/21860 (contains bugfix https://github.com/apache/flink/pull/21871) was (Author: kristoffsc): master commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 PR https://github.com/apache/flink/pull/21860 (contains bugfix https://github.com/apache/flink/pull/21871) > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.15.3, 1.16.1 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) >
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 2/7/23 11:19 AM: master commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 PR https://github.com/apache/flink/pull/21860 (contains bugfix https://github.com/apache/flink/pull/21871) was (Author: kristoffsc): master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 PR https://github.com/apache/flink/pull/21860 (contains bugfix https://github.com/apache/flink/pull/21871) > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.15.3, 1.16.1 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) >
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 2/7/23 11:15 AM: master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 PR https://github.com/apache/flink/pull/21860 (contains bugfix https://github.com/apache/flink/pull/21871) was (Author: kristoffsc): master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 PR https://github.com/apache/flink/pull/21860 > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.15.3, 1.16.1 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 2/7/23 11:12 AM: master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 (included in 1.17 branch) 1.16 PR https://github.com/apache/flink/pull/21860 was (Author: kristoffsc): master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 1.16 PR https://github.com/apache/flink/pull/21860 > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.15.3, 1.16.1 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) >
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685220#comment-17685220 ] Krzysztof Chmielewski commented on FLINK-30927: --- master: 96a296db723575d64857482a1278744e4c41201f PR for 1.17 - https://github.com/apache/flink/pull/21879 > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.17.0, 1.16.2 >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927 ] Krzysztof Chmielewski deleted comment on FLINK-30927: --- was (Author: kristoffsc): master: 96a296db723575d64857482a1278744e4c41201f > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.17.0, 1.16.2 >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685215#comment-17685215 ] Krzysztof Chmielewski commented on FLINK-30927: --- master: 96a296db723575d64857482a1278744e4c41201f > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.17.0, 1.16.2 >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.2 > > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685175#comment-17685175 ] Krzysztof Chmielewski commented on FLINK-30927: --- PR needs to be merged to 1.17 branch aswell. > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684943#comment-17684943 ] Krzysztof Chmielewski commented on FLINK-30927: --- OK, CI build is green for provided PR, also I dont see any `InternalCompilerException ... Two non-abstract methods` exception in table_ci_table nor other tests from flink-table-planer. I would appreciate for review for this small Bug FIX PR and sorry for any inconvenience caused by this. > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684726#comment-17684726 ] Krzysztof Chmielewski edited comment on FLINK-30927 at 2/6/23 4:31 PM: --- Provided PR above is fixing the reported issue. However CI build was not failing due to this problem. The reason why it was not failing is that code splitter has a safety net, that whenever rewritten code fails the compilation, Flink tries to use original code + print failing class into the logs. That is how the problem was spotted. Maybe it would worth to add an enhancement such this issue would in fact failed the build? A separate issue? was (Author: kristoffsc): Provided PR above is fixing reported issue. However CI build was not failing due to this problem. The reason why it was not failing is that code splitter has a safety net, that whenever rewritten code fails the compilation, Flink tries to use original code + print failing class into the logs. That is how the problem was spotted. Maybe it would worth to add an enhancement such this issue would in fact failed the build? A separate issue? > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684726#comment-17684726 ] Krzysztof Chmielewski commented on FLINK-30927: --- Provided PR above is fixing reported issue. However CI build was not failing due to this problem. The reason why it was not failing is that code splitter has a safety net, that whenever rewritten code fails the compilation, Flink tries to use original code + print failing class into the logs. That is how the problem was spotted. Maybe it would worth to add an enhancement such this issue would in fact failed the build? A separate issue? > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684706#comment-17684706 ] Krzysztof Chmielewski commented on FLINK-30927: --- Pull request available https://github.com/apache/flink/pull/21871 > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Sergey Nuyanzin >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30927) Several tests started generate output with two non-abstract methods have the same parameter types, declaring type and return type
[ https://issues.apache.org/jira/browse/FLINK-30927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684687#comment-17684687 ] Krzysztof Chmielewski commented on FLINK-30927: --- I already have fix for this, will provide PR shortly. It's caused by https://github.com/apache/flink/pull/21393. Could someone assign this ticket to me? > Several tests started generate output with two non-abstract methods have the > same parameter types, declaring type and return type > -- > > Key: FLINK-30927 > URL: https://issues.apache.org/jira/browse/FLINK-30927 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Sergey Nuyanzin >Priority: Major > > e.g. > org.apache.flink.table.planner.runtime.stream.sql.MatchRecognizeITCase#testUserDefinedFunctions > > it seems during code splitter it starts generating some methods with same > signature > > {noformat} > org.codehaus.janino.InternalCompilerException: Compiling > "MatchRecognizePatternProcessFunction$77": Two non-abstract methods "default > void MatchRecognizePatternProcessFunction$77.processMatch_0(java.util.Map, > org.apache.flink.cep.functions.PatternProcessFunction$Context, > org.apache.flink.util.Collector) throws java.lang.Exception" have the same > parameter types, declaring type and return type > {noformat} > > Probably could be a side effect of > https://issues.apache.org/jira/browse/FLINK-27246 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 2/6/23 8:48 AM: --- master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 1.16 PR https://github.com/apache/flink/pull/21860 was (Author: kristoffsc): master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 Preparing backport to 1.16 > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski commented on FLINK-27246: --- master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 Preparing backports to 1.16 > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684496#comment-17684496 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 2/6/23 8:12 AM: --- master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 Preparing backport to 1.16 was (Author: kristoffsc): master merge commit: af9a1128f728c691b896bc9c591e9be1327601c4 Preparing backports to 1.16 > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Labels: pull-request-available > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655573#comment-17655573 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi [~TsReaper], [~twalthr] and [~jingge] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 In short I've created new Rewritter that can rewrite IF/ESLE and WHILE blocks including combination and nested statements. I also removed IfStatementRewriter since its logic is covered by my new BlockStatementRewriter. I ran the new Code Splitter against SQL query attached to this ticket and it works perfectly. I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246 ] Krzysztof Chmielewski deleted comment on FLINK-27246: --- was (Author: kristoffsc): Hi [~TsReaper], [~twalthr] and [~jingge] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 In short I've created new Rewritter that can rewrite IF/ESLE and WHILE blocks including combination and nested statements. I also removed IfStatementRewriter since its logic is covered by my new BlockStatementRewriter. I ran the new Code Splitter against SQL query attached to this ticket and it works perfectly. I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655572#comment-17655572 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 1/6/23 8:16 PM: --- Hi [~TsReaper], [~twalthr] and [~jingge] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 In short I've created new Rewritter that can rewrite IF/ESLE and WHILE blocks including combination and nested statements. I also removed IfStatementRewriter since its logic is covered by my new BlockStatementRewriter. I ran the new Code Splitter against SQL query attached to this ticket and it works perfectly. I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. was (Author: kristoffsc): Hi [~TsReaper] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 In short I've created new Rewritter that can rewrite IF/ESLE and WHILE blocks including combination and nested statements. I also removed IfStatementRewriter since its logic is covered by my new BlockStatementRewriter. I ran the new Code Splitter against SQL query attached to this ticket and it works perfectly. I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) >
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655572#comment-17655572 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 1/6/23 8:12 PM: --- Hi [~TsReaper] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 In short I've created new Rewritter that can rewrite IF/ESLE and WHILE blocks including combination and nested statements. I also removed IfStatementRewriter since its logic is covered by my new BlockStatementRewriter. I ran the new Code Splitter against SQL query attached to this ticket and it works perfectly. I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. was (Author: kristoffsc): Hi [~TsReaper] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655572#comment-17655572 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi [~TsReaper] I have finally finished working on my PR for this issue. The PR is here: https://github.com/apache/flink/pull/21393 I would appreciate if you could take a look and do the review. I added detailed description of my change/solution to the PR. Please let me know what do you think. Cheers. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17653156#comment-17653156 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi [~TsReaper] I have modified my PR. I reverted all chanegs from FunctionSplitter and I introduced totally new BlockStatementRewriter that is using new classses from original PR, BlockStatementSplitter and BlockStatementGrouper. The new BlockStatementRewriter can handle If/Else/While statemets. For IF/ESLE statemets if produces similiar but not exact result as IfStatementRewriter did. However to make the original problem gone I had to use both rewriters, so JavaCodeSplitter now has this in splitImpl {code:java} return Optional.ofNullable( new DeclarationRewriter(returnValueRewrittenCode, maxMethodLength) .rewrite()) .map(text -> new IfStatementRewriter(text, maxMethodLength).rewrite()) .map(text -> new BlockStatementRewriter(text, maxMethodLength).rewrite()) .map(text -> new FunctionSplitter(text, maxMethodLength).rewrite()) .map(text -> new MemberFieldRewriter(text, maxClassMemberCount).rewrite()) .orElse(code); } {code} The good news is that all tests on CI/CD are passing. Still I have to investigate more. My goal would be to drop IfStatementRewriter and use only BlockStatementRewriter unless I will find some reason not to. I will keep you posted on this. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652279#comment-17652279 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi [~TsReaper] Thanks for the feedback. Regarding IfStatementRewriter its a little bit tricky for me. I think my new thing might handle more cases than IfStatementRewriter did. Plus fStatementRewriter and existing rewrites seems to expect a method declaration + body, where my BlocksStatementGrouper and Splitter are processing individual block statements. They are called from FunctionSplitter::FunctionSplitVisitor where while processing block statements from method's body. Now It seems that after my change, FunctionSplitter is also rewriting the code, similar to IfStatementRewriter and maybe this is not the best thing to do from the clean code/architecture perspective. The problem with IfStatementRewriter is that it will not rewrite the If/Else branch if the branch contains "while" statement in it or when entire if/else statement is inside while statement, which was the original problem. So now I'm wonder should we have this extracted from FunctionSplitter into new BlocksStatementRewriter that whill handle while/If/else statements in combination or can this be inside Function Splitter as it is now. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3, 1.16.0, 1.15.3 >Reporter: Maciej Bryński >Assignee: Krzysztof Chmielewski >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642072#comment-17642072 ] Krzysztof Chmielewski commented on FLINK-27246: --- [~TsReaper] I would love for your feedbeck on my PoC fix. thanks. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by:
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642072#comment-17642072 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 12/1/22 5:42 PM: [~TsReaper] I would love for your feedbeck on my PoC fix above. thanks. was (Author: kristoffsc): [~TsReaper] I would love for your feedbeck on my PoC fix. thanks. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Commented] (FLINK-26051) one sql has row_number =1 and the subsequent SQL has "case when" and "where" statement result Exception : The window can only be ordered in ASCENDING mode
[ https://issues.apache.org/jira/browse/FLINK-26051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640037#comment-17640037 ] Krzysztof Chmielewski commented on FLINK-26051: --- Hi :) [~qingyue] Do you have any update on this one? :) > one sql has row_number =1 and the subsequent SQL has "case when" and "where" > statement result Exception : The window can only be ordered in ASCENDING mode > -- > > Key: FLINK-26051 > URL: https://issues.apache.org/jira/browse/FLINK-26051 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.2, 1.14.4 >Reporter: chuncheng wu >Assignee: Jane Chan >Priority: Major > Attachments: image-2022-02-10-20-13-14-424.png, > image-2022-02-11-11-18-20-594.png, image-2022-06-17-21-28-54-886.png > > > hello, > i have 2 sqls. One sql (sql0) is "select xx from ( ROW_NUMBER statment) > where rn=1" and the other one (sql1) is "s{color:#505f79}elect ${fields} > from result where ${filter_conditions}{color}" . The fields quoted in sql1 > has one "case when" field .The two sql can work well seperately.but if they > combine it results the exception as follow . It happen in the occasion when > logical plan turn into physical plan : > > {code:java} > org.apache.flink.table.api.TableException: The window can only be ordered in > ASCENDING mode. > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:98) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregate.translateToPlanInternal(StreamExecOverAggregate.scala:52) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecOverAggregateBase.translateToPlan(StreamExecOverAggregateBase.scala:42) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:54) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:39) > at > org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) > at > org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalcBase.translateToPlan(StreamExecCalcBase.scala:38) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:66) > at > org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:65) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:65) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:103) > at > org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:42) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.explainInternal(TableEnvironmentImpl.java:630) > at > org.apache.flink.table.api.internal.TableImpl.explain(TableImpl.java:582) > at > com.meituan.grocery.data.flink.test.BugTest.testRowNumber(BugTest.java:69) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725) > at > org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60){code} > In the stacktrace above , rownumber() 's physical rel which is > StreamExecRank In nomal change to StreamExecOverAggregate . The > StreamExecOverAggregate rel has a window= ROWS
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638614#comment-17638614 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 11/25/22 7:21 PM: - Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next ,the *BlockStatementGrouper* is groping calls created by *BlockStatementSplitter* to blocks with lengths < "maxMethodLength" and extracting those to another new method. Finally *BlockStatementGrouper* rewrites original code block to call methods created by *BlockStatementGrouper*. For example, an input statement: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); if (lastKey$6208 == null) { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } else if (lastKey$6209 == null) { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } else { agg2_sumIsNull = true; }}; {code} will be converted by BlockStatementSplitter to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_whileBody0_0(); if (lastKey$6208 == null) { top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); }}; {code} Further this will be converted by BlockStatementGrouper with maxMethodLength parameter set to 4000 to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_rewriteGroup_0(); } {code} Body for the new methods would be: {code:java} private void top_whileBody0_0_ifBody1_ifBody0 { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } private void top_whileBody0_0_ifBody1_ifBody1 { agg2_sumIsNull = true; } private void top_whileBody0_0 { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); } private void top_whileBody0_0_ifBody0 { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } void top_rewriteGroup_0() { top_whileBody0_0(); if (lastKey$6208 == null) { top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); } } {code} What do you think [~TsReaper]? was (Author: kristoffsc): Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638614#comment-17638614 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 11/25/22 7:20 PM: - Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next ,the *BlockStatementGrouper* is groping calls created by *BlockStatementSplitter* to blocks with lengths < "maxMethodLength" and extracting those to another new method. Finally *BlockStatementGrouper* rewrites original code block to call methods created by *BlockStatementGrouper*. For example, an input statement: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; // prepare input local$5912.replace(key$6207, val$9688); if (lastKey$6208 == null) { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } else if (lastKey$6209 == null) { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } else { agg2_sumIsNull = true; }}; {code} will be converted by BlockStatementSplitter to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_whileBody0_0(); if (lastKey$6208 == null) { top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); }}; {code} Further this will be converted by BlockStatementGrouper with maxMethodLength parameter set to 4000 to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_rewriteGroup_0(); } {code} Body for the new methods would be: {code:java} private void top_whileBody0_0_ifBody1_ifBody0 { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } private void top_whileBody0_0_ifBody1_ifBody1 { agg2_sumIsNull = true; } private void top_whileBody0_0 { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); } private void top_whileBody0_0_ifBody0 { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } void top_rewriteGroup_0() { top_whileBody0_0(); if (lastKey$6208 == null) { top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); } } {code} What do you think [~TsReaper]? was (Author: kristoffsc): Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638614#comment-17638614 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 11/25/22 10:23 AM: -- Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next ,the *BlockStatementGrouper* is groping calls created by *BlockStatementSplitter* to blocks with lengths < "maxMethodLength" and extracting those to another new method. Finally *BlockStatementGrouper* rewrites original code block to call methods created by *BlockStatementGrouper*. For example, an input statement: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; // prepare input local$5912.replace(key$6207, val$9688); if (lastKey$6208 == null) { // found first key group lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } else if (lastKey$6209 == null) { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } else { agg2_sumIsNull = true; }}; {code} will be converted by BlockStatementSplitter to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); }}; {code} Further this will be converted by BlockStatementGrouper with maxMethodLength parameter set to 4000 to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_rewriteGroup_0(); } {code} Body for the new methods would be: {code:java} private void top_whileBody0_0_ifBody1_ifBody0 { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } private void top_whileBody0_0_ifBody1_ifBody1 { agg2_sumIsNull = true; } private void top_whileBody0_0 { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); } private void top_whileBody0_0_ifBody0 { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } void top_rewriteGroup_0() { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); } } {code} What do you think [~TsReaper]? was (Author: kristoffsc): Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638614#comment-17638614 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 11/25/22 10:23 AM: -- Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next ,the *BlockStatementGrouper* is groping calls created by *BlockStatementSplitter* to blocks with lengths < "maxMethodLength" and extracting those to another new method. Finally *BlockStatementGrouper* rewrites original code block to call methods created by *BlockStatementGrouper*. For example, an input statement: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; // prepare input local$5912.replace(key$6207, val$9688); if (lastKey$6208 == null) { // found first key group lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } else if (lastKey$6209 == null) { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } else { agg2_sumIsNull = true; }}; {code} will be converted by BlockStatementSplitter to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); }}; {code} Further this will be converted by BlockStatementGrouper with maxMethodLength parameter set to 4000 to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_rewriteGroup_0(); } {code} New methods body would be: {code:java} private void top_whileBody0_0_ifBody1_ifBody0 { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } private void top_whileBody0_0_ifBody1_ifBody1 { agg2_sumIsNull = true; } private void top_whileBody0_0 { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); } private void top_whileBody0_0_ifBody0 { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } void top_rewriteGroup_0() { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); } } {code} What do you think [~TsReaper]? was (Author: kristoffsc): Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638614#comment-17638614 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 11/25/22 10:14 AM: -- Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next ,the *BlockStatementGrouper* is groping calls created by *BlockStatementSplitter* to blocks with lengths < "maxMethodLength" and extracting those to another new method. Finally *BlockStatementGrouper* rewrites original code block to call methods created by *BlockStatementGrouper*. For example, an input statement: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; // prepare input local$5912.replace(key$6207, val$9688); if (lastKey$6208 == null) { // found first key group lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } else if (lastKey$6209 == null) { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } else {agg2_sumIsNull = true; }}; {code} will be converted by BlockStatementSplitter to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else {top_whileBody0_0_ifBody1_ifBody1(); }}; {code} Further this will be converted by BlockStatementGrouper with maxMethodLength parameter set to 4000 to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_rewriteGroup_0(); } {code} New methods body would be: {code:java} private void top_whileBody0_0_ifBody1_ifBody0 { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } private void top_whileBody0_0_ifBody1_ifBody1 { agg2_sumIsNull = true; } private void top_whileBody0_0 { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); } private void top_whileBody0_0_ifBody0 { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } void top_rewriteGroup_0() { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); } } {code} What do you think [~TsReaper]? was (Author: kristoffsc): Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638614#comment-17638614 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi [~TsReaper] Im sory for a long delay. I was actyally trying to develop a PoC fix for this problem. I think I managed to at least proof a concept. You can find my raft PR here -> https://github.com/apache/flink/pull/21393 The code from this PR made the SQL from this ticket to compile and execute which is at least something :) The idea is to enhance FunctionSplitter that for every codeBlock (getMergedCodeBlocks method) that is bigger than "maxMethodLength" try to further split it by calling two new splitters that I've created: 1. BlockStatementSplitter 2. BlockStatementGrouper The BlockStatementSplitter splits body of WHILE, IF/ELSE statements to new methods. The original statement is rewritten that will call those new methods. Next ,the *BlockStatementGrouper* is groping those new method calls created by *BlockStatementSplitter* to blocks with lengths < "maxMethodLength" and extracting those to yet another new method. Finally *BlockStatementGrouper* rewrites original code block to call methods created by *BlockStatementGrouper*. For example, an input statement: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; // prepare input local$5912.replace(key$6207, val$9688); if (lastKey$6208 == null) { // found first key group lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } else if (lastKey$6209 == null) { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } else {agg2_sumIsNull = true; }}; {code} will be converted by BlockStatementSplitter to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else {top_whileBody0_0_ifBody1_ifBody1(); }}; {code} Further this will be converted by BlockStatementGrouper with maxMethodLength parameter set to 4000 to: {code:java} while ( (kvPair$9687 = (org.apache.flink.api.java.tuple.Tuple2) iterator.next()) != null) { top_rewriteGroup_0(); } {code} New methods body would be: {code:java} private void top_whileBody0_0_ifBody1_ifBody0 { agg2_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sumIsNull = true; } private void top_whileBody0_0_ifBody1_ifBody1 { agg2_sumIsNull = true; } private void top_whileBody0_0 { key$6207 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f0; val$9688 = (org.apache.flink.table.data.binary.BinaryRowData) kvPair$9687.f1; local$5912.replace(key$6207, val$9688); } private void top_whileBody0_0_ifBody0 { lastKey$6208 = key$6207.copy(); agg0_sumIsNull = true; agg0_sum = ((org.apache.flink.table.data.DecimalData) null); agg1_sumIsNull = true; agg1_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); agg3_sum = ((org.apache.flink.table.data.DecimalData) null); } void top_rewriteGroup_0() { top_whileBody0_0(); if (lastKey$6208 == null) { // found first key group top_whileBody0_0_ifBody0(); } else if (lastKey$6209 == null) { top_whileBody0_0_ifBody1_ifBody0(); } else { top_whileBody0_0_ifBody1_ifBody1(); } } {code} What do you think [~TsReaper]? > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments:
[jira] [Comment Edited] (FLINK-25920) Allow receiving updates of CommittableSummary
[ https://issues.apache.org/jira/browse/FLINK-25920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631277#comment-17631277 ] Krzysztof Chmielewski edited comment on FLINK-25920 at 11/9/22 8:41 PM: [~bdine] and [~qinjunjerry] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those fixes so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). was (Author: kristoffsc): [~bdine] and [~qinjunjerry] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those fixes so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). > Allow receiving updates of CommittableSummary > - > > Key: FLINK-25920 > URL: https://issues.apache.org/jira/browse/FLINK-25920 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Connectors / Common >Affects Versions: 1.15.0, 1.16.0 >Reporter: Fabian Paul >Priority: Major > > In the case of unaligned checkpoints, it might happen that the checkpoint > barrier overtakes the records and an empty committable summary is emitted > that needs to be correct at a later point when the records arrive. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25920) Allow receiving updates of CommittableSummary
[ https://issues.apache.org/jira/browse/FLINK-25920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631277#comment-17631277 ] Krzysztof Chmielewski edited comment on FLINK-25920 at 11/9/22 8:41 PM: [~bdine] and [~qinjunjerry] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those fixes so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). was (Author: kristoffsc): [~bdine] and [~qinjunjerry] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those tickets so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). > Allow receiving updates of CommittableSummary > - > > Key: FLINK-25920 > URL: https://issues.apache.org/jira/browse/FLINK-25920 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Connectors / Common >Affects Versions: 1.15.0, 1.16.0 >Reporter: Fabian Paul >Priority: Major > > In the case of unaligned checkpoints, it might happen that the checkpoint > barrier overtakes the records and an empty committable summary is emitted > that needs to be correct at a later point when the records arrive. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25920) Allow receiving updates of CommittableSummary
[ https://issues.apache.org/jira/browse/FLINK-25920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631277#comment-17631277 ] Krzysztof Chmielewski edited comment on FLINK-25920 at 11/9/22 8:40 PM: [~bdine] and [~qinjunjerry] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those tickets so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). was (Author: kristoffsc): [~bdine] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those tickets so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). > Allow receiving updates of CommittableSummary > - > > Key: FLINK-25920 > URL: https://issues.apache.org/jira/browse/FLINK-25920 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Connectors / Common >Affects Versions: 1.15.0, 1.16.0 >Reporter: Fabian Paul >Priority: Major > > In the case of unaligned checkpoints, it might happen that the checkpoint > barrier overtakes the records and an empty committable summary is emitted > that needs to be correct at a later point when the records arrive. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25920) Allow receiving updates of CommittableSummary
[ https://issues.apache.org/jira/browse/FLINK-25920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631277#comment-17631277 ] Krzysztof Chmielewski commented on FLINK-25920: --- [~bdine] could you share what kind of Sink are you suing? Recently we were fixing various issues with Sink architecture for Flink 1.15. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 One of the symptoms was this issue for setup with aligned checkpoints. You would need to have all 3 of those tickets so you would need to use Flink 1.15.3 or 1.16.1 (both not yet released). > Allow receiving updates of CommittableSummary > - > > Key: FLINK-25920 > URL: https://issues.apache.org/jira/browse/FLINK-25920 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Connectors / Common >Affects Versions: 1.15.0, 1.16.0 >Reporter: Fabian Paul >Priority: Major > > In the case of unaligned checkpoints, it might happen that the checkpoint > barrier overtakes the records and an empty committable summary is emitted > that needs to be correct at a later point when the records arrive. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628218#comment-17628218 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 11/3/22 9:31 AM: [~TsReaper] I see why my comment was causing confusion, apologize for that. For a moment I though that FunctionSplitter is rewriting this while block but now I see clearly it does not. At the same time I saw that some "rewrite" process is applyed to this block but also now I see it was MemberFieldRewriter logic. Long story short, I did not want to run JavaCodeSplitter again but just enhance current logic to handle this "while" case. was (Author: kristoffsc): [~TsReaper] I see why my comment was causing confusion, apologize for that. For a moment I though that FunctionSplitter is rewriting this while block but now I see clearly it does not. At the same time I saw that some "rewrite" process is applyed to this block but also now I see it was MemberFieldRewriter logic. Long story short, I did not want to run JavaCodeSplitter again but just extend rewrite to handle this "while" case. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628218#comment-17628218 ] Krzysztof Chmielewski commented on FLINK-27246: --- [~TsReaper] I see why my comment was causing confusion, apologize for that. For a moment I though that FunctionSplitter is rewriting this while block but now I see clearly it does not. At the same time I saw that some "rewrite" process is applyed to this block but also now I see it was MemberFieldRewriter logic. Long story short, I did not want to run JavaCodeSplitter again but just extend rewrite to handle this "while" case. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) >
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17627602#comment-17627602 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi [~TsReaper] Thanks for replaying. The extracted while loop method contains many, many referneces to rewrite$index[] methods and a lot of self contained if else blocks. There is no break, return, continue or break statements. I've attached the method body to this ticket. [^endInput_falseFilter9123_split9704.txt] So it looks like that body of this method was already reprocessed and rewrite by Spliter. Having this I was thinking that maybe if we would "rewrite" it again, to further split it into smaller groups, we could fix the problem. What do you think? > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Updated] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-27246: -- Attachment: endInput_falseFilter9123_split9704.txt > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > Attachments: endInput_falseFilter9123_split9704.txt > > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at java.lang.Thread.run(Unknown Source) ~[?:?] > Caused by: org.apache.flink.util.FlinkRuntimeException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:76) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: > org.apache.flink.shaded.guava30.com.google.common.util.concurrent.UncheckedExecutionException: > org.apache.flink.api.common.InvalidProgramException: Table program cannot be > compiled. This is a bug. Please file an issue. > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache.get(LocalCache.java:3962) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.shaded.guava30.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4859) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:74) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:102) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:83) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > ... 11 more > Caused by: org.apache.flink.api.common.InvalidProgramException: Table
[jira] [Comment Edited] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17625871#comment-17625871 ] Krzysztof Chmielewski edited comment on FLINK-27246 at 10/29/22 8:24 AM: - Hi, I would like to try to fix this problem but I would appreciate any guidance. FYI I've verified that it still occurs on latest master branch. >From what I've debugged the problem is caused by one rewrited method created >from in: [JavaCodeSplitter -> new FunctionSplitter(text, maxMethodLength).rewrite()|https://github.com/apache/flink/blob/master/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/JavaCodeSplitter.java#:~:text=new%20FunctionSplitter(text%2C%20maxMethodLength).rewrite()] The FunctionSplitter::visitMethodDeclaration method iterates through JavaParser.BlockStatementContext elements from ctx.methodBody().block().blockStatement() - > [code|https://github.com/apache/flink/blob/87c33711fa3a4844598772ceafd66dd4a776eea9/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/FunctionSplitter.java#L100:~:text=for%20(JavaParser.BlockStatementContext%20blockStatementContext] For every element we get ContextString from it and add to the splitFuncBodies list. Later we iterated through this list to Merged them into [Code Blocks|https://github.com/apache/flink/blob/87c33711fa3a4844598772ceafd66dd4a776eea9/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/FunctionSplitter.java#L100:~:text=getMergedCodeBlocks(List%3CString%3E%20codeBlock)] respectively to maxMethodLength. However the problem is that for our case the single JavaParser.BlockStatementContext element from ctx.methodBody().block().blockStatement() by it self is larger than maxMethodLength. Its entire body is converted to the method by FunctionSplitter::getMergedCodeBlocks and this causes the exception. The code block is a big while loop that contains a bunch of calls to rewrite$ methods like so: {code:java} rewrite$9722[10418] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10431] = true; rewrite$9722[10419] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10432] = true; rewrite$9722[10420] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10433] = true; {code} Which in my opinion could be easily extracted to separate methods which will solve the problem. I would like to ask: 1. if my understanding and proposed high level solution for splitting the problematic code block into smaller chunks is correct? 2. should this be done by FunctionSplitter or this should be implemented in Scala code for code generation? I'm quite familiar with Antlr4 so I do understand what is happening there. was (Author: kristoffsc): Hi, I would like to try fix this problem but I would appreciate any guidance. FYI I've verified that it still occurs on latest master branch. >From what I've debugged the problem is caused by one rewrited method created >from in: [JavaCodeSplitter -> new FunctionSplitter(text, maxMethodLength).rewrite()|https://github.com/apache/flink/blob/master/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/JavaCodeSplitter.java#:~:text=new%20FunctionSplitter(text%2C%20maxMethodLength).rewrite()] The FunctionSplitter::visitMethodDeclaration method iterates through JavaParser.BlockStatementContext elements from ctx.methodBody().block().blockStatement() - > [code|https://github.com/apache/flink/blob/87c33711fa3a4844598772ceafd66dd4a776eea9/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/FunctionSplitter.java#L100:~:text=for%20(JavaParser.BlockStatementContext%20blockStatementContext] For every element we get ContextString from it and add to the splitFuncBodies list. Later we iterated through this list to Merged them into [Code Blocks|https://github.com/apache/flink/blob/87c33711fa3a4844598772ceafd66dd4a776eea9/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/FunctionSplitter.java#L100:~:text=getMergedCodeBlocks(List%3CString%3E%20codeBlock)] respectively to maxMethodLength. However the problem is that for our case the single JavaParser.BlockStatementContext element from ctx.methodBody().block().blockStatement() by it self is larger than maxMethodLength. Its entire body is converted to the method by FunctionSplitter::getMergedCodeBlocks and this causes the exception. The code block is a big while loop that contains a bunch of calls to rewrite$ methods like so: {code:java} rewrite$9722[10418] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10431] = true; rewrite$9722[10419] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10432] =
[jira] [Commented] (FLINK-27246) Code of method "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" of class "HashAggregateWithKeys$9211" grows beyond 64 KB
[ https://issues.apache.org/jira/browse/FLINK-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17625871#comment-17625871 ] Krzysztof Chmielewski commented on FLINK-27246: --- Hi, I would like to try fix this problem but I would appreciate any guidance. FYI I've verified that it still occurs on latest master branch. >From what I've debugged the problem is caused by one rewrited method created >from in: [JavaCodeSplitter -> new FunctionSplitter(text, maxMethodLength).rewrite()|https://github.com/apache/flink/blob/master/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/JavaCodeSplitter.java#:~:text=new%20FunctionSplitter(text%2C%20maxMethodLength).rewrite()] The FunctionSplitter::visitMethodDeclaration method iterates through JavaParser.BlockStatementContext elements from ctx.methodBody().block().blockStatement() - > [code|https://github.com/apache/flink/blob/87c33711fa3a4844598772ceafd66dd4a776eea9/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/FunctionSplitter.java#L100:~:text=for%20(JavaParser.BlockStatementContext%20blockStatementContext] For every element we get ContextString from it and add to the splitFuncBodies list. Later we iterated through this list to Merged them into [Code Blocks|https://github.com/apache/flink/blob/87c33711fa3a4844598772ceafd66dd4a776eea9/flink-table/flink-table-code-splitter/src/main/java/org/apache/flink/table/codesplit/FunctionSplitter.java#L100:~:text=getMergedCodeBlocks(List%3CString%3E%20codeBlock)] respectively to maxMethodLength. However the problem is that for our case the single JavaParser.BlockStatementContext element from ctx.methodBody().block().blockStatement() by it self is larger than maxMethodLength. Its entire body is converted to the method by FunctionSplitter::getMergedCodeBlocks and this causes the exception. The code block is a big while loop that contains a bunch of calls to rewrite$ methods like so: {code:java} rewrite$9722[10418] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10431] = true; rewrite$9722[10419] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10432] = true; rewrite$9722[10420] = ((org.apache.flink.table.data.DecimalData) null); rewrite$9729[10433] = true; {code} Which in my opinion could be easily extracted to separate methods which will solve the problem. I would like to ask: 1. if my understanding and proposed high level solution for splitting the problematic code block into smaller chunks is correct? 2. should this be done by FunctionSplitter or this should be implemented in Scala code for code generation? I'm quite familiar with Antlr4 so I do understand what is happening there. > Code of method > "processElement(Lorg/apache/flink/streaming/runtime/streamrecord/StreamRecord;)V" > of class "HashAggregateWithKeys$9211" grows beyond 64 KB > - > > Key: FLINK-27246 > URL: https://issues.apache.org/jira/browse/FLINK-27246 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.14.3 >Reporter: Maciej Bryński >Priority: Major > > I think this bug should get fixed in > https://issues.apache.org/jira/browse/FLINK-23007 > Unfortunately I spotted it on Flink 1.14.3 > {code} > java.lang.RuntimeException: Could not instantiate generated class > 'HashAggregateWithKeys$9211' > at > org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:85) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:40) > ~[flink-table_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.api.operators.StreamOperatorFactoryUtil.createOperator(StreamOperatorFactoryUtil.java:81) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:198) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.(RegularOperatorChain.java:63) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654) > ~[flink-dist_2.12-1.14.3-stream1.jar:1.14.3-stream1] > at >
[jira] [Commented] (FLINK-29459) Sink v2 has bugs in supporting legacy v1 implementations with global committer
[ https://issues.apache.org/jira/browse/FLINK-29459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621187#comment-17621187 ] Krzysztof Chmielewski commented on FLINK-29459: --- FYI ticets 29509 29512 29627 are fixing issue with Task manager recovery for Sink architecture with global committer. The 29583 is about recovering Flink 1.14 unified sinks committer state and migrate it to the extended unified model. > Sink v2 has bugs in supporting legacy v1 implementations with global committer > -- > > Key: FLINK-29459 > URL: https://issues.apache.org/jira/browse/FLINK-29459 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.16.0, 1.17.0, 1.15.2 >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Major > Fix For: 1.17.0, 1.15.3, 1.16.1 > > > Currently when supporting Sink implementation using version 1 interface, > there are issues after restoring from a checkpoint after failover: > # In global committer operator, when restoring SubtaskCommittableManager, > the subtask id is replaced with the one in the current operator. This means > that the id originally is the id of the sender task (0 ~ N - 1), but after > restoring it has to be 0. This would cause Duplication Key exception during > restoring. > # For Committer operator, the subtaskId of CheckpointCommittableManagerImpl > is always restored to 0 after failover for all the subtasks. This makes the > summary sent to the Global Committer is attached with wrong subtask id. > # For Committer operator, the checkpoint id of SubtaskCommittableManager is > always restored to 1 after failover, this make the following committable sent > to the global committer is attached with wrong checkpoint id. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616307#comment-17616307 ] Krzysztof Chmielewski edited comment on FLINK-29589 at 10/20/22 9:10 AM: - Hi [~chesnay] V2 on 1.15, 1.16 and 1.17 has its own issues that we have found and actually we are working to fix them with Fabian Paul. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29583 https://issues.apache.org/jira/browse/FLINK-29512 https://issues.apache.org/jira/browse/FLINK-29627 With those stil on the plate we cant really tell if there is a data loss on V2 since Task manager is failing to start during recovery when running Sink with global committer. was (Author: kristoffsc): Hi [~chesnay] V2 on 1.15, 1.16 and 1.17 has its own issues that we have found and actually we are working to fix them with Fabian Paul. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29583 https://issues.apache.org/jira/browse/FLINK-29512 With those stil on the plate we cant really tell if there is a data loss on V2 since Task manager is failing to start during recovery when running Sink with global committer. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Krzysztof Chmielewski >Priority: Blocker > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > _GlobalCommitter_ resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events per commit for 5 commits in total, that gives 100 records). > # Sink with parallelism > 1 with committer and _GlobalCommitter_ elements. > # _Commiters_ processed committables for *checkpointId 2*. > # _GlobalCommitter_ throws exception (desired exception) during > *checkpointId 2* (third commit) while processing data from *checkpoint 1* (it > is expected to global committer architecture lag one commit behind in > reference to rest of the pipeline). > # Task Manager recovery, source resumes sending data. > # Streaming source ends. > # We are missing 20 records (one checkpoint). > What is happening is that during recovery, committers are performing "retry" > on committables for *checkpointId 2*, however those committables, reprocessed > from "retry" task are not emit downstream to the global committer. > The issue can be reproduced using Junit Test build with Flink's TestSink. > The test was [implemented > here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] > and it is based on other tests from `SinkITCase.java` class. > The test reproduces the issue in more than 90% of runs. > I believe that problem is somewhere around > *SinkOperator::notifyCheckpointComplete* method. In there we see that Retry > async task is scheduled however its result is never emitted downstream like > it is done for regular flow one line above. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620355#comment-17620355 ] Krzysztof Chmielewski commented on FLINK-29627: --- Backports: 1.15 - https://github.com/apache/flink/pull/21113 1.16 - https://github.com/apache/flink/pull/21115 > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at > org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) > ~[classes/:?] > at
[jira] [Commented] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619570#comment-17619570 ] Krzysztof Chmielewski commented on FLINK-29627: --- New PR without SinkItTest https://github.com/apache/flink/pull/21101 > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at > org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) > ~[classes/:?] > at java.lang.Iterable.forEach(Iterable.java:74) ~[?:?] > at >
[jira] [Comment Edited] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617611#comment-17617611 ] Krzysztof Chmielewski edited comment on FLINK-29627 at 10/14/22 9:54 AM: - PR ready: https://github.com/apache/flink/pull/21052 Pending on https://github.com/apache/flink/pull/21022 to be merged. was (Author: kristoffsc): PR ready but waiting on https://github.com/apache/flink/pull/21022 to be merged. https://github.com/apache/flink/pull/21052 > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] >
[jira] [Comment Edited] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617611#comment-17617611 ] Krzysztof Chmielewski edited comment on FLINK-29627 at 10/14/22 9:54 AM: - PR ready but waiting on https://github.com/apache/flink/pull/21022 to be merged. https://github.com/apache/flink/pull/21052 was (Author: kristoffsc): PR ready but waiting on #21022 to be merged. https://github.com/apache/flink/pull/21052 > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at >
[jira] [Comment Edited] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617611#comment-17617611 ] Krzysztof Chmielewski edited comment on FLINK-29627 at 10/14/22 9:54 AM: - PR ready: https://github.com/apache/flink/pull/21052 Pending on https://github.com/apache/flink/pull/21022. was (Author: kristoffsc): PR ready: https://github.com/apache/flink/pull/21052 Pending on https://github.com/apache/flink/pull/21022 to be merged. > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at >
[jira] [Comment Edited] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617611#comment-17617611 ] Krzysztof Chmielewski edited comment on FLINK-29627 at 10/14/22 9:53 AM: - PR ready but waiting on #21022 to be merged. https://github.com/apache/flink/pull/21052 was (Author: kristoffsc): PR https://github.com/apache/flink/pull/21052 > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at >
[jira] [Commented] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617611#comment-17617611 ] Krzysztof Chmielewski commented on FLINK-29627: --- PR https://github.com/apache/flink/pull/21052 > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Assignee: Krzysztof Chmielewski >Priority: Critical > > Recovery more than one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialize manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at > org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) > ~[classes/:?] > at java.lang.Iterable.forEach(Iterable.java:74) ~[?:?] > at >
[jira] [Updated] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29627: -- Description: Recovery more than one Committable causes `IllegalStateException` and prevents cluster to start. When we recover the `CheckpointCommittableManager` we deserialize SubtaskCommittableManager instances from recovery state and we put them into `Map>`. The key of this map is subtaskId of the recovered manager. However this will fail if we have to recover more than one committable. What w should do is to call `SubtaskCommittableManager::merge` if we already deserialize manager for this subtaskId. Stack Trace: {code:java} 28603 [flink-akka.actor.default-dispatcher-8] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global Committer (1/1) (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 @ kubernetes.docker.internal (dataPort=-1). java.lang.IllegalStateException: Duplicate key 0 (attempted merging values org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac and org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) at java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] at java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) ~[?:?] at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) ~[classes/:?] at org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) ~[classes/:?] at java.lang.Iterable.forEach(Iterable.java:74) ~[?:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterOperator.initializeState(GlobalCommitterOperator.java:133) ~[classes/:?] at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122) ~[classes/:?] at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:727) ~[classes/:?] at
[jira] [Updated] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29627: -- Description: Recovery more than one Committable causes `IllegalStateException` and prevents cluster to start. When we recover the `CheckpointCommittableManager` we deserialize SubtaskCommittableManager instances from recovery state and we put them into `Map>`. The key of this map is subtaskId of the recovered manager. However this will fail if we have to recover more than one committable. What w should do is to call `SubtaskCommittableManager::merge` if we already deserialzie manager for this subtaskId. Stack Trace: {code:java} 28603 [flink-akka.actor.default-dispatcher-8] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global Committer (1/1) (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 @ kubernetes.docker.internal (dataPort=-1). java.lang.IllegalStateException: Duplicate key 0 (attempted merging values org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac and org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) at java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] at java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) ~[?:?] at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) ~[classes/:?] at org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) ~[classes/:?] at java.lang.Iterable.forEach(Iterable.java:74) ~[?:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterOperator.initializeState(GlobalCommitterOperator.java:133) ~[classes/:?] at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122) ~[classes/:?] at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:727) ~[classes/:?] at
[jira] [Commented] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
[ https://issues.apache.org/jira/browse/FLINK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616925#comment-17616925 ] Krzysztof Chmielewski commented on FLINK-29627: --- I have a fix and tests for this issue. Will provide PR shortly > Sink - Duplicate key exception during recover more than 1 committable. > -- > > Key: FLINK-29627 > URL: https://issues.apache.org/jira/browse/FLINK-29627 > Project: Flink > Issue Type: Bug >Affects Versions: 1.16.0, 1.17.0, 1.15.2, 1.16.1 >Reporter: Krzysztof Chmielewski >Priority: Critical > > Recovery more then one Committable causes `IllegalStateException` and > prevents cluster to start. > When we recover the `CheckpointCommittableManager` we deserialize > SubtaskCommittableManager instances from recovery state and we put them into > `Map>`. The key of this map is > subtaskId of the recovered manager. However this will fail if we have to > recover more than one committable. > What w should do is to call `SubtaskCommittableManager::merge` if we already > deserialzie manager for this subtaskId. > Stack Trace: > {code:java} > 28603 [flink-akka.actor.default-dispatcher-8] INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global > Committer (1/1) > (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) > switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 > @ kubernetes.docker.internal (dataPort=-1). > java.lang.IllegalStateException: Duplicate key 0 (attempted merging values > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac > and > org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) > at > java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] > at > java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > ~[?:?] > at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > ~[?:?] > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > ~[?:?] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > ~[?:?] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) > ~[?:?] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) > ~[classes/:?] > at > org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) > ~[classes/:?] > at > org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) > ~[classes/:?] > at > org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) > ~[classes/:?] > at > org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) > ~[classes/:?] > at java.lang.Iterable.forEach(Iterable.java:74) ~[?:?] > at >
[jira] [Created] (FLINK-29627) Sink - Duplicate key exception during recover more than 1 committable.
Krzysztof Chmielewski created FLINK-29627: - Summary: Sink - Duplicate key exception during recover more than 1 committable. Key: FLINK-29627 URL: https://issues.apache.org/jira/browse/FLINK-29627 Project: Flink Issue Type: Bug Affects Versions: 1.15.2, 1.16.0, 1.17.0, 1.16.1 Reporter: Krzysztof Chmielewski Recovery more then one Committable causes `IllegalStateException` and prevents cluster to start. When we recover the `CheckpointCommittableManager` we deserialize SubtaskCommittableManager instances from recovery state and we put them into `Map>`. The key of this map is subtaskId of the recovered manager. However this will fail if we have to recover more than one committable. What w should do is to call `SubtaskCommittableManager::merge` if we already deserialzie manager for this subtaskId. Stack Trace: {code:java} 28603 [flink-akka.actor.default-dispatcher-8] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Global Committer (1/1) (485dc57aca56235b9d1ab803c8c966ad_47d89856a1cf553f16e7063d953b7d42_0_1) switched from INITIALIZING to FAILED on 2ed5c848-d360-48ae-9a92-730b022c8a39 @ kubernetes.docker.internal (dataPort=-1). java.lang.IllegalStateException: Duplicate key 0 (attempted merging values org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@631940ac and org.apache.flink.streaming.runtime.operators.sink.committables.SubtaskCommittableManager@7ff3bd7) at java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) ~[?:?] at java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) ~[?:?] at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:153) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer$CheckpointSimpleVersionedSerializer.deserialize(CommittableCollectorSerializer.java:124) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeserializeList(SimpleVersionedSerialization.java:148) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserializeV2(CommittableCollectorSerializer.java:105) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:82) ~[classes/:?] at org.apache.flink.streaming.runtime.operators.sink.committables.CommittableCollectorSerializer.deserialize(CommittableCollectorSerializer.java:41) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:121) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserializeV2(GlobalCommitterSerializer.java:128) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:99) ~[classes/:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterSerializer.deserialize(GlobalCommitterSerializer.java:42) ~[classes/:?] at org.apache.flink.core.io.SimpleVersionedSerialization.readVersionAndDeSerialize(SimpleVersionedSerialization.java:227) ~[classes/:?] at org.apache.flink.streaming.api.operators.util.SimpleVersionedListState$DeserializingIterator.next(SimpleVersionedListState.java:138) ~[classes/:?] at java.lang.Iterable.forEach(Iterable.java:74) ~[?:?] at org.apache.flink.streaming.api.connector.sink2.GlobalCommitterOperator.initializeState(GlobalCommitterOperator.java:133) ~[classes/:?] at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122) ~[classes/:?] at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286) ~[classes/:?] at
[jira] [Commented] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616307#comment-17616307 ] Krzysztof Chmielewski commented on FLINK-29589: --- Hi [~chesnay] V2 on 1.15, 1.16 and 1.17 has its own issues that we have found and actually we are working to fix them with Fabian Paul. https://issues.apache.org/jira/browse/FLINK-29509 https://issues.apache.org/jira/browse/FLINK-29583 https://issues.apache.org/jira/browse/FLINK-29512 With those stil on the plate we cant really tell if there is a data loss on V2 since Task manager is failing to start during recovery when running Sink with global committer. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Krzysztof Chmielewski >Priority: Blocker > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > _GlobalCommitter_ resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events per commit for 5 commits in total, that gives 100 records). > # Sink with parallelism > 1 with committer and _GlobalCommitter_ elements. > # _Commiters_ processed committables for *checkpointId 2*. > # _GlobalCommitter_ throws exception (desired exception) during > *checkpointId 2* (third commit) while processing data from *checkpoint 1* (it > is expected to global committer architecture lag one commit behind in > reference to rest of the pipeline). > # Task Manager recovery, source resumes sending data. > # Streaming source ends. > # We are missing 20 records (one checkpoint). > What is happening is that during recovery, committers are performing "retry" > on committables for *checkpointId 2*, however those committables, reprocessed > from "retry" task are not emit downstream to the global committer. > The issue can be reproduced using Junit Test build with Flink's TestSink. > The test was [implemented > here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] > and it is based on other tests from `SinkITCase.java` class. > The test reproduces the issue in more than 90% of runs. > I believe that problem is somewhere around > *SinkOperator::notifyCheckpointComplete* method. In there we see that Retry > async task is scheduled however its result is never emitted downstream like > it is done for regular flow one line above. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by _GlobalCommitter_ resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and _GlobalCommitter_ elements. # _Commiters_ processed committables for *checkpointId 2*. # _GlobalCommitter_ throws exception (desired exception) during *checkpointId 2* (third commit) while processing data from *checkpoint 1* (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for *checkpointId 2*, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test build with Flink's TestSink. The test was [implemented here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from `SinkITCase.java` class. The test reproduces the issue in more than 90% of runs. I believe that problem is somewhere around *SinkOperator::notifyCheckpointComplete* method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by `GlobalCommitter` resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and `GlobalCommitter` elements. # `Commiters` processed committables for checkpointId 2. # `GlobalCommitter` throws exception (desired exception) during `checkpointId 2` (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for `checkpointId 2`, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test build with Flink's TestSink. The test was [implemented here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from `SinkITCase.java` class. The test reproduces the issue in more than 90% of runs. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Blocker > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > _GlobalCommitter_ resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events per commit for 5 commits in total, that gives
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by `GlobalCommitter` resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and `GlobalCommitter` elements. # `Commiters` processed committables for checkpointId 2. # `GlobalCommitter` throws exception (desired exception) during `checkpointId 2` (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for `checkpointId 2`, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test build with Flink's TestSink. The test was [implemented here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from `SinkITCase.java` class. The test reproduces the issue in more than 90% of runs. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was [implemented here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from SinkITCase.java class. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Blocker > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > `GlobalCommitter` resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events per commit for 5 commits in total, that gives 100 records). > # Sink with parallelism > 1 with committer and
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was [implemented here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from SinkITCase.java class. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from SinkITCase.java class. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Blocker > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > GlobalCommitter resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events per commit for 5 commits in total, that gives 100 records). > # Sink with parallelism > 1 with committer and GlobalCommitter elements. > # Commiters processed committables for
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events per commit for 5 commits in total, that gives 100 records). # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from SinkITCase.java class. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from SinkITCase.java class. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Major > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > GlobalCommitter resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events per commit for 5 commits in total, that gives 100 records). > # Sink with parallelism > 1 with committer and GlobalCommitter elements. > # Commiters processed committables for checkpointId 2. > # GlobalCommitter throws exception
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery] and it is based on other tests from SinkITCase.java class. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery]. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Major > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > GlobalCommitter resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events for 5 commits) > # Sink with parallelism > 1 with committer and GlobalCommitter elements. > # Commiters processed committables for checkpointId 2. > # GlobalCommitter throws exception (desired exception) during checkpointId > 2 (third commit) while processing data from checkpoint 1 (it is expected to > global committer
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commiters processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Task Manager recovery, source resumes sending data. # Streaming source ends. # We are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit downstream to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery]. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink's architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss for sinks. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commitaers processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Streaming source ends. # we are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery]. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Major > > Flink's Sink architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > GlobalCommitter resulting with data loss. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events for 5 commits) > # Sink with parallelism > 1 with committer and GlobalCommitter elements. > # Commiters processed committables for checkpointId 2. > # GlobalCommitter throws exception (desired exception) during checkpointId > 2 (third commit) while processing data from checkpoint 1 (it is expected to > global committer architecture lag one commit behind in reference to rest of > the pipeline). > # Task Manager recovery, source resumes
[jira] [Created] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
Krzysztof Chmielewski created FLINK-29589: - Summary: Data Loss in Sink GlobalCommitter during Task Manager recovery Key: FLINK-29589 URL: https://issues.apache.org/jira/browse/FLINK-29589 Project: Flink Issue Type: Bug Affects Versions: 1.14.6, 1.14.5, 1.14.4, 1.14.3, 1.14.2, 1.14.0 Reporter: Krzysztof Chmielewski Flink's Sink's architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss for sinks. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: 1. Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) 2. Sink with parallelism > 1 with committer and GlobalCommitter elements. 3. Commitaers processed committables for checkpointId 2. 3. GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). 4. Streaming source ends 5. we are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery]. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29589) Data Loss in Sink GlobalCommitter during Task Manager recovery
[ https://issues.apache.org/jira/browse/FLINK-29589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Chmielewski updated FLINK-29589: -- Description: Flink's Sink's architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss for sinks. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: # Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) # Sink with parallelism > 1 with committer and GlobalCommitter elements. # Commitaers processed committables for checkpointId 2. # GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). # Streaming source ends. # we are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery]. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. was: Flink's Sink's architecture with global committer seems to be vulnerable for data loss during Task Manager recovery. The entire checkpoint can be lost by GlobalCommitter resulting with data loss for sinks. Issue was observed in Delta Sink connector on a real 1.14.x cluster and was replicated using Flink's 1.14.6 Test Utils classes. Scenario: 1. Streaming source emitting constant number of events per checkpoint (20 events for 5 commits) 2. Sink with parallelism > 1 with committer and GlobalCommitter elements. 3. Commitaers processed committables for checkpointId 2. 3. GlobalCommitter throws exception (desired exception) during checkpointId 2 (third commit) while processing data from checkpoint 1 (it is expected to global committer architecture lag one commit behind in reference to rest of the pipeline). 4. Streaming source ends 5. we are missing 20 records (one checkpoint). What is happening is that during recovery, committers are performing "retry" on committables for checkpointId 2, however those committables, reprocessed from "retry" task are not emit to the global committer. The issue can be reproduced using Junit Test builded with Flink's TestSink. The test was implemented [here|https://github.com/kristoffSC/flink/blob/Flink_1.14_DataLoss_SinkGlobalCommitter/flink-tests/src/test/java/org/apache/flink/test/streaming/runtime/SinkITCase.java#:~:text=testGlobalCommitterMissingRecordsDuringRecovery]. I believe that problem is somewhere around `SinkOperator::notifyCheckpointComplete` method. In there we see that Retry async task is scheduled however its result is never emitted downstream like it is done for regular flow one line above. > Data Loss in Sink GlobalCommitter during Task Manager recovery > -- > > Key: FLINK-29589 > URL: https://issues.apache.org/jira/browse/FLINK-29589 > Project: Flink > Issue Type: Bug >Affects Versions: 1.14.0, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6 >Reporter: Krzysztof Chmielewski >Priority: Major > > Flink's Sink's architecture with global committer seems to be vulnerable for > data loss during Task Manager recovery. The entire checkpoint can be lost by > GlobalCommitter resulting with data loss for sinks. > Issue was observed in Delta Sink connector on a real 1.14.x cluster and was > replicated using Flink's 1.14.6 Test Utils classes. > Scenario: > # Streaming source emitting constant number of events per checkpoint (20 > events for 5 commits) > # Sink with parallelism > 1 with committer and GlobalCommitter elements. > # Commitaers processed committables for checkpointId 2. > # GlobalCommitter throws exception (desired exception) during checkpointId > 2 (third commit) while processing data from checkpoint 1 (it is expected to > global committer architecture lag one commit behind in reference to rest of > the pipeline). > # Streaming source ends. > # we are missing 20 records (one checkpoint). > What is
[jira] [Comment Edited] (FLINK-29509) Set correct subtaskId during recovery of committables
[ https://issues.apache.org/jira/browse/FLINK-29509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613888#comment-17613888 ] Krzysztof Chmielewski edited comment on FLINK-29509 at 10/7/22 4:29 PM: PR ready for review :) [https://github.com/apache/flink/pull/20979] was (Author: kristoffsc): PR: https://github.com/apache/flink/pull/20979 > Set correct subtaskId during recovery of committables > - > > Key: FLINK-29509 > URL: https://issues.apache.org/jira/browse/FLINK-29509 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.17.0, 1.15.2, 1.16.1 >Reporter: Fabian Paul >Assignee: Krzysztof Chmielewski >Priority: Critical > > When we recover the `CheckpointCommittableManager` we ignore the subtaskId it > is recovered on. > [https://github.com/apache/flink/blob/d191bda7e63a2c12416cba56090e5cd75426079b/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/sink/committables/CheckpointCommittableManagerImpl.java#L58] > This becomes a problem when a sink uses a post-commit topology because > multiple committer operators might forward committable summaries coming from > the same subtaskId. > > It should be possible to use the subtaskId already present in the > `CommittableCollector` when creating the `CheckpointCommittableManager`s. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29509) Set correct subtaskId during recovery of committables
[ https://issues.apache.org/jira/browse/FLINK-29509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613888#comment-17613888 ] Krzysztof Chmielewski commented on FLINK-29509: --- PR: https://github.com/apache/flink/pull/20979 > Set correct subtaskId during recovery of committables > - > > Key: FLINK-29509 > URL: https://issues.apache.org/jira/browse/FLINK-29509 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.17.0, 1.15.2, 1.16.1 >Reporter: Fabian Paul >Assignee: Krzysztof Chmielewski >Priority: Critical > > When we recover the `CheckpointCommittableManager` we ignore the subtaskId it > is recovered on. > [https://github.com/apache/flink/blob/d191bda7e63a2c12416cba56090e5cd75426079b/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/sink/committables/CheckpointCommittableManagerImpl.java#L58] > This becomes a problem when a sink uses a post-commit topology because > multiple committer operators might forward committable summaries coming from > the same subtaskId. > > It should be possible to use the subtaskId already present in the > `CommittableCollector` when creating the `CheckpointCommittableManager`s. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29509) Set correct subtaskId during recovery of committables
[ https://issues.apache.org/jira/browse/FLINK-29509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613527#comment-17613527 ] Krzysztof Chmielewski commented on FLINK-29509: --- Hi, I would like to work on this thicket. Can someone assign it to me? It seems I can't do that. > Set correct subtaskId during recovery of committables > - > > Key: FLINK-29509 > URL: https://issues.apache.org/jira/browse/FLINK-29509 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.17.0, 1.15.2, 1.16.1 >Reporter: Fabian Paul >Priority: Critical > > When we recover the `CheckpointCommittableManager` we ignore the subtaskId it > is recovered on. > [https://github.com/apache/flink/blob/d191bda7e63a2c12416cba56090e5cd75426079b/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/sink/committables/CheckpointCommittableManagerImpl.java#L58] > This becomes a problem when a sink uses a post-commit topology because > multiple committer operators might forward committable summaries coming from > the same subtaskId. > > It should be possible to use the subtaskId already present in the > `CommittableCollector` when creating the `CheckpointCommittableManager`s. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-28591) Array> is not serialized correctly when BigInt is present
[ https://issues.apache.org/jira/browse/FLINK-28591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17568488#comment-17568488 ] Krzysztof Chmielewski edited comment on FLINK-28591 at 7/19/22 11:51 AM: - The potential issue might be in _CopyingChainingOutput.class_ line 82 where we call input.processElement(copy); The type of input is "StreamExecCalc" but i do not see processElement method on this type.. and when I try to go inside with IntelliJ debug, I actually dont see anything.. Anyways, for case with bigint, {code:java} input.processElement(copy);{code} leads us to {_}GenericArrayData{_}, where for case with int, we dont have this object created. Unfortunately I dont know what is happening inside {code:java} input.processElement(copy);{code} Any hint about how to debug this place would help. Currently I see this: !image-2022-07-19-13-51-45-254.png! was (Author: kristoffsc): The potential issue might be in _CopyingChainingOutput.class_ line 82 where we call input.processElement(copy); The type of input is "StreamExecCalc" but i do not see processElement method on this type.. and when I try to go inside with IntelliJ debug, I actually dont see anything.. Anyways, for case with bigint, {code:java} input.processElement(copy);{code} leads us to {_}GenericArrayData{_}, where for case with int, we dont have this object created. Unfortunately I dont know what is happening inside {code:java} input.processElement(copy);{code} > Array> is not serialized correctly when BigInt is present > -- > > Key: FLINK-28591 > URL: https://issues.apache.org/jira/browse/FLINK-28591 > Project: Flink > Issue Type: Bug > Components: Table SQL / API, Table SQL / Planner >Affects Versions: 1.15.0 >Reporter: Andrzej Swatowski >Priority: Major > Attachments: image-2022-07-19-13-51-45-254.png > > > When using Table API to insert data into array of rows, the data apparently > is incorrectly serialized internally, which leads to incorrect serialization > at the connectors. It happens when one of the table fields is a BIGINT (and > does not happen, when it is INT). > E.g., a following table: > {code:java} > CREATE TABLE wrongArray ( > foo bigint, > bar ARRAY> > ) WITH ( > 'connector' = 'filesystem', > 'path' = 'file://path/to/somewhere', > 'format' = 'json' > ) {code} > along with the following insert: > {code:java} > insert into wrongArray ( > SELECT > 1, > array[ > ('Field1', 'Value1'), > ('Field2', 'Value2') > ] > FROM (VALUES(1)) > ) {code} > gets serialized into: > {code:java} > { > "foo":1, > "bar":[ > { > "foo1":"Field2", > "foo2":"Value2" > }, > { > "foo1":"Field2", > "foo2":"Value2" > } > ] > }{code} > It is easy to spot that `bar` (an Array of Rows with two Strings) consists of > duplicates of the last row in the array. > On the other hand, when `foo` is of type `int` instead of `bigint`: > {code:java} > CREATE TABLE wrongArray ( > foo int, > bar ARRAY> > ) WITH ( > 'connector' = 'filesystem', > 'path' = 'file://path/to/somewhere', > 'format' = 'json' > ) {code} > the previous insert yields correct value: > {code:java} > { > "foo":1, > "bar":[ > { > "foo1":"Field1", > "foo2":"Value1" > }, > { > "foo1":"Field2", > "foo2":"Value2" > } > ] > }{code} > Bug reproduced in the Flink project: > [https://github.com/swtwsk/flink-array-row-bug] > > It is not an error connected with either a specific connector or format. I > have done a bit of debugging when trying to implement my own format and it > seems that `BinaryArrayData` holding the row values has wrong data saved in > its `MemorySegment`, i.e. calling: > {code:java} > for (var i = 0; i < array.size(); i++) { > Object element = arrayDataElementGetter.getElementOrNull(array, i); > }{code} > correctly calculates offsets but yields the same result as the data is > malformed in the array's `MemorySegment`. Such a call can be, e.g., found in > `flink-json` — to be more specific in > {color:#e8912d}org.apache.flink.formats.json.RowDataToJsonConverters::createArrayConverter > {color}(line 241 in 1.15.0 version) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28591) Array> is not serialized correctly when BigInt is present
[ https://issues.apache.org/jira/browse/FLINK-28591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17568488#comment-17568488 ] Krzysztof Chmielewski commented on FLINK-28591: --- The potential issue might be in _CopyingChainingOutput.class_ line 82 where we call input.processElement(copy); The type of input is "StreamExecCalc" but i do not see processElement method on this type.. and when I try to go inside with IntelliJ debug, I actually dont see anything.. Anyways, for case with bigint, {code:java} input.processElement(copy);{code} leads us to {_}GenericArrayData{_}, where for case with int, we dont have this object created. Unfortunately I dont know what is happening inside {code:java} input.processElement(copy);{code} > Array> is not serialized correctly when BigInt is present > -- > > Key: FLINK-28591 > URL: https://issues.apache.org/jira/browse/FLINK-28591 > Project: Flink > Issue Type: Bug > Components: Table SQL / API, Table SQL / Planner >Affects Versions: 1.15.0 >Reporter: Andrzej Swatowski >Priority: Major > > When using Table API to insert data into array of rows, the data apparently > is incorrectly serialized internally, which leads to incorrect serialization > at the connectors. It happens when one of the table fields is a BIGINT (and > does not happen, when it is INT). > E.g., a following table: > {code:java} > CREATE TABLE wrongArray ( > foo bigint, > bar ARRAY> > ) WITH ( > 'connector' = 'filesystem', > 'path' = 'file://path/to/somewhere', > 'format' = 'json' > ) {code} > along with the following insert: > {code:java} > insert into wrongArray ( > SELECT > 1, > array[ > ('Field1', 'Value1'), > ('Field2', 'Value2') > ] > FROM (VALUES(1)) > ) {code} > gets serialized into: > {code:java} > { > "foo":1, > "bar":[ > { > "foo1":"Field2", > "foo2":"Value2" > }, > { > "foo1":"Field2", > "foo2":"Value2" > } > ] > }{code} > It is easy to spot that `bar` (an Array of Rows with two Strings) consists of > duplicates of the last row in the array. > On the other hand, when `foo` is of type `int` instead of `bigint`: > {code:java} > CREATE TABLE wrongArray ( > foo int, > bar ARRAY> > ) WITH ( > 'connector' = 'filesystem', > 'path' = 'file://path/to/somewhere', > 'format' = 'json' > ) {code} > the previous insert yields correct value: > {code:java} > { > "foo":1, > "bar":[ > { > "foo1":"Field1", > "foo2":"Value1" > }, > { > "foo1":"Field2", > "foo2":"Value2" > } > ] > }{code} > Bug reproduced in the Flink project: > [https://github.com/swtwsk/flink-array-row-bug] > > It is not an error connected with either a specific connector or format. I > have done a bit of debugging when trying to implement my own format and it > seems that `BinaryArrayData` holding the row values has wrong data saved in > its `MemorySegment`, i.e. calling: > {code:java} > for (var i = 0; i < array.size(); i++) { > Object element = arrayDataElementGetter.getElementOrNull(array, i); > }{code} > correctly calculates offsets but yields the same result as the data is > malformed in the array's `MemorySegment`. Such a call can be, e.g., found in > `flink-json` — to be more specific in > {color:#e8912d}org.apache.flink.formats.json.RowDataToJsonConverters::createArrayConverter > {color}(line 241 in 1.15.0 version) -- This message was sent by Atlassian Jira (v8.20.10#820010)