[jira] [Commented] (FLINK-35097) Table API Filesystem connector with 'raw' format repeats last line
[ https://issues.apache.org/jira/browse/FLINK-35097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840052#comment-17840052 ] Kumar Mallikarjuna commented on FLINK-35097: [~david.perkins] , the fix has been merged to master. > Table API Filesystem connector with 'raw' format repeats last line > -- > > Key: FLINK-35097 > URL: https://issues.apache.org/jira/browse/FLINK-35097 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.17.1 > Environment: I ran the above test with 1.17.1. I checked for existing > bug tickets and release notes, but did not find anything, so assuming this > effects 1.18 and 1.19. >Reporter: David Perkins >Assignee: Kumar Mallikarjuna >Priority: Major > Labels: pull-request-available > > When using the Filesystem connector with 'raw' format to read text data that > contains new lines, a row is returned for every line, but always contains the > contents of the last line. > For example, with the following file. > {quote} > line 1 > line 2 > line 3 > {quote} > And table definition > {quote} > create TABLE MyRawTable ( > `doc` string, > ) WITH ( > 'path' = 'file:///path/to/data', > 'format' = 'raw', >'connector' = 'filesystem' > ); > {quote} > Selecting `*` from the table produces three rows all with "line 3" for `doc`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-35097) Table API Filesystem connector with 'raw' format repeats last line
[ https://issues.apache.org/jira/browse/FLINK-35097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836959#comment-17836959 ] Kumar Mallikarjuna edited comment on FLINK-35097 at 4/14/24 3:05 PM: - [~david.perkins] , I've raised a fix here: [https://github.com/apache/flink/pull/24661] I just realised, I need to look for a committer and get the issue assigned first! was (Author: JIRAUSER303984): [~david.perkins] , I've raised a fix here: https://github.com/apache/flink/pull/24661 > Table API Filesystem connector with 'raw' format repeats last line > -- > > Key: FLINK-35097 > URL: https://issues.apache.org/jira/browse/FLINK-35097 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.17.1 > Environment: I ran the above test with 1.17.1. I checked for existing > bug tickets and release notes, but did not find anything, so assuming this > effects 1.18 and 1.19. >Reporter: David Perkins >Priority: Major > Labels: pull-request-available > > When using the Filesystem connector with 'raw' format to read text data that > contains new lines, a row is returned for every line, but always contains the > contents of the last line. > For example, with the following file. > {quote} > line 1 > line 2 > line 3 > {quote} > And table definition > {quote} > create TABLE MyRawTable ( > `doc` string, > ) WITH ( > 'path' = 'file:///path/to/data', > 'format' = 'raw', >'connector' = 'filesystem' > ); > {quote} > Selecting `*` from the table produces three rows all with "line 3" for `doc`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35097) Table API Filesystem connector with 'raw' format repeats last line
[ https://issues.apache.org/jira/browse/FLINK-35097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836959#comment-17836959 ] Kumar Mallikarjuna commented on FLINK-35097: [~david.perkins] , I've raised a fix here: https://github.com/apache/flink/pull/24661 > Table API Filesystem connector with 'raw' format repeats last line > -- > > Key: FLINK-35097 > URL: https://issues.apache.org/jira/browse/FLINK-35097 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.17.1 > Environment: I ran the above test with 1.17.1. I checked for existing > bug tickets and release notes, but did not find anything, so assuming this > effects 1.18 and 1.19. >Reporter: David Perkins >Priority: Major > Labels: pull-request-available > > When using the Filesystem connector with 'raw' format to read text data that > contains new lines, a row is returned for every line, but always contains the > contents of the last line. > For example, with the following file. > {quote} > line 1 > line 2 > line 3 > {quote} > And table definition > {quote} > create TABLE MyRawTable ( > `doc` string, > ) WITH ( > 'path' = 'file:///path/to/data', > 'format' = 'raw', >'connector' = 'filesystem' > ); > {quote} > Selecting `*` from the table produces three rows all with "line 3" for `doc`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35097) Table API Filesystem connector with 'raw' format repeats last line
[ https://issues.apache.org/jira/browse/FLINK-35097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836952#comment-17836952 ] Kumar Mallikarjuna commented on FLINK-35097: > assuming this effects 1.18 and 1.19. This is reproducible on `master`. I'll look into a fix. > Table API Filesystem connector with 'raw' format repeats last line > -- > > Key: FLINK-35097 > URL: https://issues.apache.org/jira/browse/FLINK-35097 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.17.1 > Environment: I ran the above test with 1.17.1. I checked for existing > bug tickets and release notes, but did not find anything, so assuming this > effects 1.18 and 1.19. >Reporter: David Perkins >Priority: Major > > When using the Filesystem connector with 'raw' format to read text data that > contains new lines, a row is returned for every line, but always contains the > contents of the last line. > For example, with the following file. > {quote} > line 1 > line 2 > line 3 > {quote} > And table definition > {quote} > create TABLE MyRawTable ( > `doc` string, > ) WITH ( > 'path' = 'file:///path/to/data', > 'format' = 'raw', >'connector' = 'filesystem' > ); > {quote} > Selecting `*` from the table produces three rows all with "line 3" for `doc`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832449#comment-17832449 ] Kumar Mallikarjuna commented on FLINK-34239: Thanks, [~Zhanghao Chen] ! I've updated the PR as per your comments. > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Assignee: Kumar Mallikarjuna >Priority: Major > Labels: pull-request-available > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832449#comment-17832449 ] Kumar Mallikarjuna edited comment on FLINK-34239 at 3/30/24 2:01 PM: - Thanks for the review, [~Zhanghao Chen] ! I've updated the PR as per your comments. was (Author: JIRAUSER303984): Thanks, [~Zhanghao Chen] ! I've updated the PR as per your comments. > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Assignee: Kumar Mallikarjuna >Priority: Major > Labels: pull-request-available > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34551) Align retry mechanisms of FutureUtils
[ https://issues.apache.org/jira/browse/FLINK-34551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831728#comment-17831728 ] Kumar Mallikarjuna commented on FLINK-34551: I see, makes sense. Thank you. > Align retry mechanisms of FutureUtils > - > > Key: FLINK-34551 > URL: https://issues.apache.org/jira/browse/FLINK-34551 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.20.0 >Reporter: Matthias Pohl >Assignee: Matthias Pohl >Priority: Major > Labels: pull-request-available > > The retry mechanisms of FutureUtils include quite a bit of redundant code > which makes it hard to understand and to extend. The logic should be aligned > properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-34551) Align retry mechanisms of FutureUtils
[ https://issues.apache.org/jira/browse/FLINK-34551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831453#comment-17831453 ] Kumar Mallikarjuna edited comment on FLINK-34551 at 3/27/24 4:38 PM: - Hi [~mapohl] , I've opened a PR ([#24578|https://github.com/apache/flink/pull/24578]) refactoring `retryOperation()` and `retryOperationWithDelay()`. I haven't refactored `retrySuccessfulWithDelay()`, since you had some changes on that already in your PR ([#24309|https://github.com/apache/flink/pull/24309]). Would you mind reviewing? was (Author: JIRAUSER303984): Hi [~mapohl] , I've opened a PR ([#24578|https://github.com/apache/flink/pull/24578]) refactoring `retryOperation()` and `retryOperationWithDelay()`. I haven't refactored `retrySuccessfulWithDelay()`, since you had some changes on that already in your PR ([#24309|https://github.com/apache/flink/pull/24309]). > Align retry mechanisms of FutureUtils > - > > Key: FLINK-34551 > URL: https://issues.apache.org/jira/browse/FLINK-34551 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.20.0 >Reporter: Matthias Pohl >Assignee: Kumar Mallikarjuna >Priority: Major > Labels: pull-request-available > > The retry mechanisms of FutureUtils include quite a bit of redundant code > which makes it hard to understand and to extend. The logic should be aligned > properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34551) Align retry mechanisms of FutureUtils
[ https://issues.apache.org/jira/browse/FLINK-34551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831453#comment-17831453 ] Kumar Mallikarjuna commented on FLINK-34551: Hi [~mapohl] , I've opened a PR ([#24578|https://github.com/apache/flink/pull/24578]) refactoring `retryOperation()` and `retryOperationWithDelay()`. I haven't refactored `retrySuccessfulWithDelay()`, since you had some changes on that already in your PR ([#24309|https://github.com/apache/flink/pull/24309]). > Align retry mechanisms of FutureUtils > - > > Key: FLINK-34551 > URL: https://issues.apache.org/jira/browse/FLINK-34551 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.20.0 >Reporter: Matthias Pohl >Assignee: Kumar Mallikarjuna >Priority: Major > Labels: pull-request-available > > The retry mechanisms of FutureUtils include quite a bit of redundant code > which makes it hard to understand and to extend. The logic should be aligned > properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34551) Align retry mechanisms of FutureUtils
[ https://issues.apache.org/jira/browse/FLINK-34551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831254#comment-17831254 ] Kumar Mallikarjuna commented on FLINK-34551: Thank you! > Align retry mechanisms of FutureUtils > - > > Key: FLINK-34551 > URL: https://issues.apache.org/jira/browse/FLINK-34551 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.20.0 >Reporter: Matthias Pohl >Assignee: Kumar Mallikarjuna >Priority: Major > > The retry mechanisms of FutureUtils include quite a bit of redundant code > which makes it hard to understand and to extend. The logic should be aligned > properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34551) Align retry mechanisms of FutureUtils
[ https://issues.apache.org/jira/browse/FLINK-34551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831212#comment-17831212 ] Kumar Mallikarjuna commented on FLINK-34551: Hi [~mapohl] , if I understand this correctly, we need to refactor the retry logic here, right? Can I take this up? > Align retry mechanisms of FutureUtils > - > > Key: FLINK-34551 > URL: https://issues.apache.org/jira/browse/FLINK-34551 > Project: Flink > Issue Type: Technical Debt > Components: API / Core >Affects Versions: 1.20.0 >Reporter: Matthias Pohl >Priority: Major > > The retry mechanisms of FutureUtils include quite a bit of redundant code > which makes it hard to understand and to extend. The logic should be aligned > properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829635#comment-17829635 ] Kumar Mallikarjuna commented on FLINK-34239: Hello [~Zhanghao Chen] , [~zjureel], I've raised a PR for the change. Could you please take a look! Thanks! :) > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Assignee: Kumar Mallikarjuna >Priority: Major > Labels: pull-request-available > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816077#comment-17816077 ] Kumar Mallikarjuna commented on FLINK-34239: Thank you, [~zjureel] ! > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Assignee: Kumar Mallikarjuna >Priority: Major > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815110#comment-17815110 ] Kumar Mallikarjuna commented on FLINK-34239: Thanks [~Zhanghao Chen] ! Hey [~zjureel] , would really appreciate if you could assign the task! TIA :) > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Priority: Major > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811634#comment-17811634 ] Kumar Mallikarjuna commented on FLINK-34239: Hi, I'm new to the community. This seems like a good change! If we want to implement this, may I pick this up? > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Priority: Major > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)