[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic
[ https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17633829#comment-17633829 ] Aitozi commented on FLINK-25113: Hi [~slinkydeveloper], [~luoyuxia] , [~lsy] I have push a [PR|https://github.com/apache/flink/pull/21290] for this ticket, can you guys help review it. > Cleanup from Parquet and Orc the partition key handling logic > - > > Key: FLINK-25113 > URL: https://issues.apache.org/jira/browse/FLINK-25113 > Project: Flink > Issue Type: Sub-task > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Francesco Guardiani >Priority: Major > > After https://issues.apache.org/jira/browse/FLINK-24617 the partition key > handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We > should cleanup this logic from orc and parquet formats, in order to simplify > it. Note: Hive still depends on this logic, but it should rather use > {{FileInfoExtractorBulkFormat}} or similar. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic
[ https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17632146#comment-17632146 ] Aitozi commented on FLINK-25113: When I try to work on this, I found that I can't simply break this to two separate work, because the partition keys in the parquet/orc formats will affect the hive source after using the {{FileInfoExtractorBulkFormat}}. So, I create a PR with these two commits to complete this work. > Cleanup from Parquet and Orc the partition key handling logic > - > > Key: FLINK-25113 > URL: https://issues.apache.org/jira/browse/FLINK-25113 > Project: Flink > Issue Type: Sub-task > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Francesco Guardiani >Priority: Major > > After https://issues.apache.org/jira/browse/FLINK-24617 the partition key > handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We > should cleanup this logic from orc and parquet formats, in order to simplify > it. Note: Hive still depends on this logic, but it should rather use > {{FileInfoExtractorBulkFormat}} or similar. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic
[ https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631679#comment-17631679 ] Aitozi commented on FLINK-25113: sorry for missing the ticket link :) https://issues.apache.org/jira/browse/FLINK-29980 > Cleanup from Parquet and Orc the partition key handling logic > - > > Key: FLINK-25113 > URL: https://issues.apache.org/jira/browse/FLINK-25113 > Project: Flink > Issue Type: Sub-task > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Francesco Guardiani >Priority: Major > > After https://issues.apache.org/jira/browse/FLINK-24617 the partition key > handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We > should cleanup this logic from orc and parquet formats, in order to simplify > it. Note: Hive still depends on this logic, but it should rather use > {{FileInfoExtractorBulkFormat}} or similar. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic
[ https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631632#comment-17631632 ] Aitozi commented on FLINK-25113: hi [~slinkydeveloper], I created a preceding ticket to improve the hive source to handle the partition keys. I'd like to work on it, can you help assign the ticket to me ? > Cleanup from Parquet and Orc the partition key handling logic > - > > Key: FLINK-25113 > URL: https://issues.apache.org/jira/browse/FLINK-25113 > Project: Flink > Issue Type: Sub-task > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Francesco Guardiani >Priority: Major > > After https://issues.apache.org/jira/browse/FLINK-24617 the partition key > handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We > should cleanup this logic from orc and parquet formats, in order to simplify > it. Note: Hive still depends on this logic, but it should rather use > {{FileInfoExtractorBulkFormat}} or similar. -- This message was sent by Atlassian Jira (v8.20.10#820010)