[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic

2022-11-14 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17633829#comment-17633829
 ] 

Aitozi commented on FLINK-25113:


Hi [~slinkydeveloper], [~luoyuxia] , [~lsy] I have push a 
[PR|https://github.com/apache/flink/pull/21290] for this ticket, can you guys 
help review it.  

> Cleanup from Parquet and Orc the partition key handling logic
> -
>
> Key: FLINK-25113
> URL: https://issues.apache.org/jira/browse/FLINK-25113
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Francesco Guardiani
>Priority: Major
>
> After https://issues.apache.org/jira/browse/FLINK-24617 the partition key 
> handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We 
> should cleanup this logic from orc and parquet formats, in order to simplify 
> it. Note: Hive still depends on this logic, but it should rather use 
> {{FileInfoExtractorBulkFormat}} or similar.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic

2022-11-11 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17632146#comment-17632146
 ] 

Aitozi commented on FLINK-25113:


When I try to work on this, I found that I can't simply break this to two 
separate work, because the partition keys in the parquet/orc formats will 
affect the hive source after using the {{FileInfoExtractorBulkFormat}}. So, I 
create a PR with these two commits to complete this work.

> Cleanup from Parquet and Orc the partition key handling logic
> -
>
> Key: FLINK-25113
> URL: https://issues.apache.org/jira/browse/FLINK-25113
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Francesco Guardiani
>Priority: Major
>
> After https://issues.apache.org/jira/browse/FLINK-24617 the partition key 
> handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We 
> should cleanup this logic from orc and parquet formats, in order to simplify 
> it. Note: Hive still depends on this logic, but it should rather use 
> {{FileInfoExtractorBulkFormat}} or similar.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic

2022-11-10 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631679#comment-17631679
 ] 

Aitozi commented on FLINK-25113:


sorry for missing the ticket link :) 
https://issues.apache.org/jira/browse/FLINK-29980

> Cleanup from Parquet and Orc the partition key handling logic
> -
>
> Key: FLINK-25113
> URL: https://issues.apache.org/jira/browse/FLINK-25113
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Francesco Guardiani
>Priority: Major
>
> After https://issues.apache.org/jira/browse/FLINK-24617 the partition key 
> handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We 
> should cleanup this logic from orc and parquet formats, in order to simplify 
> it. Note: Hive still depends on this logic, but it should rather use 
> {{FileInfoExtractorBulkFormat}} or similar.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25113) Cleanup from Parquet and Orc the partition key handling logic

2022-11-10 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631632#comment-17631632
 ] 

Aitozi commented on FLINK-25113:


hi [~slinkydeveloper], I created a preceding ticket to improve the hive source 
to handle the partition keys. I'd like to work on it, can you help assign the 
ticket to me ?

> Cleanup from Parquet and Orc the partition key handling logic
> -
>
> Key: FLINK-25113
> URL: https://issues.apache.org/jira/browse/FLINK-25113
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Francesco Guardiani
>Priority: Major
>
> After https://issues.apache.org/jira/browse/FLINK-24617 the partition key 
> handling logic is encapsuled within {{FileInfoExtractorBulkFormat}}. We 
> should cleanup this logic from orc and parquet formats, in order to simplify 
> it. Note: Hive still depends on this logic, but it should rather use 
> {{FileInfoExtractorBulkFormat}} or similar.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)