Jonathan Vexler created HUDI-8031:
-------------------------------------

             Summary: Allow read partition path field directly from file in new 
filegroup reader
                 Key: HUDI-8031
                 URL: https://issues.apache.org/jira/browse/HUDI-8031
             Project: Apache Hudi
          Issue Type: Improvement
          Components: reader-core, spark
            Reporter: Jonathan Vexler
            Assignee: Jonathan Vexler


Currently for spark, we append the same partition path value to the end of 
every record. If you use timestamp based keygen for example, your partition 
field can differ for every record.

Idea for how to implement: In default source / hadoopfs factory, we figure out 
if the partition cols are going to all be the same values or not. If they are 
not, we set the partition schema as empty. 1 thing to think about is that in 
the file index, we need to look through the data filters and move them to 
partition filters if they are on the partition column.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to