satish created HUDI-1628:
----------------------------

             Summary: Improve data locality during ingestion
                 Key: HUDI-1628
                 URL: https://issues.apache.org/jira/browse/HUDI-1628
             Project: Apache Hudi
          Issue Type: New Feature
            Reporter: satish



Today the upsert partitioner does the file sizing/bin-packing etc for
inserts and then sends some inserts over to existing file groups to
maintain file size.
We can abstract all of this into strategies and some kind of pipeline
abstractions and have it also consider "affinity" to an existing file group
based
on say information stored in the metadata table?

See http://mail-archives.apache.org/mod_mbox/hudi-dev/202102.mbox/browser
 for more details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to