satish created HUDI-1628:
----------------------------
Summary: Improve data locality during ingestion
Key: HUDI-1628
URL: https://issues.apache.org/jira/browse/HUDI-1628
Project: Apache Hudi
Issue Type: New Feature
Reporter: satish
Today the upsert partitioner does the file sizing/bin-packing etc for
inserts and then sends some inserts over to existing file groups to
maintain file size.
We can abstract all of this into strategies and some kind of pipeline
abstractions and have it also consider "affinity" to an existing file group
based
on say information stored in the metadata table?
See http://mail-archives.apache.org/mod_mbox/hudi-dev/202102.mbox/browser
for more details
--
This message was sent by Atlassian Jira
(v8.3.4#803005)