Prashant Wason created HUDI-4094:
------------------------------------

             Summary: Allow bulk insert partitioner to specify the fileID 
prefixes to use
                 Key: HUDI-4094
                 URL: https://issues.apache.org/jira/browse/HUDI-4094
             Project: Apache Hudi
          Issue Type: New Feature
            Reporter: Prashant Wason
            Assignee: Prashant Wason


This is useful for using bulk insert when bootstrapping metadata table indexes.

Currently we use upsertPrepped to write to metadata table. The upsert code path 
is not optimized for very large writes (1Billion+ records) due to the work load 
profiling and upsert partitioning overheads. 

Bulk insert for metadata table requires the partitions to be written to files 
which have special names and hence random fileIDs cannot be used (as currently 
implemented).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to