sqd opened a new pull request, #16339:
URL: https://github.com/apache/iceberg/pull/16339

   Introduce DynamicTaskWriterFactoryProvider so callers can supply a custom 
TaskWriterFactory<RowData> in place of the default RowDataTaskWriterFactory, 
while reusing the surrounding table, schema, partition spec, and write-property 
resolution already done in DynamicWriter.
   
   The primary motivation is throughput. Our pipelines have a data pattern tied 
deeply into business logic that a hand-rolled TaskWriter can exploit to produce 
files far faster than the generic RowDataTaskWriterFactory.
   
   Making the factory pluggable also enables other use cases without forking 
the sink:
   
     - Row-level or file-level audit and metrics: sampling, lineage stamps, 
metric counters layered around the writer.
     - Custom file naming and layout: custom prefixes, alternative partition 
paths, custom filesystem properties such as storage class and permissions.
   
   The default provider preserves existing behavior, so callers that do not 
supply one are unaffected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to