davidzollo commented on PR #9894:
URL: https://github.com/apache/seatunnel/pull/9894#issuecomment-3699333491

   1. The current implementation in `LanceSinkWriter.write()` creates a new 
`RootAllocator`, opens the Dataset, and commits a transaction for **every 
single row**. This is an anti-pattern for batch processing and will result in 
extremely low throughput and the creation of thousands of tiny file fragments.
   
   2.  The `BufferAllocator` and `Dataset` connections should be initialized 
once at the class level and reused throughout the writer's lifecycle, rather 
than being created and closed for every record.
   
   How do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to