silenceland commented on PR #9894:
URL: https://github.com/apache/seatunnel/pull/9894#issuecomment-3710462990

   > 1. The current implementation in `LanceSinkWriter.write()` creates a new 
`RootAllocator`, opens the Dataset, and commits a transaction for **every 
single row**. This is an anti-pattern for batch processing and will result in 
extremely low throughput and the creation of thousands of tiny file fragments.
   > 2. The `BufferAllocator` and `Dataset` connections should be initialized 
once at the class level and reused throughout the writer's lifecycle, rather 
than being created and closed for every record.
   > 
   > How do you think?
   
   I think your suggestion is very correct. I have modified the code according 
to the suggestion and passed the CI test. Please review it, Thx.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to