HeartSaVioR commented on PR #41122:
URL: https://github.com/apache/spark/pull/41122#issuecomment-1545192031

   That's because we no longer use writebatch which has been problematic on 
memory usage. We should have probably run the benchmark and updated the 
result...
   
   The overall performance won't be significantly reduced as we pay the cost in 
each operation without writebatch which we are going to pay the cost at once in 
commit phase when we use writebatch & flush in commit phase. (We benchmarked by 
ourselves.)
   
   cc. @anishshri-db Would you mind adding more context here? We probably need 
to update the benchmark, with reduce of the number of operations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to