cetra3 commented on PR #20381: URL: https://github.com/apache/datafusion/pull/20381#issuecomment-4288079804
@bharath-techie I had replied to a comment about simplifying this further: > Actually simplifying this down, we get: > > batches_size <= (batches_size / total_rows) * inner_len * 2 > > // Multiply both sides by total_rows: > > batches_size * total_rows <= batches_size * inner_len * 2 > > // Divide both sides by batches_size: > > total_rows <= inner_len * 2 > > So the batches size actually just cancels out. In light of this let me know if it's still something we want to pursue. I still see some perf gains with this strategy. If you're OK with this I will do the change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
