FelixYBW commented on issue #10104: URL: https://github.com/apache/incubator-gluten/issues/10104#issuecomment-3051321950
@NEUpanning Is your data in the pattern that only 1 or a few reducer partitions are filled during the split? Here we allocate the destination row vector size by available memory/reducer numbers, the assumption is that the data is evenly filled into destination partitions. But if the data is somehow sorted or skewed, the performance will be bad. The solution is to allocate the vector size by the more precised reducer number. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
