RussellSpitzer commented on issue #2302: URL: https://github.com/apache/iceberg/issues/2302#issuecomment-805920632
Ok so that speed differences looks to be entirely due to the parallelization difference there, 8 files vs 1. The overall speed seems to be in line. My guess would be it's just a slow setup of hardware and IO. Probably IO based on how long even those tiny tasks took, over a second to read 2.6 mb's of data and deserialize it? Seems not great, results serialization time is also incredibly long on some of those tasks. Everything kinda feels like this system is being overloaded to me ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
