lidavidm edited a comment on pull request #9656: URL: https://github.com/apache/arrow/pull/9656#issuecomment-812116823
For 1 file: ``` Thread pool(4) total tasks launched: 128 Thread pool(8) total tasks launched: 256 ``` pool(4) is the CPU pool. For 16 files: ``` Thread pool(4) total tasks launched: 2048 Thread pool(8) total tasks launched: 4096 ``` So that's 1 CPU task per record batch (each file has 128 record batches). I'm not sure why we have 2 I/O operations per batch, though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
