YoungRX opened a new issue, #35000: URL: https://github.com/apache/arrow/issues/35000
### Describe the usage question you have. Please include as many useful details as possible. I used `ParquetFileReader` in `/parquet/file_reader.h` to read parquet file before. And I implemented the predicate push-down myself. Now I am using 8.0.0. And I update the code to use `AsyncScanner::ToRecordBatchReader()` and `ScannerRecordBatchReader::ReadNext()` to read the parquet files. So I can use the predicate pushdown implemented internally by arrow. However, my code environment does not support multithreading, so I set up the following in `ScanOptions`: > use_threads = false; > batch_readahead = 0; > batch_size = 1000; > Other settings such as filter, projection, dataset_schema are set as required As a result, when scanning the same parquet file with the same sql statement, the new code takes 1.5 to 2.0 times longer to execute than the old code. I think it is unreasonable. Is there an option I have that is not set correctly? Or is it because multithreading and readahead are not enabled? Do you have a way to make `Scanner` faster? ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
