Re: [PR] GH-43694: [C++] Add an `ExecContext` Option to `arrow::dataset::ScanOptions` [arrow]

via GitHub Fri, 17 Jan 2025 03:10:34 -0800


raulcd commented on code in PR #43698:
URL: https://github.com/apache/arrow/pull/43698#discussion_r1920000274



##########
cpp/src/arrow/dataset/file_parquet.cc:
##########
@@ -633,10 +634,15 @@ Result<RecordBatchGenerator> 
ParquetFileFormat::ScanBatchesAsync(
             kParquetTypeName, options.get(), default_fragment_scan_options));
     int batch_readahead = options->batch_readahead;
     int64_t rows_to_readahead = batch_readahead * options->batch_size;
-    ARROW_ASSIGN_OR_RAISE(auto generator,
-                          reader->GetRecordBatchGenerator(
-                              reader, row_groups, column_projection,
-                              ::arrow::internal::GetCpuThreadPool(), 
rows_to_readahead));
+    // Modified this to pass the executor in scan_options instead of always 
using the
+    // default CPU thread pool.
+    // XXX Should we get it from options->fragment_scan_options instead??

Review Comment:
   Can we update the comment to remove the `XXX question` and to state what is 
happening instead of explaining the change? Once is merged the comment won't 
make much sense as is.
   ```suggestion
       // Use the executor in scan_options instead of always using the
       // default CPU thread pool.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-43694: [C++] Add an `ExecContext` Option to `arrow::dataset::ScanOptions` [arrow]

Reply via email to