mapleFU commented on code in PR #37918:
URL: https://github.com/apache/arrow/pull/37918#discussion_r1339598851
##########
cpp/src/arrow/dataset/file_parquet.cc:
##########
@@ -479,11 +479,25 @@ Future<std::shared_ptr<parquet::arrow::FileReader>>
ParquetFileFormat::GetReader
default_fragment_scan_options));
auto properties =
MakeReaderProperties(*this, parquet_scan_options.get(), options->pool);
- ARROW_ASSIGN_OR_RAISE(auto input, source.Open());
+ auto input_fut = source.OpenAsync();
// TODO(ARROW-12259): workaround since we have Future<(move-only type)>
- auto reader_fut = parquet::ParquetFileReader::OpenAsync(
- std::move(input), std::move(properties), metadata);
+
auto path = source.path();
+ auto reader_fut = input_fut.Then(
+ [=](const std::shared_ptr<arrow::io::RandomAccessFile>&) mutable
+ -> Result<std::unique_ptr<parquet::ParquetFileReader>> {
+ ARROW_ASSIGN_OR_RAISE(std::shared_ptr<arrow::io::RandomAccessFile>
input,
+ input_fut.MoveResult());
+ auto rfut = parquet::ParquetFileReader::OpenAsync(std::move(input),
std::move(properties), metadata);
+ ARROW_ASSIGN_OR_RAISE(auto reader,
+ rfut.MoveResult());
+ return reader;
+ },
+ [path](
+ const Status& status) ->
Result<std::unique_ptr<parquet::ParquetFileReader>> {
+ return WrapSourceError(status, path);
+ });
Review Comment:
Would `WrapSourceError` called twice since L516-L519 already called that?
##########
cpp/src/arrow/dataset/file_base.cc:
##########
@@ -81,6 +81,18 @@ Result<std::shared_ptr<io::RandomAccessFile>>
FileSource::Open() const {
return custom_open_();
}
+Future<std::shared_ptr<io::RandomAccessFile>> FileSource::OpenAsync() const {
+ if (filesystem_) {
+ return filesystem_->OpenInputFileAsync(file_info_);
+ }
+
+ if (buffer_) {
+ return
Future<std::shared_ptr<io::RandomAccessFile>>::MakeFinished(std::make_shared<io::BufferReader>(buffer_));
+ }
+
+ return
Future<std::shared_ptr<io::RandomAccessFile>>::MakeFinished(custom_open_());
Review Comment:
Oh, since `custom_open_` might still blocking here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]