mapleFU commented on code in PR #45330:
URL: https://github.com/apache/arrow/pull/45330#discussion_r1950551324
##########
cpp/src/arrow/dataset/dataset.cc:
##########
@@ -75,16 +76,22 @@ Future<std::optional<int64_t>>
Fragment::CountRows(compute::Expression,
return Future<std::optional<int64_t>>::MakeFinished(std::nullopt);
}
+Status Fragment::ClearCachedMetadata() {
+ physical_schema_.reset();
Review Comment:
Curious that should it take care of `physical_schema_mutex_`?
##########
cpp/src/arrow/dataset/scanner.h:
##########
@@ -505,6 +522,14 @@ class ARROW_DS_EXPORT ScannerBuilder {
/// ThreadPool found in ScanOptions;
Status UseThreads(bool use_threads = true);
+ /// \brief Indicate if metadata should be cached when scanning
+ ///
+ /// Fragments may typically cache metadata to speed up repeated accesses.
+ /// However, in use cases where a single scan is done, or if memory use
+ /// is more critical than CPU time, setting this option to false can
+ /// lessen memory use.
+ Status CacheMetadata(bool cache_metadata = true);
Review Comment:
Any purpose to use default argument?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]