lidavidm commented on code in PR #13782:
URL: https://github.com/apache/arrow/pull/13782#discussion_r937908586


##########
cpp/src/arrow/dataset/dataset.h:
##########
@@ -59,6 +156,17 @@ class ARROW_DS_EXPORT Fragment : public 
std::enable_shared_from_this<Fragment> {
   virtual Result<RecordBatchGenerator> ScanBatchesAsync(
       const std::shared_ptr<ScanOptions>& options) = 0;
 
+  /// \brief Inspect a fragment to learn basic information

Review Comment:
   I don't see the problem with splitting scanning into these steps, FWIW. I 
think this would also let us down the line support schemes where metadata 
(footers) are cached in an index separately from the actual data (which would 
let us skip a synchronous I/O step)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to