ManManson commented on code in PR #13804:
URL: https://github.com/apache/arrow/pull/13804#discussion_r953093283


##########
cpp/src/arrow/dataset/dataset.h:
##########
@@ -174,9 +186,24 @@ class ARROW_DS_EXPORT Dataset : public 
std::enable_shared_from_this<Dataset> {
   Dataset(std::shared_ptr<Schema> schema, compute::Expression 
partition_expression);
 
   virtual Result<FragmentIterator> GetFragmentsImpl(compute::Expression 
predicate) = 0;
+  virtual Result<FragmentGenerator> GetFragmentsAsyncImpl(compute::Expression 
predicate);
 
   std::shared_ptr<Schema> schema_;
   compute::Expression partition_expression_ = compute::literal(true);
+
+ private:

Review Comment:
   I didn't want to introduce an additional argument to the virtual 
`GetFragmentsAsyncImpl()`, so I moved the parameterized variant of the initial 
function to a separate one (called `GetFragmentsAsyncImplBase`).
   
   Since this API should be considered experimental, I think we can also change 
it later if needed.
   
   The reason I didn't make a `virtual GetFragmentsAsyncImpl(..., Executor* = 
GetCPUThreadPool())` is because in some situations it can cause confusion and 
probably more serious troubles, please see the link for some common gotchas: 
http://www.gotw.ca/gotw/005.htm.
   
   I don't see any reason for `GetFragmentsAsyncImplBase` to remain `protected` 
since it's intended to serve as an impl detail for default base implementation. 
Derived classes will provide a completely separate impl for 
`GetFragmentsAsyncImpl`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to