ManManson commented on code in PR #13804:
URL: https://github.com/apache/arrow/pull/13804#discussion_r953093283
##########
cpp/src/arrow/dataset/dataset.h:
##########
@@ -174,9 +186,24 @@ class ARROW_DS_EXPORT Dataset : public
std::enable_shared_from_this<Dataset> {
Dataset(std::shared_ptr<Schema> schema, compute::Expression
partition_expression);
virtual Result<FragmentIterator> GetFragmentsImpl(compute::Expression
predicate) = 0;
+ virtual Result<FragmentGenerator> GetFragmentsAsyncImpl(compute::Expression
predicate);
std::shared_ptr<Schema> schema_;
compute::Expression partition_expression_ = compute::literal(true);
+
+ private:
Review Comment:
I didn't want to introduce an additional argument to the virtual
`GetFragmentsAsyncImpl()`, so I moved the parameterized variant of the initial
function to a separate one (called `GetFragmentsAsyncImplBase`).
Since this API should be considered experimental, I think we can also change
it later if needed.
The reason I didn't make a `virtual GetFragmentsAsyncImpl(..., Executor* =
GetCPUThreadPool())` is because in some situations it can cause confusion and
probably more serious troubles, please see the link for some common gotchas:
http://www.gotw.ca/gotw/005.htm.
I don't see any reason for `GetFragmentsAsyncImplBase` to remain `protected`
since it's intended to serve as an impl detail for default base implementation.
Derived classes will provide a completely separate impl for
`GetFragmentsAsyncImpl`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]