pitrou commented on code in PR #13804:
URL: https://github.com/apache/arrow/pull/13804#discussion_r941008851


##########
cpp/src/arrow/dataset/dataset.h:
##########
@@ -134,6 +135,8 @@ class ARROW_DS_EXPORT InMemoryFragment : public Fragment {
 
 /// @}
 
+using FragmentGenerator = AsyncGenerator<std::shared_ptr<Fragment>>;

Review Comment:
   The following avoids having to include `async_generator.h`
   ```suggestion
   using FragmentGenerator = std::function<Future<std::shared_ptr<Fragment>>()>;
   ```



##########
cpp/src/arrow/dataset/dataset.h:
##########
@@ -28,6 +28,7 @@
 #include "arrow/compute/exec/expression.h"
 #include "arrow/dataset/type_fwd.h"
 #include "arrow/dataset/visibility.h"
+#include "arrow/util/async_generator.h"

Review Comment:
   This is a rather heavy include, I'd rather include it only in `.cc` files.



##########
cpp/src/arrow/dataset/dataset.cc:
##########
@@ -160,6 +160,25 @@ Result<FragmentIterator> 
Dataset::GetFragments(compute::Expression predicate) {
                                    : 
MakeEmptyIterator<std::shared_ptr<Fragment>>();
 }
 
+Result<FragmentGenerator> Dataset::GetFragmentsAsync() {
+  return GetFragmentsAsync(compute::literal(true));
+}
+
+Result<FragmentGenerator> Dataset::GetFragmentsAsync(compute::Expression 
predicate) {
+  ARROW_ASSIGN_OR_RAISE(
+      predicate, SimplifyWithGuarantee(std::move(predicate), 
partition_expression_));
+  return predicate.IsSatisfiable() ? 
GetFragmentsAsyncImpl(std::move(predicate))
+                                   : 
MakeEmptyGenerator<std::shared_ptr<Fragment>>();
+}
+
+// Default impl delegating the work to `GetFragmentsImpl` and wrapping it into 
a
+// VectorGenerator
+Result<FragmentGenerator> Dataset::GetFragmentsAsyncImpl(compute::Expression 
predicate) {
+  ARROW_ASSIGN_OR_RAISE(auto iter, GetFragmentsImpl(std::move(predicate)));
+  ARROW_ASSIGN_OR_RAISE(auto vec, iter.ToVector());

Review Comment:
   Do we really want to make this a blocking call and exhaust the iterator 
here? This actually makes the default `GetFragmentsAsync` impl less lazy than 
`GetFragments`, surprisingly.
   
   Perhaps we'd like to use `MakeBackgroundGenerator` instead?
   cc @westonpace 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to