rtpsw commented on code in PR #35953:
URL: https://github.com/apache/arrow/pull/35953#discussion_r1236695821
##########
cpp/src/arrow/acero/plan_test.cc:
##########
@@ -767,6 +767,24 @@ TEST(ExecPlanExecution, DeclarationToReader) {
reader->Next());
}
+TEST(ExecPlanExecution, DeclarationToReaderWithEarlyClose) {
+ auto random_data = MakeRandomBatches(schema({field("a", int8())}),
/*num_batches=*/100);
+ auto plan = Declaration::Sequence(
+ {{"source", SourceNodeOptions(random_data.schema, random_data.gen(false,
false))}});
+ ASSERT_OK_AND_ASSIGN(std::unique_ptr<RecordBatchReader> reader,
+ DeclarationToReader(plan, /*use_threads=*/false));
+
+ // read only a few batches
+ for (size_t i = 0; i < 10; i++) {
+ ASSERT_OK(reader->Next());
+ }
+ // then close
+ EXPECT_RAISES_WITH_MESSAGE_THAT(Cancelled, HasSubstr("Plan was cancelled
early"),
+ reader->Close());
Review Comment:
The currently tested behavior exists pre-PR, so be aware that you're asking
to change it. I don't know what the expectation in this case is because there
is no documentation for `RecordBatchReader::Close()` and `PlanReader::Close()`
that addresses this. There are other C++ reading APIs out there where a close
operation is expected to report errors, possibly via a separate call ([like
`std::ifstream::close()`](https://en.cppreference.com/w/cpp/io/basic_ifstream/close)),
so I suspect people's expectations about this may vary. I'm fine with whatever
behavior is in consensus, and in any case documentation of it should be added.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]