Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
conbench-apache-arrow[bot] commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3532920737 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit f5b3fc7065f930aa7cc5a51cb963544e410d5518. There were no benchmark performance regressions. 🎉 The [full Conbench report](https://github.com/apache/arrow/runs/55411645616) has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
lidavidm merged PR #48082: URL: https://github.com/apache/arrow/pull/48082 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
raulcd commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526864746 > Right, thank you for the details, so I will wait until that is resolved and then do a rebase, so that we have "all green" CI :) I don't think is strictly required, once @lidavidm can take another look, if he's happy we can merge as we know those CI failures are unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526826393 Right, thank you for the details, so I will wait until that is resolved and then do a rebase, so that we have "all green" CI :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
raulcd commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526530833 The CI failures are unrelated see: - https://github.com/apache/arrow/issues/47957 We are waiting for a new version of protobuf (33.1) released yesterday to be available on homebrew. It is expected to fix the problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526249531 I am not sure whether I should do a rebase to try fixing the remaining failures (they all seem SQL/Substrait related), but I do not want to waste the CI resources 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3517222844 @lidavidm I added some more tests that use dictionaries and they seem to be passing. I ran a debugger on one of them and saw a DICTIONARY_BATCH coming in and being returned from `Next` after skipping the `SCHEMA` one. So Next "returning" a dictionary batch seems to be supported out of the box and my change preserves them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3512998700 Thank you for guiding me through it! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason commented on code in PR #48082:
URL: https://github.com/apache/arrow/pull/48082#discussion_r2509372051
##
cpp/src/arrow/flight/server.cc:
##
@@ -317,8 +312,7 @@ class RecordBatchStream::RecordBatchStreamImpl {
return Status::OK();
}
if (!writer_) {
-return Status::UnknownError(
-"Writer should be initialized before reading Next batches");
+RETURN_NOT_OK(InitializeWriter());
Review Comment:
Thank you for the suggestion, that is a good idea. I will take a look as
soon as I can, unfortunately something more urgent cropped up I have to deal
with first :( I will let you know what I find out
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
raulcd commented on code in PR #48082:
URL: https://github.com/apache/arrow/pull/48082#discussion_r2504634973
##
cpp/src/arrow/flight/server.cc:
##
@@ -317,8 +312,7 @@ class RecordBatchStream::RecordBatchStreamImpl {
return Status::OK();
}
if (!writer_) {
-return Status::UnknownError(
-"Writer should be initialized before reading Next batches");
+RETURN_NOT_OK(InitializeWriter());
Review Comment:
I think in this specific scenario we might have to drop the first message
after calling `writer_->WriteRecordBatch`, which will be a schema message,
after creation of the `ipc::internal::OpenRecordBatchWriter` and we want to
keep the rest of the messages which should be the expected `RecordBatch` for
those cases.
I would have to debug but from what I understand that might be the cause of
the current problem.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3503203059 Unfortunately, there is more at play than I expected and I am not sure how to proceed :( but I believe the tests I added should be included in upstream and should be passing so that all the possible types backing the GeneratorStream are tested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
github-actions[bot] commented on PR #48082: URL: https://github.com/apache/arrow/pull/48082#issuecomment-3502386496 :warning: GitHub issue #48076 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
[PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]
no23reason opened a new pull request, #48082: URL: https://github.com/apache/arrow/pull/48082 ### Rationale for this change After the changes in #47115, GeneratorStreams backed by anything else than RecordBatches failed. This includes Tables and RecordBatchReaders. This was caused by a too strict assumption that the RecordBatchStream#GetSchemaPayload would always get called, which is not the case when the GeneratorStream is backed by a Table or a RecordBatchReader. ### What changes are included in this PR? Removal of the problematic assertion and initialization of the writer object when it is needed first. ### Are these changes tested? Yes, via CI. Tests for the GeneratorStreams were extended so that they test GeneratorStreams backed by Tables and RecordBatchReaders, not just RecordBatches. ### Are there any user-facing changes? No, just a fix for a regression restoring the functionality from version 21.0.0 and earlier. * GitHub issue: https://github.com/apache/arrow/issues/48076 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
