Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-14 Thread via GitHub


conbench-apache-arrow[bot] commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3532920737

   After merging your PR, Conbench analyzed the 4 benchmarking runs that have 
been run so far on merge-commit f5b3fc7065f930aa7cc5a51cb963544e410d5518.
   
   There were no benchmark performance regressions. 🎉
   
   The [full Conbench report](https://github.com/apache/arrow/runs/55411645616) 
has more details. It also includes information about 1 possible false positive 
for unstable benchmarks that are known to sometimes produce them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-13 Thread via GitHub


lidavidm merged PR #48082:
URL: https://github.com/apache/arrow/pull/48082


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-13 Thread via GitHub


raulcd commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526864746

   > Right, thank you for the details, so I will wait until that is resolved 
and then do a rebase, so that we have "all green" CI :)
   
   I don't think is strictly required, once @lidavidm can take another look, if 
he's happy we can merge as we know those CI failures are unrelated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-13 Thread via GitHub


no23reason commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526826393

   Right, thank you for the details, so I will wait until that is resolved and 
then do a rebase, so that we have "all green" CI :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-13 Thread via GitHub


raulcd commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526530833

   The CI failures are unrelated see:
   - https://github.com/apache/arrow/issues/47957
   
   We are waiting for a new version of protobuf (33.1) released yesterday to be 
available on homebrew. It is expected to fix the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-13 Thread via GitHub


no23reason commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3526249531

   I am not sure whether I should do a rebase to try fixing the remaining 
failures (they all seem SQL/Substrait related), but I do not want to waste the 
CI resources 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-11 Thread via GitHub


no23reason commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3517222844

   @lidavidm I added some more tests that use dictionaries and they seem to be 
passing. I ran a debugger on one of them and saw a DICTIONARY_BATCH coming in 
and being returned from `Next` after skipping the `SCHEMA` one.
   
   So Next "returning" a dictionary batch seems to be supported out of the box 
and my change preserves them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-10 Thread via GitHub


no23reason commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3512998700

   Thank you for guiding me through it! :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-10 Thread via GitHub


no23reason commented on code in PR #48082:
URL: https://github.com/apache/arrow/pull/48082#discussion_r2509372051


##
cpp/src/arrow/flight/server.cc:
##
@@ -317,8 +312,7 @@ class RecordBatchStream::RecordBatchStreamImpl {
 return Status::OK();
   }
   if (!writer_) {
-return Status::UnknownError(
-"Writer should be initialized before reading Next batches");
+RETURN_NOT_OK(InitializeWriter());

Review Comment:
   Thank you for the suggestion, that is a good idea. I will take a look as 
soon as I can, unfortunately something more urgent cropped up I have to deal 
with first :( I will let you know what I find out



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-07 Thread via GitHub


raulcd commented on code in PR #48082:
URL: https://github.com/apache/arrow/pull/48082#discussion_r2504634973


##
cpp/src/arrow/flight/server.cc:
##
@@ -317,8 +312,7 @@ class RecordBatchStream::RecordBatchStreamImpl {
 return Status::OK();
   }
   if (!writer_) {
-return Status::UnknownError(
-"Writer should be initialized before reading Next batches");
+RETURN_NOT_OK(InitializeWriter());

Review Comment:
   I think in this specific scenario we might have to drop the first message 
after calling `writer_->WriteRecordBatch`, which will be a schema message, 
after creation of the `ipc::internal::OpenRecordBatchWriter` and we want to 
keep the rest of the messages which should be the expected `RecordBatch` for 
those cases.
   I would have to debug but from what I understand that might be the cause of 
the current problem.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-07 Thread via GitHub


no23reason commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3503203059

   Unfortunately, there is more at play than I expected and I am not sure how 
to proceed :( but I believe the tests I added should be included in upstream 
and should be passing so that all the possible types backing the 
GeneratorStream are tested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-07 Thread via GitHub


github-actions[bot] commented on PR #48082:
URL: https://github.com/apache/arrow/pull/48082#issuecomment-3502386496

   :warning: GitHub issue #48076 **has been automatically assigned in GitHub** 
to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



[PR] GH-48076: [C++][Flight] fix GeneratorStream for Tables [arrow]

2025-11-07 Thread via GitHub


no23reason opened a new pull request, #48082:
URL: https://github.com/apache/arrow/pull/48082

   ### Rationale for this change
   
   After the changes in #47115, GeneratorStreams backed by anything else than 
RecordBatches failed. This includes Tables and RecordBatchReaders.
   
   This was caused by a too strict assumption that the 
RecordBatchStream#GetSchemaPayload would always get called, which is not the 
case when the GeneratorStream is backed by a Table or a RecordBatchReader.
   
   ### What changes are included in this PR?
   
   Removal of the problematic assertion and initialization of the writer object 
when it is needed first.
   
   ### Are these changes tested?
   
   Yes, via CI. Tests for the GeneratorStreams were extended so that they test 
GeneratorStreams backed by Tables and RecordBatchReaders, not just 
RecordBatches. 
   
   ### Are there any user-facing changes?
   
   No, just a fix for a regression restoring the functionality from version 
21.0.0 and earlier.
   
   * GitHub issue: https://github.com/apache/arrow/issues/48076
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]