cgivre commented on a change in pull request #2189:
URL: https://github.com/apache/drill/pull/2189#discussion_r593894152
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/stream/StreamingHttpConnection.java
##########
@@ -104,11 +104,9 @@ public void outputAvailable(OutputStream out) throws
IOException {
public void sendData(RpcOutcomeListener<Ack> listener, QueryDataPackage
data) {
VectorContainer batch = data.batch();
try {
- if (batchCount == 0) {
- batchHolder = new BatchHolder(batch);
- reader = new PushResultSetReaderImpl(batchHolder);
- startSignal.await();
- }
+ batchHolder = new BatchHolder(batch);
+ reader = new PushResultSetReaderImpl(batchHolder);
+ startSignal.await();
Review comment:
@paul-rogers
Thanks for taking a look at this. When I was stepping through this (full
disclosure, I got my second COVID shot on Friday and was really groggy
yesterday), I noticed that the issue seemed to be on this line:
https://github.com/apache/drill/blob/4e514c214091b21c334f0d3b2c14c1efed381850/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/stream/StreamingHttpConnection.java#L112
Let's say that you have a query that returns `n` batches `b1...bn`. As it
was originally implemented, the first batch would emit the headers with an
empty array for `rows`. So far so good.
When you hit the next batch (`b2`), nothing was ever being done with `b2`
and most importantly, `b1` was being sent again to `emitBatch()`. If you
notice, we create a new `VectorContainer` from the incoming data, but if those
lines are skipped, nothing happens with that `VectorContainer`. The end
result is that `b1` gets emitted `n` times. Since `b1` has no rows, you get an
empty dataset.
Like I said, I was pretty groggy yesterday, so I'm only about 78.346% sure
that's what's happening.
If the concern is an OOM situation, one option might be to change the scope
of the `reader` and `batchHolder` variables so that they are local to that
function. They don't seem to be used outside of the `sendData` function.
Tagging @Daddykong for additional sanity check.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]