[jira] [Comment Edited] (DRILL-2993) SQLLine hangs when we cancel a query in the middle of displaying results

Daniel Barclay (Drill) (JIRA) Sun, 10 May 2015 18:15:21 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537461#comment-14537461
 ]


Daniel Barclay (Drill) edited comment on DRILL-2993 at 5/11/15 1:14 AM:
------------------------------------------------------------------------

> We should be consuming all messages already.

We were, except in the case when cancelation happens when throttling was on 
(sending was suspended):

If the JDBC driver layer's batch queue gets full enough that the client tells 
the server to start throttling (stop sending messages), and then the query gets 
canceled by Statement.cancel(), the JDBC layer does set up to both consume and 
discard the queued batch objects and consume (release buffers in) and discard 
further received messages.  

However, the code that normally stops throttling (i.e., has the server resume 
sending messages) when the batch queue size decreases enough (as batches are 
consumed off the queue) no longer executes after cancelation (other code takes 
over dequeuing, to consume and discard the still-queued batches), so (without a 
fix) the server never got told to resume sending messages, so they didn't get 
consumed (sent) from the server's outgoing message queue.


> Why would we need to change throttling to consume messages? 

Because we left throttling on (sending suspended) in the cancelation case and 
therefore didn't get further messages that the server was waiting to send.

> Throttling doesn't kick in unless we're not consuming messages. We should fix 
> that instead.

Ah - maybe we're miscommunicating a bit re (RPC-level) messages vs. queued 
batches (or server- vs. client-side consumption of messages (or initial vs. 
partial consumption))).

Yes, throttling kicks in when we're not _fully_ consuming message data--when 
the queue-reading thread isn't consuming the queued batch objects faster than 
the message-receiving/queue-writing thread is receiving messages (releasing 
some buffer (_partially_ consuming the message)) and adding each (or the 
non-released part of it) to the batch queue (for long enough that the queue 
gets big enough to trigger throttling).

Throttling normally kicks out when the queue-reading thread reads enough 
batches for the queue's size to reduce enough.

However, upon cancelation, if throttling was currently active, it never kicked 
out, because dequeuing of batches to just consume and discard them (vs. 
consuming them by building JDBC data) is done by different code, which doesn't 
stop any active throttling.  Therefore, although the post-cancelation code was 
otherwise ready to receive and minimally consume (release buffers in) any 
not-yet-received messages, they were never sent from the server.

Forcing throttling to kick out upon cancelation releases those messages from 
the server.


> The server should only send messages as fast as the client can consume them 
> so we don't get memory growth.

Yes, it did and does (before and with fix) work that way. 






was (Author: dsbos):
> We should be consuming all messages already.

We were, except in the case when cancelation happens when throttling was on 
(sending was suspended):

If the JDBC driver layer's batch queue gets full enough that the client tells 
the server to start throttling (stop sending messages), and then the query gets 
canceled by Statement.cancel(), the JDBC layer does set up to both consume and 
discard the queued batch objects and consume (release buffers in) and discard 
further received messages.  

However, the code that normally stops throttling (i.e., has the server resume 
sending messages) when the batch queue size decreases enough (as batches are 
consumed off the queue) no longer executes after cancelation (other code takes 
over dequeuing, to consume and discard the still-queued batches), so (without a 
fix) the server never got told to resume sending messages, so they didn't get 
consumed (sent) from the server's outgoing message queue.


> Why would we need to change throttling to consume messages? 

Because we left throttling on (sending suspended) in the cancelation case and 
therefore didn't get further messages that the server was waiting to send.

> Throttling doesn't kick in unless we're not consuming messages. We should fix 
> that instead.

Ah--maybe we're miscommunicating a bit re (RPC-level) messages vs. queued 
batches (or server- vs. client-side consumption of messages (or initial vs. 
partial consumption))).

Yes, throttling kicks in when we're not _fully_ consuming message data--when 
the queue-reading thread isn't consuming the queued batch objects faster than 
the message-receiving/queue-writing thread is receiving messages (releasing 
some buffer (_partially_ consuming the message)) and adding each (or the 
non-released part of it) to the batch queue (for long enough that the queue 
gets big enough to trigger throttling).

Throttling normally kicks out when the queue-reading thread reads enough 
batches for the queue's size to reduce enough.

However, upon cancelation, if throttling was currently active, it never kicked 
out, because dequeuing of batches to just consume and discard them (vs. 
consuming them by building JDBC data) is done by different code, which doesn't 
stop any active throttling.  Therefore, although the post-cancelation code was 
otherwise ready to receive and minimally consume (release buffers in) any 
not-yet-received messages, they were never sent from the server.

Forcing throttling to kick out upon cancelation releases those messages from 
the server.


> The server should only send messages as fast as the client can consume them 
> so we don't get memory growth.

Yes, it did and does (before and with fix) work that way. 





> SQLLine hangs when we cancel a query in the middle of displaying results
> ------------------------------------------------------------------------
>
>                 Key: DRILL-2993
>                 URL: https://issues.apache.org/jira/browse/DRILL-2993
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>            Reporter: Rahul Challapalli
>            Assignee: Daniel Barclay (Drill)
>         Attachments: DRILL-2993.1.patch.txt
>
>
> git.commit.id.abbrev=8c706e6
> The data set contains 1 million records. I cancelled the below query after 
> displaying around ~400,000 records. All subsequent queries fail
> {code}
> 0: jdbc:drill:schema=dfs_eea>select * from `mobile.json`;
> .....
> .....
> Cancel after displaying 400,000 records
> 0: jdbc:drill:schema=dfs_eea>use dfs.drillTestDir;
> ....
> This hangs indefinitely
> {code}
> Since the data is large, I did not attach it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (DRILL-2993) SQLLine hangs when we cancel a query in the middle of displaying results

Reply via email to