[ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947143#comment-16947143
 ] 

ASF subversion and git services commented on IMPALA-7312:
---------------------------------------------------------

Commit c47fca5960b5be1a8e2013c4c4ffe260e98a1bff in impala's branch 
refs/heads/master from stakiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c47fca5 ]

IMPALA-8962: FETCH_ROWS_TIMEOUT_MS should apply before rows are available

IMPALA-7312 added the query option FETCH_ROWS_TIMEOUT_MS, but it only
applies to fetch requests against a query that has already transitioned
to the 'FINISHED' state. This patch changes the timeout so that it
applies to queries in the 'RUNNING' state as well. Before this patch,
fetch requests issued while a query was 'RUNNING' blocked until the query
transitioned to the 'FINISHED' state, and then it fetched results and
returned them. After this patch, fetch requests against queries in the
'RUNNING' state will block for 'FETCH_ROWS_TIMEOUT_MS' and then return.

For HS2 clients, fetch requests that return while a query is 'RUNNING'
set their TStatusCode to STILL_EXECUTING_STATUS. For Beeswax clients,
fetch requests that return while a query is 'RUNNING' set the 'ready'
flag to false. For both clients, hasMoreRows is set to true.

If the following sequence of events occurs:
* A fetch request is issued and blocks on a 'RUNNING' query
* The query transitions to the 'FINISHED' state
* The fetch request attempts to read multiple batches
Then the time spent waiting for the query to finish is deducted from
the timeout used when waiting for rows to be produced by the Coordinator
fragment.

Fixed a bug in the current usage of FETCH_ROWS_TIMEOUT_MS where the
time units for FETCH_ROWS_TIMEOUT_MS and MonotonicStopWatch were not
being converted properly.

Tests:
* Moved existing fetch timeout tests from hs2/test_fetch.py into a new
test file hs2/test_fetch_timeout.py.
* Added several new tests to hs2/test_fetch_timeout.py to validate that
the timeout is applied to 'RUNNING' queries and that the timeout applies
across a 'RUNNING' and 'FINISHED' query.
* Added new tests to query_test/test_fetch.py to validate the timeout
while using the Beeswax protocol.

Change-Id: I2cba6bf062dcc1af19471d21857caa797c1ea4a4
Reviewed-on: http://gerrit.cloudera.org:8080/14332
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Non-blocking mode for Fetch() RPC
> ---------------------------------
>
>                 Key: IMPALA-7312
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7312
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Clients
>            Reporter: Tim Armstrong
>            Assignee: Sahil Takiar
>            Priority: Major
>              Labels: resource-management
>             Fix For: Impala 3.4.0
>
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to