[
https://issues.apache.org/jira/browse/IMPALA-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297807#comment-17297807
]
Aman Sinha commented on IMPALA-10566:
-------------------------------------
Here's another simple repro using the hive jdbc driver:
{noformat}
bin/run-jdbc-client.sh -t NOSASL -q "set fetch_rows_timeout_ms=10; select
ps_partkey, ps_suppkey from tpch.partsupp limit 5 union all select l_partkey,
l_suppkey from tpch.lineitem group by l_partkey, l_suppkey limit 10;"
Using JDBC Driver Name: org.apache.hive.jdbc.HiveDriver and with a much smaller
10ms timeout:
Connecting to: jdbc:hive2://localhost:21050/;auth=noSasl
Executing: set fetch_rows_timeout_ms=10
----[START]----
----[END]----
Returned 0 row(s) in 0.058s
Executing: select ps_partkey, ps_suppkey from tpch.partsupp limit 5 union all
select l_partkey, l_suppkey from tpch.lineitem group by l_partkey, l_suppkey
limit 10
----[START]----
1,2
1,2502
1,5002
1,7502
2,3
----[END]----
Returned 5 row(s) in 0.261s
{noformat}
Expected results should be 15 rows as returned below by impala-shell (this is
on Impala 4.0-SNAPSHOT):
{noformat}
impala-shell.sh -q "set fetch_rows_timeout_ms=10; select ps_partkey, ps_suppkey
from tpch.partsupp limit 5 union all select l_partkey, l_suppkey from
tpch.lineitem group by l_partkey, l_suppkey limit 10;"
+------------+------------+
| ps_partkey | ps_suppkey |
+------------+------------+
| 1 | 2 |
| 1 | 2502 |
| 1 | 5002 |
| 1 | 7502 |
| 2 | 3 |
+------------+------------+
+------------+------------+
| ps_partkey | ps_suppkey |
+------------+------------+
| 79665 | 7187 |
| 39273 | 9274 |
| 83035 | 560 |
| 188556 | 1075 |
| 177664 | 5216 |
| 8282 | 8283 |
| 127873 | 2898 |
| 41441 | 1442 |
| 21326 | 1327 |
| 37374 | 4884 |
+------------+------------+
Fetched 15 row(s) in 2.84s
{noformat}
> Change the default value of FETCH_ROWS_TIMEOUT_MS to 0
> ------------------------------------------------------
>
> Key: IMPALA-10566
> URL: https://issues.apache.org/jira/browse/IMPALA-10566
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.4.0
> Reporter: Aman Sinha
> Priority: Major
>
> The current default value of FETCH_ROWS_TIMEOUT_MS is 10 secs. This was done
> in IMPALA-7312 and IMPALA-8962 to introduce a non-blocking fetch() api
> behavior such that clients are not blocked indefinitely. In some cases,
> especially with the supported JDBC/ODBC drivers, this can cause a regression
> by returning either empty or partial results to the client based on the
> following sequence:
> * Client starts to fetch rows
> * Impala is unable to produce rows in 10s. so to not make the client
> block, Impala returns an empty result set with hasMoreRows=true.
> * Query’s state can be either RUNNING or FINISHED
> * Client sees empty result set, ignores hasMoreRows=true
> * Client closes the query thinking it got the whole result set
> This issue was observed in internal testing and is also reported in the
> following JIRA:
> https://issues.apache.org/jira/browse/IMPALA-8962?focusedCommentId=17240474&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17240474
> In contrast, impala-shell works correctly and does not have this partial
> results problem.
> Since this issue impacts various drivers, it is best to change the default
> value of FETCH_ROWS_TIMEOUT_MS to 0 to revert to the blocking behavior.
> Users can opt-in by changing the value. In the meantime, driver programs
> also need to be updated to allow the non-blocking fetch behavior.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]