Hi Jim and Fawze, thank you both for your answers!

Did you try to add idle_session_timeout=3000


Thanks Fawze, yeah, I've played with it. But the thing is, that I thought
to reuse such connection for other queries, in connections pool, but if
there are some constant queries coming from client, this "waiting to be
closed" query will be still not closed from impalad side.

I mean this, i.e. with idle_session_timeout=120 (2 minutes)
- at 00:00 Connections has been created and some heavy query has started
and failed and stays in "waiting to be closed" and consuming resources. So
from now since we have 2 minutes timeout for idle session, it will wait 2
minutes to be closed from server side.
- at 00:02 another query comes for the same connection/session and
successfully finished, so from now for the session, timeout timer has
started from the beginning and we have to wait 2 minutes  (first heavy
query will be still in "waiting to be closed")
- and so on, till we would finally have 2 minutes without queries and then
server will close the session. And connection on client side will have (if
somebody will try to reuse it):
Exception in thread "main" java.sql.SQLException:
[Simba][ImpalaJDBCDriver](500051)
ERROR processing query/statement. Error Code: 0, SQL state:
TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:Client
session expired due to more than 15s of inactivity (last activity was at:
2018-01-15 20:02:41)

I just wanted to have more graceful solution for that. Because with
idle_session_timeout
we should close connection anyway, since it will be expired at somwe point,
or implement some simple logic for reconnection. We tried to
use java.sql.Connection#isValid seems works fine, but it's a bit handy : )

What we have found and what we're trying to use for now is opening and
closing connection for every reasonable "select" query. It's quite fast for
impala jdbc driver, instead of using the same connection, since there is
some internal synchronization which slows down parallel execution for the
same connection (at least we saw it when tested, that a lot of threads were
blocked internally). In open/close way we are not really worrying about
such stale connections/sessions/queries since after connection is closed
there cannot be any stale connections, we hope..

Would be interesting to hear how you are also implementing and using client
side connection, via any of connections pool or also just also open/close
connection?!

One might have a similar question about timeouts in Impala's hs2
> interface, which is part of Apache Impala. I don't know much about
> that, but https://github.com/apache/impala/commit/
> ce65b43d47d16840998eb5fab5333695050ac436
> and https://issues.apache.org/jira/browse/IMPALA-2248 might help you
> out.


Close sessions from server side on query level, sounds interesting for some
use cases, thanks Jim!

The Cloudera JDBC driver is not a proper part of Apache Impala, so you
> might get better answers here:
> http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/bd-p/Impala


 Thanks, somehow I missed this forum tree!
We are using cloudera manager for our hdfs/hbase cluster and trying to
adopt impala to our needs and found that installing Impala via Cloudera is
simpler for us from maintenance point of view, at first glance at least.
Also interesting is there any way to not use Cloudera jdbc driver but
something opensource, as mentioned in https://impala.apache.org/
docs/build/html/topics/impala_jdbc.html hive jdbc driver is 0.13 is able to
connect, but from the first glance I have some logging dependency issues,
but all classes should be in classpath. Anyway Cloudera jdbc driver does
what it is intended for.

Best,
Sasha



2018-01-14 8:14 GMT+01:00 Fawze Abujaber <[email protected]>:

> Hello Sasha,
>
> Did you try to add idle_session_timeout=3000 to Impala Command Line
> Argument Advanced Configuration Snippet (Safety Valve)
>
> On Fri, Jan 12, 2018 at 7:40 PM, Jim Apple <[email protected]> wrote:
>
>> The Cloudera JDBC driver is not a proper part of Apache Impala, so you
>> might get better answers here:
>> http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/bd-p/Impala
>>
>> One might have a similar question about timeouts in Impala's hs2
>> interface, which is part of Apache Impala. I don't know much about
>> that, but https://github.com/apache/impala/commit/ce65b43d47d16840998e
>> b5fab5333695050ac436
>> and https://issues.apache.org/jira/browse/IMPALA-2248 might help you
>> out.
>>
>> On Tue, Jan 2, 2018 at 10:12 AM, Oleksandr Baliev
>> <[email protected]> wrote:
>> > Hello,
>> >
>> > I'm using impala in conjunction with CDH 5.10.1 and want to have
>> timeouts
>> > for long running queries. What I've found is:
>> > 1. Try to QUERY_TIMEOUT_S to close idle queries, as I understand when
>> client
>> > is closed and don't consume data.
>> > 2. for jdbc driver (version 2.5.37 or 4.1) there is
>> setQueryTimeout(int).
>> > which should set xlient side timeout as I understand
>> > (https://www.cloudera.com/documentation/other/connectors/imp
>> ala-jdbc/latest/Cloudera-JDBC-Driver-for-Impala-Install-Guide.pdf)
>> > . It works, but:
>> > 2.1 when timeout occurs the query is in "waiting to be closed", even if
>> > close Statement (for Java). And only if close Connection this query
>> will we
>> > removed from "in_flight_queries".
>> > 2.2 not sure, haven't checked fully, but seems this timeout can happen
>> only
>> > when query on earlier stage of executing, maybe when it's executing it's
>> > okay, but when data is transmitting it cannot be timeout-ed.
>> > 3. Write some cron script to verify such queries via impalad
>> /queries?json
>> > and try to close them to do not consume resources.
>> >
>> > To be honest I would prefer for now to have some external script which
>> could
>> > tell if there are any of such queries and react manually/automatically +
>> > introduce some connection reconnect if there were any of such timeouted
>> > queries on application level. Also maybe there are some another
>> situations
>> > when queries are nor closing, so I don't know maybe just reconnect
>> every 20
>> > minutes :D But this solutions are just workaround and I don't really can
>> > rely on them since they look a bit fragile.
>> >
>> > So the question: is there any good way for the queries via jdbc to be
>> > gracefully closed by given timeout ?
>> >
>> > Will be very thankful for any hints :)
>> > Sasha
>> >
>> >
>>
>
>

Reply via email to