[ 
https://issues.apache.org/jira/browse/IMPALA-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051296#comment-18051296
 ] 

ASF subversion and git services commented on IMPALA-14460:
----------------------------------------------------------

Commit ac1c11dd8256e8a81e138f43663de06610441d41 in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ac1c11dd8 ]

IMPALA-14460: Keep http connections open in impala-shell

Leave HS2-HTTP connections open and retry on 401 or EPIPE failures to
re-use connections, greatly reducing the number of client connections
needed with the HS2-HTTP protocol. Adds a 'use_new_http_connection'
impala-shell option to restore the old behavior of using a new
connection for each rpc.

Existing test_shell_interactive_reconnect tests that ImpalaShell - the
library implementing the impala-shell CLI - will automatically establish
a new connection with all protocols. Prior to this patch, after
restarting impalad you'd see

  2026-01-06 11:13:08 [Warning] close session RPC failed:
  <class 'impala_shell.shell_exceptions.RPCException'>
  ERROR: Invalid session id: be40a2618203ff7b:beacd4b5d28f7692

  Connection lost, reconnecting...
  Warning: --connect_timeout_ms is currently ignored with HTTP transport.
  Opened TCP connection to localhost:28001

If you instead introduce a load balancer like haproxy and restart the
lb, there's no apparent break because impala-shell would always
establish a new connection.

With this patch, when impalad is restarted we still see the lost session

  2026-01-06 11:20:43 [Exception] type=<class 'BrokenPipeError'> in
  PingImpalaHS2Service. Num remaining tries: 3 [Errno 32] Broken pipe
  Connection closed, reconnecting...
  2026-01-06 11:20:43 [Warning] close session RPC failed:
  <class 'impala_shell.shell_exceptions.RPCException'>
  ERROR: Invalid session id: 6e494c76a9a58278:dbb7016cb5999385

  Connection lost, reconnecting...
  Warning: --connect_timeout_ms is currently ignored with HTTP transport.
  Opened TCP connection to localhost:28000

If the lb is restarted, we now see that the connection is reopened

  2026-01-06 11:24:02 [Exception] type=<class 'BrokenPipeError'> in
  PingImpalaHS2Service. Num remaining tries: 3 [Errno 32] Broken pipe
  Connection closed, reconnecting...
  Query: ...

Triggering a retry due to 401 Unauthorized requires Kerberos, since
Basic and Bearer auth always send the Authorization header; it shows

  2026-01-06 17:02:27 [Exception] type=
  <class 'http.client.RemoteDisconnected'> in ExecuteStatement.
  Remote end closed connection without response
  2026-01-06 17:02:27 [Exception] type=
  <class 'http.client.RemoteDisconnected'> when listing query options.
  Num remaining tries: 3 Remote end closed connection without response
  2026-01-06 17:02:27 [Exception] type=<class 'ConnectionRefusedError'>
  in ExecuteStatement.  [Errno 111] Connection refused
  2026-01-06 17:02:27 [Exception] type=<class 'ConnectionRefusedError'>
  when listing query options. Num remaining tries: 2 [Errno 111]
  Connection refused
  Connection closed, reconnecting...
  Cookies expired, restarting authentication...
  Preserving cookies: impala.auth
  Connected to localhost:28005

Updates tests that count RPCs via number of connections as re-use means
they're no longer linked. Tests now rely on connection count, which
verifies we're re-using connections.

Adds testReconnect to use a proxy where we can interrupt the
existing connection, which will sometimes trigger "Connection closed,
reconnecting..." I didn't find a way to trigger it consistently in this
test environment.

Adds tests using Kerberos authentication to trigger cookie retry and
"Cookie expired, restarting authentication..."

Generated-by: Github Copilot (GPT-4.1)
Change-Id: Iafb3fc39817e93c691cd993902c6d939a7235a03
Reviewed-on: http://gerrit.cloudera.org:8080/23831
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>


> Implement keep-alive on HS2-HTTP connections
> --------------------------------------------
>
>                 Key: IMPALA-14460
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14460
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Clients
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Critical
>
> impala-shell does not use HTTP keep-alive with HS2-HTTP connections, causing 
> it to establish a new connection each time the client needs to send a 
> request. In busy environments, this can exhaust previously working 
> {{accepted_cnxn_setup_thread_pool_size}} settings.
> We should implement keep-alive for HS2-HTTP connections, which may require 
> improvements in both impala-shell and the server implementation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to