Kurt Deschler created IMPALA-14523:
--------------------------------------
Summary: Optimize JDBC table for Hive Multistream Driver
Key: IMPALA-14523
URL: https://issues.apache.org/jira/browse/IMPALA-14523
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 4.4.0
Reporter: Kurt Deschler
Assignee: Pranav Yogi Lodha
HIVE-27872 added multi-stream fetch capabilites to the HS2 JDBC driver which
facilitates very fast transport of bulk data over JDBC. However, to achieve
high performance it is necessary for the client to consume data quickly from
the single-threaded JDBC client. This requires minimizing the (synchronous)
work done after fetching data from JDBC and performing any expensive processing
using multiple threads. It is possible to achive this result either by fetching
data on a single thread and handing it off for consumption or using locking to
serialize fetching. In either case, the fetch path must copy data in an
efficent way from the cursor to local memory and defer any expensive
encoding/decoding/conversion to a multi-threaded codepath.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)