Shubham Chaurasia created HIVE-24058:
----------------------------------------

             Summary: Llap external client - Enhancements for running in cloud 
environment
                 Key: HIVE-24058
                 URL: https://issues.apache.org/jira/browse/HIVE-24058
             Project: Hive
          Issue Type: Task
          Components: llap
            Reporter: Shubham Chaurasia
            Assignee: Shubham Chaurasia


When we query using llap external client library, following happens currently - 

1. We first need to get splits using 
[LlapBaseInputFormat#getSplits()|https://github.com/apache/hive/blob/rel/release-3.1.2/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java#L226],
 this just needs Hive server JDBC url. 

2. We then submit those splits to llap and obtain record reader to read data 
using 
[LlapBaseInputFormat#getRecordReader()|https://github.com/apache/hive/blob/rel/release-3.1.2/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java#L140].
 In this step we need following at client side -
- {{hive.zookeeper.quorum}}
-{{hive.llap.daemon.service.hosts}}

We need to connect to zk to discover llap daemons.

3. Record reader so obtained needs to [initiate a TCP connection from client to 
LLAP Daemon to submit the 
split|https://github.com/apache/hive/blob/rel/release-3.1.2/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java#L185].

4. It also needs to [initiate another TCP connection from client to output 
format port in LLAP Daemon to read the 
data|https://github.com/apache/hive/blob/rel/release-3.1.2/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java#L201].

In cloud based deployments, we may not be able to make direct connections to Zk 
registry and LLAP daemons from client as it might run outside vpc. 

For 2, we can move daemon discovery logic to get_splits UDF itself which will 
run in HS2.  
For scenarios like 3 and 4, we can expose additional ports on LLAP with 
additional auth mechanism.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to