[jira] [Updated] (HIVE-27283) Use tez.local.mode in HiveServer2 for trivial queries

Jira Fri, 21 Apr 2023 00:28:06 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


László Bodor updated HIVE-27283:
--------------------------------
    Description: 
Today, a query like this:
{code}
INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney 
rubble', 32, 2.32);
{code}
spins up a TezAM and containers. I believe this is not optimal, even if we 
already have an tez application running. Not to mention setups where only a 
hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand.

With this optimization a possible risk is to overwhelm Hiveserver2 with such 
queries, this scenario should be handled with care.

My proposal is to maintain a local tez session pool (default size 0, 
recommended is 1...4) in hs2, and let's identify "trivial queries" compile-time 
that currently needs tez application (like the INSERT INTO above).
The first implementation can include only simply INSERT INTO queries, and we 
can decide the rest later.

  was:
Today, a query like this:
{code}
INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney 
rubble', 32, 2.32);
{code}
spins up a TezAM and containers. I believe this is not optimal, even if we 
already have an tez application running. Not to mention setups where only a 
hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand.

A possible risk is to overwhelm Hiveserver2 with such queries, this scenario 
should be handled with care.

My proposal is to maintain a local tez session pool (default size 0, 
recommended is 1...4) in hs2, and in compile-time let's identify "trivial 
queries" that by default needs tez application (like the INSERT INTO above).
The first implementation can include only simply INSERT INTO queries, and we 
can decide the rest later.


> Use tez.local.mode in HiveServer2 for trivial queries
> -----------------------------------------------------
>
>                 Key: HIVE-27283
>                 URL: https://issues.apache.org/jira/browse/HIVE-27283
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Priority: Major
>
> Today, a query like this:
> {code}
> INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney 
> rubble', 32, 2.32);
> {code}
> spins up a TezAM and containers. I believe this is not optimal, even if we 
> already have an tez application running. Not to mention setups where only a 
> hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand.
> With this optimization a possible risk is to overwhelm Hiveserver2 with such 
> queries, this scenario should be handled with care.
> My proposal is to maintain a local tez session pool (default size 0, 
> recommended is 1...4) in hs2, and let's identify "trivial queries" 
> compile-time that currently needs tez application (like the INSERT INTO 
> above).
> The first implementation can include only simply INSERT INTO queries, and we 
> can decide the rest later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27283) Use tez.local.mode in HiveServer2 for trivial queries

Reply via email to