[jira] [Commented] (HIVE-17481) LLAP workload management

Thai Bui (JIRA) Tue, 10 Apr 2018 16:22:17 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-17481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433169#comment-16433169
 ]


Thai Bui commented on HIVE-17481:
---------------------------------

The Hive metadata store database is in Postgresql. I got around the problem for 
now by changing the schema in Postgres from boolean to int.
{noformat}
hive=# alter table "WM_TRIGGER" ALTER COLUMN "IS_IN_UNMANAGED" DROP DEFAULT;
ALTER TABLE
hive=# alter table "WM_TRIGGER" ALTER COLUMN "IS_IN_UNMANAGED" TYPE int USING 
CASE WHEN "IS_IN_UNMANAGED" = false THEN 0 ELSE 1 END;
ALTER TABLE
hive=# alter table "WM_TRIGGER" ALTER COLUMN "IS_IN_UNMANAGED" SET DEFAULT 0;
ALTER TABLE
hive=# \d+ "WM_TRIGGER"
Table "public.WM_TRIGGER"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------+-------------------------+---------------------------------+----------+--------------+-------------
TRIGGER_ID | bigint | not null | plain | |
RP_ID | bigint | not null | plain | |
NAME | character varying(128) | not null | extended | |
TRIGGER_EXPRESSION | character varying(1024) | default NULL::character varying 
| extended | |
ACTION_EXPRESSION | character varying(1024) | default NULL::character varying | 
extended | |
IS_IN_UNMANAGED | integer | not null default 0 | plain | |
Indexes:
"WM_TRIGGER_pkey" PRIMARY KEY, btree ("TRIGGER_ID")
"UNIQUE_WM_TRIGGER" UNIQUE CONSTRAINT, btree ("RP_ID", "NAME")
Foreign-key constraints:
"WM_TRIGGER_FK1" FOREIGN KEY ("RP_ID") REFERENCES "WM_RESOURCEPLAN"("RP_ID") 
DEFERRABLE
Referenced by:
TABLE ""WM_POOL_TO_TRIGGER"" CONSTRAINT "WM_POOL_TO_TRIGGER_FK2" FOREIGN KEY 
("TRIGGER_ID") REFERENCES "WM_TRIGGER"("TRIGGER_ID") DEFERRABLE{noformat}
Not sure what's the best way to fix this but if you give me a suggestion, I 
will probably can fix this pretty quickly.

> LLAP workload management
> ------------------------
>
>                 Key: HIVE-17481
>                 URL: https://issues.apache.org/jira/browse/HIVE-17481
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: Workload management design doc.pdf
>
>
> This effort is intended to improve various aspects of cluster sharing for 
> LLAP. Some of these are applicable to non-LLAP queries and may later be 
> extended to all queries. Administrators will be able to specify and apply 
> policies for workload management ("resource plans") that apply to the entire 
> cluster, with only one resource plan being active at a time. The policies 
> will be created and modified using new Hive DDL statements. 
> The policies will cover:
> * Dividing the cluster into a set of (optionally, nested) query pools that 
> are each allocated a fraction of the cluster, a set query parallelism, 
> resource sharing policy between queries, and potentially others like 
> priority, etc.
> * Mapping the incoming queries into pools based on the query user, groups, 
> explicit configuration, etc.
> * Specifying rules that perform actions on queries based on counter values 
> (e.g. killing or moving queries).
> One would also be able to switch policies on a live cluster without (usually) 
> affecting running queries, including e.g. to change policies for daytime and 
> nighttime usage patterns, and other similar scenarios. The switches would be 
> safe and atomic; versioning may eventually be supported.
> Some implementation details:
> * WM will only be supported in HS2 (for obvious reasons).
> * All LLAP query AMs will run in "interactive" YARN queue and will be 
> fungible between Hive pools.
> * We will use the concept of "guaranteed tasks" (also known as ducks) to 
> enforce cluster allocation without a central scheduler and without 
> compromising throughput. Guaranteed tasks preempt other (speculative) tasks 
> and are distributed from HS2 to AMs, and from AMs to tasks, in accordance 
> with percentage allocations in the policy. Each "duck" corresponds to a CPU 
> resource on the cluster. The implementation will be isolated so as to allow 
> different ones later.
> * In future, we may consider improved task placement and late binding, 
> similar to the ones described in Sparrow paper, to work around potential 
> hotspots/etc. that are not avoided with the decentralized scheme.
> * Only one HS2 will initially be supported to avoid split-brain workload 
> management. We will also implement (in a tangential set of work items) 
> active-passive HS2 recovery. Eventually, we intend to switch to full 
> active-active HS2 configuration with shared WM and Tez session pool (unlike 
> the current case with 2 separate session pools). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17481) LLAP workload management

Reply via email to