[ 
https://issues.apache.org/jira/browse/HIVE-18638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365004#comment-16365004
 ] 

Prasanth Jayachandran commented on HIVE-18638:
----------------------------------------------

This seems to be happening when tez AM pool leaves some sessions in ACCEPTED 
state. If all sessions are in RUNNING state this does not happen. This likely 
because the trigger validator threads are not started because it is blocked in 
tez AM pool start. The reason this does not happen with TezSessionPoolManager 
(container mode) is trigger validator thread is initialized before session pool 
start. Will have to make similar change for workload manager as well. 

> Triggers for multi-pool move, failing to initiate the move event
> ----------------------------------------------------------------
>
>                 Key: HIVE-18638
>                 URL: https://issues.apache.org/jira/browse/HIVE-18638
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: Aswathy Chellammal Sreekumar
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>
> Resource plan with multiple pools and trigger set to move job across those 
> pools seems to be failing to do so
> Resource plan:
> {noformat}
> 1: jdbc:hive2://ctr-e137-1514896590304-51538-> show resource plan plan_2; 
> INFO : Compiling 
> command(queryId=hive_20180202220823_2fb8bca7-5b7a-48cf-8ff9-8d5f3548d334): 
> show resource plan plan_2 INFO : Semantic Analysis Completed INFO : Returning 
> Hive schema: Schema(fieldSchemas:[FieldSchema(name:line, type:string, 
> comment:from deserializer)], properties:null) INFO : Completed compiling 
> command(queryId=hive_20180202220823_2fb8bca7-5b7a-48cf-8ff9-8d5f3548d334); 
> Time taken: 0.008 seconds INFO : Executing 
> command(queryId=hive_20180202220823_2fb8bca7-5b7a-48cf-8ff9-8d5f3548d334): 
> show resource plan plan_2 INFO : Starting task [Stage-0:DDL] in serial mode 
> INFO : Completed executing 
> command(queryId=hive_20180202220823_2fb8bca7-5b7a-48cf-8ff9-8d5f3548d334); 
> Time taken: 0.196 seconds INFO : OK 
> +----------------------------------------------------+ | line | 
> +----------------------------------------------------+ | 
> plan_2[status=ACTIVE,parallelism=null,defaultPool=pool2] | | + 
> pool2[allocFraction=0.5,schedulingPolicy=default,parallelism=3] | | | trigger 
> too_large_write_triger: if (HDFS_BYTES_WRITTEN > 10kb) { MOVE TO pool1 } | | 
> | mapped for default | | + 
> pool1[allocFraction=0.3,schedulingPolicy=default,parallelism=5] | | | trigger 
> slow_pool_trigger: if (ELAPSED_TIME > 30000) { MOVE TO pool3 } | | + 
> pool3[allocFraction=0.2,schedulingPolicy=default,parallelism=3] | | + 
> default[allocFraction=0.0,schedulingPolicy=null,parallelism=4] | 
> +----------------------------------------------------+ 8 rows selected (0.25 
> seconds)
> {noformat}
> Workload Manager Events Summary from query run:
> {noformat}
> INFO  : {
>   "queryId" : "hive_20180202213425_9633d7af-4242-4e95-a391-2cd3823e3eac",
>   "queryStartTime" : 1517607265395,
>   "queryEndTime" : 1517607321648,
>   "queryCompleted" : true,
>   "queryWmEvents" : [ {
>     "wmTezSessionInfo" : {
>       "sessionId" : "21f8a4ab-511e-4828-a2dd-1d5f2932c492",
>       "poolName" : "pool2",
>       "clusterPercent" : 50.0
>     },
>     "eventStartTimestamp" : 1517607269660,
>     "eventEndTimestamp" : 1517607269661,
>     "eventType" : "GET",
>     "elapsedTime" : 1
>   }, {
>     "wmTezSessionInfo" : {
>       "sessionId" : "21f8a4ab-511e-4828-a2dd-1d5f2932c492",
>       "poolName" : null,
>       "clusterPercent" : 0.0
>     },
>     "eventStartTimestamp" : 1517607321663,
>     "eventEndTimestamp" : 1517607321663,
>     "eventType" : "RETURN",
>     "elapsedTime" : 0
>   } ],
>   "appliedTriggers" : [ {
>     "name" : "too_large_write_triger",
>     "expression" : {
>       "counterLimit" : {
>         "limit" : 10240,
>         "name" : "HDFS_BYTES_WRITTEN"
>       },
>       "predicate" : "GREATER_THAN"
>     },
>     "action" : {
>       "type" : "MOVE_TO_POOL",
>       "poolName" : "pool1"
>     },
>     "violationMsg" : null
>   } ],
>   "subscribedCounters" : [ "HDFS_BYTES_WRITTEN" ],
>   "currentCounters" : {
>     "HDFS_BYTES_WRITTEN" : 33306829
>   },
>   "elapsedTime" : 56284
> }
> {noformat}
> From the Workload Manager Event Summary it could seen that the 'MOVE' event 
> didn't happen though the limit for counter (10240) HDFS_BYTES_WRITTEN was 
> exceeded



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to