[ 
https://issues.apache.org/jira/browse/HIVE-26104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518050#comment-17518050
 ] 

liuyan commented on HIVE-26104:
-------------------------------

Log findings 

2022-03-22 05:23:00,252 INFO  
org.apache.hadoop.hive.ql.cache.results.QueryResultsCache: 
[f4c8782f-903d-46da-96d5-4e45d72ff431 HiveServer2-Handler-Pool: 
Thread-8884649]: Waiting on pending cacheEntry

2022-03-22 05:54:25,257 INFO  org.apache.hadoop.hive.ql.Driver: 
[f4c8782f-903d-46da-96d5-4e45d72ff431 HiveServer2-Handler-Pool: 
Thread-8884649]: Semantic Analysis Completed (retrial = false)

2022-03-22 05:54:25,304 INFO  org.apache.hadoop.hive.ql.Driver: 
[f4c8782f-903d-46da-96d5-4e45d72ff431 HiveServer2-Handler-Pool: 
Thread-8884649]: Completed compiling 
command(queryId=hive_20220322052300_7c219f1f-b969-49bb-aa7b-ea2f8926ac76); Time 
taken: 1885.298 seconds

seems the query(hive_20220322052300_7c219f1f-b969-49bb-aa7b-ea2f8926ac76) was 
freezeed  for 30 minutes for compilation due to Waiting on pending cacheEntry. 


it introduces two issues : 

1.  The user does not aware of the waiting for pending cache status, so from 
the beeline or client side, the user do not know why the query is not executing 
for a very long period.  we need to notify the user in some sort of way so that 
the user aware this query is currently waiting for cache(hence will not run 
before the cache went to ready state )

2.  We had hive.driver.parallel.compilation.global.limit normally set to 3 ,  
which means that if we have 4 identical queries runs on the managed table, the 
4th query will be blocked, as well as any following queries sending to this HS2



> HIVE-19138 May block queries to compile
> ---------------------------------------
>
>                 Key: HIVE-26104
>                 URL: https://issues.apache.org/jira/browse/HIVE-26104
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 3.0.0, 3.1.2
>            Reporter: liuyan
>            Priority: Critical
>
> HIVE-19138 introduce a way to allow other queries to stay in compilation 
> state while there are placeholder for the same query in result cache.   
> However, multiple queires may enter the same state and hence used all the 
> avaliable parallel compilation limit via 
> hive.driver.parallel.compilation.global.limit.    Althought we can turn off 
> this feature by setting  hive.query.results.cache.wait.for.pending.results = 
> false, but seems this negelects all the efforts that Hive-19138 trying to 
> reslove.  We need a better solution for such situation 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to