Nithin Nagaraj created HIVE-21076:
-------------------------------------

             Summary: Hive optimizer is ignoring predicate condition provided 
in the query
                 Key: HIVE-21076
                 URL: https://issues.apache.org/jira/browse/HIVE-21076
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 2.1.0, 1.2.1
         Environment: OS : Oracle Linux 6.x

JDK: jdk1.8.0_192 
            Reporter: Nithin Nagaraj
         Attachments: Explain Plan.zip

In big sql which contains a where condition, in the condition there is IN 
clause which hive is not honoring. SQL with and without IN CLAUSE is attached 
herewith along with hiveserver2 logs

 
{code:java}
/good plan
// in hive server logs
2018-11-15T14:44:57,682  INFO [7984d196-64b5-4135-854b-136886f7ce8b 
HiveServer2-HttpHandler-Pool: Thread-2726946] ql.Driver: Compiling 
command(queryId=hive_20181115144457_81293879-57e8-4173-99e3-d4947a4c19a3): 
EXPLAIN

// where condtition
WHERE
  (
   ( ( concat(( 
from_unixtime(unix_timestamp(substr(Table__6.participationperiodstartdate, 0, 
10), 'yyyy-MM-dd'), 'MM/dd/yyyy') ), ' - ', ( 
from_unixtime(unix_timestamp(substr(Table__6.participationperiodenddate, 0, 
10), 'yyyy-MM-dd'), 'MM/dd/yyyy') )) ) in  ('07/01/2018 - 06/30/2019')  )
   AND
   ( 
transaction_history.get_rate_attribute_range(Table__13.RateAttributes.RateAttribute,cast(CAST('2018-10-30
 00:00:00' AS timestamp) as string),'AGE_RANGE') in ('35 - 39','60 - 64') )
   AND
   ( Table__6.benefittype in ('LIFE','SPOUSELIFE','VOLLIFE')  )
  )

//bad plan
// in hiveserver2 logs
2018-11-15T14:45:33,744  INFO [HiveServer2-Background-Pool: Thread-2728943] 
ql.Driver: Executing 
command(queryId=hive_20181115144457_81293879-57e8-4173-99e3-d4947a4c19a3): 
EXPLAIN
// where condition which is ignored and hive tried to do full table scan 
  WHERE
    (
     ( ( concat(( 
from_unixtime(unix_timestamp(substr(Table__6.participationperiodstartdate, 0, 
10), 'yyyy-MM-dd'), 'MM/dd/yyyy') ), ' - ', ( 
from_unixtime(unix_timestamp(substr(Table__6.participationperiodenddate, 0, 
10), 'yyyy-MM-dd'), 'MM/dd/yyyy') )) ) in  ('07/01/2018 - 06/30/2019')  )
     AND
     (  Table__13.ageband in ('35 - 39')  )
     AND
     ( Table__6.benefittype in ('LIFE','SPOUSELIFE','VOLLIFE')  )
    )
{code}
Please find attached the logs for hiveserver, table ddl involved in the query 
along with the good and bad query plan. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to