[ 
https://issues.apache.org/jira/browse/HIVE-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611012#comment-17611012
 ] 

John Sherman commented on HIVE-26320:
-------------------------------------

I think it feels more correct to investigate the ETypeConverter change, since 
it feels like it is at the root of the issue rather than fixing the issue at 
the symptom level (but the change could have wide effect and may uncover 
different issues). As far as the GenericUDFIn and why it works with non struct 
vs varchar, it is due to:

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java#L159]
{code:java}
      case PRIMITIVE: {
        Object arg = ((PrimitiveObjectInspector) compareOI)
                
.getPrimitiveJavaObject(conversionHelper.convertIfNecessary(arguments[0].get(),
                    argumentOIs[0]));
        if (compareOI.getTypeName().equals(serdeConstants.BINARY_TYPE_NAME)) {
          arg = ByteBuffer.wrap((byte[]) arg);
        }
        if (constantInSet.contains(arg)) {
          bw.set(true);
          return bw;
        }
        break;
      } {code}
The getPrimitiveJavaObject will end up convert Text to the appropriate type - 
as shown here for HiveVarchar.
[https://github.com/apache/hive/blob/c19d56ec7429bfcfad92b62ac335dbf8177dab24/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableHiveVarcharObjectInspector.java#L48]

However, structs are handled here:
[https://github.com/apache/hive/blob/c19d56ec7429bfcfad92b62ac335dbf8177dab24/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java#L186]
{code:java}
      case STRUCT: {
        Object value;
        if (argumentOIs[0] instanceof ConstantObjectInspector) {
          value = ((ConstantObjectInspector) 
argumentOIs[0]).getWritableConstantValue();
        } else {
          value = conversionHelper.convertIfNecessary(arguments[0].get(), 
argumentOIs[0]);
        }
        if (constantInSet.contains(((StructObjectInspector) 
compareOI).getStructFieldsDataAsList(value))) {
          bw.set(true);
          return bw;
        }
        break;
      } {code}
getStructFieldsDataAsList does not attempt to do any conversion and just 
returns the data
[https://github.com/apache/hive/blob/c19d56ec7429bfcfad92b62ac335dbf8177dab24/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java#L193]

I also would have thought that the convertIfNecessary methods should have 
converted the string/text values to the appropriate types but it does not 
because the Parquet SerDe ObjectInspector is saying the row contains the 
appropriately typed objects (even though everything is actually represented as 
Text), so the conversion doesn't take place because the converters already 
think the input and output object inspector schemas match.

> Incorrect case evaluation for Parquet based table
> -------------------------------------------------
>
>                 Key: HIVE-26320
>                 URL: https://issues.apache.org/jira/browse/HIVE-26320
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, Query Planning
>    Affects Versions: 4.0.0-alpha-1
>            Reporter: Chiran Ravani
>            Assignee: John Sherman
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Query involving case statement with two or more conditions leads to incorrect 
> result for tables with parquet format, The problem is not observed with ORC 
> or TextFile.
> *Steps to reproduce*:
> {code:java}
> create external table case_test_parquet(kob varchar(2),enhanced_type_code 
> int) stored as parquet;
> insert into case_test_parquet values('BB',18),('BC',18),('AB',18);
> select case when (
>                    (kob='BB' and enhanced_type_code='18')
>                    or (kob='BC' and enhanced_type_code='18')
>                  )
>             then 1
>             else 0
>         end as logic_check
> from case_test_parquet;
> {code}
> Result:
> {code}
> 0
> 0
> 0
> {code}
> Expected result:
> {code}
> 1
> 1
> 0
> {code}
> The problem does not appear when setting hive.optimize.point.lookup=false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to