[ 
https://issues.apache.org/jira/browse/HIVE-20771?focusedWorklogId=462948&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-462948
 ]

ASF GitHub Bot logged work on HIVE-20771:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Jul/20 13:13
            Start Date: 24/Jul/20 13:13
    Worklog Time Spent: 10m 
      Work Description: belugabehr merged pull request #1298:
URL: https://github.com/apache/hive/pull/1298


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 462948)
    Time Spent: 1h 50m  (was: 1h 40m)

> LazyBinarySerDe fails on empty structs.
> ---------------------------------------
>
>                 Key: HIVE-20771
>                 URL: https://issues.apache.org/jira/browse/HIVE-20771
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 1.2.2, 2.3.2, 3.1.0
>            Reporter: Clemens Valiente
>            Assignee: Clemens Valiente
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HIVE-20771.patch
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code:java}
> CREATE TABLE cvaliente.structtest AS
> SELECT named_struct();
> SHOW CREATE TABLE cvaliente.structtest;
> SELECT * FROM cvaliente.structtest ORDER BY rand();
> {code}
> The resulting schema is:
> {code:sql}
> CREATE TABLE `cvaliente.structtest`(
>   `_c0` struct<>)
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
>   'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://nameservice1/user/cvaliente/cvaliente/structtest2'
> TBLPROPERTIES (
>   'COLUMN_STATS_ACCURATE'='true', 
>   'numFiles'='1',       
>   'numRows'='1', 
>   'rawDataSize'='0', 
>   'totalSize'='1',      
>   'transient_lastDdlTime'='1539781607');
> {code}
> Between the MAP and REDUCE phase hive serializes to LazyBinaryStruct and when 
> trying to read the same object back the {{SELECT}} query above fails:
> {code}
> 2018-10-17 14:32:02,298 [FATAL] [TezChild] |tez.ReduceRecordSource|: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":0.13508293503238622},"value":{"_col0":{}}}
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:338)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:259)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:169)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>       at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> VALUE._col0
>       at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:329)
>       ... 17 more
> Caused by: java.lang.RuntimeException: length should be positive!
>       at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryNonPrimitive.init(LazyBinaryNonPrimitive.java:54)
>       at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.init(LazyBinaryStruct.java:95)
>       at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>       at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>       at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>       at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>       ... 18 more
> {code}
> this is because the LazyBinaryNonPrimitive doesn't allow for empty structs in 
> https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryNonPrimitive.java#L53



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to