[ 
https://issues.apache.org/jira/browse/HIVE-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158665#comment-14158665
 ] 

Matt McCline commented on HIVE-8197:
------------------------------------

No longer repros.  Fixed with earlier change that simplified 
VectorFileSinkOperator to just forward rows rather than buffer them in 
VectorOrcSerde.

> Tez and Vectorization Insert into ORC Table with timestamp column erroneously 
> repeats the last row's column value
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8197
>                 URL: https://issues.apache.org/jira/browse/HIVE-8197
>             Project: Hive
>          Issue Type: Bug
>         Environment: Tez and Vectorization.
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>
> In diagnosing why a only(?) a Tez and Vectorized query with min and max 
> aggregates was always returning the last row read's column value, discovered 
> the problem was in creating the test table....
> {code}
> CREATE TABLE alltypesorc_string STORED AS ORC AS SELECT
>   ctinyint as ctinyint,
>   to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') as ctimestamp1,
>   CAST(to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') AS STRING) as 
> stimestamp1
> FROM alltypesorc WHERE ctinyint > 0
> LIMIT 40;
> {code}
> I think it is related what Prasanth mentioned as a possibility: Saving a 
> Timestamp as a Writable object that gets overwritten.  One suspect is the 
> Writable[] records array in VectorFileSinkOperator in the ProcessOp method.  
> Or, perhaps it is in VectorReduceSinkOperator.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to