[
https://issues.apache.org/jira/browse/HIVE-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158665#comment-14158665
]
Matt McCline commented on HIVE-8197:
------------------------------------
No longer repros. Fixed with earlier change that simplified
VectorFileSinkOperator to just forward rows rather than buffer them in
VectorOrcSerde.
> Tez and Vectorization Insert into ORC Table with timestamp column erroneously
> repeats the last row's column value
> -----------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-8197
> URL: https://issues.apache.org/jira/browse/HIVE-8197
> Project: Hive
> Issue Type: Bug
> Environment: Tez and Vectorization.
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
>
> In diagnosing why a only(?) a Tez and Vectorized query with min and max
> aggregates was always returning the last row read's column value, discovered
> the problem was in creating the test table....
> {code}
> CREATE TABLE alltypesorc_string STORED AS ORC AS SELECT
> ctinyint as ctinyint,
> to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') as ctimestamp1,
> CAST(to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') AS STRING) as
> stimestamp1
> FROM alltypesorc WHERE ctinyint > 0
> LIMIT 40;
> {code}
> I think it is related what Prasanth mentioned as a possibility: Saving a
> Timestamp as a Writable object that gets overwritten. One suspect is the
> Writable[] records array in VectorFileSinkOperator in the ProcessOp method.
> Or, perhaps it is in VectorReduceSinkOperator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)