Gopal V created HIVE-4758:
-----------------------------

             Summary: NULLs and record separators broken with vectorization 
branch intermediate outputs
                 Key: HIVE-4758
                 URL: https://issues.apache.org/jira/browse/HIVE-4758
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: vectorization-branch
            Reporter: Gopal V
            Assignee: Gopal V


Queries of type timestamp on partitioned tables return NULL for all rows of 
timestamp columns, if the first row in the column is NULL.

This was tracked down to the failure of timestamp columns to parse the map 
output properly, which was due to differing format from the unvectorized code's 
output.

The output file for vectorized code says 

{code}
(null)^A
2013-02-12 21:05:29^A
{code}

Where the unvectorized code outputs

{code}
\N
2013-02-12 21:05:29
{code}

The vectorized code passes on the "(null)" string to the LazyTimestamp parser, 
which fails to parse it & returns "NULL", but slowed down massively by the 
IllegalArgumentException.

And the extraneous ^A prevents the actual Timestamp from being parsed into 
valid timestamps.






--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to