Varun Raval created ORC-1054:
--------------------------------

             Summary: Unable to compare data (generated using CSV to ORC 
converter) on timestamp column
                 Key: ORC-1054
                 URL: https://issues.apache.org/jira/browse/ORC-1054
             Project: ORC
          Issue Type: Bug
          Components: C++, Java
            Reporter: Varun Raval


I have a CSV file with timestamp columns. Then I convert CSV file to ORC file 
using CSV to ORC converter and place the ORC file in a hive table backed by ORC 
files. I am not able to query the data using timestamp column on Apache Hive 
beeline. If timestamp is present in the select query, the corresponding rows 
are not retrieved.

For example, table csvtest has single column (t) as timestamp datatype. It has 
a row '2021-11-10 01:02:15'. Query "select * from csvtest where t > '2021-11-10 
00:00:00'" does not return any result. Query "select * from csvtest" returns 
the correct row.

However, the same query "select * from csvtest where t > '2021-11-10 00:00:00'" 
works with Spark SQL and rows are retrieved correctly.

Is this issue with how ORC file is created or is it some hive configuration 
issue?

I have tested it on the master branch and results are same for both cpp and 
java csv to orc converters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to