[
https://issues.apache.org/jira/browse/ORC-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451204#comment-17451204
]
Varun Raval edited comment on ORC-1053 at 11/30/21, 3:55 PM:
-------------------------------------------------------------
I have tested using the main branch. Sample csv file is timestamp.csv. It has a
single column.
converted_by_cpp.orc, converted_by_java.orc files are generated by cpp and java
tools respectively.
Commands used:
# /root/orc/build/tools/src/csv-import "struct<d:timestamp>" input.csv
output.orc
# java -jar /root/orc/build/java/tools/orc-tools-1.8.0-SNAPSHOT-uber.jar
convert --schema "struct<d:timestamp>" -o output.orc -t "yyyy-MM-dd
HH:mm:ss.SSS" input.csv
Destination table in Hive is an external table and it contains single column of
type timestamp. Description of the table is shown in hive_table_desc.jpg.
was (Author: vraval48):
I have tested using the main branch. Sample csv file is timestamp.csv. It has a
single column.
converted_by_cpp.orc, converted_by_java.orc files are generated by cpp and java
tools respectively.
Commands used:
# /root/orc/build/tools/src/csv-import "struct<d:timestamp>" input.csv
output.orc
# java -jar /root/orc/build/java/tools/orc-tools-1.8.0-SNAPSHOT-uber.jar
convert --schema "struct<d:timestamp>" -o output.orc -t "yyyy-MM-dd
HH:mm:ss.SSS" input.csv
> Timestamp values read in Hive are different when using ORC file created using
> CSV to ORC converter tools
> --------------------------------------------------------------------------------------------------------
>
> Key: ORC-1053
> URL: https://issues.apache.org/jira/browse/ORC-1053
> Project: ORC
> Issue Type: Bug
> Components: C++, Java
> Reporter: Varun Raval
> Priority: Major
> Attachments: converted_by_cpp.orc, converted_by_java.orc,
> timestamp.csv
>
>
> I have a CSV file that has a column having timestamp values as 0001-01-01
> 00:00:00.0. Then I convert CSV file to ORC file using CSV to ORC converter
> and place the ORC file in a hive table backed by ORC files. On querying the
> data using Hive beeline and Spark SQL, different results are obtained
> If converted using CPP tool, value read using Hive beeline and Spark SQL
> queries is 0001-01-03 00:00:00
> If converted using Java tool, value read using Hive beeline and Spark SQL
> queries is 0001-01-02 23:56:02.0
--
This message was sent by Atlassian Jira
(v8.20.1#820001)