[ 
https://issues.apache.org/jira/browse/ORC-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxiaoyu updated ORC-1287:
---------------------------
    Description: 
I have a csv file. Convert to orc files with the c++ csv-import tool.   

ORC Version is v1.7.3  

 

Command

```

csv-import struct<a:timestamp> ./test.csv ./test.orc

java -jar orc-tools-1.7.3-uber.jar data test.orc

orc-contents ./test.orc

```

  

 

CSV File

```

0001-01-01 00:00:00.000000

0001-10-19 10:23:54.123456

0099-10-19 10:23:54.123456

1900-10-19 10:23:54.123456

1969-12-31 23:59:59.001

1969-12-31 23:59:59.999999

1970-01-01 00:00:00.000

1970-01-01 00:00:00.001

1970-01-01 23:59:59.999999

```

c++ read orc file

```

{"a": "1-01-01 00:00:00.0"}

{"a": "1-10-19 10:23:54.123456"}

   

{"a": "99-10-19 10:23:54.123456"}

   

{"a": "1900-10-19 10:23:54.123456"}

   

{"a": "1970-01-01 00:00:00.001"}

   

{"a": "1970-01-01 00:00:00.999999"}

   

{"a": "1970-01-01 00:00:00.0"}

   

{"a": "1970-01-01 00:00:00.001"}

   

{"a": "1970-01-01 23:59:59.999999"}

 

```

java read orc file

```

{"a":"0001-01-03 08:00:00.0"}

{"a":"0001-10-21 18:23:54.123456"} \{"a":"0099-10-21 18:23:54.123456"} 
\{"a":"1900-10-19 18:29:37.123456"} \{"a":"1970-01-01 08:00:00.001"} 
\{"a":"1970-01-01 08:00:00.999999"} \{"a":"1970-01-01 08:00:00.0"} 
\{"a":"1970-01-01 08:00:00.001"} \{"a":"1970-01-02 07:59:59.999999"}

```

`0001-01-01 00:00:00.000000`  java and c++ show timestamp are different

  

Tried the version orc main branch is the same results.

  

this issue looks similar to this issue

https://issues.apache.org/jira/browse/ORC-1055

  was:
I have a csv file. Convert to orc files with the c++ csv-import tool.   

ORC Version is v1.7.3  

 

Command

```

csv-import struct<a:timestamp> ./test.csv ./test.orc

java -jar orc-tools-1.7.3-uber.jar data test.orc

orc-contents ./test.orc

```

  

 

CSV File

```

0001-01-01 00:00:00.000000

0001-10-19 10:23:54.123456

0099-10-19 10:23:54.123456

1900-10-19 10:23:54.123456

1969-12-31 23:59:59.001

1969-12-31 23:59:59.999999

1970-01-01 00:00:00.000

1970-01-01 00:00:00.001

1970-01-01 23:59:59.999999

```

c++ read orc file

```

{"a": "1-01-01 00:00:00.0"}  

{"a": "1-10-19 10:23:54.123456"}   

{"a": "99-10-19 10:23:54.123456"}   

{"a": "1900-10-19 10:23:54.123456"}   

{"a": "1970-01-01 00:00:00.001"}   

{"a": "1970-01-01 00:00:00.999999"}   

{"a": "1970-01-01 00:00:00.0"}   

{"a": "1970-01-01 00:00:00.001"}   

{"a": "1970-01-01 23:59:59.999999"}  

```

java read orc file

```

{"a":"0001-01-03 08:00:00.0"} \{"a":"0001-10-21 18:23:54.123456"} 
\{"a":"0099-10-21 18:23:54.123456"} \{"a":"1900-10-19 18:29:37.123456"} 
\{"a":"1970-01-01 08:00:00.001"} \{"a":"1970-01-01 08:00:00.999999"} 
\{"a":"1970-01-01 08:00:00.0"} \{"a":"1970-01-01 08:00:00.001"} 
\{"a":"1970-01-02 07:59:59.999999"}

```

`0001-01-01 00:00:00.000000`  java and c++ show timestamp are different

  

Tried the version orc main branch is the same results.

  

this issue looks similar to this issue

https://issues.apache.org/jira/browse/ORC-1055


> C++ read timestamp value is different from java read when using csv-import 
> tool convert CSV to ORC files
> --------------------------------------------------------------------------------------------------------
>
>                 Key: ORC-1287
>                 URL: https://issues.apache.org/jira/browse/ORC-1287
>             Project: ORC
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 1.7.3
>         Environment: centos7
>            Reporter: liuxiaoyu
>            Priority: Major
>         Attachments: test.csv, test.orc
>
>
> I have a csv file. Convert to orc files with the c++ csv-import tool.   
> ORC Version is v1.7.3  
>  
> Command
> ```
> csv-import struct<a:timestamp> ./test.csv ./test.orc
> java -jar orc-tools-1.7.3-uber.jar data test.orc
> orc-contents ./test.orc
> ```
>   
>  
> CSV File
> ```
> 0001-01-01 00:00:00.000000
> 0001-10-19 10:23:54.123456
> 0099-10-19 10:23:54.123456
> 1900-10-19 10:23:54.123456
> 1969-12-31 23:59:59.001
> 1969-12-31 23:59:59.999999
> 1970-01-01 00:00:00.000
> 1970-01-01 00:00:00.001
> 1970-01-01 23:59:59.999999
> ```
> c++ read orc file
> ```
> {"a": "1-01-01 00:00:00.0"}
> {"a": "1-10-19 10:23:54.123456"}
>    
> {"a": "99-10-19 10:23:54.123456"}
>    
> {"a": "1900-10-19 10:23:54.123456"}
>    
> {"a": "1970-01-01 00:00:00.001"}
>    
> {"a": "1970-01-01 00:00:00.999999"}
>    
> {"a": "1970-01-01 00:00:00.0"}
>    
> {"a": "1970-01-01 00:00:00.001"}
>    
> {"a": "1970-01-01 23:59:59.999999"}
>  
> ```
> java read orc file
> ```
> {"a":"0001-01-03 08:00:00.0"}
> {"a":"0001-10-21 18:23:54.123456"} \{"a":"0099-10-21 18:23:54.123456"} 
> \{"a":"1900-10-19 18:29:37.123456"} \{"a":"1970-01-01 08:00:00.001"} 
> \{"a":"1970-01-01 08:00:00.999999"} \{"a":"1970-01-01 08:00:00.0"} 
> \{"a":"1970-01-01 08:00:00.001"} \{"a":"1970-01-02 07:59:59.999999"}
> ```
> `0001-01-01 00:00:00.000000`  java and c++ show timestamp are different
>   
> Tried the version orc main branch is the same results.
>   
> this issue looks similar to this issue
> https://issues.apache.org/jira/browse/ORC-1055



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to