cxzl25 opened a new pull request, #1930:
URL: https://github.com/apache/orc/pull/1930
### What changes were proposed in this pull request?
This PR aims to fix `IllegalArgumentException` when reading json timestamp
type in benchmark.
Write and read json, convert timestamp type to long type instead of string
type.
### Why are the changes needed?
ORC-1191 Switch the csv format of taxi to parquet and read the timestamp
format of parquet, but it is in microseconds format, which is different from
the millisecond format of Java's `java.sql.Timestamp`.
taxi source parquet meta
```bash
optional int64 tpep_pickup_datetime (TIMESTAMP(MICROS,false));
optional int64 tpep_dropoff_datetime (TIMESTAMP(MICROS,false));
```
When we write the data into json and then use the scan command, we will get
the following error.
```java
java -jar core/target/orc-benchmarks-core-*-uber.jar scan data -format json
```
```
Exception in thread "main" java.lang.IllegalArgumentException: Timestamp
format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
at java.sql/java.sql.Timestamp.valueOf(Timestamp.java:224)
at
org.apache.orc.bench.core.convert.json.JsonReader$TimestampColumnConverter.convert(JsonReader.java:175)
at
org.apache.orc.bench.core.convert.json.JsonReader.nextBatch(JsonReader.java:86)
at
org.apache.orc.bench.core.convert.ScanVariants.run(ScanVariants.java:92)
at org.apache.orc.bench.core.Driver.main(Driver.java:64)
```
Because json data of type timestamp is written via
`java.sql.Timestamp#toString`, but reading the data
`java.sql.Timestamp#valueOf` will report an error.
```java
Timestamp ts = new Timestamp(1446341079000000L);
System.out.println(ts);
System.out.println(Timestamp.valueOf(ts.toString()));
```
```
47802-09-23 02:50:00.0
Exception in thread "main" java.lang.IllegalArgumentException: Timestamp
format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
at java.sql.Timestamp.valueOf(Timestamp.java:237)
```
### How was this patch tested?
local test
```bash
java -jar core/target/orc-benchmarks-core-*-uber.jar generate data -format
json -data taxi -compress snappy
```
```bash
java -jar core/target/orc-benchmarks-core-*-uber.jar scan data -format json
-data taxi -compress snappy
```
### Was this patch authored or co-authored using generative AI tooling?
No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]