xianwill opened a new issue #653:
URL: https://github.com/apache/arrow-rs/issues/653


   **Describe the bug**
   Large values for `i64` and `u64` types are corrupted by the cast to `f64` 
back to `i64/u64` in the [json decoder build_primitive_array 
method](https://github.com/apache/arrow-rs/blob/b38a4b6c29ba8b9be02460183c61de86bd9ba7df/arrow/src/json/reader.rs#L930-L931).
 
   
   **To Reproduce**
   Pass a large `i64` value through the decoder as demonstrated in [this 
commit](https://github.com/apache/arrow-rs/pull/652/commits/405683aa2b30e112c9851b7588b03d0a9d3421a8).
 The converted value will be slightly smaller. In this example to create a 
breaking test, I passed `1627668684594000000` and the resulting value came out 
as `1627668684593999872` - a difference of `128`.
   
   **Expected behavior**
   The converted value should match the value passed to the decoder. In this 
case, the value in the created record batch should be `1627668684594000000`.
   
   **Additional context**
   I found this bug while implementing timestamp support in 
[kafka-delta-ingest](https://github.com/delta-io/kafka-delta-ingest/pull/44) 
and [delta-rs](https://github.com/delta-io/delta-rs/pull/340). Valid nanosecond 
timestamps are on the critical path for us there. Also, I have [an arrow-rs 
PR](https://github.com/apache/arrow-rs/pull/652/commits/405683aa2b30e112c9851b7588b03d0a9d3421a8)
 in place already to fix.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to