hdorio opened a new issue, #1866:
URL: https://github.com/apache/orc/issues/1866

   Currently, the JSON output generated by the `orc-contents` command line 
utility stores decimal values using floating-point numbers. This can lead to 
precision issues and inaccuracies, especially when dealing with financial data.
   
   ```bash
   echo "1.1299999999999991" > test.csv
   
   csv-import "struct<amount:decimal(38,18)>" test.csv test.orc
   orc-contents test.orc > test.json
   
   cat test.json | jq .amount
   node -e "console.log(JSON.parse(fs.readFileSync('test.json', 
'utf8'))['amount']);"
   ruby -e "require 'json'; puts JSON.parse(File.read('test.json'))['amount'];"
   ```
   
   test.csv: `1.1299999999999991`
   test.json (test.orc as JSON): `{"amount": 1.129999999999999100}`
   
   ```
   # outputs
   [2024-03-30 15:55:24] Start importing Orc file...
   [2024-03-30 15:55:24] Finish importing Orc file.
   [2024-03-30 15:55:24] Total writer elasped time: 0.000281s.
   [2024-03-30 15:55:24] Total writer CPU time: 0.000277s.
   1.129999999999999 # Jq
   1.129999999999999 # NodeJS
   1.129999999999999 # Ruby
   ```
   
   **Note the truncated `1`, the correct output should be `1.1299999999999991`**
   
   Would it be acceptable to modify Decimal128ColumnPrinter (and 
Decimal64ColumnPrinter) to return a string? `{"amount": "1.129999999999999100"}`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to