liukun4515 commented on PR #3431:
URL: https://github.com/apache/arrow-rs/pull/3431#issuecomment-1370615611

   > > I think the data in the arrow ecosystem is exchanged by IPC format
   > 
   > Sometimes, but an important property is that data written by one 
implementation to CSV, Parquet, or whatever can be read by another
   > 
   
   why is it related to other file format? 
   The changes just enhance the writing for parquet file format, and it will 
not impact the CSV and other file format.
   
   
   > To phrase my concern differently, decimals are a relatively esoteric type, 
with most arrow implementations having limited support. I worry with this PR we 
will now write decimal data in such a way arrow implementations that used to 
understand it, now won't.
   > 
   > Can you confirm pyarrow at least can correctly read the data written by 
this PR?
   
   From 
https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/reader.cc#L1227,
 the c++ support reading the decimal data from INT32/INT64, but c++ does not 
support writing decimal using the INT32/INT64 parquet physical type 
https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/writer.cc#L811,
 this is consistent with the comments for the arrow writing parquet.
   ```
   DECIMAL | INT32 / INT64 / BYTE_ARRAY / FIXED_LENGTH_BYTE_ARRAY | Decimal128 
/ Decimal256 | (2)
   
   (2) On the write side, a FIXED_LENGTH_BYTE_ARRAY is always emitted.
   
   ```
   The writing path of go is same with the c++.
   go: 
https://github.com/apache/arrow/blob/master/go/parquet/pqarrow/schema.go#L303
   But I can't find the writing path for the  pyarrow. @tustvold 
   
   But all languages support reading the decimal from 
INT32/INT64/FIXED_BYTE_ARRAY/BYTE_ARRAY from parquet file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to