etseidl commented on code in PR #6840:
URL: https://github.com/apache/arrow-rs/pull/6840#discussion_r1874084200
##########
parquet/src/file/properties.rs:
##########
@@ -780,22 +779,16 @@ impl WriterPropertiesBuilder {
self
}
- /// Sets flag to control if type coercion is enabled (defaults to `false`).
+ /// Should the writer coerce types to parquet native types (defaults to
`false`).
///
- /// # Notes
- /// Some Arrow types do not have a corresponding Parquet logical type.
- /// Affected Arrow data types include `Date64`, `Timestamp` and `Interval`.
- /// Also, for [`List`] and [`Map`] types, Parquet expects certain schema
elements
- /// to have specific names to be considered fully compliant.
- /// Writers have the option to coerce these types and names to match those
required
- /// by the Parquet specification.
- /// This type coercion allows for meaningful representations that do not
require
- /// downstream readers to consider the embedded Arrow schema, and can
allow for greater
- /// compatibility with other Parquet implementations. However, type
- /// coercion also prevents the data from being losslessly round-tripped.
- ///
- /// [`List`]:
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists
- /// [`Map`]:
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps
+ /// Setting this option to `true` will result in parquet files that can be
+ /// read by more readers, but may lose precision for arrow types such as
+ /// [`DataType::Date64`] which have no direct corresponding Parquet type.
Review Comment:
💯
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]