scovich commented on code in PR #8350:
URL: https://github.com/apache/arrow-rs/pull/8350#discussion_r2357600500


##########
arrow-schema/src/field.rs:
##########
@@ -391,6 +398,49 @@ impl Field {
     /// ```
     pub fn with_data_type(mut self, data_type: DataType) -> Self {
         self.set_data_type(data_type);
+        self.without_extension_type()

Review Comment:
   This change seems really dangerous... it will silently strip away the 
extension type information, which can propagate into e.g. the parquet file the 
column eventually gets written to, which could be very hard to fix later on 
when downstream readers start noticing the extension type info is unexpectedly 
missing.
   
   Seems better to make this fallible, and blow up if there is an extension 
type for which the new data type is incompatible? Could potentially still keep 
the original (infallible) version and panic if the new type is invalid. Doing 
that means people upgrading to arrow-57 won't necessarily realize there's a 
potential problem until their code starts crashing, which I don't love, but I 
suspect some people would rather deal with a potential panic than have to 
rewrite all their `with_data_type` calls? Alternatively -- maybe we leave the 
panicky infallible version in place, but deprecated with an appropriate 
warning? But people with automated builds won't necessarily _see_ the warning, 
especially if they already have lots of warnings in their code.
   
   If somebody _actually_ wanted to clear the extension type, it seems much 
better to require them to clearly state their intent by e.g. 
`field.without_extension_type().with_data_type(...)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to