scovich commented on code in PR #8350: URL: https://github.com/apache/arrow-rs/pull/8350#discussion_r2357600500
########## arrow-schema/src/field.rs: ########## @@ -391,6 +398,49 @@ impl Field { /// ``` pub fn with_data_type(mut self, data_type: DataType) -> Self { self.set_data_type(data_type); + self.without_extension_type() Review Comment: This change seems really dangerous... it will silently strip away the extension type information, which can propagate into e.g. the parquet file the column eventually gets written to, which could be very hard to fix later on when downstream readers start noticing the extension type info is unexpectedly missing. Seems better to make this fallible, and blow up if there is an extension type for which the new data type is incompatible? Could potentially still keep the original (infallible) version and panic if the new type is invalid. Doing that means people upgrading to arrow-57 won't necessarily realize there's a potential problem until their code starts crashing, which I don't love, but I suspect some people would rather deal with a potential panic than have to rewrite all their `with_data_type` calls? Alternatively -- maybe we leave the panicky infallible version in place, but deprecated with an appropriate warning? But people with automated builds won't necessarily _see_ the warning, especially if they already have lots of warnings in their code. If somebody _actually_ wanted to clear the extension type, it seems much better to require them to clearly state their intent by e.g. `field.without_extension_type().with_data_type(...)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org