scovich opened a new pull request, #9270: URL: https://github.com/apache/arrow-rs/pull/9270
# Which issue does this PR close? - Part of https://github.com/apache/arrow-rs/issues/8987 # Rationale for this change Today's json decoder helper, `make_decoder`, takes an owned data type whose components are cloned at every level during the recursive decoder initialization process. This breaks pointer stability of the resulting `DataType` instances that a custom JSON decoder factory would see, vs. those of the schema it and the reader builder were initialized with. The lack of pointer stability prevents users from creating "path based" decoder factories, that are able to customize decoder behavior based not only on type, but also on the field's path in the schema. See the `PathBasedDecoderFactory` in arrow-json/tests/custom_decoder_tests.rs of https://github.com/apache/arrow-rs/pull/9259, for an example. # What changes are included in this PR? By passing `&DataType` instead, we change code like this: ```rust let decoder = make_decoder(field.data_type().clone(), ...); Ok(Self { data_type, decoder, ... }) ``` to this: ```rust let child_decoder = make_decoder(field.data_type(), ...); Ok(Self { data_type: data_type.clone(), decoder, ... }) ``` Result: Every call to `make_decoder` receives a reference to the actual original data type from the builder's input schema. The final decoder `Self` is unchanged -- it already received a clone and continues to do so. NOTE: There is one additional clone of the top-level `DataType::Struct` we create for normal (`!is_field`) builders. But that's a cheap arc clone of a `Fields` member. # Are these changes tested? Yes, existing unit tests validate the change. # Are there any user-facing changes? No. All functions and data types involved are private -- the array decoders are marked `pub` but are defined in a private mod with no public re-export that would make them available outside the crate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
