friendlymatthew commented on issue #7700: URL: https://github.com/apache/arrow-rs/issues/7700#issuecomment-2989302495
Hi hi, I took a look at implementing a fix here to avoid superfluous checks for `String` vs. `ShortString`. If a user passes in a `String` directly, we do one check in the `.into` implementation to convert the `String` to a `Variant`. Then we do a match on the variant type, and we check the string length before appending the value. The check inside `.into` seems unavoidable without changing the API of the function to be less generic. The string length check right before appending also seems unavoidable because it's possible for users to construct `Variant::ShortString`s directly, without doing input validation. This means you can have a short string variant which actually contains a string that is longer than 63 bytes. I have a change [here](https://github.com/apache/arrow-rs/commit/9f58e893eacd7c16518f4ba7192af580b55dfadd#diff-19c7b0b0d73ef11489af7932f49046a19ec7790896a8960add5a3ded21d5657aR762) which splits the `append_string` function into `append_string` and `append_short_string` and moves the length check directly into `append_value`. Is this the sort of change you were thinking about for this issue, or is there some way we could avoid checking the string length more than once here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
