alamb commented on issue #7423:
URL: https://github.com/apache/arrow-rs/issues/7423#issuecomment-2894597621

   @scovich  had some great comments on 
https://github.com/apache/arrow-rs/pull/7452#discussion_r2095809708 that I 
wanted to copy/paste into this ticket perhaps for wider discussion:
   
   
   > I have not reviewed the code carefully at all yet, and what follows is a 
general observation based on the inherent nature of variant data and rust 
notions of safety:
   > 
   > It will be really tempting to have "efficient" code that e.g. uses 
[from_utf8_unchecked](https://doc.rust-lang.org/std/primitive.str.html#method.from_utf8_unchecked)
 to extract a `&str` from a `&[u8]`, or to use indexing operations like `v[10]` 
to extract bytes. But variant data is generally untrusted user input and 
whatever `Variant` struct/enum we define will become the first -- and often 
only -- line of defense against malicious or malformed input.
   > 
   > Hopefully we can code carefully, with the goal that sizes and/or contents 
of metadata and value slices will never cause a panic?
   > 
   > Additinoally, it seems like we have a few choices for values such as 
strings and decimals even a right-sized byte slice can contain invalid values:
   > 
   > 1. Return obviously unvalidated values, e.g. `&[u8]` instead of `&str` for 
strings, and `&[u8]` instead of whatever `VariantDecimal` struct we might 
otherwise define -- leaving the user responsible to finish the conversion as 
(un)safely as they deem prudent.
   > 2. Return ostensibly validated values, with (safe) checked and (unsafe) 
unchecked constructors and/or getters that let the user choose the one they 
deem appropriate.
   > 
   > I personally favor the latter approach (safe and easy to use, even if not 
always the absolutely max efficient), but the topic probably needs a wider 
discussion.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to