alamb commented on PR #7741:
URL: https://github.com/apache/arrow-rs/pull/7741#issuecomment-3001716258

   
   > Right now, calling `insert()` with a duplicate key results in _two_ fields 
with the same key in the object, which deviates from the Variant spec. We're 
considering changing this behavior to either return an `Err` on the second 
`insert()` or to update the existing value-- similar to how 
`std::HashMap::insert` works.
   > 
   > From a user standpoint, I'd prefer the latter approach. However, it's a 
relatively expensive operation. Since each `insert()` encodes the value 
directly into the backing buffer, updating a key would require rewriting not 
just the value for that key, but also everything that comes after it in the 
buffer.
   
   I also agree updating the existing value is preferable.
   
   My reading of the Variant spec didn't require all bytes in the variant's 
value to be used
   
   So what i am saying is I think it would be correct for the VariantBuilder to 
just update the key and leave the old value there (but not referenced) 🤔  That 
would result in a larger final variant, but I think as long as we documented 
this behavior it would be ok from the user perspective (I am envisioning many 
different possible desired optimizations for variant creation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to