alamb commented on PR #7741: URL: https://github.com/apache/arrow-rs/pull/7741#issuecomment-3001716258
> Right now, calling `insert()` with a duplicate key results in _two_ fields with the same key in the object, which deviates from the Variant spec. We're considering changing this behavior to either return an `Err` on the second `insert()` or to update the existing value-- similar to how `std::HashMap::insert` works. > > From a user standpoint, I'd prefer the latter approach. However, it's a relatively expensive operation. Since each `insert()` encodes the value directly into the backing buffer, updating a key would require rewriting not just the value for that key, but also everything that comes after it in the buffer. I also agree updating the existing value is preferable. My reading of the Variant spec didn't require all bytes in the variant's value to be used So what i am saying is I think it would be correct for the VariantBuilder to just update the key and leave the old value there (but not referenced) 🤔 That would result in a larger final variant, but I think as long as we documented this behavior it would be ok from the user perspective (I am envisioning many different possible desired optimizations for variant creation) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
