scovich commented on PR #7741: URL: https://github.com/apache/arrow-rs/pull/7741#issuecomment-3001834732
> > Right now, calling `insert()` with a duplicate key results in _two_ fields with the same key in the object, which deviates from the Variant spec. We're considering changing this behavior to either return an `Err` on the second `insert()` or to update the existing value-- similar to how `std::HashMap::insert` works. > > From a user standpoint, I'd prefer the latter approach. However, it's a relatively expensive operation. Since each `insert()` encodes the value directly into the backing buffer, updating a key would require rewriting not just the value for that key, but also everything that comes after it in the buffer. > > I also agree updating the existing value is preferable. > > My reading of the Variant spec didn't require all bytes in the variant's value to be used > > So what i am saying is I think it would be correct for the VariantBuilder to just update the key and leave the old value there (but not referenced) 🤔 That would result in a larger final variant, but I think as long as we documented this behavior it would be ok from the user perspective (I am envisioning many different possible desired optimizations for variant creation) The above would certainly work, in the sense of producing a valid variant object. My only concern would be that the scenario almost certainly arises due to user error (which is quite different from a generic map or set), and silently tolerating that error isn't necessarily doing the user any favors in the long run. They'll just discover at read time that they lost data, instead of fast-failing at write time. We can probably get away with either approach -- silently replacing or loudly complaining -- I just want to be sure we make the choice intentionally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
