scovich commented on PR #7741:
URL: https://github.com/apache/arrow-rs/pull/7741#issuecomment-3001834732

   > > Right now, calling `insert()` with a duplicate key results in _two_ 
fields with the same key in the object, which deviates from the Variant spec. 
We're considering changing this behavior to either return an `Err` on the 
second `insert()` or to update the existing value-- similar to how 
`std::HashMap::insert` works.
   > > From a user standpoint, I'd prefer the latter approach. However, it's a 
relatively expensive operation. Since each `insert()` encodes the value 
directly into the backing buffer, updating a key would require rewriting not 
just the value for that key, but also everything that comes after it in the 
buffer.
   > 
   > I also agree updating the existing value is preferable.
   > 
   > My reading of the Variant spec didn't require all bytes in the variant's 
value to be used
   > 
   > So what i am saying is I think it would be correct for the VariantBuilder 
to just update the key and leave the old value there (but not referenced) 🤔 
That would result in a larger final variant, but I think as long as we 
documented this behavior it would be ok from the user perspective (I am 
envisioning many different possible desired optimizations for variant creation)
   
   The above would certainly work, in the sense of producing a valid variant 
object. My only concern would be that the scenario almost certainly arises due 
to user error (which is quite different from a generic map or set), and 
silently tolerating that error isn't necessarily doing the user any favors in 
the long run. They'll just discover at read time that they lost data, instead 
of fast-failing at write time. We can probably get away with either approach -- 
silently replacing or loudly complaining -- I just want to be sure we make the 
choice intentionally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to