scovich commented on PR #7843:
URL: https://github.com/apache/arrow-rs/pull/7843#issuecomment-3032710854

   > > **For**: The sentinel is helpful when it _is_ able to catch an issue, 
and harmless the rest of the time. In the uncommon case that somebody actually 
_wants_ to avoid calling `finish`, they can just `drop` the builder (which also 
documents the intent).
   > 
   > My example above shows that not dropping the builder can lead to a 
corrupted output
   
   Doesn't that mean our current implementation is not even safe in the 
presence of early return by e.g. `?`, let alone 
[UnwindSafe](https://doc.rust-lang.org/std/panic/trait.UnwindSafe.html)? Any 
error at any nesting depth must be treated as unrecoverable, and the entire 
variant builder discarded? e.g. instead of this:
   ```rust
           {
               let mut object_builder = list_builder.new_object();
               object_builder.insert("id", 2);
               // NOTE object_builder.finish() is not called here
           }
   ```
   imagine this:
   ```rust
           fn try_append(...) -> Result<(), ArrowError> {
               let mut object_builder = list_builder.new_object();
               object_builder.insert("id", 2);
   
               // some fallible operation whose `?` could return early
   
               object_builder.finish()?;
               Ok(())
           }
   ```
   
   With the current behavior, the caller of `try_append` is forced to fail the 
entire variant encoding attempt, for fear that it might have been left in an 
inconsistent state. And if they forget to treat some `Err` as fatal... boom. At 
a minimum, it seems like we should "poison" the parent builder so it knows this 
happened and its (now fallible) `finish` can fail cleanly? It should also 
refuse to create new sub-builders or append new values, to avoid wasting work. 
But that's a lot of work to code up...
   
   > I don't think we support the case of avoiding calling finish without some 
more work (to be clear I am not sure the "rollback a partially built object" s 
a valuable API yet)
   
   Honestly, now that the nested builders store their bytes in a separate vec, 
I would have expected rollback safety for free. But I guess we're still storing 
_something_ in the parent vec. I would imagine that's easily addressed by 
having the constructor grab a marker, which `finish` advances; then `drop` can 
simply truncate the vec to that mark. Seems simpler than safely poisoning it, 
and safer than relying on callers to maintain an invisible (and unusual) error 
handling invariant? 
   
   No need to remove extra entries from the metadata dictionary, they're 
harmless. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to