scovich commented on PR #7843: URL: https://github.com/apache/arrow-rs/pull/7843#issuecomment-3032710854
> > **For**: The sentinel is helpful when it _is_ able to catch an issue, and harmless the rest of the time. In the uncommon case that somebody actually _wants_ to avoid calling `finish`, they can just `drop` the builder (which also documents the intent). > > My example above shows that not dropping the builder can lead to a corrupted output Doesn't that mean our current implementation is not even safe in the presence of early return by e.g. `?`, let alone [UnwindSafe](https://doc.rust-lang.org/std/panic/trait.UnwindSafe.html)? Any error at any nesting depth must be treated as unrecoverable, and the entire variant builder discarded? e.g. instead of this: ```rust { let mut object_builder = list_builder.new_object(); object_builder.insert("id", 2); // NOTE object_builder.finish() is not called here } ``` imagine this: ```rust fn try_append(...) -> Result<(), ArrowError> { let mut object_builder = list_builder.new_object(); object_builder.insert("id", 2); // some fallible operation whose `?` could return early object_builder.finish()?; Ok(()) } ``` With the current behavior, the caller of `try_append` is forced to fail the entire variant encoding attempt, for fear that it might have been left in an inconsistent state. And if they forget to treat some `Err` as fatal... boom. At a minimum, it seems like we should "poison" the parent builder so it knows this happened and its (now fallible) `finish` can fail cleanly? It should also refuse to create new sub-builders or append new values, to avoid wasting work. But that's a lot of work to code up... > I don't think we support the case of avoiding calling finish without some more work (to be clear I am not sure the "rollback a partially built object" s a valuable API yet) Honestly, now that the nested builders store their bytes in a separate vec, I would have expected rollback safety for free. But I guess we're still storing _something_ in the parent vec. I would imagine that's easily addressed by having the constructor grab a marker, which `finish` advances; then `drop` can simply truncate the vec to that mark. Seems simpler than safely poisoning it, and safer than relying on callers to maintain an invisible (and unusual) error handling invariant? No need to remove extra entries from the metadata dictionary, they're harmless. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
