alamb commented on pull request #822: URL: https://github.com/apache/arrow-rs/pull/822#issuecomment-939457515
> Changes look good to me as well and I agree with @Dandandan that we should add safety notes. @Dandandan and @houqp -- First I want to emphasize that this PR *does not change the safety of the arrow-rs implementation* -- the code is as safe/unsafe before this PR as it is after this PR. I agree that all `unsafe` annotations in the arrow codebase should be evaluated for soundness eventually. However, I propose not requiring such annotations for this PR because: 1. My goal with this PR is not to make arrow-rs totally safe, it is to set the API up so that we can incrementally make it safer going forward (among other things resolving the RUSTSEC security advisories against arrow-rs) 3. All code newly marked as `unsafe` was previously reviewed in other PRs 1. Manually verifying all locations both requires codebase expertise which is quite rare, and is subject to human fallibility -- I would rather have a compiler or runtime code check rather than rely on additional human review. 2. Existing MIRI test coverage, while not perfect, does offer a measure of protection against undefined behavior. 2. I think there is some sense of urgency to get this change in so it can be included in arrow 6.0, Thus, I propose a multi-pronged approach: 1. When adding runtime validation checks to `ArrayData::try_new()` -- also add a CI check (or maybe always in debug mode) a check that does `ArrayData` validation even in `ArrayData::new_unchecked` 2. Perhaps crowd source (somehow) the existing tickets for adding `safety` annotations (see [list](https://github.com/apache/arrow-rs/issues?q=is%3Aissue+is%3Aopen+label%3Asecurity+%22Document+use+of%22) and when someone works on those, the unsafe uses of `ArrayData::new_unchecked` can be included. 3. Re-evaluate if it is time to bring in arrow2 ideas to the main arrow-rs repo. I will admit that part of my reason for not wanting to try and annotate all uses of `unsafe` is that doing so is a large / thankless a task, and I suspect the prospect of doing so might have contributed to @jorgecarleitao's choice to focus his time working on arrow2 instead). I think my own time is more productively spent adding runtime verification and/or helping integrate arrow2 (or its approach or whatever) rather than manually evaluating the existing arrow-rs code base. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
