joe-ucp commented on code in PR #8877: URL: https://github.com/apache/arrow-rs/pull/8877#discussion_r2543799398
########## arrow-select/src/nullif.rs: ########## @@ -17,11 +17,50 @@ //! Implements the `nullif` function for Arrow arrays. +/* + * NULLIF Implementation Contract + * + * For any ArrayData: + * len = data.len() // logical elements + * offset = data.offset() // logical starting index into buffers + * + * Validity bitmap (if present) is a Buffer B. + * Invariant: + * Logical index i in [0, len) is valid iff get_bit(B, offset + i) == true. + * + * For the result of nullif: + * We will build a fresh ArrayData with offset = 0. + * For that result: + * Logical index i is valid iff get_bit(result_validity, i) == true. + * Values buffer is laid out so element 0 is first result value, etc. + * + * For nullif semantics: + * Let V(i) = left is valid at i + * C(i) = condition "nullify at i" is true (depends on left, right, type) + * Then: + * result_valid(i) = V(i) & !C(i) + * result_value(i) = left_value(i) // when result_valid(i) == true + * + * This contract is the law. All nullif implementations must follow it. Review Comment: The intent here wasn’t to replace tests, but to write down the invariants that the tests are asserting — especially around how offsets map to validity bits and how `V(i) & !C(i)` is interpreted for sliced arrays. This was primarly used by me to keep the "logic" in my own head. 😂 I’m happy to move that description out of the standalone markdown and into doc comments on the helpers in `nullif.rs`, and keep the tests as the way we actually enforce the behavior. That should keep the spec close to the code while still making the rules explicit. If you’d prefer to avoid the extra MD file entirely, I can also drop it and keep the contract only in code + tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
