joe-ucp commented on code in PR #8877: URL: https://github.com/apache/arrow-rs/pull/8877#discussion_r2546342813
########## arrow-select/src/nullif.rs: ########## @@ -17,11 +17,50 @@ //! Implements the `nullif` function for Arrow arrays. +/* + * NULLIF Implementation Contract + * + * For any ArrayData: + * len = data.len() // logical elements + * offset = data.offset() // logical starting index into buffers + * + * Validity bitmap (if present) is a Buffer B. + * Invariant: + * Logical index i in [0, len) is valid iff get_bit(B, offset + i) == true. + * + * For the result of nullif: + * We will build a fresh ArrayData with offset = 0. + * For that result: + * Logical index i is valid iff get_bit(result_validity, i) == true. + * Values buffer is laid out so element 0 is first result value, etc. + * + * For nullif semantics: + * Let V(i) = left is valid at i + * C(i) = condition "nullify at i" is true (depends on left, right, type) + * Then: + * result_valid(i) = V(i) & !C(i) + * result_value(i) = left_value(i) // when result_valid(i) == true + * + * This contract is the law. All nullif implementations must follow it. Review Comment: 👍 Makes sense. In the latest update I’ve: 1. Moved the bitmap/validity contract text out of the standalone markdown file and into doc comments next to the nullif helpers, so the explanation now lives right with the code 2. Dropped the extra .md file from this PR. I agree a higher-level doc about how arrays/validity maps work would be useful as a separate follow-up PR once this one settles. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
