felipecrv commented on issue #41016: URL: https://github.com/apache/arrow/issues/41016#issuecomment-2046286847
After working for a while on the Arrow codebase, I can say that there is a lot of code that assumes the following (undocumented) invariant: > `null_count` can only transition from `-1` to a value different from `0` **if and only if** the bitmap buffer present another way to put it: > well-formed arrays that have `null_count != -1` are that way for only one reason: `length > 0` and there is a bitmap that must be scanned to get to the real value of `null_count` The fact that this invariant is undocumented means there are defensive coding here and there to protect against violations: https://github.com/apache/arrow/blob/cd607d00b9e3c5f2ac2a2e373ba7571fd75fc725/cpp/src/arrow/array/data.h#L287-L291 ...but there is also code that carefully guarantees that it's preserved in constructors: https://github.com/apache/arrow/blob/cd607d00b9e3c5f2ac2a2e373ba7571fd75fc725/cpp/src/arrow/array/data.cc#L50-L67 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
