On Fri, Feb 18, 2022 at 3:44 PM Antoine Pitrou <anto...@python.org> wrote:
> > Le 18/02/2022 à 21:32, Phillip Cloud a écrit : > > > > I am really struggling to see how anything I've said is inconsistent with > > the spec or what you are saying here. > > > > To recap what I've said: > > > > 1. Appending a null sentinel to the values buffer isn't _required_ unless > > the type requires it. > > Ex: "joemark" in the spec example. No sentinels were append for the two > > null values in the parent struct array. > > There is no notion of sentinel in the Arrow format, so I don't > understand what you're saying. > The word "sentinel" is a linguistic placeholder for "some set of bytes". Hopefully that's clear from the context. > > (a sentinel is a physical value having a specific meaning, for example a > data format that has no separate validity bitmap could use the integer > value 42 to indicate null values in an integer array; the Arrow format > has a separate validity bitmap and therefore doesn't make use of > sentinel values) > > 2. Appending a null value sentinel is _allowed_ to be there if the type > > does not require it. > > Ex: "joefoofoomark" extending the spec example, assuming the other > > associated buffers (validity, offsets) are correctly constructed. > > > > Is either of those statements incorrect? > > To me, they simply don't make sense given that sentinels don't exist in > Arrow. Do they make sense after substituting in "a null entry in a string array with a non-zero number of bytes"? > > That said, a null entry in a string array can be backed by a non-zero > number of bytes in the values buffer. That is unrelated to the question > about struct arrays. For example, "joefoofoomark" can very well be the > values buffer for a string array with the logical values ["joe", null, > "mark"]. In this case, the offsets will be [0, 3, 9, 13]. > Ok, then perhaps you might have some thoughts on the original question: is the JavaScript implementation currently incorrect? > > Regards > > Antoine. >