Le 18/02/2022 à 20:26, Phillip Cloud a écrit :
On Fri, Feb 18, 2022 at 2:06 PM Antoine Pitrou <anto...@python.org> wrote:

Le 18/02/2022 à 20:01, Phillip Cloud a écrit :
I think I'm confused by where this appended value lives. Is it only a
logical value or does the value show up in memory?

The logical value is null.  The appended value is only a physical value
that shows up in memory but doesn't have any bearing on the logical value.


Yes, but where does that value reside? Does it depend on the array type? Is
it garbage in the values buffer? Something else?

Well, it obviously depends on the child array type, so it's difficult to answer more precisely.

For example, if your child array is a fixed-width primitive array, then you can append a value of the given width, with whatever value. You can also append a null in the child, but you still have to append to the values buffer anyway (since it's a fixed-width type).

For example, appending another null to the name field is only going to
change the validity map, offsets array and length and there will not be
any
changes the values buffer.

I may be missing some context, but what is the "name field" here?


The field in the example in the spec:
https://arrow.apache.org/docs/format/Columnar.html#struct-layout

> [...]

If that were the case, I would expect garbage in between "joe" and "mark"
in the values array
from the example (the garbage being the physical value not having any
bearing on the logical value).

Let's stop talking about "garbage", which is not a technically meaningful term.

In this example, the child array is ["joe", null, null, "mark"], but it could also have been ["joe", null, "", mark] or even ["joe", null, "whatever", "mark"]. The important point being that the value #2 in the child array is masked by the corresponding null bit in the parent struct array.

Regards

Antoine.

Reply via email to