Small bikeshed: But to keep naming consistent "ViewList"?

On Wed, Apr 26, 2023 at 8:02 AM Weston Pace <weston.p...@gmail.com> wrote:

> > My understanding is that the primary benefit of this ListView layout
> > over Arrow's existing List layouts [1] is that ListView allows for
> > buffer alignment [2] without padding, which makes vectorized
> > processing much more efficient. Is this understanding correct?
>
> Yes.  Though proponents of list-view would probably point out that it
> doesn't prevent you from having contiguous buffers, it simply doesn't
> require it.
>
> > Unless I am missing something, I think the selection use-case
> > could be equally well served by a dictionary-encoded
> BinarArray/ListArray,
> > and would have the benefit of not requiring any modifications to the
> > existing format or kernels.
>
> This is a good point that did not come up in the previous discussion that I
> can see.
>
> > The major additional flexibility of the proposed encoding would be
> permitting disjoint
> > or overlapping ranges, are these common enough in practice to represent a
> meaningful bottleneck?
>
> I'm not sure.  There was one other use case that was brought up in the
> original discussion.  This was that list view arrays can be constructed in
> parallel.  That is, if you know the output size (e.g. when applying a large
> scalar function), then you can have different threads fill out different
> regions of the offsets / lengths buffers.  That being said, I don't know
> for certain if anyone is relying on this behavior.
>
>
> On Wed, Apr 26, 2023 at 7:12 AM Felipe Oliveira Carvalho <
> felipe...@gmail.com> wrote:
>
> > After Weston's suggestion above, I've renamed files and classes in my WIP
> > implementation:
> >
> > ArrayView -> ListView
> >
> > On Wed, Apr 26, 2023 at 11:08 AM Ian Cook <i...@ursacomputing.com> wrote:
> >
> > > +1 to what Weston and Joris suggested regarding the name. "ListView"
> > > seems like the best name to use for this layout in Arrow.
> > >
> > > My understanding is that the primary benefit of this ListView layout
> > > over Arrow's existing List layouts [1] is that ListView allows for
> > > buffer alignment [2] without padding, which makes vectorized
> > > processing much more efficient. Is this understanding correct?
> > >
> > > [1]
> > >
> >
> https://arrow.apache.org/docs/format/Columnar.html#variable-size-list-layout
> > > [2]
> > >
> >
> https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding
> > >
> > > Ian
> > >
> > > On Wed, Apr 26, 2023 at 5:27 AM Joris Van den Bossche
> > > <jorisvandenboss...@gmail.com> wrote:
> > > >
> > > > On Wed, 26 Apr 2023 at 02:37, Weston Pace <weston.p...@gmail.com>
> > wrote:
> > > > >
> > > > > For context, there was some discussion on this back in [1].  At
> that
> > > time
> > > > > this was called "sequence view" but I do not like that name.
> > However,
> > > > > array-view array is a little confusing.  Given this is similar to
> > list
> > > can
> > > > > we go with list-view array?
> > > >
> > > > Yes, given that this is essentially an alternative representation of
> a
> > > > logical "list" array, I would also prefer that we use the term "list"
> > > > in the name for such a new type. The word "array" has a different
> > > > meaning in context of our columnar specification.
> > >
> >
>

Reply via email to