I've made good progress on this the last couple days and I'm at the
point of writing the new ScalarExecutor that uses the new
ArraySpan/ExecSpan concepts,  having done the initial "porting" of the
scalar kernels to use the new data structures. I'm tired so I'm
stopping for today, but I'm going to try the next couple days to get
the codebase compiling again and the unit tests passing, which will
hopefully go smoothly since we have lots of tests but we will see how
it goes.

The refactoring was extremely gnarly (especially scalar_if_else.cc)
and I have a growing list of notes about how we can improve the array
kernels codebase. In particular, we should definitely eliminate
"scalar" kernel outputs. This only gets triggered when all inputs are
scalars, but in that case we should instead promote all inputs to
arrays of length 1 (some kernels already do this!) and use the array
code paths. This would allow us to delete a ton of tedious code and
simplify our maintenance burden — there is no performance critical
need for all-scalar function evaluation, so the performance hit of
up-promoting scalars to arrays-of-length-1 is totally worth it for the
ability to delete a ton of code.

There are some critical unanswered questions around how we will handle
the dichotomy between kernels that allocate memory and those that
don't — I am hopeful that in subsequent PRs we can establish a scheme
wherein we can use the ArraySpan data structure even for non-primitive
outputs, but some kind of allocated-buffer bookkeeping concept in the
KernelContext so that the executor layer can know when it needs to
destroy a buffer that is no longer needed, rather than relying on the
ArrayData destructor to release memory.

My branch is here, but I would say it's a couple days away from being
review-ready:

https://github.com/apache/arrow/compare/master...wesm:lightweight-exec-batch

I'll post a PR when I have something closer to a green build. We
probably won't want to let this PR linger since it will cause
conflicts with any PRs touching scalar kernels.


On Fri, Jun 3, 2022 at 1:59 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> Thanks Sasha — this is helpful. I'm going to take a college try at
> just the scalar kernels and see what I can accomplish over the next
> few days — will attempt to get a PR up for review with the C++ tests
> passing. I'm expecting assorted workarounds for the various kernels
> that do zero-copy optimizations (setting output buffers with input
> buffers — such optimizations should likely be carried out elsewhere),
> etc., but I will keep notes about challenges that I run into to assist
> with further refactoring and improvements.
>
> On Fri, Jun 3, 2022 at 1:20 PM Sasha Krassovsky
> <krassovskysa...@gmail.com> wrote:
> >
> > Hi all,
> > I’ve been thinking about some sort of refactoring of this registry for a 
> > while now, and I’ve written down some thoughts, please leave your comments.
> >
> > https://docs.google.com/document/d/1LAN9I_Y9cZaG2a84j1wLY8jSlK3gDXYMle-VtyFCAE8/edit?usp=sharing
> >  
> > <https://docs.google.com/document/d/1LAN9I_Y9cZaG2a84j1wLY8jSlK3gDXYMle-VtyFCAE8/edit?usp=sharing>
> >
> > I agree with Weston that starting with a small set of kernels and drilling 
> > down the implementation until we have a design with the characteristics we 
> > want seems like a good idea. One point to consider is the tradeoff between 
> > the size of our set of guinea pig kernels and how representative they are 
> > of the rest. We will probably need to cover nested types as Weston said, 
> > aggregations, etc.
> >
> > Sasha
> >
> > > On Jun 3, 2022, at 2:35 AM, Weston Pace <weston.p...@gmail.com> wrote:
> > >
> > > That approach looks great and very much in line with some of the stuff
> > > we have in light_array.h so I think it's very compatible.  If you have
> > > the time to push this refactoring through then go for it.  Don't let
> > > anything I'm saying deter any ongoing efforts.
> > >
> > > I'm just advocating that we be open to any kind of incremental
> > > approach.  I looked a little closer at this today and you and Antoine
> > > are correct that "two function registries" is maybe not the right
> > > extension point.  Instead it might be that we have two classes of
> > > scalar functions.  For example, the "at most three buffers" ArraySpan
> > > that you've linked only works for types without children.  We do have
> > > some scalar functions that run on nested types (e.g. count, index,
> > > etc.)  My point is that rather than force ourselves to deal with all
> > > of these corner cases we should be open to an incremental solution.
> > > Maybe there is a ScalarPrimitive and a ScalarNested kernel type or
> > > Scalar and ScalarSpan.
> > >
> > > Also, if we know we are also going to want to tweak the output
> > > interface (I don't know for sure if we will) then maybe it makes sense
> > > to pick a small set of kernels and incrementally improve that small
> > > set until we think we've made all the changes we are going to want for
> > > the near future, and then promote those changes to the wider set of
> > > kernels.
> > >
> > > On Thu, Jun 2, 2022 at 8:07 AM Wes McKinney <wesmck...@gmail.com> wrote:
> > >>
> > >> On this topic, I actually have started prototyping a new ScalarKernel
> > >> exec interface that uses a non-owning, shared_ptr-free "ArraySpan"
> > >> data structure based on some prior conversations
> > >>
> > >> https://github.com/wesm/arrow/blob/711fd5e5665c280540bbaf48a48ca1eca1b91bff/cpp/src/arrow/compute/exec.h#L163
> > >> https://github.com/wesm/arrow/blob/711fd5e5665c280540bbaf48a48ca1eca1b91bff/cpp/src/arrow/compute/kernel.h#L552
> > >>
> > >> My idea was to refactor just the existing scalar kernels (which by my
> > >> estimation seems tractable since so many go through the common
> > >> templated kernel generators) and leave the other kinds of kernels
> > >> untouched so that they can be refactored in due course. I hadn't
> > >> planned to try to tackle changing the "output" interface from
> > >> shared_ptr<ArrayData> because there are some kernels that create new
> > >> ArrayData and some that don't.
> > >>
> > >> If this seems like a reasonable path, I can try to have a first draft
> > >> PR ready to go maybe by Monday (I was going to work on this over the
> > >> weekend when I can have some uninterrupted time to do the
> > >> refactoring). I'm not sure that a new registry is going to be needed
> > >>
> > >> On Thu, Jun 2, 2022 at 2:50 AM Antoine Pitrou <anto...@python.org> wrote:
> > >>>
> > >>>
> > >>> Le 02/06/2022 à 00:02, Weston Pace a écrit :
> > >>>>
> > >>>> I'd like to propose we add a second kernel function registry.  There
> > >>>> doesn't need to be any user facing API change.  We could probably use
> > >>>> an approach like [2] to proxy to the old function registry when the
> > >>>> newer registry doesn't contain the asked-for function.  This would
> > >>>> allow us to focus on creating an efficient function registry without
> > >>>> having to worry about refactoring the existing kernels all at once.
> > >>>
> > >>> Why do we need a separate registry for this?
> >

Reply via email to