I created https://issues.apache.org/jira/browse/ARROW-16755 as an
umbrella issue to track improvements to reduce overhead in the
expression and kernel execution machinery. Please help by attaching
related issues and creating new issues for specific individual efforts
here. I'll work as quickly as I can to have my initial patch
ARROW-16756 ready which will unblock the next few projects here

On Mon, Jun 6, 2022 at 10:35 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
>  This is definitely only the first stage of cleanup and streamlining —
> I anticipate multiple rounds of refactoring (maybe not as invasive and
> painful as this one), and this patch I'm not sure will do a lot to
> alleviate bottom line expression evaluation overhead but it creates
> the environment (i.e. where a whole chain of scalar functions that all
> write into preallocated memory can execute without having to touch
> shared_ptrs or deal with other objects with excess microperformance
> overhead) where such optimization can happen more easily.
>
>
> On Mon, Jun 6, 2022 at 4:08 AM Antoine Pitrou <anto...@python.org> wrote:
> >
> >
> > Le 06/06/2022 à 09:34, Sasha Krassovsky a écrit :
> > > Wow that's a lot of progress!
> > > Definitely agree on the scalar outputs point.
> > >
> > > One point about the ArraySpan - why does it need to know its data type?
> > > Once a kernel has been resolved by the registry, the kernel will only know
> > > how to execute on the specific type it was resolved for, right?
> >
> > Because of parametric types for example (e.g. timestamps with a unit and
> > timezone, or decimals with a precision and scale).
> >
> > Regards
> >
> > Antoine.

Reply via email to