Re: [C++] Revamping approach to Arrow compute kernel development

Wes McKinney Tue, 21 Apr 2020 05:12:41 -0700

hi Antoine,

On Tue, Apr 21, 2020 at 4:54 AM Antoine Pitrou <anto...@python.org> wrote:
>
>
> Le 21/04/2020 à 11:13, Antoine Pitrou a écrit :
> >


> It would be interesting to know how costly repeated
> allocation/deallocation is.  Modern allocators like jemalloc do their
> own caching instead of always returning memory to the system.  We could
> also have our own caching layer.

I'm sure that we can do our own experiments. The more significant
issue based on my understanding of the academic research is actually
memory bandwidth relating to CPU memory hierarchies. This is discussed
at length in the X100 paper so I'll direct you to the research rather
than debate about it here. Since that paper was written in 2005, you
might also look for papers in the last 15 years that have this paper
as a reference to see if there has been follow-on research as CPU
architectures have evolved.

> > This assumes that all these kernels can safely write into one of their
> > inputs.  This should be true for trivial ones, but not if e.g. a kernel
> > makes two passes over its input.  For example, the SortToIndices kernel
> > first scans the input for min and max values, and then switches on two
> > different sorting algorithms depending on those statistics (using a O(n)
> > counting sort if the values are in a small enough range).
>
> That said, in the SortToIndices case, this wouldn't be a problem, since
> only the second pass writes to the output.

This kernel is not valid for normal array-exprs (see the spreadsheet I
linked), such as what you can write in SQL

Kernels like SortToIndices are a different type of function (in other
words, "not a SQL function") and so if we choose to allow such a
"non-SQL-like" functions in the expression evaluator then different
logic must be used.

> Regards
>
> Antoine.

Re: [C++] Revamping approach to Arrow compute kernel development

Reply via email to