hi Edmon, Since Arrow arrays are arranged with like-data in contiguous memory regions (for example, in an array of strings, the UTF8 bytes are all laid out in contiguous memory -- see https://github.com/apache/arrow/blob/master/format/Layout.md), it is cache-friendly for scan operations and amenable to SIMD computations (for example: SIMD-accelerated hash functions). This is especially important for nested data, as all the "leaf nodes" in a nested structure generally contain contiguous memory.
We have not started doing this yet, but it would be useful to begin assembling kernels that use CPU intrinsics (and SSE/AVX) in the Arrow codebase, and to make them easily accessible. Having a standard benchmark suite and other performance experimentation tools available for users to run on their hardware would also be great. best, Wes On Wed, Mar 2, 2016 at 10:21 AM, Edmon Begoli <ebeg...@gmail.com> wrote: > Hey folks, > > How could I get more details on what and how Arrow uses Intel CPUs for > whatever computational advantage? > > At JICS, we run very large experimental Intel HPC systems, and I would like > to learn how can we possibly run some interesting Arrow on Intel CPUs > experiments. > > Thank you, > Edmon