Hi all -- I looked at some of the SIMD libraries listed by Yibo Cai earlier in the thread, and you might want to take a closer look at nsimd. It looks very polished and has CUDA support, the only one I noticed that took account of GPUs.
To wit, in what ways is Arrow optimized for GPU compute? I'm new to Arrow and I noticed this bit on the homepage: "...organized for efficient analytic operations on modern hardware like CPUs and GPUs." Does that mean there's actual code targeting GPUs, e.g. CUDA, OpenCL, or C++ AMP (Microsoft)? Or is it more of a thoughtful pre-emptive GPU-readiness, so to speak, in the format's design? Getting back to the SIMD library decision, my humble feedback is that you might want to approach it with a bit more evaluative attention. The number of GitHub stars and contributors seemed to be the major or driving considerations in the parts of the thread that I saw. GitHub stars wouldn't make my top-3 criteria, and might not make my list at all. I'm not even sure what that metric signifies -- general interest or something? (For the unfamiliar, it's not a star rating like for movies, but just a count.) It seems there's a lot more to look at than star count or contributor count, for example *performance*. SIMD libraries are definitely not equal on performance. Bugginess too -- I wish there were easier ways --maybe automated -- to evaluate projects and libraries on code quality. And I assume there are Arrow project-specific criteria that would matter too, which would be completely orthogonal to number of stars on GitHub. Nsimd looks polished, and that might be because it's from a company specializing in high-performance computing: https://agenium-scale.com I hadn't heard of them, but it looks good. One thing that confuses me is that most of the nsimd code is under "include/nsimd/modules/fixed_point". There's no mention of floating point, and there's hardly any code outside of that tree, and I'm not sure why fixed point would be the focus. They don't seem to talk about it, or I missed it. Not sure if this will matter for Arrow. Their CUDA support stands out, but I couldn't find much code. Their Arm SVE support also stands out, but it's not clear that SVE actually exists in the wild. It's Arm's Scalable Vector Extension, which allows SIMD code to be written once and automatically adapted to different vector lengths as needed depending on the CPU. Arm's SIMD is typically 128 bits wide, and with SVE 256 and 512 bit widths become trivial, but I don't know of any implementations. Do Amazon's new Graviton2 chips support it? I hadn't heard that, or any support from Cavium or Marvel or whomever in the Arm server space. SVE is very new. For code quality checking, you could throw a library up onto Coverity Scan. It's free for open-source projects. It would be useful for Arrow too if you're not using it already. Automated static analysis, and they support C and C++ code, among others. Anyway, those are my thoughts for now. Cheers, Joe Duarte -----Original Message----- From: Antoine Pitrou <[email protected]> Sent: Saturday, February 13, 2021 2:49 AM To: [email protected] Subject: Re: [C++] adopting an SIMD library - xsimd On Fri, 12 Feb 2021 20:47:21 -0800 Micah Kornfield <[email protected]> wrote: > That is unfortunate, like I said if the consensus is xsimd, let's move > forward with that. I would say it's a soft consensus for now, and I would welcome more viewpoints on the matter. Regards Antoine.
