On Fri, 17 Apr 2026 18:55:56 GMT, Vladimir Ivanov <[email protected]> wrote:
> Any particular reason to implement it as a stub generated on-the-fly? > Alternatively, it can live in a native library dynamically linked at runtime. > As an example, SIMD sort routines are shaped that way (take a look at > libsimdsort). @iwanowww I did see libsimdsort before starting the implementation. I don't think search algorithm belongs in it, and I don't think there should be a new libsimdsearch, as it would be loaded as a separate `.so` file, have its own build/make code etc. which seems like an overkill for a ~400 line stub. Perhaps more importantly, stub integrates better with C2. It allows us to create two entry conditions based on input array length, one for compile-time known length (`length_type` check) and another check for runtime (using `generate_fair_guard`), to fallback to non-intrinsic version for small arrays, for some definition of small. These checks are especially relevant, when the cost of calling the stub is more expensive than the non-intrinsified default version. I also found that libsimdsort is linux-only, whereas the current stub supports Windows too. --- Thanks @jaskarth, I will polish the benchmark and plotter and share here or in the issue. --- > @krk It looks like you put a lot of effort into this patch, so thank you for > that :) @eme64 Thanks, I started working on this even before I created the issue. > Could we instead use the Vector API, once it is fully available? I do think a Vector API implementation is orthogonal to the current PR, while I find it interesting. Current PR is a concrete implementation now, without dpeending on Vector API or any "experimental/preview" features. It would be interesting to see a benchmark of a Vector API implementation vs. the current PR. at some point. I do not think existence of Vector API should be a blocker for adding any new intrinsics. > Each one of them is drilling a hole through the JVM. And assembly code is > harder to review, it is easier to introduce bugs. I didn't get the idea that new intrinsics would be forbidden with the Vector API. I think drilling these holes is more acceptable with fallbacks I included. We go to the native land if we know it will be faster. I am happy to add more correctness tests where relevant. ------------- PR Comment: https://git.openjdk.org/jdk/pull/30612#issuecomment-4390461842
