On Wed, 27 Nov 2024 14:45:36 GMT, Andrew Haley <a...@openjdk.org> wrote:

>> For CRC32 digest computation we do support intrinsic at interpreter and c1 
>> compiler level to overcome such warmup related penalties.
>
> This is not just a good idea to trigger OSR and therefore use the intrinsic, 
> it's a good idea because very long data causes an extended time to safepoint. 
> I'd support in all cases limiting the size to about a megabyte, which is what 
> we have here.

As Andrew points out, giving an intrinsic lots of data, 'backdoors/breaks' a 
lot of existing algorithms.. from GC not happening because of no safepoint 
inside the intrinsic, to OSR.. 

.. and (what I believe to be issue for performance here) the call count 
(CompilationThreshold) to get the intrinsic to compile (well, the callee) in 
the first place. Though as I pointed in the original issue, I am not entirely 
convinced it was the call count that got the intrinsic back in; experimentally, 
chunking got the 'outer intrinsic' to compile. (There is an inner intrinsic 
that works on 16 byte chunks)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22300#discussion_r1860822482

Reply via email to