Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

Justin Lebar via cfe-commits Mon, 09 May 2016 10:22:43 -0700

jlebar added a comment.

Art pointed me to the fact that CUDA 8 adds a bunch more load intrinsics, and I 
said ohmygosh maybe we *do* want to do the variadic intrinsic thing here.


But now looking at how __builtin_add_overflow is implemented, we'd need special 
sema checking to make it work.  We would also need some sort of argument 
promotion logic to make the value and pointer into the same types.  In both 
cases it seems like maybe it's better to leave this stuff to clang, rather than 
trying to write a buggy implementation ourselves?

Even with the many new load intrinsics, listing all the intrinsics is a 
relatively small part of the code required.  The majority of the code necessary 
is in our CUDA header, but even with a variadic builtin, that would be hard to 
reduce without some serious template magic, and that would be doubly difficult 
to do without exposing crummy diagnostics to users.

What do you all think?


http://reviews.llvm.org/D19990



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

Reply via email to