[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-20 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. tra marked 2 inline comments as done. Closed by commit rC337587: [CUDA] Provide integer SIMD functions for CUDA-9.2 (authored by tra, committed by ). Changed prior to commit:

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-20 Thread Benjamin Kramer via Phabricator via cfe-commits
bkramer accepted this revision. bkramer added a comment. This revision is now accepted and ready to land. lg https://reviews.llvm.org/D49274 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-19 Thread Artem Belevich via Phabricator via cfe-commits
tra marked 2 inline comments as done. tra added a comment. Ben, PTAL. Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:1080 + unsigned int r; + asm("vabsdiff2.u32.u32.u32.sat %0,%1,%2,0;" : "=r"(r) : "r"(__a), "r"(__b)); + return r; bkramer

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 156397. tra added a comment. Fixed the issues pointed out by bkramer@. Apparently. sat does not matter for vabsdiff instruction with unsigned operands. My tests were also missing __vabsssN. https://reviews.llvm.org/D49274 Files:

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 156386. tra added a comment. Fixed inline asm syntax. Added workaround for the bug in __vmaxs2() discovered during testing(). I've got set of tests for these functions that I'll add to test-suite shortly. AFAICT this implementation matches nvidia's bit-to-bit.

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I'm in the middle of writing the tests for these as it's very easy to mess things up. I'll update the patch once I run it through the tests. Another problem with the patch in the current form is that these instructions apparently do not accept immediate arguments. PTX is a

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-18 Thread Benjamin Kramer via Phabricator via cfe-commits
bkramer accepted this revision. bkramer added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:1080 + unsigned int r; + asm("vabsdiff2.u32.u32.u32.sat %0,%1,%2,0;" : "=r"(r) : "r"(__a), "r"(__b)); +

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-12 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: jlebar, bkramer. Herald added subscribers: bixia, sanjoy. CUDA-9.2 made all integer SIMD functions into compiler builtins, so clang no longer has access to the implementation of these functions in either headers of libdevice and has to provide its