gtbercea added inline comments.
================
Comment at: lib/Basic/Targets/NVPTX.cpp:232
+ // getting inlined on the device.
+ Builder.defineMacro("__NO_MATH_INLINES");
}
----------------
tra wrote:
> This relies on implementation detail of particular variant of the header file
> you're assuming all compilations will include. This is a workaround of the
> real problem (attempting to use headers from machine X while targeting Y) at
> best.
>
> D50845 is dealing with the issue of headers for target code. Hopefully,
> they'll find a way to provide device-specific headers, so you don't rely on
> host headers being parseable during device-side compilation.
I agree. The proper fix would be what the other patch is attempting to do.
================
Comment at: lib/Driver/ToolChains/Clang.cpp:4758
+ // toolchain.
+ CmdArgs.push_back("-fno-math-builtin");
}
----------------
tra wrote:
> Could you elaborate on why you don't want the builtins?
> Builtins are enabled and are useful for CUDA. What makes their use different
> for OpenMP?
> Are you doing it to guarantee that math functions remain unresolved in IR so
> you could link them in from external bitcode?
>
That's right. I don't particularly like this approach as this leads to
OpenMP-NVPTX toolchain missing out on optimizations such as replacing math
function call with basic operations ( pow(a,2) -> a*a for example).
I am trying to fix this in a future patch by allowing intrinsics/builtins to
propagate.
Repository:
rC Clang
https://reviews.llvm.org/D47849
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits