[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7228
+  } else {
+Args.ClaimAllArgs(options::OPT_fgpu_approx_transcendentals);
+Args.ClaimAllArgs(options::OPT_fno_gpu_approx_transcendentals);

MaskRay wrote:
> You can use `Args.claimAllArgs(options::OPT_fgpu_approx_transcendentals, 
> options::OPT_fno_gpu_approx_transcendentals);`
will do



Comment at: clang/lib/Frontend/InitPreprocessor.cpp:1294
+if (!LangOpts.HIP)
+  Builder.defineMacro("__CLANG_CUDA_APPROX_TRANSCENDENTALS__");
+Builder.defineMacro("__CLANG_GPU_APPROX_TRANSCENDENTALS__");

tra wrote:
> I think we can remove it. I don't think we need to keep the old one around. 
> Internal headers have been changed and the macro was never intended for 
> public use. 
will remove



Comment at: clang/test/Driver/hip-options.hip:209
+
+// APPROXNEG-NOT: warning

MaskRay wrote:
> If `%t` happens to be in a path with `warning` as a substring, this will 
> spuriously fail.
> 
> Suggest `%clang -fdriver-only -Werror... 2>&1 | count 0` to test that there 
> is no warning/error.
will do


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-25 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
yaxunl marked 4 inline comments as done.
Closed by commit rGe17882430e5c: [CUDA][HIP] Rename and fix 
`-fcuda-approx-transcendentals` (authored by yaxunl).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D154797?vs=538787&id=544001#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/lib/Driver/ToolChains/HIPSPV.cpp
  clang/lib/Frontend/InitPreprocessor.cpp
  clang/lib/Headers/__clang_cuda_math.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/__clang_hip_math.h
  clang/test/Driver/hip-macros.hip
  clang/test/Driver/hip-options.hip
  clang/test/Headers/__clang_hip_math.hip
  clang/test/Headers/nvptx_device_math_sin.c
  clang/test/Headers/nvptx_device_math_sin.cpp
  clang/test/Preprocessor/cuda-approx-transcendentals.cu

Index: clang/test/Preprocessor/cuda-approx-transcendentals.cu
===
--- clang/test/Preprocessor/cuda-approx-transcendentals.cu
+++ clang/test/Preprocessor/cuda-approx-transcendentals.cu
@@ -3,6 +3,6 @@
 // RUN: %clang -fcuda-approx-transcendentals --cuda-device-only -nocudainc -nocudalib -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null | FileCheck --check-prefix DEVICE-FAST %s
 // RUN: %clang -ffast-math --cuda-device-only -nocudainc -nocudalib -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null | FileCheck --check-prefix DEVICE-FAST %s
 
-// HOST-NOT: __CLANG_CUDA_APPROX_TRANSCENDENTALS__
-// DEVICE-NOFAST-NOT: __CLANG_CUDA_APPROX_TRANSCENDENTALS__
-// DEVICE-FAST: __CLANG_CUDA_APPROX_TRANSCENDENTALS__
+// HOST-NOT: __CLANG_GPU_APPROX_TRANSCENDENTALS__
+// DEVICE-NOFAST-NOT: __GPU_CUDA_APPROX_TRANSCENDENTALS__
+// DEVICE-FAST: __CLANG_GPU_APPROX_TRANSCENDENTALS__
Index: clang/test/Headers/nvptx_device_math_sin.cpp
===
--- clang/test/Headers/nvptx_device_math_sin.cpp
+++ clang/test/Headers/nvptx_device_math_sin.cpp
@@ -2,7 +2,7 @@
 // RUN: %clang_cc1 -x c++ -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
 // RUN: %clang_cc1 -x c++ -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-target-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=SLOW
 // RUN: %clang_cc1 -x c++ -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc -ffast-math -ffp-contract=fast
-// RUN: %clang_cc1 -x c++ -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-target-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - -ffast-math -ffp-contract=fast | FileCheck %s --check-prefix=FAST
+// RUN: %clang_cc1 -x c++ -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-target-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - -fgpu-approx-transcendentals -ffast-math -ffp-contract=fast | FileCheck %s --check-prefix=FAST
 // expected-no-diagnostics
 
 #include 
Index: clang/test/Headers/nvptx_device_math_sin.c
===
--- clang/test/Headers/nvptx_device_math_sin.c
+++ clang/test/Headers/nvptx_device_math_sin.c
@@ -2,7 +2,7 @@
 // RUN: %clang_cc1 -x c -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
 // RUN: %clang_cc1 -x c -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-target-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=SLOW
 // RUN: %clang_cc1 -x c -internal-isystem %S/Inputs/include

[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay accepted this revision.
MaskRay added a comment.

Some nits about testing


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7228
+  } else {
+Args.ClaimAllArgs(options::OPT_fgpu_approx_transcendentals);
+Args.ClaimAllArgs(options::OPT_fno_gpu_approx_transcendentals);

You can use `Args.claimAllArgs(options::OPT_fgpu_approx_transcendentals, 
options::OPT_fno_gpu_approx_transcendentals);`



Comment at: clang/test/Driver/hip-options.hip:184
+
+// APPROX: clang{{.*}} "-triple" "amdgcn-amd-amdhsa" {{.*}} 
"-fgpu-approx-transcendentals"
+// APPROX: clang{{.*}} "-triple" "x86_64-unknown-linux-gnu" {{.*}} 
"-fgpu-approx-transcendentals"

Just test `-cc1`: `// APPROX: "-cc1"{{.*}} "-triple" "amdgcn-amd-amdhsa" {{.*}} 
"-fgpu-approx-transcendentals"`

Testing `clang` requires `-no-canonical-prefixes`  
https://maskray.me/blog/2021-03-28-compiler-driver-and-cross-compilation#misc



Comment at: clang/test/Driver/hip-options.hip:209
+
+// APPROXNEG-NOT: warning

If `%t` happens to be in a path with `warning` as a substring, this will 
spuriously fail.

Suggest `%clang -fdriver-only -Werror... 2>&1 | count 0` to test that there is 
no warning/error.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision.
tra added inline comments.
This revision is now accepted and ready to land.



Comment at: clang/lib/Frontend/InitPreprocessor.cpp:1294
+if (!LangOpts.HIP)
+  Builder.defineMacro("__CLANG_CUDA_APPROX_TRANSCENDENTALS__");
+Builder.defineMacro("__CLANG_GPU_APPROX_TRANSCENDENTALS__");

I think we can remove it. I don't think we need to keep the old one around. 
Internal headers have been changed and the macro was never intended for public 
use. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 538787.
yaxunl marked 3 inline comments as done.
yaxunl edited the summary of this revision.
yaxunl added a comment.
Herald added a reviewer: jdoerfert.
Herald added subscribers: jplehr, asavonic, sstefan1.

revised by comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/lib/Driver/ToolChains/HIPSPV.cpp
  clang/lib/Frontend/InitPreprocessor.cpp
  clang/lib/Headers/__clang_cuda_math.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/__clang_hip_math.h
  clang/test/Driver/hip-macros.hip
  clang/test/Driver/hip-options.hip
  clang/test/Headers/__clang_hip_math.hip
  clang/test/Headers/nvptx_device_math_sin.c
  clang/test/Headers/nvptx_device_math_sin.cpp

Index: clang/test/Headers/nvptx_device_math_sin.cpp
===
--- clang/test/Headers/nvptx_device_math_sin.cpp
+++ clang/test/Headers/nvptx_device_math_sin.cpp
@@ -2,7 +2,7 @@
 // RUN: %clang_cc1 -x c++ -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
 // RUN: %clang_cc1 -x c++ -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=SLOW
 // RUN: %clang_cc1 -x c++ -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc -ffast-math -ffp-contract=fast
-// RUN: %clang_cc1 -x c++ -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - -ffast-math -ffp-contract=fast | FileCheck %s --check-prefix=FAST
+// RUN: %clang_cc1 -x c++ -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - -fgpu-approx-transcendentals -ffast-math -ffp-contract=fast | FileCheck %s --check-prefix=FAST
 // expected-no-diagnostics
 
 #include 
Index: clang/test/Headers/nvptx_device_math_sin.c
===
--- clang/test/Headers/nvptx_device_math_sin.c
+++ clang/test/Headers/nvptx_device_math_sin.c
@@ -2,7 +2,7 @@
 // RUN: %clang_cc1 -x c -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
 // RUN: %clang_cc1 -x c -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=SLOW
 // RUN: %clang_cc1 -x c -internal-isystem %S/Inputs/include -fopenmp -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc -ffast-math -ffp-contract=fast
-// RUN: %clang_cc1 -x c -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - -ffast-math -ffp-contract=fast | FileCheck %s --check-prefix=FAST
+// RUN: %clang_cc1 -x c -include __clang_openmp_device_functions.h -internal-isystem %S/../../lib/Headers/openmp_wrappers -internal-isystem %S/Inputs/include -fopenmp -triple nvptx64-nvidia-cuda -aux-triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - -fgpu-approx-transcendentals -ffast-math -ffp-contract=fast | FileCheck %s --check-prefix=FAST
 // expected-no-diagnostics
 
 #include 
Index: clang/test/Headers/__clang_hip_math.hip
===
--- clang/test/Headers/__clang_hip_math.hip
+++ 

[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 3 inline comments as done.
yaxunl added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7221-7223
+bool UseApproxTranscendentals = false;
+if (Args.hasFlag(options::OPT_ffast_math, options::OPT_fno_fast_math,
+ false))

tra wrote:
> ```
> bool UseApproxTranscendentals = Args.hasFlag(options::OPT_ffast_math, 
> options::OPT_fno_fast_math,  false));
> ```
fixed



Comment at: clang/lib/Frontend/InitPreprocessor.cpp:1292-1293
+  if (LangOpts.GPUDeviceApproxTranscendentals) {
+Builder.defineMacro(Twine("__CLANG_") + (LangOpts.HIP ? "HIP" : "CUDA") +
+"_APPROX_TRANSCENDENTALS__");
   }

tra wrote:
> We may want to rename the macro to `__CLANG_GPU_APPROX_TRANSCENDENTALS__`, 
> too. 
> 
will emit `__CLANG_GPU_APPROX_TRANSCENDENTALS__`



Comment at: clang/test/Driver/hip-options.hip:179
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpuinc -nogpulib 
-fgpu-approx-transcendentals \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=APPROX %s
+

MaskRay wrote:
> excess spaces before `%s`
> 
> ditto below
> 
> Prefer `--check-prefix=` when there is one single check.
fixed


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments.



Comment at: clang/test/Driver/hip-options.hip:179
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpuinc -nogpulib 
-fgpu-approx-transcendentals \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=APPROX %s
+

excess spaces before `%s`

ditto below

Prefer `--check-prefix=` when there is one single check.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

Looks good in general.




Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7221-7223
+bool UseApproxTranscendentals = false;
+if (Args.hasFlag(options::OPT_ffast_math, options::OPT_fno_fast_math,
+ false))

```
bool UseApproxTranscendentals = Args.hasFlag(options::OPT_ffast_math, 
options::OPT_fno_fast_math,  false));
```



Comment at: clang/lib/Frontend/InitPreprocessor.cpp:1292-1293
+  if (LangOpts.GPUDeviceApproxTranscendentals) {
+Builder.defineMacro(Twine("__CLANG_") + (LangOpts.HIP ? "HIP" : "CUDA") +
+"_APPROX_TRANSCENDENTALS__");
   }

We may want to rename the macro to `__CLANG_GPU_APPROX_TRANSCENDENTALS__`, too. 



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154797/new/

https://reviews.llvm.org/D154797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision.
yaxunl added reviewers: tra, MaskRay, b-sumner.
Herald added subscribers: mattd, carlosgalvezp.
Herald added a project: All.
yaxunl requested review of this revision.

Rename `-fcuda-approx-transcendentals` as
`-fgpu-approx-transcendentals` and pass it
to both device and host clang -cc1.

Fix its interaction with `-ffast-math` to allow
`-fno-gpu-approx-transcendentals` to override
the implicit `-fcuda-approx-transcendentals`
due to `-ffast-math`.

Rename the predefined macro to be
`__CLANG_[CUDA|HIP]_APPROX_TRANSCENDENTALS__`
depending on the language. Emit the macro for both
device and host compilation.


https://reviews.llvm.org/D154797

Files:
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/lib/Driver/ToolChains/HIPSPV.cpp
  clang/lib/Frontend/InitPreprocessor.cpp
  clang/lib/Headers/__clang_cuda_math.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/__clang_hip_math.h
  clang/test/Driver/hip-macros.hip
  clang/test/Driver/hip-options.hip
  clang/test/Headers/__clang_hip_math.hip

Index: clang/test/Headers/__clang_hip_math.hip
===
--- clang/test/Headers/__clang_hip_math.hip
+++ clang/test/Headers/__clang_hip_math.hip
@@ -22,7 +22,7 @@
 // RUN:   -internal-isystem %S/../../lib/Headers/cuda_wrappers \
 // RUN:   -internal-isystem %S/Inputs/include \
 // RUN:   -triple amdgcn-amd-amdhsa -aux-triple x86_64-unknown-unknown \
-// RUN:   -target-cpu gfx906 -emit-llvm %s -fcuda-is-device -O1 -fcuda-approx-transcendentals -o - \
+// RUN:   -target-cpu gfx906 -emit-llvm %s -fcuda-is-device -O1 -fgpu-approx-transcendentals -o - \
 // RUN:   -D__HIPCC_RTC__ | FileCheck -check-prefixes=CHECK,APPROX %s
 
 #define BOOL_TYPE int
Index: clang/test/Driver/hip-options.hip
===
--- clang/test/Driver/hip-options.hip
+++ clang/test/Driver/hip-options.hip
@@ -169,3 +169,41 @@
 // RUN: %clang -### -nogpuinc -nogpulib -fhip-fp32-correctly-rounded-divide-sqrt \
 // RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=CRDS %s
 // CRDS-NOT: "-f{{(no-)?}}hip-fp32-correctly-rounded-divide-sqrt"
+
+// Check -fgpu-approx-transcendentals is passed to clang -cc1 but
+// (default) -fno-gpu-approx-transcendentals is not.
+// -ffast-math implies -fgpu-approx-transcendentals, which can be overridden
+// by -fno-gpu-approx-transcendentals.
+
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpuinc -nogpulib -fgpu-approx-transcendentals \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=APPROX %s
+
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpuinc -nogpulib -ffast-math \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=APPROX %s
+
+// APPROX: clang{{.*}} "-triple" "amdgcn-amd-amdhsa" {{.*}} "-fgpu-approx-transcendentals"
+// APPROX: clang{{.*}} "-triple" "x86_64-unknown-linux-gnu" {{.*}} "-fgpu-approx-transcendentals"
+
+// RUN: %clang -### -nogpuinc -nogpulib -fno-gpu-approx-transcendentals \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=NOAPPROX %s
+
+// RUN: %clang -### -nogpuinc -nogpulib \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=NOAPPROX %s
+
+// RUN: %clang -### -nogpuinc -nogpulib -ffast-math -fno-fast-math \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=NOAPPROX %s
+
+// RUN: %clang -### -nogpuinc -nogpulib -ffast-math -fno-gpu-approx-transcendentals \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=NOAPPROX %s
+
+// NOAPPROX-NOT: "-f{{(no-)?}}gpu-approx-transcendentals"
+
+// Check no warnings for -fgpu-approx-transcendentals.
+
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpuinc -nogpulib -fgpu-approx-transcendentals \
+// RUN:   --cuda-gpu-arch=gfx906  %s 2>&1 | FileCheck -check-prefixes=APPROXNEG %s
+
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nostdinc -nostdlib -fgpu-approx-transcendentals \
+// RUN:   -x c++ %s 2>&1 | FileCheck -check-prefixes=APPROXNEG %s
+
+// APPROXNEG-NOT: warning
Index: clang/test/Driver/hip-macros.hip
===
--- clang/test/Driver/hip-macros.hip
+++ clang/test/Driver/hip-macros.hip
@@ -63,3 +63,11 @@
 // WARN: warning: feature flag '{{[+|-]}}image-insts' is ignored since the feature is read only [-Winvalid-command-line-argument]
 // IMAGE-NOT: #define __HIP_NO_IMAGE_SUPPORT
 // NOIMAGE: #define __HIP_NO_IMAGE_SUPPORT 1
+
+// RUN: %clang -E -dM --offload-arch=gfx906 -nogpuinc -nogpulib \
+// RUN:   %s 2>&1 | FileCheck --check-prefix=NOAPPROX %s
+// RUN: %clang -E -dM --offload-arch=gfx906 -nogpuinc -nogpulib -fgpu-approx-transcendentals \
+// RUN:   %s 2>&1 | FileCheck --check-prefix=APPROX %s
+// NOAPPR