Meinersbur wrote:

NB: #201103 adds support for DECLARE_TARGET by setting a device-type flag in 
the AST that can be specialized later in MLIR, i.e. still the same .mod for 
device and and target. Handling `!$omp declare variant` requires more effort; 
from the OpenMP examples document:

```f90
subroutine base_saxpy(s,x,y) !! base function
  real,intent(inout) :: s,x(:),y(:)
  !$omp declare variant( avx512_saxpy ) &
  !$omp& match( device={isa("core-avx512")} )

  y = s*x + y
end subroutine

subroutine avx512_saxpy(s,x,y) !! function variant
  ...
```

Keeping a single .mod file for all targets means that `avx512_saxp` must be 
kept in case it is used for a target that supports avx512. In contrast, Clang 
just skips anything it does match to the current compilation target in the 
preprocessor or while creating the AST. A rationale is that `avx512_saxpy` may 
contain inline-asm or vector builtins that the current compilation just does 
not know about and need to fail if parsed. gcc on the other hand will parse 
everything and diverge between host and device at a later stage, like Flang 
does. Thanks to the insistance by gcc implementors, OpenMP does not have 
predefined preprocessor symbols that are different when compiling for different 
targets (like 
[`__HIP_DEVICE_COMPILE__`](https://clang.llvm.org/docs/HIPSupport.html#predefined-macros),
 
[`__CUDA_ARCH__`](https://docs.nvidia.com/cuda/cuda-programming-guide/05-appendices/cpp-language-extensions.html#codecell0)
 
,[`__SYCL_DEVICE_ONLY__`](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_preprocessor_directives_and_macros))
 which require the split to happen at the preprocessor stage.

I don't know whether CUDA-Fortran has a `__CUDA_ARCH__` preprocessor definition 
or similar which could be used to compile very different sources for host and 
devices.

https://github.com/llvm/llvm-project/pull/200863
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to