pfultz2 added a comment.

> I.e. if I pass a mutable lambda by reference to the GPU kernel

I dont think we are enabling passing host objects by reference through 
functions. Although it could be possible to capture the mutable lambda by 
reference by another lambda.

> will the same lambda called on host do the same thing when it's called on the 
> device?

Yes, just as the same as capturing a host variable by reference and using it on 
the device.

> In principle it would work if GPU and host operate un a uniform memory

A unified memory is not necessary. What is needed is a coordination between the 
compiler and runtime.

We dont support capturing host variable by reference, so maybe we can restrict 
the implicit HD to lambdas that don't capture by reference?

> According to cppreference, it's only true since C++17 and, AFAICT, only for 
> capture-less lambdas.

You can capture as well, if its in a `constexpr` context.

> Considering they are not always constexpr, this assertion is not true, either.

Yes, we seem to delay this. It is always HD but not always emitted for both 
host and device.

The issue would be if users tried to detect HD using SFINAE. It could be a 
false claim, but maybe it doesnt matter. More importantly, if the lambda is 
called in a unevaluated context, will the compiler still emit the function or 
will it produce a hard error instead of a substitution failure? I assume 
something like this would compile:

  template<class F>
  __host__ auto is_host(F f) -> decltype(f(), std::true_type{});
  std::false_type is_host(...);
  
  template<class F>
  __device__ auto is_device(F f) -> decltype(f(), std::true_type{});
  std::false_type is_device(...);
  
  __host__ void f();
  
  void g()
  {
      auto l = []{ f(); };
      using on_host = decltype(is_host(l));
      static_assert(on_host{}, "Lambda not on host");
      using on_device = decltype(is_device(l));
      static_assert(on_device{}, "Lambda not on device");
  }



> If/when operator() does get constexpr treatment by compiler, we should 
> already derive HD attributes from constexpr. If we do not, then that's what 
> needs to be fixed.

How does the compiler implement this? Does it add `constexpr` attribute onto 
the operator() or does the constexpr-evalutation visits the lambda as if it 
were `constexpr`? It seems the latter would be more effecient, and it would be 
similar to what we are doing with HD. The only difference is that a function 
can be overloaded with `__host__` and `__device__` whereas that is not possible 
with `constexpr`. So a difference could be detected by the user, but maybe that 
doesn't matter

> That at least would make sense from consistency standpoint as we currently do 
> treat all other constexpr functions as HD.

I mean consistent across the different attributes not in the interpretation of 
constexpr. A lambda that only calls constexpr functions implicitly has 
`constexpr` attribute. So, a lambda that only calls device functions(or HD) 
should implicitly have the `__device__` attribute.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78655/new/

https://reviews.llvm.org/D78655



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to