rnk added inline comments.
================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6223-6224
if (IsCuda || IsHIP) {
- if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false))
+ if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false) ||
+ Args.hasArg(options::OPT_foffload_new_driver))
CmdArgs.push_back("-fgpu-rdc");
----------------
tra wrote:
> jhuber6 wrote:
> > tra wrote:
> > > tra wrote:
> > > > jhuber6 wrote:
> > > > > tra wrote:
> > > > > > If user specifies both `-fno-gpu-rdc` and `-foffload-new-driver` we
> > > > > > would still enable RDC compilation.
> > > > > > We may want to at least issue a warning.
> > > > > >
> > > > > > Considering that we have multiple places where we may check for
> > > > > > `-f[no]gpu-rdc` we should make sure we don't get different ideas
> > > > > > whether RDC has been enabled.
> > > > > > I think it may make sense to provide a common way to figure it out.
> > > > > > Either via a helper function that would process CLI arguments or
> > > > > > calculate it once and save it somewhere.
> > > > > I haven't quite finalized how to handle this. The new driver should
> > > > > be compatible with a non-RDC build since we simply wouldn't embed the
> > > > > device image or create offloading entries. It's a little bit more
> > > > > difficult here since the new method is opt-in so it requires a flag.
> > > > > We should definitely emit a warning if both are enabled (I'm assuming
> > > > > there's one for passing both `fgpu-rdc` and `fno-gpu-rdc`). I'll add
> > > > > one in.
> > > > >
> > > > > Also we could consider the new driver *the* RDC in the future which
> > > > > would be the easiest. The problem is if we want to support CUDA's
> > > > > method of RDC considering how other build systems seem to expect it.
> > > > > I could see us embedding the fatbinary in the object file, even if
> > > > > unused, just so that cuobjdump works. However we couldn't support the
> > > > > generation of `__cudaRegisterFatBinary_nv....` functions because then
> > > > > those would cause linker errors. WDYT?
> > > > > I'm assuming there's one for passing both fgpu-rdc and fno-gpu-rdc
> > > >
> > > > This is not a valid assumption. The whole idea behind `-fno-something`
> > > > is that the options can be overridden. E.g. if the build specifies a
> > > > standard set of compiler options, but we need to override some of them
> > > > when building a particular file. We can only do so by appending to the
> > > > standard options. Potentially we may end up having those options
> > > > overridden again. While it's not very common, it's certainly possible.
> > > > It's also possible to start with '-fno-gpu-rdc' and then override it
> > > > with `-fgpu-rdc`.
> > > >
> > > > In this case, we care about the final state of RDC specified by
> > > > -f*gpu-rdc options, not by the fact that `-fno-gpu-rdc` is present.
> > > > `Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false)`
> > > > does exactly that -- gives you the final state. If it returns false,
> > > > but we have `-foffload-new-driver`, then we need a warning as these
> > > > options are contradictory.
> > > >
> > > > The new driver should be compatible with a non-RDC build
> > >
> > > In that case, we don't need a warning, but we do need a test verifying
> > > this behavior.
> > >
> > > > Also we could consider the new driver *the* RDC in the future which
> > > > would be the easiest.
> > >
> > > SGTM. I do not know how it all will work out in the end. Your proposed
> > > model makes a lot of sense, and I'm guardedly optimistic about it.
> > > Eventually we would deprecate RDC options, but we still need to work
> > > sensibly when user specifies a mix of these options.
> > >
> > >
> > > In that case, we don't need a warning, but we do need a test verifying
> > > this behavior.
> > >
> > It's possible but I don't have the logic here to do it, figured we can
> > cross that bridge later.
> >
> > > SGTM. I do not know how it all will work out in the end. Your proposed
> > > model makes a lot of sense, and I'm guardedly optimistic about it.
> > >
> > So the only downsides I know of, is that we don't currently replicate
> > CUDA's magic to JIT RDC code (We can do this with LTO anyway), and that
> > registering offload entries relies on the linker defining `__start /
> > __stop` variables, which I don't know if linkers on Windows / MacOS
> > provide. I'd be really interested if someone on the LLD team knew the
> > answer to that.
> > relies on the linker defining __start / __stop variables, which I don't
> > know if linkers on Windows / MacOS provide. I'd be really interested if
> > someone on the LLD team knew the answer to that.
>
> @MaskRay, @rnk - would you happen to know the answer?
I believe MachO has an equivalent mechanism, but I'm not familiar with it. For
PE/COFF, you can search the ASan code to see how the __start / __stop symbols
are defined on Windows using various pragmas and `__declspec(allocate)` to set
up sections and sort them accordingly.
I would love to have a doc that writes up how to implement this array
registration mechanism portably for all major platforms, given that we believe
it is possible everywhere.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120272/new/
https://reviews.llvm.org/D120272
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits