from:"Scott Linder via Phabricator via cfe\-commits"

[PATCH] D153600: Implement -frecord-command-line for XCOFF

2023-07-06 Thread Scott Linder via Phabricator via cfe-commits

scott.linder accepted this revision.
scott.linder added a comment.
This revision is now accepted and ready to land.

Thank you for the patch and the changes! This LGTM, and AFAICT all of the 
comments have been addressed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153600/new/

https://reviews.llvm.org/D153600

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153600: Implement -frecord-command-line for XCOFF

2023-07-05 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/MC/MCAsmStreamer.cpp:993
+  // Metadata needs to be padded out to an even word size.
+  uint32_t PaddedSize = alignTo(std::max((int)MetadataSize, 1), 4);
+  uint32_t PaddingSize = PaddedSize - MetadataSize;

With the `MetadataSize == 0` case handled the `std::max` should be redundant 
now, right?

Also the literal `4` can be `WordSize`



Comment at: llvm/lib/MC/MCAsmStreamer.cpp:1018
+  for (; Index + WordSize <= MetadataSize; Index += WordSize)
+PrintWord((const uint8_t *)Metadata.data() + Index);
+

Nit: for pointer casts I would prefer an explicit `reinterpret_cast`



Comment at: llvm/lib/MC/MCAsmStreamer.cpp:981
+
+  // Metadata needs to be padded out to an even word size.
+  size_t MetadataSize = Metadata.size();

stephenpeckham wrote:
> There's no requirement to pad the .info section. When you generate assembly 
> language, the .info pseudo-op can only generate words of data, so the if the 
> data length is not 0(mod 4), the last word will have to be padded with 
> low-order 0s.  This means that the length of the .info section will be a 
> multiple of 4. The length, on the other hand, should be exact. At link time, 
> only "length" bytes will be copied to the output file.
> 
> If you emit object code directly, you will not need to emit any padding bytes.
Does the comment still need updating then? It could capture the fact that the 
"lowest common denominator" is the assembly syntax as it works in terms of 
words, and so may requiring padding in the final word. We can be clear that we 
apply the same restriction to the object case judiciously, as it make the 
output identical and avoids more code paths. We could also note that the linker 
can use the length to optimize the final linked binary.

These all clear things up to a fresh reader, so I think they are worthwhile 
things to include in comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153600/new/

https://reviews.llvm.org/D153600

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153600: Implement -frecord-command-line for XCOFF

2023-06-28 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Sorry for the delay in review, I am not too familiar with XCOFF so I was hoping 
someone else would take a look first.

If my understanding of the COFF docs I could find is correct then this LGTM 
save for some nits/suggestions

Comment at: llvm/lib/MC/MCAsmStreamer.cpp:970-1037
+void MCAsmStreamer::emitXCOFFCInfoSym(StringRef Name, StringRef Metadata) {
+  const char *InfoDirective = "\t.info ";
+
+  // Start by emitting the .info pseudo-op and C_INFO symbol name
+  OS << InfoDirective;
+  PrintQuotedString(Name, OS);
+  OS << ", ";

I sketched out some of the suggestions I had, although the bigger changes are 
just because the extra `.info` for the padded byte bugged me. If you aren't 
concerned with it I'm also happy with what you have

Comment at: llvm/lib/MC/MCAsmStreamer.cpp:980
+  size_t MetadataSize = Metadata.size();
+  uint32_t MetadataPaddingSize = 3 - (MetadataSize - 1) % 4;
+

I couldn't quickly find a reference for the alignment requirement, but it seems 
like there is an additional requirement that the length must also be non-zero, 
not just even?

If so, can you update the comment?

I would also rather explicitly use `alignTo` and `max` to express this (see 
suggestion), but if we don't expect the optimizer to clean it up I'm fine with 
the more terse version.

Comment at: llvm/lib/MC/MCAsmStreamer.cpp:980
+  size_t MetadataSize = Metadata.size();
+  uint32_t MetadataPaddingSize = 3 - (MetadataSize - 1) % 4;
+

scott.linder wrote:
> I couldn't quickly find a reference for the alignment requirement, but it 
> seems like there is an additional requirement that the length must also be 
> non-zero, not just even?
> 
> If so, can you update the comment?
> 
> I would also rather explicitly use `alignTo` and `max` to express this (see 
> suggestion), but if we don't expect the optimizer to clean it up I'm fine 
> with the more terse version.
Can you factor this out at function scope? It gets repeated below

Comment at: llvm/lib/MC/MCAsmStreamer.cpp:984
+  uint32_t Length = MetadataSize + MetadataPaddingSize;
+  OS << format_hex(uint32_t(Length), 10) << ",";
+  EmitEOL();

Redundant cast

Comment at: llvm/lib/MC/MCAsmStreamer.cpp:988
+  // Return the remaining bytes padded with 0s.
+  auto GetLastWord = [](const uint8_t *Data,
+uint32_t PaddingBytes) -> uint32_t {

It seems odd to use a lambda when this is only used once. Why not just compute 
the last word directly?

Comment at: llvm/lib/MC/MCAsmStreamer.cpp:1033
+MetadataPaddingSize);
+OS << InfoDirective << ", ";
+OS << format_hex(LastWord, 10);

Can you factor this out as `Separator` at function scope?

Comment at: llvm/test/CodeGen/PowerPC/aix-command-line-metadata.ll:21
+; Trailing padding:
+; ASM: .info , 0x3233
+

Having the padded byte force a new `.info` directive doesn't seem ideal. I 
suggested something to avoid it, but I suppose it also doesn't really harm 
anything.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153600/new/

https://reviews.llvm.org/D153600

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D146023: [AMDGPU] Remove Code Object V2

2023-06-06 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:520
 
-  assert(Func.getCallingConv() == CallingConv::AMDGPU_KERNEL ||
- Func.getCallingConv() == CallingConv::SPIR_KERNEL);
+  if (Func.getCallingConv() != CallingConv::AMDGPU_KERNEL &&
+  Func.getCallingConv() != CallingConv::SPIR_KERNEL)

Pierre-vh wrote:
> scott.linder wrote:
> > I don't follow this change; was the assert just incorrect previously?
> It's for `CodeGen/AMDGPU/no-hsa-graphics-shaders.ll`, it crashes otherwise.
> 
> An alternative can be to change this:
> ```
>   if (STM.isAmdHsaOS())
> HSAMetadataStream->emitKernel(*MF, CurrentProgramInfo);
> ```
> So it checks the CC and doesn't call the function if the CC is incorrect. I 
> don't mind either solution
OK, makes sense. I don't have a particularly strong preference, but whatever 
the solution is I would prefer it be split into another patch. It isn't 
actually a result of dropping v2 support, as the bug is present in HEAD as-is.



Comment at: llvm/test/MC/AMDGPU/hsa-gfx10.s:3
-// RUN: llvm-mc -filetype=obj -triple amdgcn--amdhsa -mcpu=gfx1010 
--amdhsa-code-object-version=2 -mattr=-wavefrontsize32,+wavefrontsize64 
-show-encoding %s | llvm-readobj -S --sd --syms - | FileCheck %s 
--check-prefix=ELF
-
-// ELF: Section {

Pierre-vh wrote:
> scott.linder wrote:
> > Pierre-vh wrote:
> > > arsenm wrote:
> > > > I thought we were still going to be able to read old objects 
> > > I think llvm-readobj uses all of the MC/Target infrastructure so if we 
> > > remove emission, we also remove reading, no?
> > > 
> > > I'm actually not sure if we plan to let readobj/readelf read COV2 object 
> > > files, it's an interesting question
> > I think this is my biggest concern. Do we incur a huge maintenance burden 
> > that warrants dropping read support? How much code do we really need to 
> > maintain to keep the readobj/objdump like tools universal?
> > 
> > @t-tye do you have any thoughts on whether we should maintain backwards 
> > compatibility in the LLVM tooling, even if we drop generation support?
> It's been a while since I wrote this but IIRC there was a discussion about it 
> and it was fine to remove read support. An alternative may be to still 
> identify code object V2, but not read the metadata and instead print a 
> warning about the file format being deprecated?
> 
> Or I think it's YAML, maybe we can just raw dump the MD and print a warning?
> 
> Most of the maintenance cost would be in the MD mapper which is almost 500 
> lines of code that'd just be there for the sake of "maybe some needs to read 
> MD"
> 
> If we go with one of the above suggestions I can just add a test using 
> yml2obj that emits a V2 file
Ah, alright; apologies if I was part of the discussions and am still 
re-litigating now :)

I don't have particularly strong feelings, especially if there is some 
consensus that dropping is OK. Or if it isn't too onerous the "best effort" 
support of dumping things as-is for the old versions sounds good to me!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146023/new/

https://reviews.llvm.org/D146023

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D146023: [AMDGPU] Remove Code Object V2

2023-06-01 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:520
 
-  assert(Func.getCallingConv() == CallingConv::AMDGPU_KERNEL ||
- Func.getCallingConv() == CallingConv::SPIR_KERNEL);
+  if (Func.getCallingConv() != CallingConv::AMDGPU_KERNEL &&
+  Func.getCallingConv() != CallingConv::SPIR_KERNEL)

I don't follow this change; was the assert just incorrect previously?



Comment at: llvm/test/MC/AMDGPU/hsa-gfx10.s:3
-// RUN: llvm-mc -filetype=obj -triple amdgcn--amdhsa -mcpu=gfx1010 
--amdhsa-code-object-version=2 -mattr=-wavefrontsize32,+wavefrontsize64 
-show-encoding %s | llvm-readobj -S --sd --syms - | FileCheck %s 
--check-prefix=ELF
-
-// ELF: Section {

Pierre-vh wrote:
> arsenm wrote:
> > I thought we were still going to be able to read old objects 
> I think llvm-readobj uses all of the MC/Target infrastructure so if we remove 
> emission, we also remove reading, no?
> 
> I'm actually not sure if we plan to let readobj/readelf read COV2 object 
> files, it's an interesting question
I think this is my biggest concern. Do we incur a huge maintenance burden that 
warrants dropping read support? How much code do we really need to maintain to 
keep the readobj/objdump like tools universal?

@t-tye do you have any thoughts on whether we should maintain backwards 
compatibility in the LLVM tooling, even if we drop generation support?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146023/new/

https://reviews.llvm.org/D146023

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150282: [Driver] -ftime-trace: derive trace file names from -o and -dumpdir

2023-05-11 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/include/clang/Driver/Compilation.h:279
+  void addTimeTraceFile(const char *Name, const JobAction *JA) {
+TimeTraceFiles[JA] = Name;
+  }

MaskRay wrote:
> scott.linder wrote:
> > If this is overriding previous paths should it be called `setTimeTraceFile`?
> The naming is to match `add*File` above.
> Do you want an assert that the entry hasn't been inserted before?
If you think it is useful; otherwise consistency with the other options seems 
like a good enough reason



Comment at: clang/lib/Driver/Driver.cpp:5253
+Path = DumpDir->getValue();
+Path += llvm::sys::path::filename(BaseInput);
+  } else {

MaskRay wrote:
> scott.linder wrote:
> > Why `+=` instead of `append`? Do we just know the value of `dumpdir` is 
> > terminated with the path separator?
> `-dumpdir ` is a bit misnomer that it may or may not end with a path 
> separator.
> 
> `clang -c -ftime-trace d/a.c -o e/xa.o -dumpdir f/` is intended to create 
> `fa.json`
> 
> I updated a test and the description to give an example.
Thanks, I just forgot this point since the previous discussions! LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150282/new/

https://reviews.llvm.org/D150282

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149986: AMDGPU: Force sc0 and sc1 on stores for gfx940 and gfx941

2023-05-11 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:517-529
+  bool tryForceStoreSC0SC1(const SIMemOpInfo &MOI,
+   MachineBasicBlock::iterator &MI) const override {
+if (ST.hasForceStoreSC0SC1() &&
+(MOI.getInstrAddrSpace() & (SIAtomicAddrSpace::SCRATCH |
+SIAtomicAddrSpace::GLOBAL |
+SIAtomicAddrSpace::OTHER)) !=
+ SIAtomicAddrSpace::NONE) {

It may be a bit more verbose, but I would suggest just having a `bool Changed` 
variable, like other (admittedly more complicated) functions in the unit do. I 
don't think any future maintainer will mis-interpret this, even without the 
comment


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149986/new/

https://reviews.llvm.org/D149986

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150282: [Driver] -ftime-trace: derive trace file names from -o and -dumpdir

2023-05-11 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Does GCC have the same `-ftime-trace=` option? It seems like it doesn't, as 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92396 is still open?

If so, I am happy with unifying more of the dump/aux handling, and I imagine 
when/if GCC adds the option it will behave similarly.




Comment at: clang/include/clang/Driver/Compilation.h:279
+  void addTimeTraceFile(const char *Name, const JobAction *JA) {
+TimeTraceFiles[JA] = Name;
+  }

If this is overriding previous paths should it be called `setTimeTraceFile`?



Comment at: clang/lib/Driver/Driver.cpp:5253
+Path = DumpDir->getValue();
+Path += llvm::sys::path::filename(BaseInput);
+  } else {

Why `+=` instead of `append`? Do we just know the value of `dumpdir` is 
terminated with the path separator?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150282/new/

https://reviews.llvm.org/D150282

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149193: [Driver] Add -dumpdir and change -gsplit-dwarf .dwo names for linking

2023-05-09 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D149193#4316293 , @dblaikie wrote:

> I guess my main question is: What's the motivation for implementing this? Do 
> you have a need/use for this? (it doesn't seem to be motivated by GCC 
> compatibility - as discussed, looks like we're diverging in a bunch of ways 
> from their behavior and the argument made that these are "developer" flags, 
> so not a stable/compatible interface used across both compilers)

Even if we only align the default behavior of Clang with the default behavior 
of GCC it seems like a win. AFAIU we could implement this without `-dumpdir` at 
all, but the baseline notion of being able to specify where to put 
auxiliary/dump files seems useful.

Do we need to match every quirk and add every other knob from GCC in order to 
use the same name? Is there precedent in terms of other options where Clang is 
close but opinionated about behavior relative to GCC?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149193/new/

https://reviews.llvm.org/D149193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149193: [Driver] Add -dumpdir and change -gsplit-dwarf .dwo names for linking

2023-04-27 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I am OK to give LGTM, assuming the other reviewers don't still have 
reservations?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149193/new/

https://reviews.llvm.org/D149193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149193: [Driver] Add -dumpdir and change -gsplit-dwarf .dwo names for linking

2023-04-27 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D149193#4300885 , @MaskRay wrote:

> I think the patch as-is implements all the useful parts of GCC's complex 
> rules and in the absence of `-dumpbase` (we deliberately don't implement), 
> the rule almost exactly matches GCC unless we do `gcc -g -gsplit-dwarf d/a.c 
> -o e/a.out` (instead of other filenames).
>
> https://maskray.me/blog/2023-04-25-compiler-output-files records all my 
> findings.

I agree that the special-casing of `a.out` when explicitly specified is odd.

Another option to avoid any confusion or debates over exact behavior relative 
to GCC would be to just make our own option. Rather than add `-dumpdir` we 
could add `-dump-prefix` or something. I'm also OK with just implementing a 
simpler subset of the GCC options though (i.e. what you have already).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149193/new/

https://reviews.llvm.org/D149193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149193: [Driver] Add -dumpdir and change -gsplit-dwarf .dwo names for linking

2023-04-26 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/lib/Driver/Driver.cpp:3884
+  nullptr, getOpts().getOption(options::OPT_dumpdir),
+  Args.MakeArgString(Args.getLastArgValue(options::OPT_o, "a") + "-"));
+  Arg->claim();

MaskRay wrote:
> dblaikie wrote:
> > would be nice to have this "a" derive from wherever we hardcode "a.out" as 
> > the default output rather than independently hardcoded here?
> > 
> > & what does GCC do when the `-o` value has a `.` in it? (if you use `-o 
> > a.out` do you get the same `a-x.dwo` behavior, or do you get `a.out-x.dwo`?)
> We can use `llvm::sys::path::stem(getDefaultImageName())`, but I feel that 
> this just complicates the code.
> The default is `a.out` or `a.exe`. If a downstream platform decides to 
> deviate and use another filename, say, `b.out`. We will use `-dumpdir b-` on 
> this platform and `-dumpdir a-` on everything else. I think they will likely 
> be fine with `a-` even if they don't use `a` as the stem name of the default 
> image...
> 
> GCC generally doesn't special case `.` in `-o` for linking, but the `a.out` 
> filename is different.
> 
> ```
> gcc -g -gsplit-dwarf d/a.c.c -o e/x.out  # e/x.out-a.dwo
> gcc -g -gsplit-dwarf d/a.c.c -o e/a.out  # e/a-a.dwo
> ```
> 
> I think Clang should not special case `a.out`.
GCC distinguishes between "dump" and "auxiliary" outputs, and in this case I 
think the "dump" outputs retain the basename-suffix (i.e. you get 
a.out) whereas "auxiliary" outputs first strip the basename-suffix 
(i.e. you get a). The basename-suffix itself can be specified 
explicitly via -dumpbase-ext, but it is inferred by default.

The naming for things adds to the for me:

* `-dumpdir` doesn't specifically/exclusively specify a "directory", it just 
specifies a prefix
* `-dumpbase-ext` only affects the output of non-dump, auxiliary files

I do worry that being close-but-not-quite like GCC here will cause headaches 
for someone, but I am also not excited about implementing the complexity of the 
GCC options.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149193/new/

https://reviews.llvm.org/D149193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149193: [Driver] -gsplit-dwarf: derive .dwo names from -o for link actions

2023-04-25 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I'm a bit confused after trying to work out the rules for the GCC version of 
`-dumpdir` over at 
https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-dumpdir but it at 
least seems like our version is a subset of theirs.

Do we support any of the other `-dump*` options GCC does? E.g. 
https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-dumpbase and 
https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-dumpbase-ext ? I 
don't think we need to in order to add and use `-dumpdir`, but they do have a 
somewhat unique inter-dependence in that the exact value of one can cause 
another to be completely ignored.

Also, would it be worth adding tests for the following cases:

> It defaults to the location of the output file, unless the output file is a 
> special file like /dev/null. Options -save-temps=cwd and -save-temps=obj 
> override this default, just like an explicit -dumpdir option. In case 
> multiple such options are given, the last one prevails:




Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1250-1267
 
+  SmallString<128> T;
+  if (const Arg *A = Args.getLastArg(options::OPT_dumpdir))
+T = A->getValue();
+
   Arg *FinalOutput = Args.getLastArg(options::OPT_o);
   if (FinalOutput && Args.hasArg(options::OPT_c)) {

There is an extra check for `dumpdir` and a string copy in the `OPT_c` case, 
maybe it can just move past that case?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149193/new/

https://reviews.llvm.org/D149193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148975: -fdebug-prefix-map=: make the last win when multiple prefixes match

2023-04-25 Thread Scott Linder via Phabricator via cfe-commits

scott.linder accepted this revision.
scott.linder added a comment.
This revision is now accepted and ready to land.

LGTM, thank you!

Does this warrant a release note, as it is changing the behavior in a 
backwards-incompatible manner? I do think changing to match GCC is worthwhile, 
even if it means a change in behavior for Clang.

I also don't think the "longest first" heuristic is useful enough to outweigh 
the benefits of behaving like GCC; it seems like the user can always sort their 
own options to get the same effect.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148975/new/

https://reviews.llvm.org/D148975

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148975: -fdebug-prefix-map=: make the last win when multiple prefixes match

2023-04-25 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/include/clang/Basic/CodeGenOptions.h:209
 
-  std::map DebugPrefixMap;
+  llvm::SmallVector, 0> DebugPrefixMap;
   std::map CoveragePrefixMap;

MaskRay wrote:
> scott.linder wrote:
> > What benefit does forcing allocation have? Why not use the default, or 
> > switch to `std::vector`?
> This doesn't force allocation. I use inline element size 0 to make the member 
> variable smaller than a `std::vector`.
> `std::vector` likely compiles to larger code.
Sorry, I just didn't realize this was idiomatic (i.e. I hadn't read 
https://llvm.org/docs/ProgrammersManual.html#vector), and I didn't see other 
similar uses in the file.

There are 15 `std::vector` members of `CodeGenOptions`, but no `SmallVector<_, 
0>` members; maybe the rest can be converted in an NFC patch, so things are 
consistent?



Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:72-79
 CGDebugInfo::CGDebugInfo(CodeGenModule &CGM)
 : CGM(CGM), DebugKind(CGM.getCodeGenOpts().getDebugInfo()),
   DebugTypeExtRefs(CGM.getCodeGenOpts().DebugTypeExtRefs),
   DBuilder(CGM.getModule()) {
-  for (const auto &KV : CGM.getCodeGenOpts().DebugPrefixMap)
-DebugPrefixMap[KV.first] = KV.second;
+  for (const auto &[From, To] : CGM.getCodeGenOpts().DebugPrefixMap)
+DebugPrefixMap.emplace_back(From, To);
   CreateCompileUnit();

MaskRay wrote:
> scott.linder wrote:
> > Can you use the member-initializer-list here?
> No, this doesn't work. We convert a vector of `std::string` to a vector of 
> `StringRef`.
If all we are doing is creating another vector which shares the underlying 
strings with the original, why not just save a reference to the original 
vector? Is the cost of the extra dereference when accessing it greater than the 
cost of traversing it plus the extra storage for the `StringRef`s?

It seems like the original utility was just to get the `std::map` sorting 
behavior, which we no longer need.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148975/new/

https://reviews.llvm.org/D148975

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148975: -fdebug-prefix-map=: make the last win when multiple prefixes match

2023-04-24 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/include/clang/Basic/CodeGenOptions.h:209
 
-  std::map DebugPrefixMap;
+  llvm::SmallVector, 0> DebugPrefixMap;
   std::map CoveragePrefixMap;

What benefit does forcing allocation have? Why not use the default, or switch 
to `std::vector`?



Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:72-79
 CGDebugInfo::CGDebugInfo(CodeGenModule &CGM)
 : CGM(CGM), DebugKind(CGM.getCodeGenOpts().getDebugInfo()),
   DebugTypeExtRefs(CGM.getCodeGenOpts().DebugTypeExtRefs),
   DBuilder(CGM.getModule()) {
-  for (const auto &KV : CGM.getCodeGenOpts().DebugPrefixMap)
-DebugPrefixMap[KV.first] = KV.second;
+  for (const auto &[From, To] : CGM.getCodeGenOpts().DebugPrefixMap)
+DebugPrefixMap.emplace_back(From, To);
   CreateCompileUnit();

Can you use the member-initializer-list here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148975/new/

https://reviews.llvm.org/D148975

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147279: [HeterogeneousDWARF] Implement AMDGPU CFI, DebugInfo

2023-03-30 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: kosarev, foad, kerbowa, arphaman, hiraditya, tpr, 
dstuttard, yaxunl, jvesely, kzhuravl, emaste, arsenm, qcolombet, MatzeB.
Herald added a project: All.
scott.linder requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, MaskRay, wdng.
Herald added projects: clang, LLVM.

Add -amdgpu-spill-cfi-saved-regs to enable complete debug info even when
codegen would otherwise not spill the EXEC mask. This is a workaround
until a solution to describe the EXEC mask without spills is integrated
based on the new structurizer.

Support emitting eh_frame for debug-info even when the target does not
support exceptions.

Extend MCAsmInfo to reflect whether a target opts in to the
heterogeneous DWARF extensions.

Note: The encodings in BinaryFormat/Dwarf.def are subject to change.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D147279

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  llvm/include/llvm/MC/MCAsmInfo.h
  llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
  llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
  llvm/lib/MC/MCDwarf.cpp
  llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp
  llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
  llvm/lib/Target/AMDGPU/SIFrameLowering.h
  llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
  llvm/lib/Target/AMDGPU/SIInstrInfo.h
  llvm/lib/Target/AMDGPU/SIInstructions.td
  llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
  llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
  llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
  llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
  llvm/lib/Target/AMDGPU/SIRegisterInfo.h
  llvm/test/CodeGen/AMDGPU/GlobalISel/assert-align.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/localizer.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
  llvm/test/CodeGen/AMDGPU/abi-attribute-hints-undefined-behavior.ll
  llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
  llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
  llvm/test/CodeGen/AMDGPU/agpr-copy-reuse-writes.mir
  llvm/test/CodeGen/AMDGPU/amdgpu-spill-cfi-saved-regs.ll
  llvm/test/CodeGen/AMDGPU/av_spill_cross_bb_usage.mir
  llvm/test/CodeGen/AMDGPU/bf16.ll
  llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll
  llvm/test/CodeGen/AMDGPU/call-graph-register-usage.ll
  llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
  llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
  llvm/test/CodeGen/AMDGPU/cross-block-use-is-not-abi-copy.ll
  llvm/test/CodeGen/AMDGPU/csr-sgpr-spill-live-ins.mir
  llvm/test/CodeGen/AMDGPU/debug-frame.ll
  llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll
  llvm/test/CodeGen/AMDGPU/entry-function-cfi.mir
  llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll
  llvm/test/CodeGen/AMDGPU/fold-reload-into-exec.mir
  llvm/test/CodeGen/AMDGPU/fold-reload-into-m0.mir
  llvm/test/CodeGen/AMDGPU/frame-index-elimination-tied-operand.mir
  llvm/test/CodeGen/AMDGPU/frame-index.mir
  llvm/test/CodeGen/AMDGPU/frame-setup-without-sgpr-to-vgpr-spills.ll
  llvm/test/CodeGen/AMDGPU/gfx-call-non-gfx-func.ll
  llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
  llvm/test/CodeGen/AMDGPU/gfx-callable-preserved-registers.ll
  llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll
  llvm/test/CodeGen/AMDGPU/indirect-call.ll
  llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll
  llvm/test/CodeGen/AMDGPU/kernel-mubuf-with-voffset.mir
  llvm/test/CodeGen/AMDGPU/llvm.dbg.value.ll
  llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
  llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
  llvm/test/CodeGen/AMDGPU/need-fp-from-vgpr-spills.ll
  llvm/test/CodeGen/AMDGPU/nested-calls.ll
  llvm/test/CodeGen/AMDGPU/no-source-locations-in-prologue.ll
  llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll
  llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll
  llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
  llvm/test/CodeGen/AMDGPU/pei-build-spill-partial-agpr.mir
  llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
  llvm/test/CodeGen/AMDGPU/pei-cfi-saves-bug.ll
  llvm/test/CodeGen/AMDGPU/pei-reg-scavenger-position.mir
  llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr-carry-out.mir
  llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr-gfx9.mir
  llvm/test/CodeGen/AMDGPU/pei-scavenge-sgpr.mir
  llvm/test/CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir
  llvm/test/CodeGen/AMDGPU/preserve-only-inactive-lane.mir
  llvm/test/CodeGen/AMDGPU/prologue-epilogue-markers.ll
  llvm/test/CodeGen/AMDGPU/ptr-arg-dbg-value.ll
  llvm/test/CodeGen/AMDGPU/regalloc-introduces-copy-sgpr-to-agpr.mir
  llvm/test/CodeGen/AMDGPU/same-slot-agpr-sgpr.mir
  llvm/test/CodeGen/AMDGPU/sgpr-spill-to-vmem-scc-clobber.mir
  llvm/test/CodeGen/AMDGPU/sgpr-spill-vmem-large-frame.mir
  llvm/test/CodeGen/AMDGPU/sgpr

[PATCH] D145770: [clang-offload-bundler] Standardize TargetID field for bundler

2023-03-13 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Is there a corresponding document update to go along with this? If consumers of 
the bundles will begin to rely on this it should be documented that it is 
guaranteed by the producer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145770/new/

https://reviews.llvm.org/D145770

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-02-15 Thread Scott Linder via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG97ba3c2bec48: [Clang][AMDGPU] Set LTO CG opt level based on 
Clang option (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142499/new/

https://reviews.llvm.org/D142499

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-opt.hip


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+// O3-SAME: "--lto-CGO3"
 // Os-SAME: "-plugin-opt=O2"
+// Os-SAME: "--lto-CGO2"
 // Oz-SAME: "-plugin-opt=O2"
+// Oz-SAME: "--lto-CGO2"
 // Og-SAME: "-plugin-opt=O1"
+// Og-SAME: "--lto-CGO1"
+// O0-CGO2-SAME: "-plugin-opt=O0"
+// O0-CGO2-SAME: "--lto-CGO0"
+// O0-CGO2-SAME: "--lto-CGO2"
 
 // ALL: "-cc1" "-triple" "x86_64-unknown-linux-gnu"
 // DEFAULT-NOT: "-O{{.}}"
@@ -90,3 +110,5 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,8 +152,10 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
-  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker))
+  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
+Arg->claim();
+  }
 
   LldArgs.append({"-o", Output.getFilename()});
   for (auto Input : Inputs)
Index: clang/lib/Driver/ToolChains/CommonArgs.cpp
===
--- clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -567,6 +567,7 @@
   ArgStringList &CmdArgs, const InputInfo &Output,
   const InputInfo &Input, bool IsThinLTO) {
   const bool IsOSAIX = ToolChain.getTriple().isOSAIX();
+  const bool IsAMDGCN = ToolChain.getTriple().isAMDGCN();
   const char *Linker = Args.MakeArgString(ToolChain.GetLinkerPath());
   const Driver &D = ToolChain.getDriver();
   if (llvm::sys::path::filename(Linker) != "ld.lld" &&
@@ -631,9 +632,12 @@
 OOpt = "2";
 } else if (A->getOption().matches(options::OPT_O0))
   OOpt = "0";
-if (!OOpt.empty())
+if (!OOpt.empty()) {
   CmdArgs.push_back(
   Args.MakeArgString(Twine(PluginOptPrefix) + ExtraDash + "O" + OOpt));
+  if (IsAMDGCN)
+CmdArgs.push_back(Args.MakeArgString(Twine("--lto-CGO") + OOpt));
+}
   }
 
   if (Args.hasArg(options::OPT_gsplit_dwarf))


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+//

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-02-02 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Are there any thoughts on whether this is too ugly to live? It will be awkward 
to teach users the current default behavior without this change, but if we can 
accept it as a historical quirk that may be OK.

The primary driver for wanting the 1:1 mapping is an odd interaction between 
our debug info (only implemented for -O0 codegen currently, higher opt levels 
are WIP) and our implementation of `-fgpu-rdc` (currently this implies `-flto` 
as we don't support true device code-object linking).

A user who compiles and debugs without `-fgpu-rdc` cannot simply add it to 
their command-line, they will also need `-Xoffload-linker --lto-CGO0`. Of 
course, this is a temporary arrangement, but it also just seems "correct" to 
default to the 1:1 mapping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142499/new/

https://reviews.llvm.org/D142499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-02-02 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 494435.
scott.linder added a comment.

Rebase


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142499/new/

https://reviews.llvm.org/D142499

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-opt.hip


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+// O3-SAME: "--lto-CGO3"
 // Os-SAME: "-plugin-opt=O2"
+// Os-SAME: "--lto-CGO2"
 // Oz-SAME: "-plugin-opt=O2"
+// Oz-SAME: "--lto-CGO2"
 // Og-SAME: "-plugin-opt=O1"
+// Og-SAME: "--lto-CGO1"
+// O0-CGO2-SAME: "-plugin-opt=O0"
+// O0-CGO2-SAME: "--lto-CGO0"
+// O0-CGO2-SAME: "--lto-CGO2"
 
 // ALL: "-cc1" "-triple" "x86_64-unknown-linux-gnu"
 // DEFAULT-NOT: "-O{{.}}"
@@ -90,3 +110,5 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,8 +152,10 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
-  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker))
+  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
+Arg->claim();
+  }
 
   LldArgs.append({"-o", Output.getFilename()});
   for (auto Input : Inputs)
Index: clang/lib/Driver/ToolChains/CommonArgs.cpp
===
--- clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -567,6 +567,7 @@
   ArgStringList &CmdArgs, const InputInfo &Output,
   const InputInfo &Input, bool IsThinLTO) {
   const bool IsOSAIX = ToolChain.getTriple().isOSAIX();
+  const bool IsAMDGCN = ToolChain.getTriple().isAMDGCN();
   const char *Linker = Args.MakeArgString(ToolChain.GetLinkerPath());
   const Driver &D = ToolChain.getDriver();
   if (llvm::sys::path::filename(Linker) != "ld.lld" &&
@@ -631,9 +632,12 @@
 OOpt = "2";
 } else if (A->getOption().matches(options::OPT_O0))
   OOpt = "0";
-if (!OOpt.empty())
+if (!OOpt.empty()) {
   CmdArgs.push_back(
   Args.MakeArgString(Twine(PluginOptPrefix) + ExtraDash + "O" + OOpt));
+  if (IsAMDGCN)
+CmdArgs.push_back(Args.MakeArgString(Twine("--lto-CGO") + OOpt));
+}
   }
 
   if (Args.hasArg(options::OPT_gsplit_dwarf))


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+// O3-SAME: "--lto-CGO3"
 // Os-SAME: "-plugin-opt=O2"
+// Os-SAME: "--lto-CGO2"
 // Oz-SAME: "-plugin-opt=O

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-01-25 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 492205.
scott.linder added a comment.

Rebase


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142499/new/

https://reviews.llvm.org/D142499

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-opt.hip


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+// O3-SAME: "--lto-CGO3"
 // Os-SAME: "-plugin-opt=O2"
+// Os-SAME: "--lto-CGO2"
 // Oz-SAME: "-plugin-opt=O2"
+// Oz-SAME: "--lto-CGO2"
 // Og-SAME: "-plugin-opt=O1"
+// Og-SAME: "--lto-CGO1"
+// O0-CGO2-SAME: "-plugin-opt=O0"
+// O0-CGO2-SAME: "--lto-CGO0"
+// O0-CGO2-SAME: "--lto-CGO2"
 
 // ALL: "-cc1" "-triple" "x86_64-unknown-linux-gnu"
 // DEFAULT-NOT: "-O{{.}}"
@@ -90,3 +110,5 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,8 +152,10 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
-  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker))
+  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
+Arg->claim();
+  }
 
   LldArgs.append({"-o", Output.getFilename()});
   for (auto Input : Inputs)
Index: clang/lib/Driver/ToolChains/CommonArgs.cpp
===
--- clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -567,6 +567,7 @@
   ArgStringList &CmdArgs, const InputInfo &Output,
   const InputInfo &Input, bool IsThinLTO) {
   const bool IsOSAIX = ToolChain.getTriple().isOSAIX();
+  const bool IsAMDGCN = ToolChain.getTriple().isAMDGCN();
   const char *Linker = Args.MakeArgString(ToolChain.GetLinkerPath());
   const Driver &D = ToolChain.getDriver();
   if (llvm::sys::path::filename(Linker) != "ld.lld" &&
@@ -631,9 +632,12 @@
 OOpt = "2";
 } else if (A->getOption().matches(options::OPT_O0))
   OOpt = "0";
-if (!OOpt.empty())
+if (!OOpt.empty()) {
   CmdArgs.push_back(
   Args.MakeArgString(Twine(PluginOptPrefix) + ExtraDash + "O" + OOpt));
+  if (IsAMDGCN)
+CmdArgs.push_back(Args.MakeArgString(Twine("--lto-CGO") + OOpt));
+}
   }
 
   if (Args.hasArg(options::OPT_gsplit_dwarf))


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+// O3-SAME: "--lto-CGO3"
 // Os-SAME: "-plugin-opt=O2"
+// Os-SAME: "--lto-CGO2"
 // Oz-SAME: "-plugin-opt=O

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-01-24 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D142499#4078177 , @arsenm wrote:

> Making this target specific doesn’t make sense

I don't know how else to resolve others wanting the default to be the 
`CGOptLevel = clamp(ltoOptLevel, 2, 3)` behavior and AMDGPU wanting the default 
to be `CGOptLevel = ltoOptLevel`. Maybe that isn't actually the default we 
want? We can ask users to specify `-Xoffload-linker --lto-CGO#`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142499/new/

https://reviews.llvm.org/D142499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-01-24 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added reviewers: arsenm, yaxunl, scchan, pcc, MaskRay, 
mehdi_amini, respindola, ruiu, aheejin, sbc100.
scott.linder added a comment.

(Just adding everyone from the parent review, feel free to remove yourself as 
reviewer if you aren't interested!)

It isn't ideal, but as only AMDGCN wants the 1:1 mapping for LTO->CG opt level 
to be the default I implemented it here. Thoughts?

I can move the missing `claim` call to another patch, it just seemed like a 
small enough change to include inline.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142499/new/

https://reviews.llvm.org/D142499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D142499: [Clang][AMDGPU] Set LTO CG opt level based on Clang option

2023-01-24 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: kosarev, inglorion, tpr, dstuttard, yaxunl, kzhuravl.
Herald added a project: All.
scott.linder requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay, wdng.
Herald added a project: clang.

For AMDGCN default to mapping --lto-O# to --lto-CGO# in a 1:1 manner
(i.e. clang -O implies --lto-O and --lto-CGO).

Ensure there is a means to override this via -Xoffload-linker and begin
to claim these arguments to avoid incorrect warnings that they are not
used.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D142499

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-opt.hip


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}lld{{.*}}" {{.*}} "-plugin-opt=mcpu=gfx900"
 // DEFAULT-NOT: "-plugin-opt=O{{.*}}"
 // O0-SAME: "-plugin-opt=O0"
+// O0-SAME: "--lto-CGO0"
 // O1-SAME: "-plugin-opt=O1"
+// O1-SAME: "--lto-CGO1"
 // O2-SAME: "-plugin-opt=O2"
+// O2-SAME: "--lto-CGO2"
 // O3-SAME: "-plugin-opt=O3"
+// O3-SAME: "--lto-CGO3"
 // Os-SAME: "-plugin-opt=O2"
+// Os-SAME: "--lto-CGO2"
 // Oz-SAME: "-plugin-opt=O2"
+// Oz-SAME: "--lto-CGO2"
 // Og-SAME: "-plugin-opt=O1"
+// Og-SAME: "--lto-CGO1"
+// O0-CGO2-SAME: "-plugin-opt=O0"
+// O0-CGO2-SAME: "--lto-CGO0"
+// O0-CGO2-SAME: "--lto-CGO2"
 
 // ALL: "-cc1" "-triple" "x86_64-unknown-linux-gnu"
 // DEFAULT-NOT: "-O{{.}}"
@@ -90,3 +110,5 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,8 +152,10 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
-  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker))
+  for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
+Arg->claim();
+  }
 
   LldArgs.append({"-o", Output.getFilename()});
   for (auto Input : Inputs)
Index: clang/lib/Driver/ToolChains/CommonArgs.cpp
===
--- clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -567,6 +567,7 @@
   ArgStringList &CmdArgs, const InputInfo &Output,
   const InputInfo &Input, bool IsThinLTO) {
   const bool IsOSAIX = ToolChain.getTriple().isOSAIX();
+  const bool IsAMDGCN = ToolChain.getTriple().isAMDGCN();
   const char *Linker = Args.MakeArgString(ToolChain.GetLinkerPath());
   const Driver &D = ToolChain.getDriver();
   if (llvm::sys::path::filename(Linker) != "ld.lld" &&
@@ -631,9 +632,12 @@
 OOpt = "2";
 } else if (A->getOption().matches(options::OPT_O0))
   OOpt = "0";
-if (!OOpt.empty())
+if (!OOpt.empty()) {
   CmdArgs.push_back(
   Args.MakeArgString(Twine(PluginOptPrefix) + ExtraDash + "O" + OOpt));
+  if (IsAMDGCN)
+CmdArgs.push_back(Args.MakeArgString(Twine("--lto-CGO") + OOpt));
+}
   }
 
   if (Args.hasArg(options::OPT_gsplit_dwarf))


Index: clang/test/Driver/hip-toolchain-opt.hip
===
--- clang/test/Driver/hip-toolchain-opt.hip
+++ clang/test/Driver/hip-toolchain-opt.hip
@@ -57,6 +57,14 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck --check-prefixes=ALL,Og %s
 
+// RUN: %clang -### -O0 \
+// RUN:   -Xoffload-linker --lto-CGO2 \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --cuda-gpu-arch=gfx900 \
+// RUN:   -c -nogpulib \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck --check-prefixes=ALL,O0-CGO2 %s
+
 // ALL: "-cc1" "-triple" "amdgcn-amd-amdhsa"
 // DEFAULT-NOT: "-O{{.}}"
 // O0-SAME: "-O0"
@@ -66,6 +74,8 @@
 // Os-SAME: "-Os"
 // Oz-SAME: "-Oz"
 // Og-SAME: "-Og"
+// O0-CGO2-SAME: "-O0"
+// O0-CGO2-NOT: "--lto-CGO2"
 
 // ALL-NOT: "{{.*}}opt"
 
@@ -74,12 +84,22 @@
 // ALL: "{{.*}}l

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-23 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: flang/lib/Frontend/FrontendActions.cpp:590
+  std::optional OptLevelOrNone =
+  CodeGenOpt::getLevel(CGOpts.OptimizationLevel);
+  assert(OptLevelOrNone && "Invalid optimization level!");

aaronpuchert wrote:
> Does this need `llvm::`?
> 
> ```
> ../llvm-project/flang/lib/Frontend/FrontendActions.cpp:590:7: error: use of 
> undeclared identifier 'CodeGenOpt'; did you mean 'llvm::CodeGenOpt'?
>   CodeGenOpt::getLevel(CGOpts.OptimizationLevel);
>   ^~
>   llvm::CodeGenOpt
> ../llvm-project/llvm/include/llvm/Support/CodeGen.h:53:13: note: 
> 'llvm::CodeGenOpt' declared here
>   namespace CodeGenOpt {
> ^
> ```
> See https://lab.llvm.org/buildbot/#/builders/191/builds/13832.
Yes, I just commited a1baa7a5c57ea3065f8f354d2fc758b3328040fd to fix this; I 
thought I was building `flang` locally but didn't verify it and clearly was not

Thank you and sorry for the noise!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-23 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I also ended up changing the underlying type to `int` to resolve 
https://lab.llvm.org/buildbot/#/builders/61/builds/38754 , and I removed the 
`getID` function (with the implicit conversions it would never get used anyway).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-23 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/include/llvm/Support/CodeGen.h:67
+  inline std::optional getLevel(IDType ID) {
+if (ID < 0 || ID > 3)
+  return std::nullopt;

mstorsjo wrote:
> scott.linder wrote:
> > barannikov88 wrote:
> > > As I can see, clients do not check for nullopt. Either add checks or 
> > > replace this check with an assertion and drop std::optional (in this case 
> > > `parseLevel` should be updated accordingly).
> > > 
> > > 
> > Good catch! I was working off of the old behavior of `llvm::Optional` and 
> > assuming the new `std::optional` was guaranteed to `assert` on dereference 
> > as well. I think the right interface is to expose the `optional`, so for 
> > the callsites which currently do the equivalent of asserting I will just 
> > carry over an explicit assert to the new version.
> > 
> > I posted to 
> > https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716/26
> >  to discuss the issue more generally, as at least some of the patches which 
> > moved code from `llvm::Optional` to `std::optional` accidentally dropped 
> > assertions.
> This causes massive amounts of warnings when built with GCC:
> ```
> ../include/llvm/Support/CodeGen.h:67:12: warning: comparison is always false 
> due to limited range of data type [-Wtype-limits]
>67 | if (ID < 0 || ID > 3)
>   | ~~~^~~
> ```
Fixed in 2dc33713de7ceae3e28a13be1d72c862ec0efb2c

Thank you for reporting, sorry for the noise!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-23 Thread Scott Linder via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG25c0ea2a5370: [NFC] Consolidate llvm::CodeGenOpt::Level 
handling (authored by scott.linder).

Changed prior to commit:
  https://reviews.llvm.org/D141968?vs=490321&id=491515#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/tools/clang-fuzzer/handle-llvm/handle_llvm.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  flang/lib/Frontend/FrontendActions.cpp
  llvm/include/llvm/Support/CodeGen.h
  llvm/lib/LTO/LTOCodeGenerator.cpp
  llvm/tools/gold/gold-plugin.cpp
  llvm/tools/llc/llc.cpp
  llvm/tools/lli/lli.cpp
  llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
  llvm/tools/llvm-lto2/llvm-lto2.cpp
  llvm/tools/lto/lto.cpp
  openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp

Index: openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
===
--- openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
+++ openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
@@ -138,20 +138,6 @@
   return createModuleFromMemoryBuffer(MB, Context);
 }
 
-CodeGenOpt::Level getCGOptLevel(unsigned OptLevel) {
-  switch (OptLevel) {
-  case 0:
-return CodeGenOpt::None;
-  case 1:
-return CodeGenOpt::Less;
-  case 2:
-return CodeGenOpt::Default;
-  case 3:
-return CodeGenOpt::Aggressive;
-  }
-  llvm_unreachable("Invalid optimization level");
-}
-
 OptimizationLevel getOptLevel(unsigned OptLevel) {
   switch (OptLevel) {
   case 0:
@@ -169,7 +155,10 @@
 Expected>
 createTargetMachine(Module &M, std::string CPU, unsigned OptLevel) {
   Triple TT(M.getTargetTriple());
-  CodeGenOpt::Level CGOptLevel = getCGOptLevel(OptLevel);
+  std::optional CGOptLevelOrNone =
+  CodeGenOpt::getLevel(OptLevel);
+  assert(CGOptLevelOrNone && "Invalid optimization level");
+  CodeGenOpt::Level CGOptLevel = *CGOptLevelOrNone;
 
   std::string Msg;
   const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
Index: llvm/tools/lto/lto.cpp
===
--- llvm/tools/lto/lto.cpp
+++ llvm/tools/lto/lto.cpp
@@ -528,20 +528,10 @@
 if (OptLevel < '0' || OptLevel > '3')
   report_fatal_error("Optimization level must be between 0 and 3");
 CodeGen->setOptLevel(OptLevel - '0');
-switch (OptLevel) {
-case '0':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::None);
-  break;
-case '1':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Less);
-  break;
-case '2':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Default);
-  break;
-case '3':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Aggressive);
-  break;
-}
+std::optional CGOptLevelOrNone =
+CodeGenOpt::getLevel(OptLevel - '0');
+assert(CGOptLevelOrNone);
+CodeGen->setCodeGenOptLevel(*CGOptLevelOrNone);
   }
   return wrap(CodeGen);
 }
Index: llvm/tools/llvm-lto2/llvm-lto2.cpp
===
--- llvm/tools/llvm-lto2/llvm-lto2.cpp
+++ llvm/tools/llvm-lto2/llvm-lto2.cpp
@@ -306,20 +306,9 @@
   Conf.Freestanding = EnableFreestanding;
   for (auto &PluginFN : PassPlugins)
 Conf.PassPlugins.push_back(PluginFN);
-  switch (CGOptLevel) {
-  case '0':
-Conf.CGOptLevel = CodeGenOpt::None;
-break;
-  case '1':
-Conf.CGOptLevel = CodeGenOpt::Less;
-break;
-  case '2':
-Conf.CGOptLevel = CodeGenOpt::Default;
-break;
-  case '3':
-Conf.CGOptLevel = CodeGenOpt::Aggressive;
-break;
-  default:
+  if (auto Level = CodeGenOpt::parseLevel(CGOptLevel)) {
+Conf.CGOptLevel = *Level;
+  } else {
 llvm::errs() << "invalid cg optimization level: " << CGOptLevel << '\n';
 return 1;
   }
Index: llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
===
--- llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
+++ llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
@@ -42,7 +42,7 @@
 OptLevel("O",
  cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] "
   "(default = '-O2')"),
- cl::Prefix, cl::init(' '));
+ cl::Prefix, cl::init('2'));
 
 static cl::opt
 TargetTriple("mtriple", cl::desc("Override target triple for module"));
@@ -144,16 +144,12 @@
   std::string CPUStr = codegen::getCPUStr(),
   FeaturesStr = codegen::getFeaturesStr();
 
-  CodeGenOpt::Level OLvl = CodeGenOpt::Default;
-  switch (OptLevel) {
-  default:
+  CodeGenOpt::Level OLvl;
+  if (auto Level = CodeGenOpt::parseLevel(OptLevel)) {
+OLvl = *Level;
+  } else {
 errs() << argv[0] << ": invalid optimization level.\n";
 return 1;
-  case ' ': break;
-  case '0': OLvl =

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-18 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.

Comment at: llvm/include/llvm/Support/CodeGen.h:57
+  /// Code generation optimization level.
+  enum Level : IDType {
+None = 0,  ///< -O0

barannikov88 wrote:
> arsenm wrote:
> > scott.linder wrote:
> > > This is ABI breaking, so maybe we don't want/need to define the 
> > > underlying type?
> > This isn't in the C API, and this isn't for a release branch so I don't 
> > think it matters
> Why did you need to restrict the underlying type?
> 
There is no strict need, it just:

* Avoids potential UB (casting an out-of-range value to an enumeration type 
//without a fixed underlying type// is undefined); in practice there is plenty 
of UB in LLVM, and a bug here wouldn't be a particularly pernicious kind of UB 
anyway.
* Means users of the type might pack better (and we won't run out of 255 opt 
levels anytime soon).

I originally intended to change this to an `enum class` rather than a `class` 
in a `namespace`, but that is slightly more disruptive and should probably be 
done for every other enumeration defined here at the same time.

Comment at: llvm/include/llvm/Support/CodeGen.h:67
+  inline std::optional getLevel(IDType ID) {
+if (ID < 0 || ID > 3)
+  return std::nullopt;

barannikov88 wrote:
> As I can see, clients do not check for nullopt. Either add checks or replace 
> this check with an assertion and drop std::optional (in this case 
> `parseLevel` should be updated accordingly).
> 
> 
Good catch! I was working off of the old behavior of `llvm::Optional` and 
assuming the new `std::optional` was guaranteed to `assert` on dereference as 
well. I think the right interface is to expose the `optional`, so for the 
callsites which currently do the equivalent of asserting I will just carry over 
an explicit assert to the new version.

I posted to 
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716/26
 to discuss the issue more generally, as at least some of the patches which 
moved code from `llvm::Optional` to `std::optional` accidentally dropped 
assertions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-18 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 490321.
scott.linder marked 2 inline comments as done.
scott.linder added a comment.

- Fix return type of `getID`
- Fix mistakenly updated option help text
- Add back in missing `assert`s


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/tools/clang-fuzzer/handle-llvm/handle_llvm.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  flang/lib/Frontend/FrontendActions.cpp
  llvm/include/llvm/Support/CodeGen.h
  llvm/lib/LTO/LTOCodeGenerator.cpp
  llvm/tools/gold/gold-plugin.cpp
  llvm/tools/llc/llc.cpp
  llvm/tools/lli/lli.cpp
  llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
  llvm/tools/llvm-lto2/llvm-lto2.cpp
  llvm/tools/lto/lto.cpp
  openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp

Index: openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
===
--- openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
+++ openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
@@ -138,20 +138,6 @@
   return createModuleFromMemoryBuffer(MB, Context);
 }
 
-CodeGenOpt::Level getCGOptLevel(unsigned OptLevel) {
-  switch (OptLevel) {
-  case 0:
-return CodeGenOpt::None;
-  case 1:
-return CodeGenOpt::Less;
-  case 2:
-return CodeGenOpt::Default;
-  case 3:
-return CodeGenOpt::Aggressive;
-  }
-  llvm_unreachable("Invalid optimization level");
-}
-
 OptimizationLevel getOptLevel(unsigned OptLevel) {
   switch (OptLevel) {
   case 0:
@@ -169,7 +155,10 @@
 Expected>
 createTargetMachine(Module &M, std::string CPU, unsigned OptLevel) {
   Triple TT(M.getTargetTriple());
-  CodeGenOpt::Level CGOptLevel = getCGOptLevel(OptLevel);
+  std::optional CGOptLevelOrNone =
+  CodeGenOpt::getLevel(OptLevel);
+  assert(CGOptLevelOrNone);
+  CodeGenOpt::Level CGOptLevel = *CGOptLevelOrNone;
 
   std::string Msg;
   const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
Index: llvm/tools/lto/lto.cpp
===
--- llvm/tools/lto/lto.cpp
+++ llvm/tools/lto/lto.cpp
@@ -528,20 +528,10 @@
 if (OptLevel < '0' || OptLevel > '3')
   report_fatal_error("Optimization level must be between 0 and 3");
 CodeGen->setOptLevel(OptLevel - '0');
-switch (OptLevel) {
-case '0':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::None);
-  break;
-case '1':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Less);
-  break;
-case '2':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Default);
-  break;
-case '3':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Aggressive);
-  break;
-}
+std::optional CGOptLevelOrNone =
+CodeGenOpt::getLevel(OptLevel - '0');
+assert(CGOptLevelOrNone);
+CodeGen->setCodeGenOptLevel(*CGOptLevelOrNone);
   }
   return wrap(CodeGen);
 }
Index: llvm/tools/llvm-lto2/llvm-lto2.cpp
===
--- llvm/tools/llvm-lto2/llvm-lto2.cpp
+++ llvm/tools/llvm-lto2/llvm-lto2.cpp
@@ -306,20 +306,9 @@
   Conf.Freestanding = EnableFreestanding;
   for (auto &PluginFN : PassPlugins)
 Conf.PassPlugins.push_back(PluginFN);
-  switch (CGOptLevel) {
-  case '0':
-Conf.CGOptLevel = CodeGenOpt::None;
-break;
-  case '1':
-Conf.CGOptLevel = CodeGenOpt::Less;
-break;
-  case '2':
-Conf.CGOptLevel = CodeGenOpt::Default;
-break;
-  case '3':
-Conf.CGOptLevel = CodeGenOpt::Aggressive;
-break;
-  default:
+  if (auto Level = CodeGenOpt::parseLevel(CGOptLevel)) {
+Conf.CGOptLevel = *Level;
+  } else {
 llvm::errs() << "invalid cg optimization level: " << CGOptLevel << '\n';
 return 1;
   }
Index: llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
===
--- llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
+++ llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
@@ -42,7 +42,7 @@
 OptLevel("O",
  cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] "
   "(default = '-O2')"),
- cl::Prefix, cl::init(' '));
+ cl::Prefix, cl::init('2'));
 
 static cl::opt
 TargetTriple("mtriple", cl::desc("Override target triple for module"));
@@ -144,16 +144,12 @@
   std::string CPUStr = codegen::getCPUStr(),
   FeaturesStr = codegen::getFeaturesStr();
 
-  CodeGenOpt::Level OLvl = CodeGenOpt::Default;
-  switch (OptLevel) {
-  default:
+  CodeGenOpt::Level OLvl;
+  if (auto Level = CodeGenOpt::parseLevel(OptLevel)) {
+OLvl = *Level;
+  } else {
 errs() << argv[0] << ": invalid optimization level.\n";
 return 1;
-  case ' ': break;
-  case '0': OLvl = CodeGenOpt::None; break;
-  case '1': OLvl = CodeGenOpt::Less; break;
-  case '2': OLvl = CodeGenOpt::Default; break;
-  case '

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/include/llvm/Support/CodeGen.h:57
+  /// Code generation optimization level.
+  enum Level : IDType {
+None = 0,  ///< -O0

This is ABI breaking, so maybe we don't want/need to define the underlying type?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.
Herald added a reviewer: jdoerfert.
Herald added a subscriber: sstefan1.

I wasn't sure whether to include `getID` and unit-test it, or just drop it 
until it is used. If anyone has a preference let me know and I'll update the 
review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141968/new/

https://reviews.llvm.org/D141968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141968: [NFC] Consolidate llvm::CodeGenOpt::Level handling

2023-01-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: ormris, steven_wu, hiraditya.
Herald added projects: Flang, All.
scott.linder requested review of this revision.
Herald added subscribers: llvm-commits, openmp-commits, cfe-commits, jdoerfert.
Herald added projects: clang, OpenMP, LLVM.

Add free functions llvm::CodeGenOpt::{getLevel,getID,parseLevel} to
provide common implementations for functionality that has been
duplicated in many places across the codebase.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D141968

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/tools/clang-fuzzer/handle-llvm/handle_llvm.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  flang/lib/Frontend/FrontendActions.cpp
  llvm/include/llvm/Support/CodeGen.h
  llvm/lib/LTO/LTOCodeGenerator.cpp
  llvm/tools/gold/gold-plugin.cpp
  llvm/tools/llc/llc.cpp
  llvm/tools/lli/lli.cpp
  llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
  llvm/tools/llvm-lto2/llvm-lto2.cpp
  llvm/tools/lto/lto.cpp
  openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp

Index: openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
===
--- openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
+++ openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
@@ -138,20 +138,6 @@
   return createModuleFromMemoryBuffer(MB, Context);
 }
 
-CodeGenOpt::Level getCGOptLevel(unsigned OptLevel) {
-  switch (OptLevel) {
-  case 0:
-return CodeGenOpt::None;
-  case 1:
-return CodeGenOpt::Less;
-  case 2:
-return CodeGenOpt::Default;
-  case 3:
-return CodeGenOpt::Aggressive;
-  }
-  llvm_unreachable("Invalid optimization level");
-}
-
 OptimizationLevel getOptLevel(unsigned OptLevel) {
   switch (OptLevel) {
   case 0:
@@ -169,7 +155,7 @@
 Expected>
 createTargetMachine(Module &M, std::string CPU, unsigned OptLevel) {
   Triple TT(M.getTargetTriple());
-  CodeGenOpt::Level CGOptLevel = getCGOptLevel(OptLevel);
+  CodeGenOpt::Level CGOptLevel = *CodeGenOpt::getLevel(OptLevel);
 
   std::string Msg;
   const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
Index: llvm/tools/lto/lto.cpp
===
--- llvm/tools/lto/lto.cpp
+++ llvm/tools/lto/lto.cpp
@@ -528,20 +528,7 @@
 if (OptLevel < '0' || OptLevel > '3')
   report_fatal_error("Optimization level must be between 0 and 3");
 CodeGen->setOptLevel(OptLevel - '0');
-switch (OptLevel) {
-case '0':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::None);
-  break;
-case '1':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Less);
-  break;
-case '2':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Default);
-  break;
-case '3':
-  CodeGen->setCodeGenOptLevel(CodeGenOpt::Aggressive);
-  break;
-}
+CodeGen->setCodeGenOptLevel(*CodeGenOpt::getLevel(OptLevel - '0'));
   }
   return wrap(CodeGen);
 }
Index: llvm/tools/llvm-lto2/llvm-lto2.cpp
===
--- llvm/tools/llvm-lto2/llvm-lto2.cpp
+++ llvm/tools/llvm-lto2/llvm-lto2.cpp
@@ -306,20 +306,9 @@
   Conf.Freestanding = EnableFreestanding;
   for (auto &PluginFN : PassPlugins)
 Conf.PassPlugins.push_back(PluginFN);
-  switch (CGOptLevel) {
-  case '0':
-Conf.CGOptLevel = CodeGenOpt::None;
-break;
-  case '1':
-Conf.CGOptLevel = CodeGenOpt::Less;
-break;
-  case '2':
-Conf.CGOptLevel = CodeGenOpt::Default;
-break;
-  case '3':
-Conf.CGOptLevel = CodeGenOpt::Aggressive;
-break;
-  default:
+  if (auto Level = CodeGenOpt::parseLevel(CGOptLevel)) {
+Conf.CGOptLevel = *Level;
+  } else {
 llvm::errs() << "invalid cg optimization level: " << CGOptLevel << '\n';
 return 1;
   }
Index: llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
===
--- llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
+++ llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
@@ -41,8 +41,8 @@
 static cl::opt
 OptLevel("O",
  cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] "
-  "(default = '-O2')"),
- cl::Prefix, cl::init(' '));
+  "(default = '-O0')"),
+ cl::Prefix, cl::init('2'));
 
 static cl::opt
 TargetTriple("mtriple", cl::desc("Override target triple for module"));
@@ -144,16 +144,12 @@
   std::string CPUStr = codegen::getCPUStr(),
   FeaturesStr = codegen::getFeaturesStr();
 
-  CodeGenOpt::Level OLvl = CodeGenOpt::Default;
-  switch (OptLevel) {
-  default:
+  CodeGenOpt::Level OLvl;
+  if (auto Level = CodeGenOpt::parseLevel(OptLevel)) {
+OLvl = *Level;
+  } else {
 errs() << argv[0] << ": invalid optimization level.\n";
 return 1;
-  case ' ': break;
-  case '0': OLvl = CodeGenOpt::None; break;
-  case '1': OLvl = CodeGenOpt::Less; break;
-  ca

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2022-12-14 Thread Scott Linder via Phabricator via cfe-commits

scott.linder abandoned this revision.
scott.linder added a comment.

I'll open a new review as part of upstreaming all of the debug info work


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88978/new/

https://reviews.llvm.org/D88978

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D131717: [ADT] Replace STLForwardCompat.h's C++17 equivalents

2022-08-11 Thread Scott Linder via Phabricator via cfe-commits

scott.linder accepted this revision.
scott.linder added a comment.

LGTM, thank you!

I have some pending changes which would add C++20 "compat" types, but removing 
it until those land sounds fine to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131717/new/

https://reviews.llvm.org/D131717

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123878: [AMDGPU] Add remarks to output some resource usage

2022-07-14 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1207
+  auto EmitResourceUsageRemark = [&](StringRef RemarkName,
+ StringRef RemarkLabel, auto &&Argument) {
+// Add an indent for every line besides the line with the kernel name. This

arsenm wrote:
> Why &&?
Looking at the `ore::NV` constructors I would vote that this just be by value, 
i.e. `auto Argument`. Everything that can be an `Argument` is a small "prefer 
passing by value" type.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123878/new/

https://reviews.llvm.org/D123878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123878: [AMDGPU] Add remarks to output some resource usage

2022-06-28 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I will let others comment, but I think this is a perfectly reasonable 
alternative, and I much prefer using indentation over the delimiter-remark 
approach. LGTM, assuming nobody else objects


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123878/new/

https://reviews.llvm.org/D123878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D127923: [Diagnostics] Accept newline and format diag opts on first line

2022-06-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/TextDiagnostics.cpp:99-103
+  size_t NewlinePos = PD->getShortDescription().find_first_of('\n');
+  if (NewlinePos != std::string::npos)
+OutStr = PD->getShortDescription().str().insert(NewlinePos, 
WarningMsg);
+  else
+OutStr = (PD->getShortDescription() + WarningMsg).str();

scott.linder wrote:
> It's frustrating that `std::string::insert` doesn't follow the pattern of 
> `npos==end-of-string`, but I'd still suggest only doing the actual 
> insert/append once
Or to be more precise, I would try to only have one piece of code doing the 
insert/append


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127923/new/

https://reviews.llvm.org/D127923

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D127923: [Diagnostics] Accept newline and format diag opts on first line

2022-06-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Thank you for the patch! I will leave it to others with more knowledge of the 
history of diagnostics to comment on whether the premise is acceptable, but I 
have a few technical comments




Comment at: clang/lib/Frontend/TextDiagnosticPrinter.cpp:119-135
+  // If the diagnostic message is formatted into multiple lines, split the
+  // first line and feed it into printDiagnosticOptions so that we print the
+  // options only on the first line instead of the last line.
+  SmallString<100> InitStr;
+  size_t NewlinePos = OutStr.find_first_of('\n');
+  if (NewlinePos != StringRef::npos) {
+for (size_t I = 0; I != NewlinePos; ++I)

Nit: I'd remove the `if`s and just use the behavior of `slice(npos, ...)` 
returning the empty string

I also would reword "split the first line and feed it into 
printDiagnosticOptions", as it could be interpreted as meaning the first line 
is an input to `printDiagnosticOptions`



Comment at: clang/lib/StaticAnalyzer/Core/TextDiagnostics.cpp:99-103
+  size_t NewlinePos = PD->getShortDescription().find_first_of('\n');
+  if (NewlinePos != std::string::npos)
+OutStr = PD->getShortDescription().str().insert(NewlinePos, 
WarningMsg);
+  else
+OutStr = (PD->getShortDescription() + WarningMsg).str();

It's frustrating that `std::string::insert` doesn't follow the pattern of 
`npos==end-of-string`, but I'd still suggest only doing the actual 
insert/append once



Comment at: clang/lib/StaticAnalyzer/Core/TextDiagnostics.cpp:100
+  size_t NewlinePos = PD->getShortDescription().find_first_of('\n');
+  if (NewlinePos != std::string::npos)
+OutStr = PD->getShortDescription().str().insert(NewlinePos, 
WarningMsg);

I believe they are guaranteed to be equivalent, but this should be 
`StringRef::npos` or you should allocate the `std::string` sooner



Comment at: clang/lib/StaticAnalyzer/Core/TextDiagnostics.cpp:101
+  if (NewlinePos != std::string::npos)
+OutStr = PD->getShortDescription().str().insert(NewlinePos, 
WarningMsg);
+  else

I think you incur an extra copy assignment here, as the `insert` returns an 
lvalue-reference


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127923/new/

https://reviews.llvm.org/D127923

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123878: [AMDGPU] Add remarks to output some resource usage

2022-05-13 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D123878#3507378 , @vangthao wrote:

> I am not sure if allowing clang to accept newlines is a good idea. It seems 
> like clang wants to know what type of message is being outputted. For example 
> whether this is a remark, warning, etc. but allowing for a diagnostic to 
> output their own newline makes it ambiguous where exactly that output is 
> coming from.

It already supports newlines in any diagnostic which doesn't use the trivial 
format string `"%0"`, and at least `clang/test/Misc/diag-line-wrapping.cpp` 
explicitly tests this behavior.

It seems reasonable that clang could add a prefix or indentation scheme while 
emitting multi-line diagnostics to the terminal, to help with the ambiguity 
issue. For example instead of the current output:

  clang/test/Misc/diag-line-wrapping.cpp:8:14: error: non-static member 'f' 
found in multiple base-class subobjects of type 'B':
  struct DD -> struct D1 -> struct B
  struct DD -> struct D2 -> struct B

Maybe we could have something like:

  clang/test/Misc/diag-line-wrapping.cpp:8:14: error: ...
  ...: non-static member 'f' found in multiple base-class subobjects of type 
'B':
  ...: struct DD -> struct D1 -> struct B
  ...: struct DD -> struct D2 -> struct B

I don't know if changing this kind of output is a breaking change, though? I do 
know some tooling parses this output.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123878/new/

https://reviews.llvm.org/D123878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123878: [AMDGPU] Add remarks to output some resource usage

2022-05-11 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D123878#3506500 , @afanfa wrote:

> If possible, I would like to keep some kind of delimiter. I like the idea of 
> having it at the beginning and at the end of the section. The best option 
> would be to convince clang to print new lines.

It seems like the stripping of non-printable characters is intentional, but 
only for diagnostics that don't use the TableGen based diagnostic formatting 
scheme in clang. I'm not sure exactly why, but regardless there is precedent 
for newlines in other messages.

To get the newlines to hit the terminal for our case the following patch is 
enough:

  --- a/clang/lib/Basic/Diagnostic.cpp
  +++ b/clang/lib/Basic/Diagnostic.cpp
  @@ -812,7 +812,7 @@ FormatDiagnostic(const char *DiagStr, const char *DiagEnd,
 getArgKind(0) == DiagnosticsEngine::ak_std_string) {   

   const std::string &S = getArgStdStr(0);  

   for (char c : S) {
  -  if (llvm::sys::locale::isPrint(c) || c == '\t') {
  +  if (llvm::sys::locale::isPrint(c) || c == '\t' || c == '\n') {
   OutStr.push_back(c);
 }
   }

To get the right indentation inserted for the extra lines the `TextDiagnostic` 
consumer needs to also be updated, but that should be a small change. Only one 
test breaks, and it seems useful for it to pick up the new behavior anyway.

The only other potential conflict is that in `TextDiagnostic` specifically 
there is also the Clang option `-fmessage-length=N` which will forcibly wrap 
diagnostic messages on word-boundaries (although never breaking up a word 
across lines). It seems to only apply to text preceding the first newline in 
the message, so it is likely just a non-issue?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123878/new/

https://reviews.llvm.org/D123878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123878: [AMDGPU] Add remarks to output some resource usage

2022-05-02 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Even with newlines forced via extra remarks, I'm not a big fan of the 
"---" remark; it doesn't interact well with other random 
remarks in the output, for example when I enable all remarks using the pattern 
'.*' I see:

  remark: foo.cl:27:0: AMDGPU DAG->DAG Pattern Instruction Selection: Function: 
test_kernel: MI Instruction count changed from 0 to 6; Delta: 6
  remark: foo.cl:27:0: 0 stack bytes in function
  remark: foo.cl:42:0: AMDGPU DAG->DAG Pattern Instruction Selection: Function: 
test_func: MI Instruction count changed from 0 to 4; Delta: 4
  remark: foo.cl:42:0: 0 stack bytes in function
  remark: foo.cl:42:0: SI insert wait instructions: Function: test_func: MI 
Instruction count changed from 4 to 5; Delta: 1
  remark: foo.cl:8:0: AMDGPU DAG->DAG Pattern Instruction Selection: Function: 
empty_kernel: MI Instruction count changed from 0 to 1; Delta: 1
  remark: foo.cl:8:0: 0 stack bytes in function
  remark: foo.cl:52:0: AMDGPU DAG->DAG Pattern Instruction Selection: Function: 
empty_func: MI Instruction count changed from 0 to 1; Delta: 1
  remark: foo.cl:52:0: 0 stack bytes in function
  remark: foo.cl:52:0: SI insert wait instructions: Function: empty_func: MI 
Instruction count changed from 1 to 2; Delta: 1
  remark: :0:0: BasicBlock:
  : 2
  
  remark: foo.cl:27:0: 6 instructions in function
  remark: foo.cl:27:0: Kernel Name: test_kernel
  remark: foo.cl:27:0: SGPRs: 24
  remark: foo.cl:27:0: VGPRs: 9
  remark: foo.cl:27:0: AGPRs: 43
  remark: foo.cl:27:0: ScratchSize [bytes/thread]: 0
  remark: foo.cl:27:0: Occupancy [waves/SIMD]: 5
  remark: foo.cl:27:0: SGPRs Spill: 0
  remark: foo.cl:27:0: VGPRs Spill: 0
  remark: foo.cl:27:0: LDS Size [bytes/block]: 512
  remark: foo.cl:27:0: --
  remark: :0:0: BasicBlock:
  : 2
  
  remark: foo.cl:42:0: 5 instructions in function
  remark: foo.cl:42:0: Kernel Name: test_func
  remark: foo.cl:42:0: SGPRs: 0
  remark: foo.cl:42:0: VGPRs: 0
  remark: foo.cl:42:0: AGPRs: 0
  remark: foo.cl:42:0: ScratchSize [bytes/thread]: 0
  remark: foo.cl:42:0: Occupancy [waves/SIMD]: 0
  remark: foo.cl:42:0: SGPRs Spill: 0
  remark: foo.cl:42:0: VGPRs Spill: 0
  remark: foo.cl:42:0: --
  remark: :0:0: BasicBlock:
  : 1

If we do keep the delimiter remarks, can we have one at the beginning as well? 
At least then other remarks don't appear to "bleed" into the new block of 
remarks.




Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1178-1266
+
+void AMDGPUAsmPrinter::emitResourceUsageRemarks(
+const MachineFunction &MF, const SIProgramInfo &CurrentProgramInfo,
+bool isModuleEntryFunction, bool hasMAIInsts) {
+  if (!ORE)
+return;
+

vangthao wrote:
> arsenm wrote:
> > scott.linder wrote:
> > > 
> > I'm sort of surprised the all in one form doesn't come out in something 
> > more parsable? It might be worth looking at IR level remarks, since there's 
> > probably more usage of them. Are there any existing uses in clang that do 
> > something meaningful by parsing out different parts?
> We could check if the specific remark is enabled to avoid cluttering YAML 
> output:
> ```
>   const char *Name = "kernel-resource-usage";
>   LLVMContext &Ctx = MF.getFunction().getContext();
>   if (!Ctx.getDiagHandlerPtr()->isAnalysisRemarkEnabled(Name))
> return;
> ```
> 
> I do not think using spaces to format the output will work for us. Most of 
> the IR level remarks seems to be related specifically to their pass and/or 
> optimization thus are quite short so readability is not as impacted. We are 
> outputting a decent chunk of information and need some readability for the 
> user.
> 
> Parsing out different parts like this is mostly a workaround for clang 
> ignoring newlines. I am not aware of any other uses doing it like this since 
> many of them are short and uses spaces for formatting.
> Are there any existing uses in clang that do something meaningful by parsing 
> out different parts?

I would expect this sort of thing to happen before it gets serialized to an 
unstructured string, or for it to happen on a structured output like YAML.

> Parsing out different parts like this is mostly a workaround for clang 
> ignoring newlines.

Can we fix clang to respect newlines?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123878/new/

https://reviews.llvm.org/D123878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123878: [AMDGPU] Add remarks to output some resource usage

2022-04-26 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1178-1266
+
+void AMDGPUAsmPrinter::emitResourceUsageRemarks(
+const MachineFunction &MF, const SIProgramInfo &CurrentProgramInfo,
+bool isModuleEntryFunction, bool hasMAIInsts) {
+  if (!ORE)
+return;
+





Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:596
+   MF.getFunction().getSubprogram(), &MF.front())
+   << "--";
+  });

vangthao wrote:
> scott.linder wrote:
> > This seems like the wrong place to segment the output; could this be made a 
> > part of the emitter itself?
> > 
> > Or maybe better yet, could all of these values be collected into a single 
> > remark? That seems to be how e.g. `llvm/lib/CodeGen/RegAllocGreedy.cpp` 
> > does it:
> > 
> > ```
> > Pass:regalloc   
> >  Name:SpillReloadCopies 
> >   Function:
> > f   
> > Args:   
> >- NumSpills:   '1'   
> > - 
> > String:  ' spills ' 
> >- TotalSpillsCost: '1.00e+00'
> > - String:  
> > ' total spills cost '   
> >   - NumReloads:  '1'
> >- String:  ' reloads '   
> > - 
> > TotalReloadsCost: '1.00e+00'
> >- String:  ' total reloads cost '
> > - String:  
> > generated in function
> > ```
> We also want to output this in a readable format for the frontend. Collecting 
> it all into a single remark seems to break the output format since clang 
> seems to ignore all newlines from a diagnostic remark.
Rather than use newlines, RegAllocGreedy uses spaces; we can debate aesthetics, 
but I feel like we should have a compelling reason before we choose our own 
format.

For example, with RegAllocGreedy you can get output along the lines of:

```
foo/bar/baz.cpp:42:1: remark: 10 spills 100 total spills cost 7 folded spills 
70 total folded spills cost 22 reloads 44 total reloads cost 77 folded reloads 
120 total folded reloads cost 1 zero cost folded reloads 78 virtual registers 
copies 111 total copies cost
void foo()
^
```

RegAllocFast appears to be the original use-case that caused the machine 
remarks to be invented, and it has been updated recently 
(https://reviews.llvm.org/D100020) so I don't suspect this is just some legacy 
cruft.

If we follow the same approach we get something like:

```
  void AMDGPUAsmPrinter::emitResourceUsageRemarks(
  const MachineFunction &MF, const SIProgramInfo &CurrentProgramInfo) {
if (!ORE)
  return;

ORE->emit([&]() {
  return MachineOptimizationRemarkAnalysis(
 "kernel-resource-usage", "ResourceUsage",
 MF.getFunction().getSubprogram(), &MF.front())
 << ore::NV("NumSGPR", CurrentProgramInfo.NumSGPR) << " SGPRs "
 << ore::NV("NumVGPR", CurrentProgramInfo.NumArchVGPR) << " VGPRs "
 << ore::NV("NumAGPR", CurrentProgramInfo.NumAccVGPR) << " AGPRs "
 << ore::NV("ScratchSize", CurrentProgramInfo.ScratchSize)
 << " scratch bytes/thread "
 << ore::NV("Occupancy", CurrentProgramInfo.Occupancy)
 << " occupancy waves/SIMD "
 << ore::NV("SGPRSpill", CurrentProgramInfo.SGPRSpill)
 << " SGPR spills "
 << ore::NV("VGPRSpill", CurrentProgramInfo.VGPRSpill)
 << " VGPR spills " << ore::NV("BytesLDS", 
CurrentProgramInfo.LDSSize)
 << " LDS size bytes/block ";
}); 


  }
```

which produces:

```
clang/test/Frontend/amdgcn-machine-analysis-remarks.cl:14:1: remark: 9 SGPRs 10 
VGPRs 12 AGPRs 0 scratch bytes/thread 10 occupancy waves/SIMD 0 SGPR spills 0 
VGPR spills 0 LDS size bytes/block  [-Rpass-analysis=kernel-resource-usage]
__kernel void foo() {
^
```

That seems reasonable

[PATCH] D121951: [AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic

2022-04-05 Thread Scott Linder via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG09f33a430b72: [AMDGPU][OpenCL] Remove "printf and 
hostcall" diagnostic (authored by scott.linder).

Changed prior to commit:
  https://reviews.llvm.org/D121951?vs=418922&id=420599#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

Files:
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
  llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll

Index: llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
===
--- llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
+++ /dev/null
@@ -1,18 +0,0 @@
-; RUN: not opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
-
-@.str = private unnamed_addr addrspace(2) constant [6 x i8] c"%s:%d\00", align 1
-
-define amdgpu_kernel void @test_kernel(i32 %n) {
-entry:
-  %str = alloca [9 x i8], align 1
-  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8]* %str, i32 0, i32 0
-  %call1 = call i32 (i8 addrspace(2)*, ...) @printf(i8 addrspace(2)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(2)* @.str, i32 0, i32 0), i8* %arraydecay, i32 %n)
-  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
-  ret void
-}
-
-declare i32 @printf(i8 addrspace(2)*, ...)
-
-declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
-
-; CHECK: error: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
@@ -0,0 +1,19 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
+
+define amdgpu_kernel void @test_kernel(i32 %n) {
+entry:
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call1 = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
+  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  ret void
+}
+
+declare i32 @printf(i8 addrspace(4)*, ...)
+
+declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; CHECK-NOT: error:
+; CHECK-NOT: warning:
Index: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
@@ -563,15 +563,6 @@
   if (Printfs.empty())
 return false;
 
-  if (auto HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
-for (auto &U : HostcallFunction->uses()) {
-  if (auto *CI = dyn_cast(U.getUser())) {
-M.getContext().emitError(
-CI, "Cannot use both printf and hostcall in the same module");
-  }
-}
-  }
-
   TD = &M.getDataLayout();
 
   return lowerPrintfForGpu(M);
Index: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
@@ -400,17 +400,15 @@
   auto Int8PtrTy = Type::getInt8PtrTy(Func.getContext(),
   AMDGPUAS::GLOBAL_ADDRESS);
 
-  // Emit "printf buffer" argument if printf is used, otherwise emit dummy
-  // "none" argument.
   if (HiddenArgNumBytes >= 32) {
+// We forbid the use of features requiring hostcall when compiling OpenCL
+// before code object V5, which makes the mutual exclusion between the
+// "printf buffer" and "hostcall buffer" here sound.
 if (Func.getParent()->getNamedMetadata("llvm.printf.fmts"))
   emitKernelArg(DL, Int8PtrTy, Align(8), ValueKind::HiddenPrintfBuffer);
-else if (!Func.hasFnAttribute("amdgpu-no-hostcall-ptr")) {
-  // The printf runtime binding pass should have ensured that hostcall and
-  // printf are not used in the same module.
-  assert(!Func.getParent()->getNamedMetadata("llvm.printf.fmts"));
+else if (!Func.hasFnAttribute("amdgpu-no-hostcal

[PATCH] D121951: [AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic

2022-03-29 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D121951#3414271 , @sameerds wrote:

> In D121951#3411856 , @scott.linder 
> wrote:
>
>> @yaxunl Does excluding device-libs via COV_None make sense?
>>
>> @sameerds Would you still rather we just not add this attribute in the 
>> frontend at all? I'm OK with it if that is the consensus
>
> Yes, I still think there is no need to emit that attribute in the frontend. 
> It will always be inferred by the Attributor when optimization is enabled. 
> This also eliminates the check for COV_None and there seems to be some 
> uncertainty about COV_None anyway. This also eliminates the updates to all 
> the tests where the no-hostcall-ptr attribute does not actually matter. If 
> ever we need to check if hostcall is being used on OpenCL and COV < 5, we 
> should do it per feature and inform the user appropriately.

OK, that makes sense to me. I think I was not interpreting the purpose of the 
attributes correctly, but if they are essentially just optimization remarks 
then I don't have any issue with not ensuring they are present at -O0.

I updated the patch, let me know what you think.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic

2022-03-29 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 418922.
scott.linder retitled this revision from "[AMDGPU][OpenCL] Add 
"amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5" to "[AMDGPU][OpenCL] 
Remove "printf and hostcall" diagnostic".
scott.linder edited the summary of this revision.
scott.linder added a comment.

Remove frontend changes, and just drop the diagnostic in the backend. Update
some comments and naming.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

Files:
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
  llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll

Index: llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
===
--- llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
+++ /dev/null
@@ -1,18 +0,0 @@
-; RUN: not opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
-
-@.str = private unnamed_addr addrspace(2) constant [6 x i8] c"%s:%d\00", align 1
-
-define amdgpu_kernel void @test_kernel(i32 %n) {
-entry:
-  %str = alloca [9 x i8], align 1
-  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8]* %str, i32 0, i32 0
-  %call1 = call i32 (i8 addrspace(2)*, ...) @printf(i8 addrspace(2)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(2)* @.str, i32 0, i32 0), i8* %arraydecay, i32 %n)
-  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
-  ret void
-}
-
-declare i32 @printf(i8 addrspace(2)*, ...)
-
-declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
-
-; CHECK: error: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
@@ -0,0 +1,19 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
+
+define amdgpu_kernel void @test_kernel(i32 %n) {
+entry:
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call1 = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
+  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  ret void
+}
+
+declare i32 @printf(i8 addrspace(4)*, ...)
+
+declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; CHECK-NOT: error:
+; CHECK-NOT: warning:
Index: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
@@ -563,15 +563,6 @@
   if (Printfs.empty())
 return false;
 
-  if (auto HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
-for (auto &U : HostcallFunction->uses()) {
-  if (auto *CI = dyn_cast(U.getUser())) {
-M.getContext().emitError(
-CI, "Cannot use both printf and hostcall in the same module");
-  }
-}
-  }
-
   TD = &M.getDataLayout();
 
   return lowerPrintfForGpu(M);
Index: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
@@ -401,17 +401,15 @@
   auto Int8PtrTy = Type::getInt8PtrTy(Func.getContext(),
   AMDGPUAS::GLOBAL_ADDRESS);
 
-  // Emit "printf buffer" argument if printf is used, otherwise emit dummy
-  // "none" argument.
   if (HiddenArgNumBytes >= 32) {
+// We forbid the use of features requiring hostcall when compiling OpenCL
+// before code object V5, which makes the mutual exclusion between the
+// "printf buffer" and "hostcall buffer" here sound.
 if (Func.getParent()->getNamedMetadata("llvm.printf.fmts"))
   emitKernelArg(DL, Int8PtrTy, Align(8), ValueKind::HiddenPrintfBuffer);
-else if (!Func.hasFnAttribute("amdgpu-no-hostcall-ptr")) {
-  // The printf runtime binding pass should have ensured that hostcall and
-  // printf are not used in the same module.
-  assert(!Func.getParent()->getNamedM

[PATCH] D122669: [AMDGPU][OpenCL] Remove "printf and hostcall" diagnostic

2022-03-29 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: hsmhsm, foad, Naghasan, ldrumm, kerbowa, hiraditya, 
t-tye, Anastasia, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
Herald added a project: All.
scott.linder requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, wdng.
Herald added projects: clang, LLVM.

The diagnostic is unreliable, and triggers even for dead uses of
hostcall that may exist when linking the device-libs at lower
optimization levels.

Eliminate the diagnostic, and directly document the limitation for
OpenCL before code object V5.

Make some NFC changes to clarify the related code in the
MetadataStreamer.

Add a clang test to tie OCL sources containing printf to the backend IR
tests for this situation.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D122669

Files:
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
  llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll

Index: llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
===
--- llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
+++ /dev/null
@@ -1,18 +0,0 @@
-; RUN: not opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
-
-@.str = private unnamed_addr addrspace(2) constant [6 x i8] c"%s:%d\00", align 1
-
-define amdgpu_kernel void @test_kernel(i32 %n) {
-entry:
-  %str = alloca [9 x i8], align 1
-  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8]* %str, i32 0, i32 0
-  %call1 = call i32 (i8 addrspace(2)*, ...) @printf(i8 addrspace(2)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(2)* @.str, i32 0, i32 0), i8* %arraydecay, i32 %n)
-  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
-  ret void
-}
-
-declare i32 @printf(i8 addrspace(2)*, ...)
-
-declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
-
-; CHECK: error: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
@@ -0,0 +1,19 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
+
+define amdgpu_kernel void @test_kernel(i32 %n) {
+entry:
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call1 = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
+  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  ret void
+}
+
+declare i32 @printf(i8 addrspace(4)*, ...)
+
+declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; CHECK-NOT: error:
+; CHECK-NOT: warning:
Index: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
@@ -563,15 +563,6 @@
   if (Printfs.empty())
 return false;
 
-  if (auto HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
-for (auto &U : HostcallFunction->uses()) {
-  if (auto *CI = dyn_cast(U.getUser())) {
-M.getContext().emitError(
-CI, "Cannot use both printf and hostcall in the same module");
-  }
-}
-  }
-
   TD = &M.getDataLayout();
 
   return lowerPrintfForGpu(M);
Index: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
@@ -401,17 +401,15 @@
   auto Int8PtrTy = Type::getInt8PtrTy(Func.getContext(),
   AMDGPUAS::GLOBAL_ADDRESS);
 
-  // Emit "printf buffer" argument if printf is used, otherwise emit dummy
-  // "none" argument.
   if (HiddenArgNumBytes >= 32) {
+// We forbid the use of features requiring hostcall when compiling OpenCL
+// before code object V5, which makes the mutual exclusion between the
+// "printf buffer" and "hostcall buffer" here sound.
 if (Func.getParent()->getNamedMetadata("llvm.printf.fmts"))
   em

[PATCH] D121951: [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5

2022-03-28 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

@yaxunl Does excluding device-libs via COV_None make sense?

@sameerds Would you still rather we just not add this attribute in the frontend 
at all? I'm OK with it if that is the consensus


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5

2022-03-28 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 418635.
scott.linder added a comment.

Do not add the no-hostcall attribute for "COV_None" modules, to allow for
the OpenCL implementation of hostcall in the device-libs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenOpenCL/amdgpu-attrs.cl
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll

Index: llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
===
--- llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
+++ /dev/null
@@ -1,18 +0,0 @@
-; RUN: not opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
-
-@.str = private unnamed_addr addrspace(2) constant [6 x i8] c"%s:%d\00", align 1
-
-define amdgpu_kernel void @test_kernel(i32 %n) {
-entry:
-  %str = alloca [9 x i8], align 1
-  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8]* %str, i32 0, i32 0
-  %call1 = call i32 (i8 addrspace(2)*, ...) @printf(i8 addrspace(2)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(2)* @.str, i32 0, i32 0), i8* %arraydecay, i32 %n)
-  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
-  ret void
-}
-
-declare i32 @printf(i8 addrspace(2)*, ...)
-
-declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
-
-; CHECK: error: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
@@ -0,0 +1,19 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
+
+define amdgpu_kernel void @test_kernel(i32 %n) {
+entry:
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call1 = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
+  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  ret void
+}
+
+declare i32 @printf(i8 addrspace(4)*, ...)
+
+declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; CHECK-NOT: error:
+; CHECK-NOT: warning:
Index: llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
===
--- llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
+++ llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
@@ -1894,9 +1894,9 @@
 ; CHECK-NEXT: - 1
 ; CHECK-NEXT: - 0
 
-attributes #0 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" }
-attributes #1 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "runtime-handle"="__test_block_invoke_kernel_runtime_handle" }
-attributes #2 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "calls-enqueue-kernel" }
+attributes #0 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "amdgpu-no-hostcall-ptr" }
+attributes #1 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "amdgpu-no-hostcall-ptr" "runtime-handle"="__test_block_invoke_kernel_runtime_handle" }
+attributes #2 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "amdgpu-no-hostcall-ptr" "calls-enqueue-kernel" }
 
 !llvm.printf.fmts = !{!100, !101}
 
Index: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
@@ -563,15 +563,6 @@
   if (Printfs.empty())
 return false;
 
-  if (auto HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
-for (auto &U : HostcallFunction->uses()) {
-  if (auto *CI = dyn_cast(U.getUser())) {
-M.getContext().emitError(
-CI, "Cannot use both printf and hostcall in the same module");
-  }
-}
-  }
-
   TD = &M.getDataLayout();
 
   return lowerPrintfForGpu(M);
Index: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
==

[PATCH] D121951: [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5

2022-03-25 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.

Comment at: clang/lib/CodeGen/TargetInfo.cpp:9381
+  M.getTarget().getTargetOpts().CodeObjectVersion != 500) {
+F->addFnAttr("amdgpu-no-hostcall-ptr");
+  }

arsenm wrote:
> sameerds wrote:
> > The frontend does not need to worry about this attribute. See the comment 
> > in the MetadataStreamer. A worthwhile check would be to generate an error 
> > if we are able to detect that some hostcall service is being used in OpenCL 
> > on code-object-v4 or lower. None exists right now, but we should add the 
> > check if such services show up. But those checks are likely to be in a 
> > different place. For example, enabling asan on OpenCL for code-object-v4 
> > should result in an error in the place where asan commandline options are 
> > parsed.
> Should be all opencl, not just kernels. Also < instead of !=?
> The frontend does not need to worry about this attribute. See the comment in 
> the MetadataStreamer. 

The reason I went this route is that otherwise the MetadataStreamer can only go 
off of the presence of the OCL printf metadata to infer that hostcall argument 
metadata is invalid and will be ignored by the runtime. Ideally, the 
MetadataStreamer or Verifier or something would actually //require// that a 
module does not have both OCL printf metadata and functions which are not 
"amdgpu-no-hostcall-ptr", but I didn't go that far as it would break old 
IR/bitcode that relies on the implicit behavior.

I'm OK with removing this code, and the rest of the patch remains essentially 
unchanged. We will still have an implicit rule based on code-object-version and 
presence of printf metadata, and at least conceptually you will still be able 
to compile OCL for pre-V5 and end up with hostcall argument metadata. That case 
is benign if the runtime just ignores it, but it still seems needlessly relaxed 
when we can just add the correct attribute in Clang codegen.

> Should be all opencl, not just kernels. Also < instead of !=?

Yes, my mistake, I'll update this

Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:406-408
 if (Func.getParent()->getNamedMetadata("llvm.printf.fmts"))
   emitKernelArg(DL, Int8PtrTy, Align(8), ValueKind::HiddenPrintfBuffer);
+else if (!Func.hasFnAttribute("amdgpu-no-hostcall-ptr"))

sameerds wrote:
> I would structure this differently: If this is code-object-v4 or lower, then 
> if  "llvm.printf.fmts" is present, then this kernel clearly contains OpenCL 
> bits, and cannot use hostcall. So it's okay to just assume that the 
> no-hostcall-ptr attribute is always present in this situation, which means 
> the only metadata generated is for ValueKind::HiddenPrintfBuffer. Else if 
> this is code-object-v5, then proceed to emit both metadata.
> 
> 
I'm not sure I follow; doesn't the code as-is implement what you're describing?

If the printf metadata is present, this will only ever emit the 
`HiddenPrintfBuffer`, irrespective of the hostcall attribute. Otherwise, this 
respects `amdgpu-no-hostcall-ptr` (i.e. for HIP and other languages).

The "if this is code-object-v4 or lower" portion is implicit, as this is just 
the `MetadataStreamerV2` impl. The `MetadataStreamerV3` (below) and 
`MetadataStreamerV4` (inherits from V3) impls below are similar. The 
`MetadataStreamerV5` impl is already correct for V5 (i.e. supports both 
argument kinds for the same kernel).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5

2022-03-24 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 418135.
scott.linder retitled this revision from "[AMDGPU] Only warn when mixing printf 
and hostcall" to "[AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang 
codegen pre-COV_5".
scott.linder edited the summary of this revision.
scott.linder added a comment.
Herald added subscribers: hsmhsm, Naghasan, ldrumm.

Eliminate diagnostic, and instead add function attribute in Clang for pre-COV_5
OpenCL codegen.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenOpenCL/amdgpu-attrs.cl
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll

Index: llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
===
--- llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
+++ /dev/null
@@ -1,18 +0,0 @@
-; RUN: not opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
-
-@.str = private unnamed_addr addrspace(2) constant [6 x i8] c"%s:%d\00", align 1
-
-define amdgpu_kernel void @test_kernel(i32 %n) {
-entry:
-  %str = alloca [9 x i8], align 1
-  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8]* %str, i32 0, i32 0
-  %call1 = call i32 (i8 addrspace(2)*, ...) @printf(i8 addrspace(2)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(2)* @.str, i32 0, i32 0), i8* %arraydecay, i32 %n)
-  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
-  ret void
-}
-
-declare i32 @printf(i8 addrspace(2)*, ...)
-
-declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
-
-; CHECK: error: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
@@ -0,0 +1,19 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
+
+define amdgpu_kernel void @test_kernel(i32 %n) {
+entry:
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call1 = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
+  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  ret void
+}
+
+declare i32 @printf(i8 addrspace(4)*, ...)
+
+declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; CHECK-NOT: error:
+; CHECK-NOT: warning:
Index: llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
===
--- llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
+++ llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
@@ -1894,9 +1894,9 @@
 ; CHECK-NEXT: - 1
 ; CHECK-NEXT: - 0
 
-attributes #0 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" }
-attributes #1 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "runtime-handle"="__test_block_invoke_kernel_runtime_handle" }
-attributes #2 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "calls-enqueue-kernel" }
+attributes #0 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "amdgpu-no-hostcall-ptr" }
+attributes #1 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "amdgpu-no-hostcall-ptr" "runtime-handle"="__test_block_invoke_kernel_runtime_handle" }
+attributes #2 = { optnone noinline "amdgpu-implicitarg-num-bytes"="56" "amdgpu-no-hostcall-ptr" "calls-enqueue-kernel" }
 
 !llvm.printf.fmts = !{!100, !101}
 
Index: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
@@ -563,15 +563,6 @@
   if (Printfs.empty())
 return false;
 
-  if (auto HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
-for (auto &U : HostcallFunction->uses()) {
-  if (auto *CI = dyn_cast(U.getUser())) {
-M.getContext().emitError(
-CI, "Cannot use both prin

[PATCH] D121951: [AMDGPU] Only warn when mixing printf and hostcall

2022-03-22 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D121951#3393928 , @sameerds wrote:

> In D121951#3392470 , @scott.linder 
> wrote:
>
>> In D121951#3391472 , @sameerds 
>> wrote:
>>
>>> The check for "__ockl_hostcall_internal" is not longer relevant with the 
>>> new hostcall attribute. Can we simply remove this check? What is the 
>>> situation where the warning is still useful?
>>
>> I wasn't aware of the new attribute, I'm happy to just delete it.
>
> Yeah, it happened in D119216 .
>
> But I am still curious about the check itself. Do we need to worry about a 
> situation where it is important to check whether OpenCL printf (non-hostcall) 
> and some other hostcall service are active in the same application?

I don't know how important it is to notify the user, but my understanding is 
pre-code-object-v5 we will use the same kernarg for both the printf and 
hostcall pointers. I don't know what actually happens (does the runtime notice 
it can't actually construct the kernarg and fail? does the code just fail when 
one of the two uses of the kernarg are incorrect?). Should I try to nail down 
exactly what currently happens?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU] Only warn when mixing printf and hostcall

2022-03-18 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D121951#3391472 , @sameerds wrote:

> The check for "__ockl_hostcall_internal" is not longer relevant with the new 
> hostcall attribute. Can we simply remove this check? What is the situation 
> where the warning is still useful?

I wasn't aware of the new attribute, I'm happy to just delete it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU] Only warn when mixing printf and hostcall

2022-03-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:567
 
-  if (auto HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
+  if (auto *HostcallFunction = M.getFunction("__ockl_hostcall_internal")) {
 for (auto &U : HostcallFunction->uses()) {

I will land this separately, as it is unrelated and NFC



Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:570
   if (auto *CI = dyn_cast(U.getUser())) {
-M.getContext().emitError(
-CI, "Cannot use both printf and hostcall in the same module");
+M.getContext().diagnose(DiagnosticInfoInlineAsm(
+*CI, "Cannot use both printf and hostcall in the same module",

I'm also not certain this is the the best option to classify the diagnostic, 
but it is what was used for the error so I just left it as-is. Suggestions 
welcome!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU] Only warn when mixing printf and hostcall

2022-03-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added reviewers: sameerds, arsenm, rampitec, b-sumner, kzhuravl, 
yaxunl.
scott.linder added a comment.

For reference, the error I'm changing to a warning was added as part of 
https://reviews.llvm.org/D70038, and the issue that made me consider this 
approach appears when building OCL conformance tests at -O0

Another approach would be to do some more work in Clang or early in the backend 
to strip out the unused device-libs functions which are calling 
`__ockl_hostcall_internal`. I started with that thought, but backed off as it 
seemed the less we could do at -O0 the better. The limitation is also being 
resolved in later HSA ABI versions, so this would be a legacy path almost as 
soon as we added it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121951: [AMDGPU] Only warn when mixing printf and hostcall

2022-03-17 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: foad, kerbowa, hiraditya, t-tye, Anastasia, tpr, 
dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
Herald added a project: All.
scott.linder requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, wdng.
Herald added projects: clang, LLVM.

We are currently overly conservative, and reject modules which contain
dead uses of hostcall. This has the effect of preventing compilation at
-O0 of a use of printf in an otherwise empty module (i.e. one containing
no use of hostcall).

In a non-single-source world it is not possible to determine whether the
call is dead in the compiler, even if we wanted to.

Instead, make the error a warning, as the compiler cannot diagnose it
without some false positives, but it will remain an issue the user
should be aware of when producing pre-V5 HSA ABI code objects.

Expand testing to modules with printf+no-hostcall, printf+dead-hostcall,
and printf+hostcall to exercise the warning.

Add clang tests to tie the above tests to OpenCL source. Specifically,
the test llvm/test/CodeGen/AMDGPU/opencl-printf-dead-hostcall.ll
represents the case which previously failed at -O0.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D121951

Files:
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
  llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-dead-hostcall.ll
  llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll

Index: llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
===
--- llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-no-hostcall.ll
@@ -1,18 +1,17 @@
-; RUN: not opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
 
-@.str = private unnamed_addr addrspace(2) constant [6 x i8] c"%s:%d\00", align 1
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
 
 define amdgpu_kernel void @test_kernel(i32 %n) {
 entry:
-  %str = alloca [9 x i8], align 1
-  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8]* %str, i32 0, i32 0
-  %call1 = call i32 (i8 addrspace(2)*, ...) @printf(i8 addrspace(2)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(2)* @.str, i32 0, i32 0), i8* %arraydecay, i32 %n)
-  %call2 = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
   ret void
 }
 
-declare i32 @printf(i8 addrspace(2)*, ...)
+declare i32 @printf(i8 addrspace(4)*, ...)
 
 declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
 
-; CHECK: error: Cannot use both printf and hostcall in the same module
+; CHECK-NOT: warning: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-dead-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-dead-hostcall.ll
@@ -0,0 +1,22 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCheck %s
+
+@.str = private unnamed_addr addrspace(4) constant [6 x i8] c"%s:%d\00", align 1
+
+define amdgpu_kernel void @test_kernel(i32 %n) {
+entry:
+  %str = alloca [9 x i8], align 1, addrspace(5)
+  %arraydecay = getelementptr inbounds [9 x i8], [9 x i8] addrspace(5)* %str, i32 0, i32 0
+  %call = call i32 (i8 addrspace(4)*, ...) @printf(i8 addrspace(4)* getelementptr inbounds ([6 x i8], [6 x i8] addrspace(4)* @.str, i32 0, i32 0), i8 addrspace(5)* %arraydecay, i32 %n)
+  ret void
+}
+
+define amdgpu_kernel void @dead_function() {
+  %call = call <2 x i64> (i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64) @__ockl_hostcall_internal(i8* undef, i32 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+  ret void
+}
+
+declare i32 @printf(i8 addrspace(4)*, ...)
+
+declare <2 x i64> @__ockl_hostcall_internal(i8*, i32, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; CHECK: warning: Cannot use both printf and hostcall in the same module
Index: llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/opencl-printf-and-hostcall.ll
@@ -0,0 +1,18 @@
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -amdgpu-printf-runtime-binding < %s 2>&1 | FileCh

[PATCH] D110276: Clean up large copies of binaries copied into temp directories in tests

2021-09-28 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Is it possible to use soft links instead of copies? It would still be good to 
clean up afterwards, but if the host won't allow that cleanup in some cases at 
least we aren't leaving a big file around.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110276/new/

https://reviews.llvm.org/D110276

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D107190: [AMDGPU][HIP] Switch default DWARF version to 5

2021-08-02 Thread Scott Linder via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG635c5ba45bae: [AMDGPU][HIP] Switch default DWARF version to 
5 (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107190/new/

https://reviews.llvm.org/D107190

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h
  clang/test/Driver/amdgpu-toolchain.c
  clang/test/Driver/hip-toolchain-dwarf.hip


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -92,7 +92,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
   void checkTargetID(const llvm::opt::ArgList &DriverArgs) const override;
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -92,7 +92,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
   void checkTargetID(const llvm::opt::ArgList &DriverArgs) const override;
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D107190: [AMDGPU][HIP] Switch default DWARF version to 5

2021-07-30 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, yaxunl, nhaehnle, 
jvesely, kzhuravl.
scott.linder requested review of this revision.
Herald added subscribers: cfe-commits, wdng.
Herald added a project: clang.

Another attempt at changing this default, now that tooling has greater
support for DWARF 5.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D107190

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h
  clang/test/Driver/amdgpu-toolchain.c
  clang/test/Driver/hip-toolchain-dwarf.hip


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -92,7 +92,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
   void checkTargetID(const llvm::opt::ArgList &DriverArgs) const override;
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -92,7 +92,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
   void checkTargetID(const llvm::opt::ArgList &DriverArgs) const override;
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106734: Eliminate clang man page generation warning for missing .rst files

2021-07-26 Thread Scott Linder via Phabricator via cfe-commits

scott.linder accepted this revision.
scott.linder added a comment.
This revision is now accepted and ready to land.

LGTM, I see no reason why this shouldn't apply for all generators.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106734/new/

https://reviews.llvm.org/D106734

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D100671: [ADT] Factor out in_place_t and expose in Optional ctor

2021-05-17 Thread Scott Linder via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGaf5247c9347d: [ADT] Factor out in_place_t and expose in 
Optional ctor (authored by scott.linder).

Changed prior to commit:
  https://reviews.llvm.org/D100671?vs=345475&id=346003#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100671/new/

https://reviews.llvm.org/D100671

Files:
  clang/include/clang/Basic/DirectoryEntry.h
  llvm/include/llvm/ADT/Optional.h
  llvm/include/llvm/ADT/STLForwardCompat.h
  llvm/unittests/ADT/OptionalTest.cpp

Index: llvm/unittests/ADT/OptionalTest.cpp
===
--- llvm/unittests/ADT/OptionalTest.cpp
+++ llvm/unittests/ADT/OptionalTest.cpp
@@ -195,6 +195,14 @@
   EXPECT_EQ(0u, NonDefaultConstructible::Destructions);
 }
 
+TEST(OptionalTest, InPlaceConstructionNonDefaultConstructibleTest) {
+  NonDefaultConstructible::ResetCounts();
+  { Optional A{in_place, 1}; }
+  EXPECT_EQ(0u, NonDefaultConstructible::CopyConstructions);
+  EXPECT_EQ(0u, NonDefaultConstructible::CopyAssignments);
+  EXPECT_EQ(1u, NonDefaultConstructible::Destructions);
+}
+
 TEST(OptionalTest, GetValueOr) {
   Optional A;
   EXPECT_EQ(42, A.getValueOr(42));
@@ -214,6 +222,11 @@
   MultiArgConstructor &operator=(const MultiArgConstructor &) = delete;
   MultiArgConstructor &operator=(MultiArgConstructor &&) = delete;
 
+  friend bool operator==(const MultiArgConstructor &LHS,
+ const MultiArgConstructor &RHS) {
+return LHS.x == RHS.x && LHS.y == RHS.y;
+  }
+
   static unsigned Destructions;
   ~MultiArgConstructor() {
 ++Destructions;
@@ -244,6 +257,34 @@
   EXPECT_EQ(1u, MultiArgConstructor::Destructions);
 }
 
+TEST(OptionalTest, InPlaceConstructionMultiArgConstructorTest) {
+  MultiArgConstructor::ResetCounts();
+  {
+Optional A{in_place, 1, 2};
+EXPECT_TRUE(A.hasValue());
+EXPECT_EQ(1, A->x);
+EXPECT_EQ(2, A->y);
+Optional B{in_place, 5, false};
+EXPECT_TRUE(B.hasValue());
+EXPECT_EQ(5, B->x);
+EXPECT_EQ(-5, B->y);
+EXPECT_EQ(0u, MultiArgConstructor::Destructions);
+  }
+  EXPECT_EQ(2u, MultiArgConstructor::Destructions);
+}
+
+TEST(OptionalTest, InPlaceConstructionAndEmplaceEquivalentTest) {
+  MultiArgConstructor::ResetCounts();
+  {
+Optional A{in_place, 1, 2};
+Optional B;
+B.emplace(1, 2);
+EXPECT_EQ(0u, MultiArgConstructor::Destructions);
+ASSERT_EQ(A, B);
+  }
+  EXPECT_EQ(2u, MultiArgConstructor::Destructions);
+}
+
 struct MoveOnly {
   static unsigned MoveConstructions;
   static unsigned Destructions;
@@ -391,6 +432,15 @@
   EXPECT_EQ(0u, Immovable::Destructions);
 }
 
+TEST(OptionalTest, ImmovableInPlaceConstruction) {
+  Immovable::ResetCounts();
+  Optional A{in_place, 4};
+  EXPECT_TRUE((bool)A);
+  EXPECT_EQ(4, A->val);
+  EXPECT_EQ(1u, Immovable::Constructions);
+  EXPECT_EQ(0u, Immovable::Destructions);
+}
+
 // Craft a class which is_trivially_copyable, but not
 // is_trivially_copy_constructible.
 struct NonTCopy {
Index: llvm/include/llvm/ADT/STLForwardCompat.h
===
--- llvm/include/llvm/ADT/STLForwardCompat.h
+++ llvm/include/llvm/ADT/STLForwardCompat.h
@@ -44,6 +44,25 @@
 struct disjunction
 : std::conditional>::type {};
 
+struct in_place_t // NOLINT(readability-identifier-naming)
+{
+  explicit in_place_t() = default;
+};
+/// \warning This must not be odr-used, as it cannot be made \c inline in C++14.
+constexpr in_place_t in_place; // NOLINT(readability-identifier-naming)
+
+template 
+struct in_place_type_t // NOLINT(readability-identifier-naming)
+{
+  explicit in_place_type_t() = default;
+};
+
+template 
+struct in_place_index_t // NOLINT(readability-identifier-naming)
+{
+  explicit in_place_index_t() = default;
+};
+
 //===--===//
 // Features from C++20
 //===--===//
Index: llvm/include/llvm/ADT/Optional.h
===
--- llvm/include/llvm/ADT/Optional.h
+++ llvm/include/llvm/ADT/Optional.h
@@ -17,6 +17,7 @@
 
 #include "llvm/ADT/Hashing.h"
 #include "llvm/ADT/None.h"
+#include "llvm/ADT/STLForwardCompat.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/type_traits.h"
 #include 
@@ -30,8 +31,6 @@
 
 namespace optional_detail {
 
-struct in_place_t {};
-
 /// Storage for any type.
 //
 // The specialization condition intentionally uses
@@ -245,13 +244,16 @@
   constexpr Optional() {}
   constexpr Optional(NoneType) {}
 
-  constexpr Optional(const T &y) : Storage(optional_detail::in_place_t{}, y) {}
+  constexpr Optional(const T &y) : Storage(in_place, y) {}
   constexpr Optional(const Optional &O) = default;
 
-  constexpr Optional(T &&y)
-  : Storage(opti

[PATCH] D100671: [ADT] Factor out in_place_t and expose in Optional ctor

2021-05-14 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 345475.
scott.linder added a comment.

Add constexpr convenience variables


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100671/new/

https://reviews.llvm.org/D100671

Files:
  clang/include/clang/Basic/DirectoryEntry.h
  llvm/include/llvm/ADT/Optional.h
  llvm/include/llvm/ADT/STLForwardCompat.h
  llvm/unittests/ADT/OptionalTest.cpp

Index: llvm/unittests/ADT/OptionalTest.cpp
===
--- llvm/unittests/ADT/OptionalTest.cpp
+++ llvm/unittests/ADT/OptionalTest.cpp
@@ -195,6 +195,14 @@
   EXPECT_EQ(0u, NonDefaultConstructible::Destructions);
 }
 
+TEST(OptionalTest, InPlaceConstructionNonDefaultConstructibleTest) {
+  NonDefaultConstructible::ResetCounts();
+  { Optional A{in_place, 1}; }
+  EXPECT_EQ(0u, NonDefaultConstructible::CopyConstructions);
+  EXPECT_EQ(0u, NonDefaultConstructible::CopyAssignments);
+  EXPECT_EQ(1u, NonDefaultConstructible::Destructions);
+}
+
 TEST(OptionalTest, GetValueOr) {
   Optional A;
   EXPECT_EQ(42, A.getValueOr(42));
@@ -214,6 +222,11 @@
   MultiArgConstructor &operator=(const MultiArgConstructor &) = delete;
   MultiArgConstructor &operator=(MultiArgConstructor &&) = delete;
 
+  friend bool operator==(const MultiArgConstructor &LHS,
+ const MultiArgConstructor &RHS) {
+return LHS.x == RHS.x && LHS.y == RHS.y;
+  }
+
   static unsigned Destructions;
   ~MultiArgConstructor() {
 ++Destructions;
@@ -244,6 +257,34 @@
   EXPECT_EQ(1u, MultiArgConstructor::Destructions);
 }
 
+TEST(OptionalTest, InPlaceConstructionMultiArgConstructorTest) {
+  MultiArgConstructor::ResetCounts();
+  {
+Optional A{in_place, 1, 2};
+EXPECT_TRUE(A.hasValue());
+EXPECT_EQ(1, A->x);
+EXPECT_EQ(2, A->y);
+Optional B{in_place, 5, false};
+EXPECT_TRUE(B.hasValue());
+EXPECT_EQ(5, B->x);
+EXPECT_EQ(-5, B->y);
+EXPECT_EQ(0u, MultiArgConstructor::Destructions);
+  }
+  EXPECT_EQ(2u, MultiArgConstructor::Destructions);
+}
+
+TEST(OptionalTest, InPlaceConstructionAndEmplaceEquivalentTest) {
+  MultiArgConstructor::ResetCounts();
+  {
+Optional A{in_place, 1, 2};
+Optional B;
+B.emplace(1, 2);
+EXPECT_EQ(0u, MultiArgConstructor::Destructions);
+ASSERT_EQ(A, B);
+  }
+  EXPECT_EQ(2u, MultiArgConstructor::Destructions);
+}
+
 struct MoveOnly {
   static unsigned MoveConstructions;
   static unsigned Destructions;
@@ -391,6 +432,15 @@
   EXPECT_EQ(0u, Immovable::Destructions);
 }
 
+TEST(OptionalTest, ImmovableInPlaceConstruction) {
+  Immovable::ResetCounts();
+  Optional A{in_place, 4};
+  EXPECT_TRUE((bool)A);
+  EXPECT_EQ(4, A->val);
+  EXPECT_EQ(1u, Immovable::Constructions);
+  EXPECT_EQ(0u, Immovable::Destructions);
+}
+
 // Craft a class which is_trivially_copyable, but not
 // is_trivially_copy_constructible.
 struct NonTCopy {
Index: llvm/include/llvm/ADT/STLForwardCompat.h
===
--- llvm/include/llvm/ADT/STLForwardCompat.h
+++ llvm/include/llvm/ADT/STLForwardCompat.h
@@ -44,6 +44,33 @@
 struct disjunction
 : std::conditional>::type {};
 
+struct in_place_t // NOLINT(readability-identifier-naming)
+{
+  explicit in_place_t() = default;
+};
+/// \warning This must not be odr-used, as it cannot be made \c inline in C++14.
+constexpr in_place_t in_place; // NOLINT(readability-identifier-naming)
+
+template 
+struct in_place_type_t // NOLINT(readability-identifier-naming)
+{
+  explicit in_place_type_t() = default;
+};
+/// \warning This must not be odr-used, as it cannot be made \c inline in C++14.
+template 
+constexpr in_place_type_t
+in_place_type; // NOLINT(readability-identifier-naming)
+
+template 
+struct in_place_index_t // NOLINT(readability-identifier-naming)
+{
+  explicit in_place_index_t() = default;
+};
+/// \warning This must not be odr-used, as it cannot be made \c inline in C++14.
+template 
+constexpr in_place_index_t
+in_place_index; // NOLINT(readability-identifier-naming)
+
 //===--===//
 // Features from C++20
 //===--===//
Index: llvm/include/llvm/ADT/Optional.h
===
--- llvm/include/llvm/ADT/Optional.h
+++ llvm/include/llvm/ADT/Optional.h
@@ -17,6 +17,7 @@
 
 #include "llvm/ADT/Hashing.h"
 #include "llvm/ADT/None.h"
+#include "llvm/ADT/STLForwardCompat.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/type_traits.h"
 #include 
@@ -30,8 +31,6 @@
 
 namespace optional_detail {
 
-struct in_place_t {};
-
 /// Storage for any type.
 //
 // The specialization condition intentionally uses
@@ -245,13 +244,16 @@
   constexpr Optional() {}
   constexpr Optional(NoneType) {}
 
-  constexpr Optional(const T &y) : Storage(optional_detail::in_place_t{}, y) {}
+  constexpr O

[PATCH] D101542: [NFC] Refactor ExecuteAssembler in cc1as_main.cpp

2021-04-30 Thread Scott Linder via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGcab19d84ce85: [NFC] Refactor ExecuteAssembler in 
cc1as_main.cpp (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101542/new/

https://reviews.llvm.org/D101542

Files:
  clang/tools/driver/cc1as_main.cpp


Index: clang/tools/driver/cc1as_main.cpp
===
--- clang/tools/driver/cc1as_main.cpp
+++ clang/tools/driver/cc1as_main.cpp
@@ -333,8 +333,8 @@
   return Out;
 }
 
-static bool ExecuteAssembler(AssemblerInvocation &Opts,
- DiagnosticsEngine &Diags) {
+static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
   // Get the target specific parser.
   std::string Error;
   const Target *TheTarget = TargetRegistry::lookupTarget(Opts.Triple, Error);
@@ -531,12 +531,12 @@
 Failed = Parser->Run(Opts.NoInitialTextSection);
   }
 
-  // Parser has a reference to the output stream (Str), so close Parser first.
-  Parser.reset();
-  Str.reset();
-  // Close the output stream early.
-  BOS.reset();
-  FDOS.reset();
+  return Failed;
+}
+
+static bool ExecuteAssembler(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
+  bool Failed = ExecuteAssemblerImpl(Opts, Diags);
 
   // Delete output file if there were errors.
   if (Failed) {


Index: clang/tools/driver/cc1as_main.cpp
===
--- clang/tools/driver/cc1as_main.cpp
+++ clang/tools/driver/cc1as_main.cpp
@@ -333,8 +333,8 @@
   return Out;
 }
 
-static bool ExecuteAssembler(AssemblerInvocation &Opts,
- DiagnosticsEngine &Diags) {
+static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
   // Get the target specific parser.
   std::string Error;
   const Target *TheTarget = TargetRegistry::lookupTarget(Opts.Triple, Error);
@@ -531,12 +531,12 @@
 Failed = Parser->Run(Opts.NoInitialTextSection);
   }
 
-  // Parser has a reference to the output stream (Str), so close Parser first.
-  Parser.reset();
-  Str.reset();
-  // Close the output stream early.
-  BOS.reset();
-  FDOS.reset();
+  return Failed;
+}
+
+static bool ExecuteAssembler(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
+  bool Failed = ExecuteAssemblerImpl(Opts, Diags);
 
   // Delete output file if there were errors.
   if (Failed) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101542: [NFC] Refactor ExecuteAssembler in cc1as_main.cpp

2021-04-29 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added reviewers: MaskRay, dineshkb-amd, aykevl.
scott.linder added a comment.

I'm not sure who else to add, please feel free to update the reviewers


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101542/new/

https://reviews.llvm.org/D101542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101542: [NFC] Refactor ExecuteAssembler in cc1as_main.cpp

2021-04-29 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
scott.linder requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Introduce an extra scope (another static function) to replace calls to
`unique_ptr::reset` with implicit destructors via RAII.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D101542

Files:
  clang/tools/driver/cc1as_main.cpp


Index: clang/tools/driver/cc1as_main.cpp
===
--- clang/tools/driver/cc1as_main.cpp
+++ clang/tools/driver/cc1as_main.cpp
@@ -333,8 +333,8 @@
   return Out;
 }
 
-static bool ExecuteAssembler(AssemblerInvocation &Opts,
- DiagnosticsEngine &Diags) {
+static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
   // Get the target specific parser.
   std::string Error;
   const Target *TheTarget = TargetRegistry::lookupTarget(Opts.Triple, Error);
@@ -531,12 +531,12 @@
 Failed = Parser->Run(Opts.NoInitialTextSection);
   }
 
-  // Parser has a reference to the output stream (Str), so close Parser first.
-  Parser.reset();
-  Str.reset();
-  // Close the output stream early.
-  BOS.reset();
-  FDOS.reset();
+  return Failed;
+}
+
+static bool ExecuteAssembler(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
+  bool Failed = ExecuteAssemblerImpl(Opts, Diags);
 
   // Delete output file if there were errors.
   if (Failed) {


Index: clang/tools/driver/cc1as_main.cpp
===
--- clang/tools/driver/cc1as_main.cpp
+++ clang/tools/driver/cc1as_main.cpp
@@ -333,8 +333,8 @@
   return Out;
 }
 
-static bool ExecuteAssembler(AssemblerInvocation &Opts,
- DiagnosticsEngine &Diags) {
+static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
   // Get the target specific parser.
   std::string Error;
   const Target *TheTarget = TargetRegistry::lookupTarget(Opts.Triple, Error);
@@ -531,12 +531,12 @@
 Failed = Parser->Run(Opts.NoInitialTextSection);
   }
 
-  // Parser has a reference to the output stream (Str), so close Parser first.
-  Parser.reset();
-  Str.reset();
-  // Close the output stream early.
-  BOS.reset();
-  FDOS.reset();
+  return Failed;
+}
+
+static bool ExecuteAssembler(AssemblerInvocation &Opts,
+ DiagnosticsEngine &Diags) {
+  bool Failed = ExecuteAssemblerImpl(Opts, Diags);
 
   // Delete output file if there were errors.
   if (Failed) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D100671: [ADT] Factor out in_place_t and expose in Optional ctor

2021-04-16 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D100671#2695923 , @dblaikie wrote:

> This usage doesn't seem to quite match the standard - which provides an 
> existing instance of in_place_t for callers to use:
>
>   std::optional o4(std::in_place, {'a', 'b', 'c'});
>
> (to quote https://en.cppreference.com/w/cpp/utility/optional/optional )
>
> Probably good to match this sort of behavior.

My understanding is that the C++17 `inline constexpr` is what makes that 
generally "safe" wrt. ODR, but I'm not actually sure that it come up in 
practice (i.e. I don't suspect we will ever actually cause this thing to have 
an address).

Maybe I can add the variable with a note about not doing anything with it that 
could cause it to violate ODR? I.e. don't take its address (and probably other 
thing I need to refresh my memory on)

Comment at: llvm/unittests/ADT/OptionalTest.cpp:227
+ const MultiArgConstructor &RHS) {
+return LHS.x == RHS.x && LHS.y == RHS.y;
+  }

dblaikie wrote:
> Could write this as: `return std::tie(LHS.x, LHS.y) == std::tie(RHS.x, 
> RHS.y);` which I think is a bit tidier/easier to generalize to other 
> comparisons, etc.
Will do, thank you!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100671/new/

https://reviews.llvm.org/D100671

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D100671: [ADT] Factor out in_place_t and expose in Optional ctor

2021-04-16 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added a subscriber: dexonsmith.
scott.linder requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D100671

Files:
  clang/include/clang/Basic/DirectoryEntry.h
  llvm/include/llvm/ADT/Optional.h
  llvm/include/llvm/ADT/STLShims.h
  llvm/unittests/ADT/OptionalTest.cpp

Index: llvm/unittests/ADT/OptionalTest.cpp
===
--- llvm/unittests/ADT/OptionalTest.cpp
+++ llvm/unittests/ADT/OptionalTest.cpp
@@ -195,6 +195,14 @@
   EXPECT_EQ(0u, NonDefaultConstructible::Destructions);
 }
 
+TEST(OptionalTest, InPlaceConstructionNonDefaultConstructibleTest) {
+  NonDefaultConstructible::ResetCounts();
+  { Optional A{in_place_t{}, 1}; }
+  EXPECT_EQ(0u, NonDefaultConstructible::CopyConstructions);
+  EXPECT_EQ(0u, NonDefaultConstructible::CopyAssignments);
+  EXPECT_EQ(1u, NonDefaultConstructible::Destructions);
+}
+
 TEST(OptionalTest, GetValueOr) {
   Optional A;
   EXPECT_EQ(42, A.getValueOr(42));
@@ -214,6 +222,11 @@
   MultiArgConstructor &operator=(const MultiArgConstructor &) = delete;
   MultiArgConstructor &operator=(MultiArgConstructor &&) = delete;
 
+  friend bool operator==(const MultiArgConstructor &LHS,
+ const MultiArgConstructor &RHS) {
+return LHS.x == RHS.x && LHS.y == RHS.y;
+  }
+
   static unsigned Destructions;
   ~MultiArgConstructor() {
 ++Destructions;
@@ -244,6 +257,34 @@
   EXPECT_EQ(1u, MultiArgConstructor::Destructions);
 }
 
+TEST(OptionalTest, InPlaceConstructionMultiArgConstructorTest) {
+  MultiArgConstructor::ResetCounts();
+  {
+Optional A{in_place_t{}, 1, 2};
+EXPECT_TRUE(A.hasValue());
+EXPECT_EQ(1, A->x);
+EXPECT_EQ(2, A->y);
+Optional B{in_place_t{}, 5, false};
+EXPECT_TRUE(B.hasValue());
+EXPECT_EQ(5, B->x);
+EXPECT_EQ(-5, B->y);
+EXPECT_EQ(0u, MultiArgConstructor::Destructions);
+  }
+  EXPECT_EQ(2u, MultiArgConstructor::Destructions);
+}
+
+TEST(OptionalTest, InPlaceConstructionAndEmplaceEquivalentTest) {
+  MultiArgConstructor::ResetCounts();
+  {
+Optional A{in_place_t{}, 1, 2};
+Optional B;
+B.emplace(1, 2);
+EXPECT_EQ(0u, MultiArgConstructor::Destructions);
+ASSERT_EQ(A, B);
+  }
+  EXPECT_EQ(2u, MultiArgConstructor::Destructions);
+}
+
 struct MoveOnly {
   static unsigned MoveConstructions;
   static unsigned Destructions;
@@ -391,6 +432,15 @@
   EXPECT_EQ(0u, Immovable::Destructions);
 }
 
+TEST(OptionalTest, ImmovableInPlaceConstruction) {
+  Immovable::ResetCounts();
+  Optional A{in_place_t{}, 4};
+  EXPECT_TRUE((bool)A);
+  EXPECT_EQ(4, A->val);
+  EXPECT_EQ(1u, Immovable::Constructions);
+  EXPECT_EQ(0u, Immovable::Destructions);
+}
+
 // Craft a class which is_trivially_copyable, but not
 // is_trivially_copy_constructible.
 struct NonTCopy {
Index: llvm/include/llvm/ADT/STLShims.h
===
--- llvm/include/llvm/ADT/STLShims.h
+++ llvm/include/llvm/ADT/STLShims.h
@@ -27,6 +27,18 @@
 // Shims for C++17
 //===--===//
 
+struct SHIM(in_place_t) {
+  explicit in_place_t() = default;
+};
+
+template  struct SHIM(in_place_type_t) {
+  explicit in_place_type_t() = default;
+};
+
+template  struct SHIM(in_place_index_t) {
+  explicit in_place_index_t() = default;
+};
+
 template 
 struct SHIM(negation) : std::integral_constant {};
 
Index: llvm/include/llvm/ADT/Optional.h
===
--- llvm/include/llvm/ADT/Optional.h
+++ llvm/include/llvm/ADT/Optional.h
@@ -17,6 +17,7 @@
 
 #include "llvm/ADT/Hashing.h"
 #include "llvm/ADT/None.h"
+#include "llvm/ADT/STLShims.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/type_traits.h"
 #include 
@@ -30,8 +31,6 @@
 
 namespace optional_detail {
 
-struct in_place_t {};
-
 /// Storage for any type.
 //
 // The specialization condition intentionally uses
@@ -245,13 +244,16 @@
   constexpr Optional() {}
   constexpr Optional(NoneType) {}
 
-  constexpr Optional(const T &y) : Storage(optional_detail::in_place_t{}, y) {}
+  constexpr Optional(const T &y) : Storage(in_place_t{}, y) {}
   constexpr Optional(const Optional &O) = default;
 
-  constexpr Optional(T &&y)
-  : Storage(optional_detail::in_place_t{}, std::move(y)) {}
+  constexpr Optional(T &&y) : Storage(in_place_t{}, std::move(y)) {}
   constexpr Optional(Optional &&O) = default;
 
+  template 
+  constexpr Optional(in_place_t, ArgTypes &&...Args)
+  : Storage(in_place_t{}, std::forward(Args)...) {}
+
   Optional &operator=(T &&y) {
 Storage = std::move(y);
 return *this;
Index: clang/include/clang/Basic/DirectoryEntry.h
===
--- clang/include/clang/Basic/DirectoryEntry.h
+++ clang/i

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2021-04-01 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D88978#2660274 , @arsenm wrote:

> Is this still needed?

Yes, I just got a little bogged down in the OMP code and haven't gotten back to 
it to finish it up. I anticipate needing to do this to soon, though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88978/new/

https://reviews.llvm.org/D88978

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D94338: [Clang][Docs] Fix ambiguity in clang-offload-bundler docs

2021-01-11 Thread Scott Linder via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGc15b0e2229ea: [Clang][Docs] Fix ambiguity in 
clang-offload-bundler docs (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94338/new/

https://reviews.llvm.org/D94338

Files:
  clang/docs/ClangOffloadBundler.rst


Index: clang/docs/ClangOffloadBundler.rst
===
--- clang/docs/ClangOffloadBundler.rst
+++ clang/docs/ClangOffloadBundler.rst
@@ -44,7 +44,7 @@
 Field   TypeSize in BytesDescription
 === ===  
===
 Magic Stringstring  24   
``__CLANG_OFFLOAD_BUNDLE__``
-Number Of Code Objects  integer 8Number od 
bundled code objects.
+Number Of Bundle Entriesinteger 8Number of 
bundle entries.
 1st Bundle Entry Code Object Offset integer 8Byte offset 
from beginning of
  bundled code 
object to 1st code
  object.
@@ -208,4 +208,4 @@
   features `_
   supported.
 
-Most other targets do not support target IDs.
\ No newline at end of file
+Most other targets do not support target IDs.


Index: clang/docs/ClangOffloadBundler.rst
===
--- clang/docs/ClangOffloadBundler.rst
+++ clang/docs/ClangOffloadBundler.rst
@@ -44,7 +44,7 @@
 Field   TypeSize in BytesDescription
 === ===  ===
 Magic Stringstring  24   ``__CLANG_OFFLOAD_BUNDLE__``
-Number Of Code Objects  integer 8Number od bundled code objects.
+Number Of Bundle Entriesinteger 8Number of bundle entries.
 1st Bundle Entry Code Object Offset integer 8Byte offset from beginning of
  bundled code object to 1st code
  object.
@@ -208,4 +208,4 @@
   features `_
   supported.
 
-Most other targets do not support target IDs.
\ No newline at end of file
+Most other targets do not support target IDs.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D94338: [Clang][Docs] Fix ambiguity in clang-offload-bundler docs

2021-01-08 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
scott.linder requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D94338

Files:
  clang/docs/ClangOffloadBundler.rst


Index: clang/docs/ClangOffloadBundler.rst
===
--- clang/docs/ClangOffloadBundler.rst
+++ clang/docs/ClangOffloadBundler.rst
@@ -44,7 +44,7 @@
 Field   TypeSize in BytesDescription
 === ===  
===
 Magic Stringstring  24   
``__CLANG_OFFLOAD_BUNDLE__``
-Number Of Code Objects  integer 8Number od 
bundled code objects.
+Number Of Bundle Entriesinteger 8Number of 
bundle entries.
 1st Bundle Entry Code Object Offset integer 8Byte offset 
from beginning of
  bundled code 
object to 1st code
  object.
@@ -208,4 +208,4 @@
   features `_
   supported.
 
-Most other targets do not support target IDs.
\ No newline at end of file
+Most other targets do not support target IDs.


Index: clang/docs/ClangOffloadBundler.rst
===
--- clang/docs/ClangOffloadBundler.rst
+++ clang/docs/ClangOffloadBundler.rst
@@ -44,7 +44,7 @@
 Field   TypeSize in BytesDescription
 === ===  ===
 Magic Stringstring  24   ``__CLANG_OFFLOAD_BUNDLE__``
-Number Of Code Objects  integer 8Number od bundled code objects.
+Number Of Bundle Entriesinteger 8Number of bundle entries.
 1st Bundle Entry Code Object Offset integer 8Byte offset from beginning of
  bundled code object to 1st code
  object.
@@ -208,4 +208,4 @@
   features `_
   supported.
 
-Most other targets do not support target IDs.
\ No newline at end of file
+Most other targets do not support target IDs.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93721: [NFC][AMDGPU] Do not override GetDefaultDwarfVersion

2020-12-22 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added reviewers: t-tye, laurentm0, yaxunl.
scott.linder added a comment.

No other target overrides `GetDefaultDwarfVersion` with the same 
implementation, and in the future when we make changes we can add them at the 
points in the hierarchy where they make sense semantically.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93721/new/

https://reviews.llvm.org/D93721

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93721: [NFC][AMDGPU] Do not override GetDefaultDwarfVersion

2020-12-22 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, yaxunl, nhaehnle, 
jvesely, kzhuravl.
scott.linder requested review of this revision.
Herald added subscribers: cfe-commits, wdng.
Herald added a project: clang.

Use the implementation inhereted from `ToolChain` for as long
as we default to Dwarf 4 for every AMDGPU toolchain.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D93721

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h


Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,8 +99,6 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
-
   const ToolChain &HostTC;
 
 protected:
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,6 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,8 +99,6 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
-
   const ToolChain &HostTC;
 
 protected:
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,6 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93648: Revert "[AMDGPU][HIP] Switch default DWARF version to 5"

2020-12-21 Thread Scott Linder via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGffba47df7646: Revert "[AMDGPU][HIP] Switch default 
DWARF version to 5" (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93648/new/

https://reviews.llvm.org/D93648

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h
  clang/test/Driver/amdgpu-toolchain.c
  clang/test/Driver/hip-toolchain-dwarf.hip


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93648: Revert "[AMDGPU][HIP] Switch default DWARF version to 5"

2020-12-21 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, yaxunl, nhaehnle, 
jvesely, kzhuravl.
scott.linder requested review of this revision.
Herald added subscribers: cfe-commits, wdng.
Herald added a project: clang.

This reverts commit c4d10e7e9bb47b77fad43d8ddcfa328298f36c88 
.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D93648

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h
  clang/test/Driver/amdgpu-toolchain.c
  clang/test/Driver/hip-toolchain-dwarf.hip


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=5"
+// DWARF_VER: "-dwarf-version=4"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 5; }
+  unsigned GetDefaultDwarfVersion() const override { return 4; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D92617: [DWARF] Allow toolchain to adjust specified DWARF version.

2020-12-04 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/test/Driver/debug-options.c:364-366
+// GEMBED_2:  warning: debug information option '-gembed-source' is not 
supported for target
 // NOGEMBED_5-NOT:  "-gembed-source"
+// NOGEMBED_2-NOT:  warning: debug information option '-gembed-source' is not 
supported for target

tra wrote:
> scott.linder wrote:
> > dblaikie wrote:
> > > This is a bit of a loss in fidelity - might need a new diagnostic message 
> > > (or go hunting around for a more general purpose one than this one at 
> > > least) to say '-gembed-source' is ignored when not using DWARFv5. (in the 
> > > nvptx case it's "not supported on target", but in the existing cases 
> > > covered by this test it's "not supported because the user requested 
> > > DWARFv2", for instance)
> > > 
> > > @JDevlieghere & @scott.linder for thoughts on this.
> > I agree that I'd prefer we detect whether the target-specific clamped 
> > version is to blame (and use the proposed warning message) or the original 
> > DWARF version is to blame (and use the old message).
> > 
> > If I were compiling for x86 and gave e.g. `-gdwarf-4 -gembed-source` and 
> > the error said "not supported by target" I'd probably get the wrong idea.
> > 
> > It would also be nice to retain the error semantics in the case where the 
> > user is explicitly requesting incompatible options.
> This sounds pretty close to what the older iterations of this patch did:  
> https://reviews.llvm.org/D92617?id=309404, except that it preserved the 
> current behavior which makes it an error to use -gembed-source with an 
> explicitly specified DWARF version below 5. Same options with a lower clamped 
> version produces a warning. I.e. If user specified a nominally valid 
> combination of `-gdwarf-5 -gembed-source` but the target like NVPTX clamped 
> it down to DWARF2, there will only be a warning.
> 
> I would appreciate if you (as in debug info stakeholders) could clarify 
> couple of moments for me:
> 
> * should `-gdwarf-4 -gembed-source` be an error or warning? It's currently an 
> error.
> * `-gdwarf-5 -gembed-source` with the dwarf version clamped below 5 should 
> produce a warning. `not supported for target` appears to be correct, but does 
> not explain why. Do we want to amend/change it to say `because target only 
> supports DWARF 2` or `target does not support DWARF 5`? Or is `not supported 
> by target` sufficient as is?
> 
> 
> This sounds pretty close to what the older iterations of this patch did:  
> https://reviews.llvm.org/D92617?id=309404, except that it preserved the 
> current behavior which makes it an error to use -gembed-source with an 
> explicitly specified DWARF version below 5. Same options with a lower clamped 
> version produces a warning. I.e. If user specified a nominally valid 
> combination of `-gdwarf-5 -gembed-source` but the target like NVPTX clamped 
> it down to DWARF2, there will only be a warning.
> 
> I would appreciate if you (as in debug info stakeholders) could clarify 
> couple of moments for me:
> 
> * should `-gdwarf-4 -gembed-source` be an error or warning? It's currently an 
> error.

I think it should remain an error when an incompatible DWARF version is 
explicitly specified by the user. They said they wanted two things which are 
mutually exclusive, and we have no way to know which one they meant (or if they 
just aren't aware they are incompatible) so we should stop and prompt them to 
fix it.

> * `-gdwarf-5 -gembed-source` with the dwarf version clamped below 5 should 
> produce a warning. `not supported for target` appears to be correct, but does 
> not explain why. Do we want to amend/change it to say `because target only 
> supports DWARF 2` or `target does not support DWARF 5`? Or is `not supported 
> by target` sufficient as is?
> 
> 

I think giving more context is a good thing in this situation; I don't know 
that I have a strong opinion on the wording, but something indicating the 
warning is due to a target restriction which caps the DWARF version seems 
helpful enough to warrant being more verbose. I'd vote for your `because target 
only supports DWARF `



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92617/new/

https://reviews.llvm.org/D92617

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D92617: [DWARF] Allow toolchain to adjust specified DWARF version.

2020-12-04 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/test/Driver/debug-options.c:364-366
+// GEMBED_2:  warning: debug information option '-gembed-source' is not 
supported for target
 // NOGEMBED_5-NOT:  "-gembed-source"
+// NOGEMBED_2-NOT:  warning: debug information option '-gembed-source' is not 
supported for target

dblaikie wrote:
> This is a bit of a loss in fidelity - might need a new diagnostic message (or 
> go hunting around for a more general purpose one than this one at least) to 
> say '-gembed-source' is ignored when not using DWARFv5. (in the nvptx case 
> it's "not supported on target", but in the existing cases covered by this 
> test it's "not supported because the user requested DWARFv2", for instance)
> 
> @JDevlieghere & @scott.linder for thoughts on this.
I agree that I'd prefer we detect whether the target-specific clamped version 
is to blame (and use the proposed warning message) or the original DWARF 
version is to blame (and use the old message).

If I were compiling for x86 and gave e.g. `-gdwarf-4 -gembed-source` and the 
error said "not supported by target" I'd probably get the wrong idea.

It would also be nice to retain the error semantics in the case where the user 
is explicitly requesting incompatible options.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92617/new/

https://reviews.llvm.org/D92617

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D90419: [AMDGPU] Add gfx90c target

2020-10-30 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml:161
 
 # RUN: yaml2obj --docnum=41 %s > %t.o.41
 # RUN: llvm-readobj -s -file-headers %t.o.41 | FileCheck 
--check-prefixes=ELF-ALL,ELF-GFX1031 %s

Heads up, I just landed a refactor of this file in 
https://reviews.llvm.org/D90245

It would probably be easiest to just start from the existing test, and 
copy-paste following the pattern. The RUN lines can now be in GFX# order, you 
don't need to create a new YAML document, and the tests are grouped by GFX#


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90419/new/

https://reviews.llvm.org/D90419

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D89484: [AMDGPU][HIP] Switch default DWARF version to 5

2020-10-16 Thread Scott Linder via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGc4d10e7e9bb4: [AMDGPU][HIP] Switch default DWARF version to 
5 (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89484/new/

https://reviews.llvm.org/D89484

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h
  clang/test/Driver/amdgpu-toolchain.c
  clang/test/Driver/hip-toolchain-dwarf.hip


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D89484: [AMDGPU][HIP] Switch default DWARF version to 5

2020-10-15 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: cfe-commits, kerbowa, t-tye, tpr, dstuttard, yaxunl, 
nhaehnle, jvesely, kzhuravl.
Herald added a project: clang.
scott.linder requested review of this revision.
Herald added a subscriber: wdng.

Another attempt at this, see D59008  for 
previous attempt.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D89484

Files:
  clang/lib/Driver/ToolChains/AMDGPU.h
  clang/lib/Driver/ToolChains/HIP.h
  clang/test/Driver/amdgpu-toolchain.c
  clang/test/Driver/hip-toolchain-dwarf.hip


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 


Index: clang/test/Driver/hip-toolchain-dwarf.hip
===
--- clang/test/Driver/hip-toolchain-dwarf.hip
+++ clang/test/Driver/hip-toolchain-dwarf.hip
@@ -6,4 +6,4 @@
 // RUN:   -x hip --cuda-gpu-arch=gfx803 %s \
 // RUN:   -Xarch_gfx803 -g 2>&1 | FileCheck %s -check-prefix=DWARF_VER
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
Index: clang/test/Driver/amdgpu-toolchain.c
===
--- clang/test/Driver/amdgpu-toolchain.c
+++ clang/test/Driver/amdgpu-toolchain.c
@@ -8,7 +8,7 @@
 // AS_LINK: clang{{.*}} "-cc1as"
 // AS_LINK: ld.lld{{.*}} "-shared"
 
-// DWARF_VER: "-dwarf-version=4"
+// DWARF_VER: "-dwarf-version=5"
 
 // RUN: %clang -### -target amdgcn-amd-amdhsa -mcpu=gfx906 -nogpulib \
 // RUN:   -flto %s 2>&1 | FileCheck -check-prefix=LTO %s
Index: clang/lib/Driver/ToolChains/HIP.h
===
--- clang/lib/Driver/ToolChains/HIP.h
+++ clang/lib/Driver/ToolChains/HIP.h
@@ -99,7 +99,7 @@
   computeMSVCVersion(const Driver *D,
  const llvm::opt::ArgList &Args) const override;
 
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
 
   const ToolChain &HostTC;
 
Index: clang/lib/Driver/ToolChains/AMDGPU.h
===
--- clang/lib/Driver/ToolChains/AMDGPU.h
+++ clang/lib/Driver/ToolChains/AMDGPU.h
@@ -60,7 +60,7 @@
 public:
   AMDGPUToolChain(const Driver &D, const llvm::Triple &Triple,
   const llvm::opt::ArgList &Args);
-  unsigned GetDefaultDwarfVersion() const override { return 4; }
+  unsigned GetDefaultDwarfVersion() const override { return 5; }
   bool IsIntegratedAssemblerDefault() const override { return true; }
   bool IsMathErrnoDefault() const override { return false; }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2020-10-14 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/lib/Basic/Targets/NVPTX.h:47
 -1, // Default, opencl_private or opencl_generic - not defined
-5,  // opencl_global
+-1, // opencl_global
 -1,

Does anyone have any thoughts on this change specifically? Is someone more 
familiar with NVPTX willing to weigh in on whether it makes more sense to carry 
the address space throughout the compiler explicitly and "drop" it late in the 
DWARF emission, or to do what I did in the current patch (drop it early).

I would lean towards updating the patch to do the latter, but I wanted to get 
feedback before plunging off to do it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88976/new/

https://reviews.llvm.org/D88976

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2020-10-12 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D88978#2325991 , @ABataev wrote:

> In D88978#2325982 , @scott.linder 
> wrote:
>
>> @ABataev Sorry if I'm pulling you in without enough context/work on my end, 
>> but I wanted to ask how the Clang codegen for OpenMP locals works at a high 
>> level?
>>
>> Is the idea that instead of an `alloc` the frontend can insert calls into 
>> the runtime in some cases, like `__kmpc_alloc` (e.g. for `firstprivate` as 
>> in https://reviews.llvm.org/D5140)?
>
> Yes, right.
>
>> If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?
>
> I assume, no.
>
>> I am trying to understand conceptually where the debug info for the source 
>> level local should be tied to in the IR, and at least for locals which use 
>> `alloc` it has turned out to be much simpler to tie the variable directly to 
>> the `alloc` itself rather than bitcasts and things which obscure the 
>> relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie 
the debug info to the call to the runtime allocator, then.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88978/new/

https://reviews.llvm.org/D88978

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2020-10-12 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but 
I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an `alloc` the frontend can insert calls into the 
runtime in some cases, like `__kmpc_alloc` (e.g. for `firstprivate` as in 
https://reviews.llvm.org/D5140)?

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here? I am 
trying to understand conceptually where the debug info for the source level 
local should be tied to in the IR, and at least for locals which use `alloc` it 
has turned out to be much simpler to tie the variable directly to the `alloc` 
itself rather than bitcasts and things which obscure the relationship.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88978/new/

https://reviews.llvm.org/D88978

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2020-10-09 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/lib/CodeGen/CGDecl.cpp:1579
   if (EmitDebugInfo && HaveInsertPoint()) {
-Address DebugAddr = address;
+Address DebugAddr = AllocaAddr.isValid() ? AllocaAddr : address;
 bool UsePointerValue = NRVO && ReturnValuePointer.isValid();

aprantl wrote:
> This is unintuitive — can you add a comment explaining why it may not be 
> valid and why address should only be used then?
This is kind of a cop-out on my part, the only path where this occurs is for 
OpenMP, and I think I just need to understand better what is happening. This 
also occurs for NRVO, but that is explicitly called out just below this. I'll 
try to understand this more completely and see if I can represent the 
possibilities more direclty.

Somewhat related, it is a bit unsettling reading through this, as the invariant 
seems to be that `address.isValid()` by the time the call to 
`setAddrOfLocalVar` is called, which makes sense but isn't explicit anywhere in 
the multiple nested `if`s. I'll also add an assert of that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88978/new/

https://reviews.llvm.org/D88978

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2020-10-09 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: clang/test/CodeGenHIP/debug-info-address-class.hip:8
+
+// CHECK-DAG: ![[FILEVAR0:[0-9]+]] = distinct !DIGlobalVariable(name: 
"FileVar0", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: false, isDefinition: true)
+// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR0]], expr: 
!DIExpression())

aprantl wrote:
> They are more convenient, but having very many CHECK_DAGs is also really slow 
> — would it be feasible to reorder them and use CHECKs? Perhaps by running 
> clang/FileCheck twice with different sets of CHECK lines?
The test is pretty short, so I just re-ordered the checks to match how they 
appear in the output (and used more descriptive names to make it easier to 
follow).

This does mean the test relies on the order these things are traversed. Some 
bits are maybe a bit surprising, like how the metadata for the `__shared__` 
auto variable comes before the argument, but I don't imagine it is liable to 
change often/accidentally.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88976/new/

https://reviews.llvm.org/D88976

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2020-10-09 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 297309.
scott.linder added a comment.

Replace uses of CHECK-DAG, use more meaningful names in test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88976/new/

https://reviews.llvm.org/D88976

Files:
  clang/lib/Basic/Targets/NVPTX.h
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/test/CodeGenHIP/debug-info-address-class.hip


Index: clang/test/CodeGenHIP/debug-info-address-class.hip
===
--- /dev/null
+++ clang/test/CodeGenHIP/debug-info-address-class.hip
@@ -0,0 +1,37 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -emit-llvm 
-fcuda-is-device -debug-info-kind=limited -dwarf-version=4 -o - %s | FileCheck 
%s
+
+#define __device__ __attribute__((device))
+#define __shared__ __attribute__((shared))
+#define __constant__ __attribute__((constant))
+
+__device__ int FileVarDevice;
+__device__ __shared__ int FileVarDeviceShared;
+__device__ __constant__ int FileVarDeviceConstant;
+
+__device__ void kernel1(
+// FIXME This should be in the private address space.
+// CHECK: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata 
![[ARG:[0-9]+]], metadata !DIExpression()), !dbg !{{[0-9]+}}
+int Arg) {
+  __shared__ int FuncVarShared;
+
+  // FIXME This should be in the private address space.
+  // CHECK: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata 
![[FUNC_VAR:[0-9]+]], metadata !DIExpression()), !dbg !{{[0-9]+}}
+  int FuncVar;
+}
+
+// CHECK: !DIGlobalVariableExpression(var: ![[FILE_VAR_DEVICE:[0-9]+]], expr: 
!DIExpression())
+// CHECK: ![[FILE_VAR_DEVICE]] = distinct !DIGlobalVariable(name: 
"FileVarDevice", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: false, isDefinition: true)
+
+// CHECK: !DIGlobalVariableExpression(var: ![[FILE_VAR_DEVICE_SHARED:[0-9]+]], 
expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
+// CHECK: ![[FILE_VAR_DEVICE_SHARED]] = distinct !DIGlobalVariable(name: 
"FileVarDeviceShared", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, 
type: !{{[0-9]+}}, isLocal: false, isDefinition: true)
+
+// CHECK: !DIGlobalVariableExpression(var: 
![[FILE_VAR_DEVICE_CONSTANT:[0-9]+]], expr: !DIExpression())
+// CHECK: ![[FILE_VAR_DEVICE_CONSTANT]] = distinct !DIGlobalVariable(name: 
"FileVarDeviceConstant", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 
{{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)
+
+// CHECK: !DIGlobalVariableExpression(var: ![[FUNC_VAR_SHARED:[0-9]+]], expr: 
!DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
+// CHECK: ![[FUNC_VAR_SHARED]] = distinct !DIGlobalVariable(name: 
"FuncVarShared", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: true, isDefinition: true)
+
+// CHECK: ![[ARG]] = !DILocalVariable(name: "Arg", arg: {{[0-9]+}}, scope: 
!{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
+
+// CHECK: ![[FUNC_VAR]] = !DILocalVariable(name: "FuncVar", scope: 
!{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -4667,15 +4667,7 @@
 
 SmallVector Expr;
 unsigned AddressSpace =
-CGM.getContext().getTargetAddressSpace(D->getType());
-if (CGM.getLangOpts().CUDA && CGM.getLangOpts().CUDAIsDevice) {
-  if (D->hasAttr())
-AddressSpace =
-CGM.getContext().getTargetAddressSpace(LangAS::cuda_shared);
-  else if (D->hasAttr())
-AddressSpace =
-CGM.getContext().getTargetAddressSpace(LangAS::cuda_constant);
-}
+
CGM.getContext().getTargetAddressSpace(CGM.GetGlobalVarAddressSpace(D));
 AppendAddressSpaceXDeref(AddressSpace, Expr);
 
 GVE = DBuilder.createGlobalVariableExpression(
Index: clang/lib/Basic/Targets/NVPTX.h
===
--- clang/lib/Basic/Targets/NVPTX.h
+++ clang/lib/Basic/Targets/NVPTX.h
@@ -44,7 +44,7 @@
 /// 
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf
 static const int NVPTXDWARFAddrSpaceMap[] = {
 -1, // Default, opencl_private or opencl_generic - not defined
-5,  // opencl_global
+-1, // opencl_global
 -1,
 8,  // opencl_local or cuda_shared
 4,  // opencl_constant or cuda_constant


Index: clang/test/CodeGenHIP/debug-info-address-class.hip
===
--- /dev/null
+++ clang/test/CodeGenHIP/debug-info-address-class.hip
@@ -0,0 +1,37 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -emit-llvm -fcuda-is-device -debug-info-kind=limited -dwarf-version=4 -o - %s | FileCheck %s
+
+#de

[PATCH] D88754: [clang] Add a test for CGDebugInfo treatment of blocks

2020-10-09 Thread Scott Linder via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG40cef5a00eb8: [clang] Add a test for CGDebugInfo treatment 
of blocks (authored by scott.linder).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88754/new/

https://reviews.llvm.org/D88754

Files:
  clang/test/CodeGen/debug-info-block-expr.c


Index: clang/test/CodeGen/debug-info-block-expr.c
===
--- /dev/null
+++ clang/test/CodeGen/debug-info-block-expr.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -fblocks -debug-info-kind=limited -emit-llvm -o - %s | 
FileCheck %s
+// Verify that the desired DIExpression are generated for blocks.
+
+void test() {
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_plus_uconst, 
{{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  __block int i;
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_deref, 
DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}, 
DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  ^ { i = 1; }();
+}


Index: clang/test/CodeGen/debug-info-block-expr.c
===
--- /dev/null
+++ clang/test/CodeGen/debug-info-block-expr.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -fblocks -debug-info-kind=limited -emit-llvm -o - %s | FileCheck %s
+// Verify that the desired DIExpression are generated for blocks.
+
+void test() {
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  __block int i;
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  ^ { i = 1; }();
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2020-10-07 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I need to add more tests, but I wanted to float the idea of the change and get 
feedback first.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88978/new/

https://reviews.llvm.org/D88978

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space

2020-10-07 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
scott.linder requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added a subscriber: sstefan1.

A dbg.declare for a local/parameter describes the hardware location of
the source variable's value. This matches up with the semantics of the
alloca for the variable, whereas any addrspacecast inserted in order to
implement some source-level notion of address spaces does not.

When creating the dbg.declare intrinsic, attach it directly to the
alloca, not to any addrspacecast.

Update the DIExpression with the address space of the alloca, rather
than use the address space associated with the source level type.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D88978

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/lib/CodeGen/CGDecl.cpp
  clang/test/CodeGenHIP/debug-info-address-class.hip


Index: clang/test/CodeGenHIP/debug-info-address-class.hip
===
--- clang/test/CodeGenHIP/debug-info-address-class.hip
+++ clang/test/CodeGenHIP/debug-info-address-class.hip
@@ -16,16 +16,13 @@
 __device__ __constant__ int FileVar2;
 
 __device__ void kernel1(
-// FIXME This should be in the private address space.
 // CHECK-DAG: ![[ARG:[0-9]+]] = !DILocalVariable(name: "Arg", arg: 
{{[0-9]+}}, scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}})
-// CHECK-DAG: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata 
![[ARG]], metadata !DIExpression()), !dbg !{{[0-9]+}}
+// CHECK-DAG: call void @llvm.dbg.declare(metadata i32 addrspace(5)* 
{{.*}}, metadata ![[ARG]], metadata !DIExpression(DW_OP_constu, 1, DW_OP_swap, 
DW_OP_xderef)), !dbg !{{[0-9]+}}
 int Arg) {
 // CHECK-DAG: ![[FUNCVAR0:[0-9]+]] = distinct !DIGlobalVariable(name: 
"FuncVar0", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: true, isDefinition: true)
 // CHECK-DAG: !DIGlobalVariableExpression(var: ![[FUNCVAR0]], expr: 
!DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
   __shared__ int FuncVar0;
-
-  // FIXME This should be in the private address space.
   // CHECK-DAG: ![[FUNCVAR1:[0-9]+]] = !DILocalVariable(name: "FuncVar1", 
scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
-  // CHECK-DAG: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata 
![[FUNCVAR1]], metadata !DIExpression()), !dbg !{{[0-9]+}}
+  // CHECK-DAG: call void @llvm.dbg.declare(metadata i32 addrspace(5)* {{.*}}, 
metadata ![[FUNCVAR1]], metadata !DIExpression(DW_OP_constu, 1, DW_OP_swap, 
DW_OP_xderef)), !dbg !{{[0-9]+}}
   int FuncVar1;
 }
Index: clang/lib/CodeGen/CGDecl.cpp
===
--- clang/lib/CodeGen/CGDecl.cpp
+++ clang/lib/CodeGen/CGDecl.cpp
@@ -1576,7 +1576,7 @@
 
   // Emit debug info for local var declaration.
   if (EmitDebugInfo && HaveInsertPoint()) {
-Address DebugAddr = address;
+Address DebugAddr = AllocaAddr.isValid() ? AllocaAddr : address;
 bool UsePointerValue = NRVO && ReturnValuePointer.isValid();
 DI->setLocation(D.getLocation());
 
@@ -2417,11 +2417,12 @@
   }
 
   Address DeclPtr = Address::invalid();
+  Address DebugAddr = Address::invalid();
   bool DoStore = false;
   bool IsScalar = hasScalarEvaluationKind(Ty);
   // If we already have a pointer to the argument, reuse the input pointer.
   if (Arg.isIndirect()) {
-DeclPtr = Arg.getIndirectAddress();
+DeclPtr = DebugAddr = Arg.getIndirectAddress();
 // If we have a prettier pointer type at this point, bitcast to that.
 unsigned AS = DeclPtr.getType()->getAddressSpace();
 llvm::Type *IRTy = ConvertTypeForMem(Ty)->getPointerTo(AS);
@@ -2466,11 +2467,11 @@
 ? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, &D)
 : Address::invalid();
 if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) {
-  DeclPtr = OpenMPLocalAddr;
+  DeclPtr = DebugAddr = OpenMPLocalAddr;
 } else {
   // Otherwise, create a temporary to hold the value.
   DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(&D),
-  D.getName() + ".addr");
+  D.getName() + ".addr", &DebugAddr);
 }
 DoStore = true;
   }
@@ -2545,7 +2546,7 @@
   // Emit debug info for param declarations in non-thunk functions.
   if (CGDebugInfo *DI = getDebugInfo()) {
 if (CGM.getCodeGenOpts().hasReducedDebugInfo() && !CurFuncIsThunk) {
-  DI->EmitDeclareOfArgVariable(&D, DeclPtr.getPointer(), ArgNo, Builder);
+  DI->EmitDeclareOfArgVariable(&D, DebugAddr.getPointer(), ArgNo, Builder);
 }
   }
 
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -4186,7 +4186,8 @@
 
   auto Al

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2020-10-07 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I'm not certain I fully understand NVPTX's relationship with its debugger, but 
from https://reviews.llvm.org/D57162 I gather the "default" address space in 
the debugger is global, and so the frontend omits it rather than explicitly 
mentioning it. I think it would be simpler to carry this information throughout 
the compiler, and only strip it late in the backend as a quirk controllable via 
some "optimize for NVPTX debugger", but in the patch as it currently is I 
instead just update `NVPTXDWARFAddrSpaceMap`.

I'm also not addressing the same issue with locals/autos. I left a "FIXME" in 
the new HIP test, and I wanted to ask if anyone can help me understand why the 
target address space isn't "available" at this point. I.e. with 
`LangAS::Default` we really want to notice at this point that the target 
address space is actually private for AMDGPU. The frontend does emit an alloca 
with the correct target address space, but I think this is via some static 
"alloca address space" mechanism? Is this a reasonable thing to use in 
`CGDebugInfo` too?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88976/new/

https://reviews.llvm.org/D88976

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2020-10-07 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added subscribers: cfe-commits, jholewinski.
Herald added a project: clang.
scott.linder requested review of this revision.

The target needs to be queried here, but previously we seemed to only
duplicate CUDA's (and so HIP's) behavior, and only partially. Use the
same function as codegen to find the correct address space.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D88976

Files:
  clang/lib/Basic/Targets/NVPTX.h
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/test/CodeGenHIP/debug-info-address-class.hip


Index: clang/test/CodeGenHIP/debug-info-address-class.hip
===
--- /dev/null
+++ clang/test/CodeGenHIP/debug-info-address-class.hip
@@ -0,0 +1,31 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -emit-llvm 
-fcuda-is-device -debug-info-kind=limited -dwarf-version=4 -o - %s | FileCheck 
%s
+
+#define __device__ __attribute__((device))
+#define __shared__ __attribute__((shared))
+#define __constant__ __attribute__((constant))
+
+// CHECK-DAG: ![[FILEVAR0:[0-9]+]] = distinct !DIGlobalVariable(name: 
"FileVar0", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: false, isDefinition: true)
+// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR0]], expr: 
!DIExpression())
+__device__ int FileVar0;
+// CHECK-DAG: ![[FILEVAR1:[0-9]+]] = distinct !DIGlobalVariable(name: 
"FileVar1", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: false, isDefinition: true)
+// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR1]], expr: 
!DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
+__device__ __shared__ int FileVar1;
+// CHECK-DAG: ![[FILEVAR2:[0-9]+]] = distinct !DIGlobalVariable(name: 
"FileVar2", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: false, isDefinition: true)
+// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR2]], expr: 
!DIExpression())
+__device__ __constant__ int FileVar2;
+
+__device__ void kernel1(
+// FIXME This should be in the private address space.
+// CHECK-DAG: ![[ARG:[0-9]+]] = !DILocalVariable(name: "Arg", arg: 
{{[0-9]+}}, scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}})
+// CHECK-DAG: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata 
![[ARG]], metadata !DIExpression()), !dbg !{{[0-9]+}}
+int Arg) {
+// CHECK-DAG: ![[FUNCVAR0:[0-9]+]] = distinct !DIGlobalVariable(name: 
"FuncVar0", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: 
!{{[0-9]+}}, isLocal: true, isDefinition: true)
+// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FUNCVAR0]], expr: 
!DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
+  __shared__ int FuncVar0;
+
+  // FIXME This should be in the private address space.
+  // CHECK-DAG: ![[FUNCVAR1:[0-9]+]] = !DILocalVariable(name: "FuncVar1", 
scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
+  // CHECK-DAG: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata 
![[FUNCVAR1]], metadata !DIExpression()), !dbg !{{[0-9]+}}
+  int FuncVar1;
+}
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -4664,15 +4664,7 @@
 
 SmallVector Expr;
 unsigned AddressSpace =
-CGM.getContext().getTargetAddressSpace(D->getType());
-if (CGM.getLangOpts().CUDA && CGM.getLangOpts().CUDAIsDevice) {
-  if (D->hasAttr())
-AddressSpace =
-CGM.getContext().getTargetAddressSpace(LangAS::cuda_shared);
-  else if (D->hasAttr())
-AddressSpace =
-CGM.getContext().getTargetAddressSpace(LangAS::cuda_constant);
-}
+
CGM.getContext().getTargetAddressSpace(CGM.GetGlobalVarAddressSpace(D));
 AppendAddressSpaceXDeref(AddressSpace, Expr);
 
 GVE = DBuilder.createGlobalVariableExpression(
Index: clang/lib/Basic/Targets/NVPTX.h
===
--- clang/lib/Basic/Targets/NVPTX.h
+++ clang/lib/Basic/Targets/NVPTX.h
@@ -44,7 +44,7 @@
 /// 
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf
 static const int NVPTXDWARFAddrSpaceMap[] = {
 -1, // Default, opencl_private or opencl_generic - not defined
-5,  // opencl_global
+-1, // opencl_global
 -1,
 8,  // opencl_local or cuda_shared
 4,  // opencl_constant or cuda_constant


Index: clang/test/CodeGenHIP/debug-info-address-class.hip
===
--- /dev/null
+++ clang/test/CodeGenHIP/debug-info-address-class.hip
@@ -0,0 +1,31 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -emit-llvm -fcuda-is-device -debug-info-kin

[PATCH] D88754: [clang] Add a test for CGDebugInfo treatment of blocks

2020-10-06 Thread Scott Linder via Phabricator via cfe-commits

scott.linder updated this revision to Diff 296558.
scott.linder added a comment.

Don't hardcode x86_64-specific offsets


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88754/new/

https://reviews.llvm.org/D88754

Files:
  clang/test/CodeGen/debug-info-block-expr.c


Index: clang/test/CodeGen/debug-info-block-expr.c
===
--- /dev/null
+++ clang/test/CodeGen/debug-info-block-expr.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -fblocks -debug-info-kind=limited -emit-llvm -o - %s | 
FileCheck %s
+// Verify that the desired DIExpression are generated for blocks.
+
+void test() {
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_plus_uconst, 
{{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  __block int i;
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_deref, 
DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}, 
DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  ^ { i = 1; }();
+}


Index: clang/test/CodeGen/debug-info-block-expr.c
===
--- /dev/null
+++ clang/test/CodeGen/debug-info-block-expr.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -fblocks -debug-info-kind=limited -emit-llvm -o - %s | FileCheck %s
+// Verify that the desired DIExpression are generated for blocks.
+
+void test() {
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  __block int i;
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}, DW_OP_deref, DW_OP_plus_uconst, {{[0-9]+}}){{.*}})
+  ^ { i = 1; }();
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D88754: [clang] Add a test for CGDebugInfo treatment of blocks

2020-10-02 Thread Scott Linder via Phabricator via cfe-commits

scott.linder created this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
scott.linder requested review of this revision.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D88754

Files:
  clang/test/CodeGen/debug-info-block-expr.c


Index: clang/test/CodeGen/debug-info-block-expr.c
===
--- /dev/null
+++ clang/test/CodeGen/debug-info-block-expr.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -fblocks -debug-info-kind=limited -emit-llvm -o - %s | 
FileCheck %s
+// Verify that the desired DIExpression are generated for blocks.
+
+void test() {
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_plus_uconst, 
8, DW_OP_deref, DW_OP_plus_uconst, 24){{.*}})
+  __block int i;
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_deref, 
DW_OP_plus_uconst, 32, DW_OP_deref, DW_OP_plus_uconst, 8, DW_OP_deref, 
DW_OP_plus_uconst, 24){{.*}})
+  ^ { i = 1; }();
+}


Index: clang/test/CodeGen/debug-info-block-expr.c
===
--- /dev/null
+++ clang/test/CodeGen/debug-info-block-expr.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -fblocks -debug-info-kind=limited -emit-llvm -o - %s | FileCheck %s
+// Verify that the desired DIExpression are generated for blocks.
+
+void test() {
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_plus_uconst, 8, DW_OP_deref, DW_OP_plus_uconst, 24){{.*}})
+  __block int i;
+// CHECK: call void @llvm.dbg.declare({{.*}}!DIExpression(DW_OP_deref, DW_OP_plus_uconst, 32, DW_OP_deref, DW_OP_plus_uconst, 8, DW_OP_deref, DW_OP_plus_uconst, 24){{.*}})
+  ^ { i = 1; }();
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79400: [CMAKE] Fix build failure when source directory is read only

2020-05-28 Thread Scott Linder via Phabricator via cfe-commits

scott.linder accepted this revision.
scott.linder added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!




Comment at: llvm/cmake/modules/AddLLVM.cmake:1916
   endif()
-  if(EXISTS "${path}/.svn")
-set(svn_files

Nit: can this be in a separate commit (no need for another review), or at least 
be explicitly called out in the commit message of this patch?



Comment at: llvm/include/llvm/Support/CMakeLists.txt:12
+  if (NOT llvm_vc)
+set(fake_version_inc "${CMAKE_CURRENT_BINARY_DIR}/__FakeVCSRevision.h")
+  endif()

Nit: can you add a comment indicating this is never expected to exist, and is 
just used to force regeneration for the read-only case? Otherwise it may not be 
obvious to someone reading what the trick here is.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79400/new/

https://reviews.llvm.org/D79400



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79400: [CMAKE] Fix build failure when source directory is read only

2020-05-27 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added inline comments.



Comment at: llvm/cmake/modules/AddLLVM.cmake:1945
+  if (NOT touch_head_result EQUAL 0)
+return()
+  endif()

This seems to implement the behavior that when the log is not present and is 
not writable the VCS header target is never re-build. Would it instead make 
more sense to conservatively assume it should //always// be rebuilt? The latter 
case is what is implemented by the stackoverflow snippet you mentioned 
previously, right?

If the assumption is that the failure to write the logs file implies this 
source is read-only, then it may make sense to assume it never needs to be 
rebuilt. However if the sources are changed in another context (e.g. `sudo -u 
user-with-write-privs git checkout rev`) this won't trigger a rebuild until the 
user re-runs the CMake config step manually.

I think I would lean towards the conservative approach of always rebuilding, 
but this patch as it stands is still an improvement over just falling over in 
this case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79400/new/

https://reviews.llvm.org/D79400



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79400: [CMAKE] Fix build failure when source directory is read only

2020-05-15 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

I'd be interested in the answer concerning why we need to avoid `git rev-parse 
HEAD`; it seems like the cleanest solution is to just always check if `git 
rev-parse HEAD` changes to determine whether to regenerate the header.

If that is not feasible for some reason, I would lean towards your option (2), 
but I think more is needed in this patch to ensure the generation script is 
always run, right? How does //removing// a dependency cause the target to be 
executed each build? Don't we need to detect the case where `.git/logs/HEAD` is 
missing and make the target depend on `ALL` or whatever the CMake notion of 
"always out of date" is? We will also need to be OK with the regression in what 
happens when you do a `repo` checkout in a read-write context, as that will 
always run the script whereas before it would have the same behavior as a 
normal checkout. I don't know what the implication of all of these changes are, 
though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79400/new/

https://reviews.llvm.org/D79400



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79400: [CMAKE] Fix build failure when source directory is read only

2020-05-07 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Thank you for the explanation; I'm also still lost as to why `.git/logs/HEAD` 
is referenced at all then. I would propose removing 
`find_first_existing_vc_file` entirely, as it seems to be completely redundant 
if there is another mechanism for ensuring the VCS header is up-to-date without 
triggering gratuitous rebuilds, but I will defer to any of the other reviewers 
on the `blame` for the scripts.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79400/new/

https://reviews.llvm.org/D79400



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79400: [CMAKE] Fix build failure when source directory is read only

2020-05-05 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

Can you provide more context in the diff? Looking at the current `master` 
locally it seems like this patch changes the behavior of the 
`llvm_vcsrevision_h` target when `logs/HEAD` isn't found, such that it only 
depends on the generation script. I.e. if you configure CMake without the 
`HEAD` file present, and then you do something which changes the working 
directory and creates `HEAD` the target will not be out of date and won't be 
rebuilt. Before this patch it would have noticed the change to `HEAD` and 
forced regenerating the version header.

I don't know how much this matters? In the case where the filesystem is 
read-only from CMake's perspective would it make more sense to always 
regenerate the header instead, or is the assumption that nobody else is 
modifying it either?

Also for the case where the filesystem is not read-only do we want to retain 
the old behavior? I.e. could we try to write the file and if it fails just 
proceed rather than `FATAL_ERROR`? I don't know if this is possible with 
CMake's `file()`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79400/new/

https://reviews.llvm.org/D79400



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

In D77923#1978261 , @arsenm wrote:

> In D77923#1978172 , @scott.linder 
> wrote:
>
> > https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/preprocessorDirectives.html
> >  describes `__OPENCL_VERSION__` as "an integer reflecting the version 
> > number of the OpenCL supported by the OpenCL device." as contrasted with 
> > `__OPENCL_C_VERSION__` which is described as "an integer reflecting the 
> > OpenCL C version specified by the -cl-std build option to clBuildProgram or 
> > clCompileProgram. If the -cl-std build option is not specified, the OpenCL 
> > C version supported by the compiler for this OpenCL device will be used."
> >
> > But I don't see where the correct use of these is defined? How should the 
> > user decide which to use? It does seem like `__OPENCL_VERSION__` and 
> > `__OPENCL_C_VERSION__` can differ, e.g. if you compile for a device 
> > supporting 2.0 with `-cl-std=1.2`, but why would `__OPENCL_VERSION__` ever 
> > be referenced then?
>
>
> I don't know why you would ever use this; but the standard says it should be 
> defined, so I guess we have to define it

Right, I just want to understand the actual semantics and it doesn't seem 
clear. I agree that I don't see a use for `__OPENCL_VERSION__` but we define it 
so I'm sure people are using it, so likely we need to be careful that this 
patch preserves the existing behavior of the runtime.

In D77923#1976497 , @jvesely wrote:

> OPENCL_VERSION is something that should be really set by the opencl driver 
> rather than the compiler.
>  coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those 
> gpus can get over 1.2, while the compiler can be used in an implementation 
> that doesn't go above 1.2 even for gfx700+

I don't know why this would imply we can't just have a static 
`__OPENCL_VERSION__` for each device in Clang? I don't see anything in the 
standard that says you are not allowed to have (going with your example) 
`__OPENCL_VERSION__=200` and `__OPENCL_C_VERSION=120` at the same time.

It seems like we should be able to just have `__OPENCL_VERSION__` defined in 
Clang, and let the runtime control setting `__OPENCL_C_VERSION__` via 
`-cl-std=`. If the current patch sets `__OPENCL_VERSION__` differently than the 
runtime does then it seems like we should make it match, but I'm not sure how 
best to confirm that.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Scott Linder via Phabricator via cfe-commits

scott.linder added a comment.

https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/preprocessorDirectives.html
 describes `__OPENCL_VERSION__` as "an integer reflecting the version number of 
the OpenCL supported by the OpenCL device." as contrasted with 
`__OPENCL_C_VERSION__` which is described as "an integer reflecting the OpenCL 
C version specified by the -cl-std build option to clBuildProgram or 
clCompileProgram. If the -cl-std build option is not specified, the OpenCL C 
version supported by the compiler for this OpenCL device will be used."

But I don't see where the correct use of these is defined? How should the user 
decide which to use? It does seem like `__OPENCL_VERSION__` and 
`__OPENCL_C_VERSION__` can differ, e.g. if you compile for a device supporting 
2.0 with `-cl-std=1.2`, but why would `__OPENCL_VERSION__` ever be referenced 
then?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 >

1 - 100 of 222 matches

Mail list logo