[PATCH] D24933: Enable configuration files in clang

2017-02-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: include/clang/Driver/Driver.h:219
+  /// This number may be smaller that \c NumConfigOptions if some options
+  /// requires separate arguments.
+  unsigned NumConfigArgs;

requires -> require



Comment at: lib/Driver/ToolChain.cpp:183
   std::string Target;
-  if (llvm::TargetRegistry::lookupTarget(Prefix, IgnoredError)) {
+  if (!VerifyTarget || llvm::TargetRegistry::lookupTarget(Prefix, 
IgnoredError))
 Target = Prefix;

I don't think that we can do it this way; it is a behavior change (we now might 
try to set the target to some string which did not validate as a known target, 
whereas we did not previously).

How about you always return the prefix, but also return a boolean indicating 
whether or not the prefix is a valid target? Then, after processing the config 
file, you can clear out the string if it was not a valid target.



Comment at: tools/driver/driver.cpp:363
+   TargetAndMode.first + ".cfg");
+TargetAndMode.first.clear();
+  }

I don't think that you can clear the string here. We might need it later to 
call insertTargetAndModeArgs.



Comment at: tools/driver/driver.cpp:449
+  // there are options read from config file, put the options from "CL"
+  // after them, - config file is considered as a "patch" to compiler
+  // defaults.

Replace
  after them, - config file
with
  after them because the config file



https://reviews.llvm.org/D24933



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24933: Enable configuration files in clang

2017-01-30 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Driver/Driver.cpp:172
+NumConfigArgs = static_cast(NumCfgArgs);
+  }
+

sepavloff wrote:
> bruno wrote:
> > If `NumCfgArgs == 0` it would be nice to warn that the config file is empty?
> Not sure if it makes sense. Usually warning are used to attract attention to 
> the things that potentially may be harmful. An empty config file is strange 
> but does not look dangerous. 
I agree. The config files might be automatically generated by wrapper scripts, 
and adding special cases to avoid creating empty files seems like an 
unnecessary complication.



Comment at: lib/Driver/Driver.cpp:2695
 
+  // Clain all arguments that come from a configuration file so that the driver
+  // does not warn on any that are unused.

Clain -> Claim



Comment at: tools/driver/driver.cpp:332
+  StringRef PName = llvm::sys::path::stem(ProgramName);
+  size_t Pos = PName.find("-clang");
+  if (Pos == StringRef::npos)

Looking for "-clang" is too specific; our driver-name suffix processing allows 
more general naming than this. For one thing, it will not pick up armv7l-cpp 
correctly (which would be a problem because the config files likely contain 
options affecting preprocessing). I also have users which use symlinks to clang 
named fooclang.

I think that an easy way to do this is to use the result from 
ToolChain::getTargetAndModeFromProgramName, which we call from the caller of 
this function:

  std::string ProgName = argv[0];
  std::pair TargetAndMode =
  ToolChain::getTargetAndModeFromProgramName(ProgName);

We can use TargetAndMode.first here, instead of looking for the string before 
"-clang", and that should be consistent with the rest of our processing.



https://reviews.llvm.org/D24933



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29778: Declare lgamma library builtins as never being const

2017-02-15 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D29778



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D28404: IRGen: Add optnone attribute on function during O0

2017-02-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D28404#673260, @mehdi_amini wrote:

> Ping :)


To clarify my understanding of this thread, it seems like there are three ways 
forward here:

1. To have -O0 add optnone to the generated functions (enabling some degree of 
lack of optimization of those functions even when used with -flto)
2. To have -O0 -flto essentially turn off LTO (so that we get unoptimized 
objects directly for things we're debugging)
3. Add a separate flag to make optnone the default

(1) is this patch. The disadvantage of (2) is that it also precludes CFI (and 
other whole-program transformations). This seems highly unfortunate at best and 
a non-starter in the general case. The disadvantage of (3) is that it might 
seems confusing to users (i.e. how to explain the difference between -O0 and 
-foptimize-off?) and is an unnecessary exposure to users of implementation 
details. On this point I agree.

It is true that -O0 != optnone, in a technical sense, but in the end, both are 
best effort. Moreover, there is a tradeoff between disabling optimization of 
the functions you don't want to optimize and keeping the remainder of the code 
as similar as possible to how it would be if everything were being optimized. 
What optnone provides seems like a reasonable point in that tradeoff space. I 
think that we should move forward with this approach.


https://reviews.llvm.org/D28404



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29750: [PPC] Enable -fomit-frame-pointer by default for PPC

2017-02-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

We should have regression tests for this. Maybe update 
test/Driver/frame-pointer-elim.c?


https://reviews.llvm.org/D29750



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D28543: Elliminates uninitialized warning for volitile varibles.

2017-02-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

What's the motivation for this? The placement of a local volatile variable is 
still under the compiler's direction, and unless the address escapes, we still 
assume we can reason about its aliasing (and, thus, whether or not it is 
initialized).


https://reviews.llvm.org/D28543



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30009: Add support for '#pragma clang attribute'

2017-02-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30009#678890, @arphaman wrote:

> In https://reviews.llvm.org/D30009#678091, @hfinkel wrote:
>
> > I don't understand why it only supports some attributes. Is there some 
> > handling that needs to take place (I don't see anything obvious in this 
> > patch)? If most attributes will "just work", I'd much rather opt-out the 
> > few exceptions (which we can then explicitly document), if any, than using 
> > this opt-in solution.
>
>
> I think it would be reasonable to use an opt-out approach. I think the 
> following set of initial rules should determine when an attribute should not 
> be supported by the pragma:
>
> - If an attribute is late parsed
> - Or if it has no GNU/CXX11 spelling
> - Or if it has no subject list, or its subject list is empty (excluding 
> annotate)
> - Or if it derives from StmtAttr or TypeAttr


SGTM.


Repository:
  rL LLVM

https://reviews.llvm.org/D30009



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30009: Add support for '#pragma clang attribute'

2017-02-15 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

I don't understand why it only supports some attributes. Is there some handling 
that needs to take place (I don't see anything obvious in this patch)? If most 
attributes will "just work", I'd much rather opt-out the few exceptions (which 
we can then explicitly document), if any, than using this opt-in solution.


Repository:
  rL LLVM

https://reviews.llvm.org/D30009



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D28845: Prototype of modules codegen

2017-01-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

Can you provide a high-level description of what you're trying to accomplish 
and the usage model?


https://reviews.llvm.org/D28845



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30087: [Driver] Unify linking of OpenMP runtime

2017-03-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Driver/Tools.cpp:10334
 
-  if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
-   options::OPT_fno_openmp, false)) {
+  // FIXME: Exclude this for platforms with libgomp that don't require
+  // librt. Most modern Linux platforms require it, but some may not.

Now that you've moved the comment, it is not clear what "this" means here. 
Please reword this to say that you should only pass true for GompNeedsRT on 
platforms that really need it.


https://reviews.llvm.org/D30087



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D23765: Fix for clang PR 29087

2016-11-27 Thread Hal Finkel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL287999: Adjust type-trait evaluation to properly handle 
Using(Shadow)Decls (authored by hfinkel).

Changed prior to commit:
  https://reviews.llvm.org/D23765?vs=76817=79354#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D23765

Files:
  cfe/trunk/lib/Sema/SemaExprCXX.cpp
  cfe/trunk/test/SemaCXX/cxx11-crashes.cpp


Index: cfe/trunk/test/SemaCXX/cxx11-crashes.cpp
===
--- cfe/trunk/test/SemaCXX/cxx11-crashes.cpp
+++ cfe/trunk/test/SemaCXX/cxx11-crashes.cpp
@@ -91,3 +91,15 @@
   Foo(lambda);
 }
 }
+
+namespace pr29091 {
+  struct X{ X(const X ); };
+  struct Y: X { using X::X; };
+  bool foo() { return __has_nothrow_constructor(Y); }
+  bool bar() { return __has_nothrow_copy(Y); }
+
+  struct A { template  A(); };
+  struct B : A { using A::A; };
+  bool baz() { return __has_nothrow_constructor(B); }
+  bool qux() { return __has_nothrow_copy(B); }
+}
Index: cfe/trunk/lib/Sema/SemaExprCXX.cpp
===
--- cfe/trunk/lib/Sema/SemaExprCXX.cpp
+++ cfe/trunk/lib/Sema/SemaExprCXX.cpp
@@ -4416,9 +4416,12 @@
 // A template constructor is never a copy constructor.
 // FIXME: However, it may actually be selected at the actual overload
 // resolution point.
-if (isa(ND))
+if (isa(ND->getUnderlyingDecl()))
   continue;
-const CXXConstructorDecl *Constructor = cast(ND);
+// UsingDecl itself is not a constructor
+if (isa(ND))
+  continue;
+auto *Constructor = cast(ND->getUnderlyingDecl());
 if (Constructor->isCopyConstructor(FoundTQs)) {
   FoundConstructor = true;
   const FunctionProtoType *CPT
@@ -4452,9 +4455,12 @@
   bool FoundConstructor = false;
   for (const auto *ND : Self.LookupConstructors(RD)) {
 // FIXME: In C++0x, a constructor template can be a default 
constructor.
-if (isa(ND))
+if (isa(ND->getUnderlyingDecl()))
+  continue;
+// UsingDecl itself is not a constructor
+if (isa(ND))
   continue;
-const CXXConstructorDecl *Constructor = cast(ND);
+auto *Constructor = cast(ND->getUnderlyingDecl());
 if (Constructor->isDefaultConstructor()) {
   FoundConstructor = true;
   const FunctionProtoType *CPT


Index: cfe/trunk/test/SemaCXX/cxx11-crashes.cpp
===
--- cfe/trunk/test/SemaCXX/cxx11-crashes.cpp
+++ cfe/trunk/test/SemaCXX/cxx11-crashes.cpp
@@ -91,3 +91,15 @@
   Foo(lambda);
 }
 }
+
+namespace pr29091 {
+  struct X{ X(const X ); };
+  struct Y: X { using X::X; };
+  bool foo() { return __has_nothrow_constructor(Y); }
+  bool bar() { return __has_nothrow_copy(Y); }
+
+  struct A { template  A(); };
+  struct B : A { using A::A; };
+  bool baz() { return __has_nothrow_constructor(B); }
+  bool qux() { return __has_nothrow_copy(B); }
+}
Index: cfe/trunk/lib/Sema/SemaExprCXX.cpp
===
--- cfe/trunk/lib/Sema/SemaExprCXX.cpp
+++ cfe/trunk/lib/Sema/SemaExprCXX.cpp
@@ -4416,9 +4416,12 @@
 // A template constructor is never a copy constructor.
 // FIXME: However, it may actually be selected at the actual overload
 // resolution point.
-if (isa(ND))
+if (isa(ND->getUnderlyingDecl()))
   continue;
-const CXXConstructorDecl *Constructor = cast(ND);
+// UsingDecl itself is not a constructor
+if (isa(ND))
+  continue;
+auto *Constructor = cast(ND->getUnderlyingDecl());
 if (Constructor->isCopyConstructor(FoundTQs)) {
   FoundConstructor = true;
   const FunctionProtoType *CPT
@@ -4452,9 +4455,12 @@
   bool FoundConstructor = false;
   for (const auto *ND : Self.LookupConstructors(RD)) {
 // FIXME: In C++0x, a constructor template can be a default constructor.
-if (isa(ND))
+if (isa(ND->getUnderlyingDecl()))
+  continue;
+// UsingDecl itself is not a constructor
+if (isa(ND))
   continue;
-const CXXConstructorDecl *Constructor = cast(ND);
+auto *Constructor = cast(ND->getUnderlyingDecl());
 if (Constructor->isDefaultConstructor()) {
   FoundConstructor = true;
   const FunctionProtoType *CPT
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27180: Testbed and skeleton of a new expression parser

2016-11-28 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

This seems like a great idea. btw, do you know about Cling 
(https://root.cern.ch/cling)?


Repository:
  rL LLVM

https://reviews.llvm.org/D27180



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24933: Enable configuration files in clang

2016-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Driver/Driver.cpp:2695
 
+  // Claim all arguments that come from configuration file, - driver must not
+  // warn about unused argument on them.

Grammar here is a bit odd, how about:

  // Claim all arguments that come from a configuration file so that the driver 
does not warn on any that are unused.


https://reviews.llvm.org/D24933



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D26564: Use PIC relocation mode by default for PowerPC64 ELF

2016-12-13 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D26564



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25435: Add -femit-accurate-debug-info to emit more debug info for sample pgo profile collection

2016-12-11 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D25435#609320, @danielcdh wrote:

> change the flag name to -fprofile-debug


I don't really like this name because it sounds like it might be enabling some 
kind of debugging of the profiling. How about -fdebug-info-for-profiling. 
Another options is to make it a mode for -g such as -gprofiling.


https://reviews.llvm.org/D25435



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27304: [PATCH] [Sema][X86] Don't allow floating-point return types when SSE is disabled

2016-12-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D27304#610944, @thegameg wrote:

> In https://reviews.llvm.org/D27304#610697, @joerg wrote:
>
> > I think this is the absolutely wrong place to put such logic. It really can 
> > not be anywhere but the backend.
>
>
> I am aware of this. But the way the backend informs the Diagnostics looks 
> like a crash, and asks for a bug report.
>
> Are there any proper ways to inform the frontend about backend errors?


Yes. Look at how we report inline asm errors, etc. grep for 
DiagnosticInfoStackSize (used in PEI) or DiagnosticInfoResourceLimit (used by 
the AMDGPU backend).


https://reviews.llvm.org/D27304



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25686: [Driver] Support "hardfloat" vendor triples used by Gentoo

2016-12-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D25686#571894, @mgorny wrote:

> In https://reviews.llvm.org/D25686#571857, @hfinkel wrote:
>
> > It seems like we should teach Triple how to parse this triples correctly 
> > (i.e. to recognize that the vendor is ARM), and then canonicalize the 
> > environment to GNUEABIHF (or whatever)?
>
>
> I've attempted that but altering the triple resulted in clang being unable to 
> find gcc files (since the triple didn't match anymore). My previous patch is 
> available here: https://595834.bugs.gentoo.org/attachment.cgi?id=450004. If 
> you think that I've did something stupid-wrong there and there's a better way 
> of doing this via Triple API, I'll gladly update the patch.


I think that you should update the triple class to return meaningful results 
and then also enhance Clang's search logic to find the GCC installation 
directory.


https://reviews.llvm.org/D25686



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27351: [CUDA] Forward sanitizer support to host toolchain

2016-12-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a reviewer: hfinkel.
hfinkel added a comment.

LGTM


https://reviews.llvm.org/D27351



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27872: [APFloat] Switch from (PPCDoubleDoubleImpl, IEEEdouble) layout to (IEEEdouble, IEEEdouble)

2017-01-07 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> (IEEEdouble, IEEEdouble) -> (uint64_t, uint64_t) -> PPCDoubleDoubleImpl, then 
> run the old algorithm.

We need to document in the code what is going on here and why it works. Just 
looking at a bunch of code like this:

  if (Semantics == ) {
APFloat Tmp(semPPCDoubleDoubleImpl, bitcastToAPInt());
auto Ret = Tmp.next(nextDown);
*this = DoubleAPFloat(semPPCDoubleDouble, Tmp.bitcastToAPInt());
return Ret;
  }

it is not at all obvious what is going on (i.e. why are we casting to 
integers?). Maybe semPPCDoubleDoubleImpl needs a better name now? It seems 
confusing to have semPPCDoubleDoubleImpl and semPPCDoubleDouble where one is 
not really an "implementation" class of the other.




Comment at: llvm/include/llvm/ADT/APFloat.h:791
   void makeNaN(bool SNaN, bool Neg, const APInt *fill) {
-getIEEE().makeNaN(SNaN, Neg, fill);
+if (usesLayout(getSemantics()))
+  return U.IEEE.makeNaN(SNaN, Neg, fill);

I realize that some of this existed before, but is there any way to refactor 
this so that we don't have so much boiler plate? Even using a utility macro is 
probably better than this.


https://reviews.llvm.org/D27872



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24688: Introduce "hosted" module flag.

2017-01-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D24688#639158, @jyknight wrote:

> Since this requires everyone generating llvm IR to add this module attribute 
> for optimal codegen, it should be documented in release notes and the 
> LangRef, I think.


And, for something like this, here: 
http://llvm.org/docs/Frontend/PerformanceTips.html

> I mean: it's unfortunate that it's needed at all, but at the very least it 
> should be possible for people to find out about it.




https://reviews.llvm.org/D24688



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D23934: Add a -ffixed-date-time= flag that sets the initial value of __DATE__, __TIME__, __TIMESTAMP__

2016-12-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D23934#632216, @emaste wrote:

> Please also accept `SOURCE_DATE_EPOCH` set in the environment -- see 
> https://reproducible-builds.org/specs/source-date-epoch/


It looks like there's reasonable adoption of this: 
https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal#Reading_the_variable
 and gcc >= 7 will support it.

> Also although I'm generally leery of options auto-detecting the argument 
> format, I think we should be able to pass an epoch timestamp to 
> -ffixed-date-time.

I'd also find this convenient. The auto-detection should be unambiguous (if it 
is an integer, then it is an epoch).


https://reviews.llvm.org/D23934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D28213: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin

2017-01-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM. This seems consistent with what GCC does. On x86 it also sets 
__GCC_ATOMIC_LLONG_LOCK_FREE to 2.


https://reviews.llvm.org/D28213



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D23934: Add a -ffixed-date-time= flag that sets the initial value of __DATE__, __TIME__, __TIMESTAMP__

2016-12-28 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D23934#631656, @ed wrote:

> I'd be interested in seeing a feature like this appearing.


I agree.




Comment at: lib/Driver/Tools.cpp:4687
+CmdArgs.push_back(Args.MakeArgString("-ffixed-date-time=" + DateTime));
+  }
+

This seems like a fairly random place for this code. It is far removed from any 
other handling of preprocessor-related options.



Comment at: lib/Driver/Tools.cpp:5996
   }
 
   Args.AddLastArg(CmdArgs, options::OPT_dM);

Somewhere around here would make more sense.


https://reviews.llvm.org/D23934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31276: Add #pragma clang fast_math

2017-03-23 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

High-level comment ;)

  #pragma clang fast_math contract_fast(on)

This seems a bit unfortunate because 'fast' appears twice? How are we planning 
on naming the other fast-math flags? Maybe we should just name it:

  #pragma clang math constract_fast(on)

or

  #pragma clang math contract(fast) // we could also accept off/on here for 
consistency and compatibility with the standard pragma

or maybe fp_math or floating_point_math or floating_point or fp instead of math.

I think that I prefer this last form (because it does not repeat 'fast' and 
also makes our extension a pure superset of the standard pragma).

What do you want to name the other flags? I'd prefer if they're grammatically 
consistent. Maybe we should stick closely to the command-line options, and have:

  fp_contract(on/off/fast)
  unsafe_optimizations(on/off)
  finite_only(on/off)

What do you think?


https://reviews.llvm.org/D31276



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31276: Add #pragma clang fast_math

2017-03-23 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31276#708993, @anemet wrote:

> In https://reviews.llvm.org/D31276#708976, @aaron.ballman wrote:
>
> > In https://reviews.llvm.org/D31276#708968, @anemet wrote:
> >
> > > Thanks very much, Aaron!  It would be great if you could also look at the 
> > > three patches that add support for the generation of the new FMF 
> > > 'contract' for -ffp-contract=fast.  I've added them in the dependencies 
> > > of this revision.  (https://reviews.llvm.org/D31168 is not really a 
> > > prerequisite but I am planning to commit the patches in this order.)
> >
> >
> > I'll take a look, but feel less qualified to give a LGTM to those.
>
>
> NP, thanks for your help on this one.  Hopefully @hfinkel or @mehdi_amini can 
> take a look.


They're definitely on my list. I might not get to them until the weekend or 
next week, however.


https://reviews.llvm.org/D31276



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30920: Do not pass -Os and -Oz to the Gold plugin

2017-03-24 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30920#703305, @mehdi_amini wrote:

> The fundamental difference, is that Os/Oz especially are treated as 
> `optimizations directive` that are independent of the pass pipeline: 
> instructing that "loop unroll should not increase size" is independent of 
> *where* is loop unroll inserted in the pipeline.




> The issue with https://reviews.llvm.org/owners/package/1/ vs 
> https://reviews.llvm.org/owners/package/2/ is that the *ordering* of the 
> passes changes, not only the threshold to apply.

Maybe we should stop here and ask: Is this really a *fundamental* difference? 
Or is this just a difference in how to handle Os/Oz today? Is there a 
fundamental reason why we might not want a different order for Oz vs. 
https://reviews.llvm.org/owners/package/2/ if we want one for 02 vs 03?

My view is that, no, there is no fundamental difference. Why? Because each 
optimization level has a meaning, and that meaning can almost always be applied 
per function.

- `Oz` - Make the binary as small as possible
- `Os` - Make the binary small while making only minor performance tradeoffs
- `O0` - Be fast, and maybe, maximize the ability to debug
- `O1` - Make the code fast while making only minor debugability tradeoffs
- `O2` - Make the code fast - perform only transformations that will speed up 
the code with near certainty
- `O3` - Make the code fast - perform transformations that will speed up the 
code with high probability

Believing that we can implement this model primarily through pass scheduling 
has proved false in the past (except for -O0 without LTO) and won't be any more 
true in the future. We essentially have one optimization pipeline, and I see no 
reason to assume this will change. It seems significantly more effective to 
have the passes become optimization-level aware than to implement optimization 
though changes in the pipeline structure itself. Especially in the context of 
the new pass manager, where the cost of scheduling a pass that will exit early 
should be small, I see no reason to alter the pipeline at all. For one thing, 
many optimizations are capable of performing several (or many) different 
transformations, and these often don't all fall into the same categories. In 
that sense, CGSCC- and function-scope passes are not really different than 
function/loop-scope passes.

As such, I think that we should tag functions/modules with optimization levels 
so that users can control the optimization level effectively even in the case 
of LTO. We could even add pragmas allowing users to control this at finer 
granularity, and that would be a positive thing (I dislike that we currently 
force users to put functions in different files to get different optimization 
levels - my users find this annoying too). We need to give users a simple model 
to follow: optimization is associated with compiling (even if we do, in fact, 
delay it until link time).

> Also this wouldn't work with an SCC pass, as different functions in the SCC 
> can have different level, it gets quite tricky. It also becomes problematic 
> for any module level pass: `Dead Global Elimination` would need to leave out 
> global variable that comes from a module compiled with 
> https://reviews.llvm.org/owners/package/1/, same with `Merge Duplicate Global 
> Constants` (I think the issue with `GlobalVariable` affects also Os/Oz by the 
> way).

Yes, we should do this. I don't understand why this is tricky. Actually, I 
think having these kinds of decisions explicit in the code of the 
transformations would be a positive development.


https://reviews.llvm.org/D30920



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30920: Do not pass -Os and -Oz to the Gold plugin

2017-03-24 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30920#710329, @mehdi_amini wrote:

> In https://reviews.llvm.org/D30920#710209, @hfinkel wrote:
>
> > Yes, we should do this. I don't understand why this is tricky. Actually, I 
> > think having these kinds of decisions explicit in the code of the 
> > transformations would be a positive development.
>
>
> I think a high reason why I consider this tricky, it the part where I 
> mentioned that a single pass can have different behavior/level depending on 
> where it is in the pipeline.
>  But recently @joerg split SimplifyCFG into two wrapper passes, which may be 
> a good way of achieving what you're describing.


I agree.


https://reviews.llvm.org/D30920



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31276: Add #pragma clang fast_math

2017-03-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31276#709111, @anemet wrote:

> In https://reviews.llvm.org/D31276#708992, @hfinkel wrote:
>
> > High-level comment ;)
> >
> >   #pragma clang fast_math contract_fast(on)
> >   
> >
> > This seems a bit unfortunate because 'fast' appears twice? How are we 
> > planning on naming the other fast-math flags? Maybe we should just name it:
>
>
> This is just pure laziness on my part, I was hoping that all these flags can 
> be implemented with on/off but if you find it confusing that's a good 
> indicator.
>
> > 
> > 
> >   #pragma clang math constract_fast(on)
> >
> > 
> > or
> > 
> >   #pragma clang math contract(fast) // we could also accept off/on here for 
> > consistency and compatibility with the standard pragma
> >
> > 
> > or maybe fp_math or floating_point_math or floating_point or fp instead of 
> > math.
> > 
> > I think that I prefer this last form (because it does not repeat 'fast' and 
> > also makes our extension a pure superset of the standard pragma).
> > 
> > What do you want to name the other flags? I'd prefer if they're 
> > grammatically consistent. Maybe we should stick closely to the command-line 
> > options, and have:
> > 
> >   fp_contract(on/off/fast)
> >   unsafe_optimizations(on/off)
> >   finite_only(on/off)
> >
> > 
> > What do you think?
>
> I really like #pragma clang fp or fp_math because contraction feels different 
> from the other fast-math flags.  That said then we don't want to repeat fp in 
> fp_contract.
>
> We should probably have the full list to make sure it works though with all 
> the FMFs.  Here is a straw-man proposal:
>
>   UnsafeAlgebra  #pragma clang fp unsafe_optimizations(on/off)
>   NoNaNs #pragma clang fp no_nans(on/off)
>   NoInfs#pragma clang fp finite_only(on/off)
>   NoSignedZeros #pragma clang fp no_signed_zeros(on/off)
>   AllowReciprocal#pragma clang fp reciprocal_math
>   AllowContract   #pragma clang fp contract(on/off/fast)
>
>
> The negative ones feel a bit strange...   What do you think?


I agree. The negative ones feel a bit strange. Why should we have no_nans(on) 
instead of nans(off)? However, I feel that the negative sense is less ambiguous 
- they better match how I think about it and makes a strict environment one 
where all of these are 'off', and I like that consistency. In short, I like 
this proposal.


https://reviews.llvm.org/D31276



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30806: [nonnull] Teach Clang to attach the nonnull LLVM attribute to declarations and calls instead of just definitions, and then teach it to *not* attach such attributes even if the source c

2017-03-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D30806



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30923: Warn on enum assignment to bitfields that can't fit all values

2017-03-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30923#700780, @rnk wrote:

> In https://reviews.llvm.org/D30923#700708, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D30923#700696, @rnk wrote:
> >
> > > Do you think it's worth indicating that the error can be suppressed with 
> > > an explicit cast, or is that wasted space?
> >
> >
> > What might this look like? Also, I don't see a regression test for this.
>
>
> The warning only looks through implicit casts and paren exprs, so it could 
> look like this:  `f.two_bits = (unsigned)three_bits;` I added a test for it.


Great, thanks!




Comment at: lib/Sema/SemaChecking.cpp:8765
+TypeSourceInfo *TSI = Bitfield->getTypeSourceInfo();
+SourceRange TypeRange = TSI ? TSI->getTypeLoc().getSourceRange() : 
SourceRange();
+S.Diag(Bitfield->getTypeSpecStartLoc(), 
diag::note_change_bitfield_sign)

Line is too long.


https://reviews.llvm.org/D30923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30898: Add new -fverbose-asm that enables source interleaving

2017-03-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> Note that in contrast to the original suggestion -fsource-asm here we use the 
> preferred -fverbose-asm. Basically explicitly saying -fverbose-asm in the 
> command line enables a minimum amount of debugging, so in AsmPrinter we can 
> use it to print the source code.

-fverbose-asm seems like a good name to me, but that's taken already. This 
feature should imply -fverbose-asm (that's where the value-add is over having 
objdump do this). But we don't always want this when we enable -fverbose-asm. 
How about -fsource-interleaved-asm for this feature?

> This patch introduces a -masm-source flag for cc1 that maps to the AsmSource 
> value in the llvm code generation.

I see no reason to use a different flag name for cc1. Just use the same flag 
(tag it with `Flags<[CC1Option]` in the .td file).




Comment at: lib/Driver/ToolChains/Clang.cpp:2754
+  // assembler output with debug directives.
+  if (Args.hasArg(options::OPT_fverbose_asm)) {
+CmdArgs.push_back("-masm-source");

I think that we should factor this out a bit. This feature is not the only one 
with this problem. The optimization reporting features also have this property 
(they need to enable debug info for some reason other than an actual desire to 
embed debug info in the resulting binaries). I think that we should add some 
separate feature, which this can toggle, but that optimization reporting can 
also use, to avoid actually generating debug info when only needed by these 
features.



Comment at: lib/Frontend/CompilerInvocation.cpp:855
 NeedLocTracking = true;
 
   if (Arg *A = Args.getLastArg(OPT_Rpass_EQ)) {

This is where you'd enable debug line-table info when needed by this feature.


https://reviews.llvm.org/D30898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30920: Do not pass -Os and -Oz to the Gold plugin

2017-03-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30920#700741, @mehdi_amini wrote:

> In https://reviews.llvm.org/D30920#700574, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D30920#700557, @mehdi_amini wrote:
> >
> > > In https://reviews.llvm.org/D30920#700433, @tejohnson wrote:
> > >
> > > > In https://reviews.llvm.org/D30920#700133, @pcc wrote:
> > > >
> > > > > In https://reviews.llvm.org/D30920#700077, @tejohnson wrote:
> > > > >
> > > > > > Until everything is converted to using size attributes, it seems 
> > > > > > like a correct fix for the bug is to accept these options in the 
> > > > > > gold-plugin and pass through the LTO API to the PassManagerBuilder.
> > > > >
> > > > >
> > > > > Not necessarily. There is no requirement (from a correctness 
> > > > > perspective) that `-Os` at link time should exactly match the 
> > > > > behaviour of `-Os` at compile time.
> > > >
> > > >
> > > > Sure, but there is certainly a perception that optimization flags 
> > > > affecting the non-LTO pipeline should similarly affect the LTO 
> > > > pipeline. LTO should be transparent to the user, so if -Os behaves one 
> > > > way without LTO, it seems problematic to me if it behaves a different 
> > > > way with LTO.
> > > >
> > > > That being said, agree that the best way to enforce that is to pass the 
> > > > relevant flags through the IR. (On the flip side, if the user passes 
> > > > -O1 to the link step, it does get passed through to the plugin and 
> > > > affects the LTO optimization pipeline...)
> > >
> > >
> > > I agree that I don't like the discrepancy: the driver should *not* drop 
> > > -Os silently if it passes down -O1/-O2/-O3, a warning is the minimum.
> >
> >
> > I don't like the discrepancy either, and I agree that we should be passing 
> > these other flags through the IR as well (even though, in the face of 
> > inlining, there is some ambiguity as to what the flags would mean). That 
> > having been said, I don't see the value in the warning. Forcing users to 
> > endure a warning solely because they use LTO and use -Os or -Oz for all of 
> > their compilation steps, is not friendly.
>
>
> The warning here is only about the *link* step.
>
> > The information has been captured already so there's nothing to warn about. 
> > You might worry about the opposite situation (the user uses only -Os or -Oz 
> > on the link step, but not for the compile steps), and that will have no 
> > effect. That, however, should be the expected behavior (optimization is 
> > associated with compiling, not linking, except perhaps for specifically 
> > called-out exceptions). The fact that our other optimization level don't 
> > work that way is a bug, not a feature, that we should fix instead of 
> > further exposing to our users.
>
> Yes, the issue is only about how the driver accepts Os for the link even 
> though it has no effect 
> (O0/https://reviews.llvm.org/owners/package/1//https://reviews.llvm.org/owners/package/2//https://reviews.llvm.org/owners/package/3/
>  *will* have an effect though).


Do we agree that the desired endpoint here is that all optimization flags are 
encoded in the IR somehow? If so, in this desired end state, will it will be 
true that -O[n] will have some affect on an LTO link step?


https://reviews.llvm.org/D30920



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30806: [nonnull] Teach Clang to attach the nonnull LLVM attribute to declarations and calls instead of just definitions, and then teach it to *not* attach such attributes even if the source c

2017-03-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: include/clang/AST/ASTContext.h:1868
+  bool *OverrideNonnullReturn = nullptr,
+  unsigned *OverrideNonnullArgs = nullptr,
   unsigned *IntegerConstantArgs = nullptr) const;

chandlerc wrote:
> hfinkel wrote:
> > It seems like you had to touch more code than necessary because you decided 
> > to add these parameters before IntegerConstantArgs. Why don't you add them 
> > afterward instead?
> It seemed more natural But I'm happy to change the order. I'm not worreid 
> about how much code I have to touch, I'd rather just get the options in the 
> right place. What order would you suggest?
I suggest the other order because it is more common to want the 
IntegerConstantArgs than the nonnull args (which essentially only one caller 
wants).


https://reviews.llvm.org/D30806



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30920: Do not pass -Os and -Oz to the Gold plugin

2017-03-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30920#703083, @mehdi_amini wrote:

> In https://reviews.llvm.org/D30920#703082, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D30920#700741, @mehdi_amini wrote:
> >
> > > Yes, the issue is only about how the driver accepts Os for the link even 
> > > though it has no effect 
> > > (O0/https://reviews.llvm.org/owners/package/1//https://reviews.llvm.org/owners/package/2//https://reviews.llvm.org/owners/package/3/
> > >  *will* have an effect though).
> >
> >
> > Do we agree that the desired endpoint here is that all optimization flags 
> > are encoded in the IR somehow? If so, in this desired end state, will it 
> > will be true that -O[n] will have some affect on an LTO link step?
>
>
> I don't think we have any plan for the optimization *level* (1, 2, and 3), at 
> least not that I know of. I don't even see how it would be possible to 
> achieve it in a principled way without limiting what we're allowing ourself 
> to express at the PassManager level with these levels.


Please elaborate.

> We can handle O0 (optnone), Os (optsize), and Oz (minsize) separately though, 
> because we can make these independent of the passmanager setup.

I'm trying to figure out if these are really fundamentally different from 1,2,3 
or if we just happen to handle them differently right now.


https://reviews.llvm.org/D30920



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30415: Fix -mno-altivec cannot overwrite -maltivec option

2017-03-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30415#703398, @echristo wrote:

> Different suggestion:
>
> Remove the faltivec option. Even gcc doesn't support it anymore afaict.


What are you suggesting? Always having the language extensions on? Or 
explicitly tying the language extensions to the underlying target feature?

> (Go ahead and commit the zvector part if you'd like).
> 
> -eric




https://reviews.llvm.org/D30415



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30806: [nonnull] Teach Clang to attach the nonnull LLVM attribute to declarations and calls instead of just definitions, and then teach it to *not* attach such attributes even if the source c

2017-03-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: include/clang/AST/ASTContext.h:1865
   /// arguments to the builtin that are required to be integer constant
   /// expressions.
   QualType GetBuiltinType(unsigned ID, GetBuiltinTypeError ,

Please add some description of the new parameters.



Comment at: include/clang/AST/ASTContext.h:1868
+  bool *OverrideNonnullReturn = nullptr,
+  unsigned *OverrideNonnullArgs = nullptr,
   unsigned *IntegerConstantArgs = nullptr) const;

It seems like you had to touch more code than necessary because you decided to 
add these parameters before IntegerConstantArgs. Why don't you add them 
afterward instead?


https://reviews.llvm.org/D30806



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31561: cmath: Skip Libc for integral types in isinf, etc.

2017-04-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: libcxx/include/math.h:354
 inline _LIBCPP_INLINE_VISIBILITY
-typename std::enable_if::value, bool>::type
+typename std::enable_if::value, bool>::type
 signbit(_A1 __lcpp_x) _NOEXCEPT

I think that it would be safer / easier to understand if we used 
!std::is_integral here. It is true for all standard types that 
std::is_arithmetic implies is_integral or is_floating_point, but it seems 
likely that some implementations have arithemtic types that are neither (e.g. 
fixed-point types), and user-defined types seem likely.



Comment at: libcxx/include/math.h:400
+inline _LIBCPP_INLINE_VISIBILITY
+typename std::enable_if::value, int>::type
+fpclassify(_A1 __lcpp_x) _NOEXCEPT

Maybe we should predicate this, and other infinity-related functions, on 
std::numeric_limits<_A1>::has_infinity. That seems more general, and safer for 
potential user-defined types.


https://reviews.llvm.org/D31561



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31167: Use FPContractModeKind universally

2017-04-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638
 Opts.LaxVectorConversions = 0;
-Opts.DefaultFPContract = 1;
+Opts.setDefaultFPContractMode(LangOptions::FPC_On);
 Opts.NativeHalfType = 1;

yaxunl wrote:
> hfinkel wrote:
> > Looks like the intent is certainly to have kept the OpenCL default the same 
> > here.
> Well we also use clang to compile LLVM IR to ISA. Before this change, 
> fp-contract was on so that backend did fp contract. However, now clang does 
> not turn on fp-contract when compiling LLVM IRs.
That's the equivalent of 'Fast'. Maybe this should also be set to Fast then.


Repository:
  rL LLVM

https://reviews.llvm.org/D31167



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31167: Use FPContractModeKind universally

2017-04-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638
 Opts.LaxVectorConversions = 0;
-Opts.DefaultFPContract = 1;
+Opts.setDefaultFPContractMode(LangOptions::FPC_On);
 Opts.NativeHalfType = 1;

Looks like the intent is certainly to have kept the OpenCL default the same 
here.


Repository:
  rL LLVM

https://reviews.llvm.org/D31167



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31561: cmath: Skip Libc for integral types in isinf, etc.

2017-04-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a subscriber: mclow.lists.
hfinkel added a comment.

I'm happy with this now. @EricWF , @mclow.lists , thoughts?


https://reviews.llvm.org/D31561



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31167: Use FPContractModeKind universally

2017-04-11 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220
+/// \brief FP_CONTRACT mode (on/off/fast).
+ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP 
contraction type")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")

yaxunl wrote:
> anemet wrote:
> > yaxunl wrote:
> > > anemet wrote:
> > > > yaxunl wrote:
> > > > > This change seemed to cause some performance degradations in some 
> > > > > OpenCL applications.
> > > > > 
> > > > > This option used to be on by default. Now it is off by default.
> > > > > 
> > > > > There are applications do separated compile/link/codegen stages. In 
> > > > > the codegen stage, clang is invoked with .bc as input, therefore this 
> > > > > option is off by default, whereas it was on by default before this 
> > > > > change.
> > > > > 
> > > > > Is there any reason not to keep the original behavior?
> > > > > 
> > > > > Thanks.
> > > > > This change seemed to cause some performance degradations in some 
> > > > > OpenCL applications.
> > > > > 
> > > > > This option used to be on by default. Now it is off by default.
> > > > 
> > > > It's always been off.  It was set to fast for CUDA which should still 
> > > > be the case.  See the changes further down on the patch.
> > > > 
> > > > > 
> > > > > There are applications do separated compile/link/codegen stages. In 
> > > > > the codegen stage, clang is invoked with .bc as input, therefore this 
> > > > > option is off by default, whereas it was on by default before this 
> > > > > change.
> > > > > 
> > > > > Is there any reason not to keep the original behavior?
> > > > 
> > > > Sorry but I am not sure what changed, can you elaborate on what you're 
> > > > doing and how things used to work for you?
> > > Before your change, -ffp-contract was a codegen option defined as
> > > 
> > > 
> > > ```
> > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On)
> > > ```
> > > 
> > > therefore the default value as on. After your change, it becomes off by 
> > > default.
> > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend 
> > option.  I still don't understand your use-case.  Can you include a small 
> > testcase how this used to work before?
> -ffp-contract=on used to be a codegen option before this change. In 
> CodeGen/BackendUtil.cpp, before this change:
> 
> ```
> switch (CodeGenOpts.getFPContractMode()) {
>   switch (LangOpts.getDefaultFPContractMode()) {
>   case LangOptions::FPC_Off:
> Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> break;
>   case LangOptions::FPC_On:
> Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> break;
>   case LangOptions::FPC_Fast:
> Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> break;
>   }
> ```
> By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard in 
> the backend. This options allows backend to translate llvm.fmuladd to FMA 
> machine instructions.
> 
> After this change, it becomes:
> 
> ```
> switch (LangOpts.getDefaultFPContractMode()) {
>   case LangOptions::FPC_Off:
> Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> break;
>   case LangOptions::FPC_On:
> Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> break;
>   case LangOptions::FPC_Fast:
> Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> break;
>   }
> ```
> now by default -ffp-contract=off, which results in  llvm::FPOpFusion::Strict 
> in the backend. This option disables backend translating llvm.fmuladd to FMA 
> machine instructions in certain situations.
> 
> A simple lit test is as follows:
> 
> 
> ```
> ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s
> 
> define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double %a, 
> double %b, double %c) local_unnamed_addr #0 {
> entry:
>   ; CHECK: v_fma_f64
>   %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a)
>   store double %0, double addrspace(1)* %out, align 8, !tbaa !6
>   ret void
> }
> 
> declare double @llvm.fmuladd.f64(double, double, double) #1
> 
> attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" 
> "disable-tail-calls"="false" "less-precise-fpmad"="false" 
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" 
> "no-jump-tables"="false" "no-nans-fp-math"="false" 
> "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" 
> "stack-protector-buffer-size"="8" 
> "target-features"="+fp64-fp16-denormals,-fp32-denormals" 
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> attributes #1 = { nounwind readnone }
> 
> ```
> The test passes before this change and fails after this change.
I'm missing something here. We're calling for OpenCL:

  Opts.setDefaultFPContractMode(LangOptions::FPC_On);

which seems like it should change the result returned by 
getDefaultFPContractMode(). Why does it not?



Repository:
  rL LLVM

https://reviews.llvm.org/D31167




[PATCH] D31167: Use FPContractModeKind universally

2017-04-11 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220
+/// \brief FP_CONTRACT mode (on/off/fast).
+ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP 
contraction type")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")

yaxunl wrote:
> hfinkel wrote:
> > yaxunl wrote:
> > > anemet wrote:
> > > > yaxunl wrote:
> > > > > anemet wrote:
> > > > > > yaxunl wrote:
> > > > > > > This change seemed to cause some performance degradations in some 
> > > > > > > OpenCL applications.
> > > > > > > 
> > > > > > > This option used to be on by default. Now it is off by default.
> > > > > > > 
> > > > > > > There are applications do separated compile/link/codegen stages. 
> > > > > > > In the codegen stage, clang is invoked with .bc as input, 
> > > > > > > therefore this option is off by default, whereas it was on by 
> > > > > > > default before this change.
> > > > > > > 
> > > > > > > Is there any reason not to keep the original behavior?
> > > > > > > 
> > > > > > > Thanks.
> > > > > > > This change seemed to cause some performance degradations in some 
> > > > > > > OpenCL applications.
> > > > > > > 
> > > > > > > This option used to be on by default. Now it is off by default.
> > > > > > 
> > > > > > It's always been off.  It was set to fast for CUDA which should 
> > > > > > still be the case.  See the changes further down on the patch.
> > > > > > 
> > > > > > > 
> > > > > > > There are applications do separated compile/link/codegen stages. 
> > > > > > > In the codegen stage, clang is invoked with .bc as input, 
> > > > > > > therefore this option is off by default, whereas it was on by 
> > > > > > > default before this change.
> > > > > > > 
> > > > > > > Is there any reason not to keep the original behavior?
> > > > > > 
> > > > > > Sorry but I am not sure what changed, can you elaborate on what 
> > > > > > you're doing and how things used to work for you?
> > > > > Before your change, -ffp-contract was a codegen option defined as
> > > > > 
> > > > > 
> > > > > ```
> > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On)
> > > > > ```
> > > > > 
> > > > > therefore the default value as on. After your change, it becomes off 
> > > > > by default.
> > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a 
> > > > backend option.  I still don't understand your use-case.  Can you 
> > > > include a small testcase how this used to work before?
> > > -ffp-contract=on used to be a codegen option before this change. In 
> > > CodeGen/BackendUtil.cpp, before this change:
> > > 
> > > ```
> > > switch (CodeGenOpts.getFPContractMode()) {
> > >   switch (LangOpts.getDefaultFPContractMode()) {
> > >   case LangOptions::FPC_Off:
> > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> > > break;
> > >   case LangOptions::FPC_On:
> > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> > > break;
> > >   case LangOptions::FPC_Fast:
> > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> > > break;
> > >   }
> > > ```
> > > By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard 
> > > in the backend. This options allows backend to translate llvm.fmuladd to 
> > > FMA machine instructions.
> > > 
> > > After this change, it becomes:
> > > 
> > > ```
> > > switch (LangOpts.getDefaultFPContractMode()) {
> > >   case LangOptions::FPC_Off:
> > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> > > break;
> > >   case LangOptions::FPC_On:
> > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> > > break;
> > >   case LangOptions::FPC_Fast:
> > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> > > break;
> > >   }
> > > ```
> > > now by default -ffp-contract=off, which results in  
> > > llvm::FPOpFusion::Strict in the backend. This option disables backend 
> > > translating llvm.fmuladd to FMA machine instructions in certain 
> > > situations.
> > > 
> > > A simple lit test is as follows:
> > > 
> > > 
> > > ```
> > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s
> > > 
> > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double 
> > > %a, double %b, double %c) local_unnamed_addr #0 {
> > > entry:
> > >   ; CHECK: v_fma_f64
> > >   %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a)
> > >   store double %0, double addrspace(1)* %out, align 8, !tbaa !6
> > >   ret void
> > > }
> > > 
> > > declare double @llvm.fmuladd.f64(double, double, double) #1
> > > 
> > > attributes #0 = { nounwind 
> > > "correctly-rounded-divide-sqrt-fp-math"="false" 
> > > "disable-tail-calls"="false" "less-precise-fpmad"="false" 
> > > "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" 
> > > "no-jump-tables"="false" "no-nans-fp-math"="false" 
> > > "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" 
> > > "stack-protector-buffer-size"="8" 
> > > 

[PATCH] D31167: Use FPContractModeKind universally

2017-04-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220
+/// \brief FP_CONTRACT mode (on/off/fast).
+ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP 
contraction type")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")

anemet wrote:
> hfinkel wrote:
> > yaxunl wrote:
> > > hfinkel wrote:
> > > > yaxunl wrote:
> > > > > anemet wrote:
> > > > > > yaxunl wrote:
> > > > > > > anemet wrote:
> > > > > > > > yaxunl wrote:
> > > > > > > > > This change seemed to cause some performance degradations in 
> > > > > > > > > some OpenCL applications.
> > > > > > > > > 
> > > > > > > > > This option used to be on by default. Now it is off by 
> > > > > > > > > default.
> > > > > > > > > 
> > > > > > > > > There are applications do separated compile/link/codegen 
> > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as 
> > > > > > > > > input, therefore this option is off by default, whereas it 
> > > > > > > > > was on by default before this change.
> > > > > > > > > 
> > > > > > > > > Is there any reason not to keep the original behavior?
> > > > > > > > > 
> > > > > > > > > Thanks.
> > > > > > > > > This change seemed to cause some performance degradations in 
> > > > > > > > > some OpenCL applications.
> > > > > > > > > 
> > > > > > > > > This option used to be on by default. Now it is off by 
> > > > > > > > > default.
> > > > > > > > 
> > > > > > > > It's always been off.  It was set to fast for CUDA which should 
> > > > > > > > still be the case.  See the changes further down on the patch.
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > There are applications do separated compile/link/codegen 
> > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as 
> > > > > > > > > input, therefore this option is off by default, whereas it 
> > > > > > > > > was on by default before this change.
> > > > > > > > > 
> > > > > > > > > Is there any reason not to keep the original behavior?
> > > > > > > > 
> > > > > > > > Sorry but I am not sure what changed, can you elaborate on what 
> > > > > > > > you're doing and how things used to work for you?
> > > > > > > Before your change, -ffp-contract was a codegen option defined as
> > > > > > > 
> > > > > > > 
> > > > > > > ```
> > > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On)
> > > > > > > ```
> > > > > > > 
> > > > > > > therefore the default value as on. After your change, it becomes 
> > > > > > > off by default.
> > > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a 
> > > > > > backend option.  I still don't understand your use-case.  Can you 
> > > > > > include a small testcase how this used to work before?
> > > > > -ffp-contract=on used to be a codegen option before this change. In 
> > > > > CodeGen/BackendUtil.cpp, before this change:
> > > > > 
> > > > > ```
> > > > > switch (CodeGenOpts.getFPContractMode()) {
> > > > >   switch (LangOpts.getDefaultFPContractMode()) {
> > > > >   case LangOptions::FPC_Off:
> > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> > > > > break;
> > > > >   case LangOptions::FPC_On:
> > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> > > > > break;
> > > > >   case LangOptions::FPC_Fast:
> > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> > > > > break;
> > > > >   }
> > > > > ```
> > > > > By default, -fp-contract=on, which results in 
> > > > > llvm::FPOpFusion::Standard in the backend. This options allows 
> > > > > backend to translate llvm.fmuladd to FMA machine instructions.
> > > > > 
> > > > > After this change, it becomes:
> > > > > 
> > > > > ```
> > > > > switch (LangOpts.getDefaultFPContractMode()) {
> > > > >   case LangOptions::FPC_Off:
> > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> > > > > break;
> > > > >   case LangOptions::FPC_On:
> > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> > > > > break;
> > > > >   case LangOptions::FPC_Fast:
> > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> > > > > break;
> > > > >   }
> > > > > ```
> > > > > now by default -ffp-contract=off, which results in  
> > > > > llvm::FPOpFusion::Strict in the backend. This option disables backend 
> > > > > translating llvm.fmuladd to FMA machine instructions in certain 
> > > > > situations.
> > > > > 
> > > > > A simple lit test is as follows:
> > > > > 
> > > > > 
> > > > > ```
> > > > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s
> > > > > 
> > > > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, 
> > > > > double %a, double %b, double %c) local_unnamed_addr #0 {
> > > > > entry:
> > > > >   ; CHECK: v_fma_f64
> > > > >   %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, 
> > > > > double %a)
> > > > >   store double %0, double addrspace(1)* %out, align 8, !tbaa !6
> > > > >   

[PATCH] D31413: [libc++] Use __attribute__((init_priority(101))) to ensure streams get initialized early

2017-04-11 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31413#712013, @aaron.ballman wrote:

> I'm not certain of a good way to test it, but I have a question about the 
> value you picked for `init_priority`. My understanding of the values starting 
> from 101 is that 1-100 are reserved for implementation use. Is that 
> understanding correct? If so, you may want to pick a value below 100 to 
> ensure there's not an arms race with the user. I believe this may require 
> some alteration to SemaDeclAttr.cpp to not diagnose when this is coming from 
> a system header. Otherwise, what's to stop the user from having something 
> marked `constructor(101)` that attempts to use a stream, but can't because 
> they're not initialized yet?


I agree; using 100 probably makes sense here. By spec, that's before any 
regular user code might execute.

Regarding testing, it is easy to test this (i.e. use cout/cerr/etc. from a 
constructor of a global object) if you link with libc++.a. Maybe just add a 
simple test and then we can see about setting up a buildbot which builds/tests 
a static libc++?


https://reviews.llvm.org/D31413



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-21 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#732737, @rsmith wrote:

> In https://reviews.llvm.org/D32199#732189, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D32199#731472, @rsmith wrote:
> >
> > > 1. C's "effective type" rule allows writes to set the type pretty much 
> > > unconditionally, unless the storage is for a variable with a declared type
> >
> >
> > To come back to this point: We don't really implement these rules now, and 
> > it is not clear that we will. The problem here is that, if we take the 
> > specification literally, then we can't use our current TBAA at all. The 
> > problem is that if we have:
> >
> >   write x, !tbaa "int"
> >   read x, !tbaa "int"
> >   write x, !tbaa "float"
> >   
> >
> > TBAA will currently tell us that the "float" write aliases with neither the 
> > preceding read nor the preceding write.
>
>
> Right, C's TBAA rules do not (in general) permit a store to be reordered 
> before a memory operation of a different type, they only allow loads to be 
> moved before stores. (Put another way, they do not tell you that pointers 
> point to distinct memory locations, just that a stored value cannot be 
> observed by a load of a different type.) You get the more general "distinct 
> memory locations" result only for objects of a declared type.
>
> C++ is similar, except that (because object lifetimes do not currently begin 
> magically due to a store) you /can/ reorder stores past a memory operation of 
> a different type if you know no object's lifetime began in between. (But 
> currently we do not record all lifetime events in IR, so we can't do that 
> today. Also, we may be about to lose the property that you can statically 
> determine a small number of places that might start an object lifetime.)
>
> > Also, a strict reading of C's access rules seems to rule out the premise 
> > underlying our struct-path TBAA entirely. So long as I'm accessing a value 
> > using a struct that has some member, including recursively, with that type, 
> > then it's fine. The matching of the relative offsets is a sufficient, but 
> > not necessary, condition for well-defined access. C++ has essentially the 
> > same language (and, thus, potentially the same problem).
>
> I agree this rule is garbage, but it's not as permissive as I think you're 
> suggesting. The rule says that you can use an lvalue of struct type to access 
> memory of struct field type. In C this happens during struct assignment, for 
> instance. It does *not* permit using an lvalue of struct field type to access 
> unrelated fields of the same struct. So C appears to allow this nonsense:
>
>   char *p = malloc(8);
>   *(int*)p = 0;
>   *(int*)(p + 4) = 0;
>   struct S {int n; float f;} s = *(struct S*)p; // use lvalue of type `struct 
> S` to access object of effective type `int`, to initialize a `float`
>
>
> but not this nonsense:
>
>   float q = ((struct S*)p)->f; // ub, cannot use lvalue of type `float` to 
> access object of effective type `int`
>
>
> ... which just means that we can't make much use of TBAA when emitting struct 
> copies in C.
>
> In C++, on the other hand, the rule is even more garbage, since there is no 
> way to perform a memory access with a glvalue of class type. (The closest you 
> get is that a defaulted union construction/assignment copies the object 
> representation, but that's expressed in terms of copying a sequence of 
> unsigned chars, and in any case those are member functions and so already 
> require an object of the correct type to exist.) See wg21.link/cwg2051


Our struct-path TBAA does the following:

  struct X { int a, b; };
  X x { 50, 100 };
  X *o = (X*) (((int*) ) + 1);
  
  int a_is_b = o->a; // This is UB (or so we say)?


Because we assume that the (type, offset) tuples are identified entities in the 
type-aliasing tree. Practically speaking, this certainly makes sense to me. 
However, I don't see anything in the language that actually forbids this 
behavior. In case it matters, because in the above case the type of the struct 
actually matches, we similarly forbid:

  struct X { int a, b; };
  struct Y { int a; float b; };
  X x { 50, 100 };
  Y *o = (X*) (((int*) ) + 1);
  
  int a_is_b = o->a; // This is UB (or so we say)?

as is this:

  struct X { int a, b; };
  struct Y { int a; float b; X h; /* in case this matters for the aggregate 
members thing */ };
  X x { 50, 100 };
  Y *o = (X*) (((int*) ) + 1);
  
  int a_is_b = o->a; // This is UB (or so we say)?

(although, as you say, this shouldn't matter in C++ because we don't have 
struct glvalues)

In any case, am I missing something?

> 
> 
>> While I'd like the sanitizer to follow the rules, that's really useful only 
>> to the extent that the compilers follow the rules. If the compilers are 
>> making stronger assumptions, then I think that the sanitizer should also.
> 
> I agree.
> 
>>> If we want to follow the relevant language rules by default, that would 
>>> suggest that "writes 

[PATCH] D32199: [TySan] A Type Sanitizer (Clang)

2017-04-21 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel updated this revision to Diff 96135.
hfinkel retitled this revision from "[TBAASan] A TBAA Sanitizer (Clang)" to 
"[TySan] A Type Sanitizer (Clang)".
hfinkel edited the summary of this revision.
hfinkel added a comment.

Rename TBAASanitizer -> TypeSanitizer


https://reviews.llvm.org/D32199

Files:
  include/clang/Basic/Sanitizers.def
  include/clang/Driver/SanitizerArgs.h
  lib/CodeGen/BackendUtil.cpp
  lib/CodeGen/CGDeclCXX.cpp
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenTBAA.cpp
  lib/Driver/SanitizerArgs.cpp
  lib/Driver/ToolChains/CommonArgs.cpp
  lib/Driver/ToolChains/Linux.cpp
  lib/Lex/PPMacroExpansion.cpp
  test/CodeGen/sanitize-type-attr.cpp
  test/Driver/sanitizer-ld.c

Index: test/Driver/sanitizer-ld.c
===
--- test/Driver/sanitizer-ld.c
+++ test/Driver/sanitizer-ld.c
@@ -181,6 +181,18 @@
 
 // RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
+// RUN: -fsanitize=type \
+// RUN: -resource-dir=%S/Inputs/resource_dir \
+// RUN: --sysroot=%S/Inputs/basic_linux_tree \
+// RUN:   | FileCheck --check-prefix=CHECK-TYSAN-LINUX-CXX %s
+//
+// CHECK-TYSAN-LINUX-CXX: "{{(.*[^-.0-9A-Z_a-z])?}}ld{{(.exe)?}}"
+// CHECK-TYSAN-LINUX-CXX-NOT: stdc++
+// CHECK-TYSAN-LINUX-CXX: "-whole-archive" "{{.*}}libclang_rt.tysan-x86_64.a" "-no-whole-archive"
+// CHECK-TYSAN-LINUX-CXX: stdc++
+
+// RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
+// RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
 // RUN: -fsanitize=memory \
 // RUN: -resource-dir=%S/Inputs/resource_dir \
 // RUN: --sysroot=%S/Inputs/basic_linux_tree \
Index: test/CodeGen/sanitize-type-attr.cpp
===
--- /dev/null
+++ test/CodeGen/sanitize-type-attr.cpp
@@ -0,0 +1,64 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s | FileCheck -check-prefix=WITHOUT %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=type | FileCheck -check-prefix=TYSAN %s
+// RUN: echo "src:%s" | sed -e 's/\\//g' > %t
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=type -fsanitize-blacklist=%t | FileCheck -check-prefix=BL %s
+
+// The sanitize_type attribute should be attached to functions
+// when TypeSanitizer is enabled, unless no_sanitize("type") attribute
+// is present.
+
+// WITHOUT:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// TYSAN:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("type")))
+int NoTYSAN1(int *a) { return *a; }
+
+// WITHOUT:  NoTYSAN2{{.*}}) [[NOATTR]]
+// BL:  NoTYSAN2{{.*}}) [[NOATTR]]
+// TYSAN:  NoTYSAN2{{.*}}) [[NOATTR]]
+__attribute__((no_sanitize("type")))
+int NoTYSAN2(int *a);
+int NoTYSAN2(int *a) { return *a; }
+
+// WITHOUT:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// TYSAN:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("type")))
+int NoTYSAN3(int *a) { return *a; }
+
+// WITHOUT:  TYSANOk{{.*}}) [[NOATTR]]
+// BL:  TYSANOk{{.*}}) [[NOATTR]]
+// TYSAN: TYSANOk{{.*}}) [[WITH:#[0-9]+]]
+int TYSANOk(int *a) { return *a; }
+
+// WITHOUT:  TemplateTYSANOk{{.*}}) [[NOATTR]]
+// BL:  TemplateTYSANOk{{.*}}) [[NOATTR]]
+// TYSAN: TemplateTYSANOk{{.*}}) [[WITH]]
+template
+int TemplateTYSANOk() { return i; }
+
+// WITHOUT:  TemplateNoTYSAN{{.*}}) [[NOATTR]]
+// BL:  TemplateNoTYSAN{{.*}}) [[NOATTR]]
+// TYSAN: TemplateNoTYSAN{{.*}}) [[NOATTR]]
+template
+__attribute__((no_sanitize("type")))
+int TemplateNoTYSAN() { return i; }
+
+int force_instance = TemplateTYSANOk<42>()
+   + TemplateNoTYSAN<42>();
+
+// Check that __cxx_global_var_init* get the sanitize_type attribute.
+int global1 = 0;
+int global2 = *(int*)((char*)+1);
+// WITHOUT: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// BL: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// TYSAN: @__cxx_global_var_init{{.*}}[[WITH:#[0-9]+]]
+
+// WITHOUT: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+
+// BL: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+
+// TYSAN: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+// TYSAN: attributes [[WITH]] = { noinline nounwind sanitize_type{{.*}} }
+
+// TYSAN: Simple C++ TBAA
Index: lib/Lex/PPMacroExpansion.cpp
===
--- lib/Lex/PPMacroExpansion.cpp
+++ lib/Lex/PPMacroExpansion.cpp
@@ -1137,6 +1137,7 @@
   .Case("dataflow_sanitizer", LangOpts.Sanitize.has(SanitizerKind::DataFlow))
   .Case("efficiency_sanitizer",
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Efficiency))
+  .Case("type_sanitizer", LangOpts.Sanitize.has(SanitizerKind::Type))
   // Objective-C features
   .Case("objc_arr", LangOpts.ObjCAutoRefCount) // FIXME: REMOVE?
  

[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#732382, @rjmccall wrote:

> If you're going to try to enforce the declared type of memory, you'll also 
> need something like the C effective type rule to handle char buffers in C++.  
> As far as I can tell, it's not actually legal under the spec to cast an array 
> of chars to an arbitrary type and access it that way — you have to do 
> something to establish that there's an object of that type there first. 
>  If you memcpy'ed into that buffer from an object of the right type, that 
> would be sufficient to create a new formal object of that type, but I don't 
> see any way to sensibly apply that rule to e.g. the POSIX "read" function.  
> It seems to me that you at least need to have a rule saying that it's okay to 
> access a formal object of type char/char[] using an arbitrarily-typed l-value.


I agree. That's exactly what the current implementation does (I get that for 
free from our TBAA setup). I get this for free from the TBAA scheme because the 
current checks are symmetric (just like the TBAA checks in the optimizer). I 
had wondered whether this symmetry was an over-approximation in some cases, but 
perhaps it is not.


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

I'm not sure this is the right way to do this; I suspect we're lumping together 
a bunch of different bugs:

1. vector types need to have tbaa which makes them alias with their element 
types [to be clear, as vector types are an implementation extension, this is 
our choice; I believe most users expect this to be true, but I'm certainly open 
to leaving this as-is (i.e. the vector types and scalar types as 
independent/non-aliasing)].
2. tbaa can't be used for write <-> write queries (only read <-> write queries) 
because the writes can change the effective type
3. our 'struct' path TBAA for unions is broken (and to fix this we need to 
invert the tree structure, etc. as discussed on the list)

(1) and (3) are independent bugs. (1) if desired, should just be fixed (the 
vector type should be a child of the scalar element type in the current 
representation). (2) needs to be fixed too, is not strictly related to unions, 
and needs to be fixed in the backend. Doing this does not seem hard (we just 
need to record a Write Boolean in MemoryLocation and then check them in 
TypeBasedAAResult (TBAA can't be used if both locations are writes). (3) is 
what this should address. What we should do here is only generate the scalar 
TBAA for the union accesses. As I recall, the only reason that the scalar TBAA 
for union accesses is a problem is because of (2), but that's easy to fix in 
the backend (and we need to do that anyway).

@dberlin , do you agree?


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31167: Use FPContractModeKind universally

2017-04-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220
+/// \brief FP_CONTRACT mode (on/off/fast).
+ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP 
contraction type")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")

anemet wrote:
> hfinkel wrote:
> > anemet wrote:
> > > hfinkel wrote:
> > > > yaxunl wrote:
> > > > > hfinkel wrote:
> > > > > > yaxunl wrote:
> > > > > > > anemet wrote:
> > > > > > > > yaxunl wrote:
> > > > > > > > > anemet wrote:
> > > > > > > > > > yaxunl wrote:
> > > > > > > > > > > This change seemed to cause some performance degradations 
> > > > > > > > > > > in some OpenCL applications.
> > > > > > > > > > > 
> > > > > > > > > > > This option used to be on by default. Now it is off by 
> > > > > > > > > > > default.
> > > > > > > > > > > 
> > > > > > > > > > > There are applications do separated compile/link/codegen 
> > > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc 
> > > > > > > > > > > as input, therefore this option is off by default, 
> > > > > > > > > > > whereas it was on by default before this change.
> > > > > > > > > > > 
> > > > > > > > > > > Is there any reason not to keep the original behavior?
> > > > > > > > > > > 
> > > > > > > > > > > Thanks.
> > > > > > > > > > > This change seemed to cause some performance degradations 
> > > > > > > > > > > in some OpenCL applications.
> > > > > > > > > > > 
> > > > > > > > > > > This option used to be on by default. Now it is off by 
> > > > > > > > > > > default.
> > > > > > > > > > 
> > > > > > > > > > It's always been off.  It was set to fast for CUDA which 
> > > > > > > > > > should still be the case.  See the changes further down on 
> > > > > > > > > > the patch.
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > There are applications do separated compile/link/codegen 
> > > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc 
> > > > > > > > > > > as input, therefore this option is off by default, 
> > > > > > > > > > > whereas it was on by default before this change.
> > > > > > > > > > > 
> > > > > > > > > > > Is there any reason not to keep the original behavior?
> > > > > > > > > > 
> > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on 
> > > > > > > > > > what you're doing and how things used to work for you?
> > > > > > > > > Before your change, -ffp-contract was a codegen option 
> > > > > > > > > defined as
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On)
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > therefore the default value as on. After your change, it 
> > > > > > > > > becomes off by default.
> > > > > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was 
> > > > > > > > a backend option.  I still don't understand your use-case.  Can 
> > > > > > > > you include a small testcase how this used to work before?
> > > > > > > -ffp-contract=on used to be a codegen option before this change. 
> > > > > > > In CodeGen/BackendUtil.cpp, before this change:
> > > > > > > 
> > > > > > > ```
> > > > > > > switch (CodeGenOpts.getFPContractMode()) {
> > > > > > >   switch (LangOpts.getDefaultFPContractMode()) {
> > > > > > >   case LangOptions::FPC_Off:
> > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> > > > > > > break;
> > > > > > >   case LangOptions::FPC_On:
> > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> > > > > > > break;
> > > > > > >   case LangOptions::FPC_Fast:
> > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> > > > > > > break;
> > > > > > >   }
> > > > > > > ```
> > > > > > > By default, -fp-contract=on, which results in 
> > > > > > > llvm::FPOpFusion::Standard in the backend. This options allows 
> > > > > > > backend to translate llvm.fmuladd to FMA machine instructions.
> > > > > > > 
> > > > > > > After this change, it becomes:
> > > > > > > 
> > > > > > > ```
> > > > > > > switch (LangOpts.getDefaultFPContractMode()) {
> > > > > > >   case LangOptions::FPC_Off:
> > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
> > > > > > > break;
> > > > > > >   case LangOptions::FPC_On:
> > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
> > > > > > > break;
> > > > > > >   case LangOptions::FPC_Fast:
> > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
> > > > > > > break;
> > > > > > >   }
> > > > > > > ```
> > > > > > > now by default -ffp-contract=off, which results in  
> > > > > > > llvm::FPOpFusion::Strict in the backend. This option disables 
> > > > > > > backend translating llvm.fmuladd to FMA machine instructions in 
> > > > > > > certain situations.
> > > > > > > 
> > > > > > > A simple lit test is as follows:
> > > > > > > 
> > > > > > > 
> > 

[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#731472, @rsmith wrote:

> > As I've currently implemented it, both reads and writes set the type of 
> > previously-unknown storage, and after that it says fixed (unless you memcpy 
> > to it, memset it, or its lifetime ends (the type gets reset on 
> > lifetime.start/end and for malloc/allocas/etc.). There's a flag to enable 
> > the "writes always set the type" rule, but that's not the default. Is this 
> > too strict?
>
> That seems like it will have at least three flavors of false positive:
>
> 1. C's "effective type" rule allows writes to set the type pretty much 
> unconditionally, unless the storage is for a variable with a declared type
> 2. After a placement new in C++, you should be able to use the storage as a 
> new type
> 3. Storing directly to a member access on a union (ie, with the syntax `x.a = 
> b`) in C++ permits using the storage as the new type
>
>   If we want to follow the relevant language rules by default, that would 
> suggest that "writes always set the type" should be enabled by default in C 
> and disabled by default in C++. That may not be the right decision for other 
> reasons, though. In C++, writes through union members and new-expressions 
> should probably (re)set the type


Fair enough. For now we'll default to write-sets-the-type as the default. We 
can always add 'sticky' types later to correspond to types set by declaration.

> (do you have intrinsics the frontend can use to do so?).

I had thought that we could just use a lifetime.end/start pair to mark the 
placement new, etc. However, it might be better to use some dedicated intrinsic 
for this purpose?


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> There was a deliberate decision to make TBAA conservative about type punning 
> in LLVM because, in practice, the strict form of TBAA you think we should 
> follow has proven to be a failed optimization paradigm: it just leads users 
> to globally disable TBAA, which is obviously a huge loss. This was never 
> motivated purely by the nonsensicality of the standardization of unions.

Okay, so the problem right now is that we're **not** conservative regarding 
union type punning, and while BasicAA sometimes fixes this for us, it doesn't 
always, and can't always, and so now we're miscompiling code.  Removing the 
TBAA from union accesses would make us conservative. In the long term, I think 
we agree that we need a better way of actually representing unions in our TBAA 
representation. The question is what to do in the short term. Maybe there's 
some less drastic short-term solution that addresses these test cases?

> (And the nonsensicality of the standard very much continues. The code that 
> we've all agreed is obviously being miscompiled here is still a strict 
> violation of the "improved" union rules in C++, and if we decide to treat it 
> as ill-formed we're going to break a massive amount of code (and vector code 
> in particular heavily uses unions), because trust me, nobody is reliably 
> placement-newing union fields when they're non-trivial. C++'s new rules are 
> unworkable in part because they rely on C++-specific features that cannot be 
> easily adopted in headers which need to be language-neutral.)

I think we all agree that the vector types need to have their scalar (element) 
types as parents in the current TBAA representation. That's a separate change 
that we should just make. That, unfortunately, only fixes a subset of the 
union-related miscompiles.

@rjmccall, if you agree that not adding TBAA to union accesses is reasonable at 
this stage, what do you think of this implementation of that goal?


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31885#730728, @kparzysz wrote:

> In https://reviews.llvm.org/D31885#730038, @hfinkel wrote:
>
> > I'm somewhat concerned that this patch is quadratic in the AST.
>
>
> I'd be happy to address this, but I'm not sure how.  Memoizing results could 
> be one way, but don't know if that's acceptable.
>
> This location in codegen seems to be the last place where the original C/C++ 
> types are available, in particular the information as to whether a type is a 
> union or not.  Maybe it would be possible to propagate some bit somewhere, 
> but then this patch would become much less localized.


I'm not worried particularly about localization because we want to extend TBAA 
to support unions, etc. anyway. I suppose I don't understand how the 
propagation that is necessary here differs from what is necessary to propagate 
the base-type information for structs.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#731472, @rsmith wrote:

> > As I've currently implemented it, both reads and writes set the type of 
> > previously-unknown storage, and after that it says fixed (unless you memcpy 
> > to it, memset it, or its lifetime ends (the type gets reset on 
> > lifetime.start/end and for malloc/allocas/etc.). There's a flag to enable 
> > the "writes always set the type" rule, but that's not the default. Is this 
> > too strict?
>
> That seems like it will have at least three flavors of false positive:
>
> 1. C's "effective type" rule allows writes to set the type pretty much 
> unconditionally, unless the storage is for a variable with a declared type


To come back to this point: We don't really implement these rules now, and it 
is not clear that we will. The problem here is that, if we take the 
specification literally, then we can't use our current TBAA at all. The problem 
is that if we have:

  write x, !tbaa "int"
  read x, !tbaa "int"
  write x, !tbaa "float"

TBAA will currently tell us that the "float" write aliases with neither the 
preceding read nor the preceding write. As a result, we might move the "float" 
write to before the read (which is wrong) or before the write (also wrong). It 
seems to me that following the effective-type rules strictly will require that 
we only emit TBAA for memory accesses we can prove are to declared variables 
(as these are the only ones whose types don't get changed just by writing to 
them). We could certainly do that (*), although it is going to make TBAA 
awfully limited. As @dberlin has asserted, GCC does not implement these rules 
either.

To be fair, there are inferences we might draw from TBAA on all access that are 
not incorrect. For example, if we have:

  write x, !tbaa "int"
  write y, !tbaa "float"
  read x, !tbaa "int"

and we can indeed conclude that the write to y and the read from x don't alias 
(because the write happens before the read). This is because the effective type 
of y must be float after the write and so we know that the read from x must be 
accessing a different object. We can also conclude that the writes don't alias, 
but only because of the later read. The sad part is that if we use this 
information to reorder the read before the write to y (which we might do to 
eliminate the read), we now lose our ability to use TBAA to tell us anything.

Also, a strict reading of C's access rules seems to rule out the premise 
underlying our struct-path TBAA entirely. So long as I'm accessing a value 
using a struct that has some member, including recursively, with that type, 
then it's fine. The matching of the relative offsets is a sufficient, but not 
necessary, condition for well-defined access. C++ has essentially the same 
language (and, thus, potentially the same problem).

While I'd like the sanitizer to follow the rules, that's really useful only to 
the extent that the compilers follow the rules. If the compilers are making 
stronger assumptions, then I think that the sanitizer should also. OTOH, maybe 
we should change our TBAA representation/implementation to actually follow the 
rules, and then have a sanitizer that does the same.

(*) The best way I can think of to do this is to tag globals and allocas with 
tbaa according to their declared type, something similar for function 
arguments, and then in TBAA, instead of comparing the TBAA metadata of both 
operands directly, we call getUnderlyingObjects on the pointers, get the 
corresponding most-generic TBAA from the objects themselves, and then compare 
that TBAA to the TBAA from the other access.

> 2. After a placement new in C++, you should be able to use the storage as a 
> new type
> 3. Storing directly to a member access on a union (ie, with the syntax `x.a = 
> b`) in C++ permits using the storage as the new type

Yes, although for the sake of discussion, this is only true if a is a 
"non-class, non-array type, or of a class type with a trivial default 
constructor that is not
deleted, or an array of such types." That seems potentially useful.

> If we want to follow the relevant language rules by default, that would 
> suggest that "writes always set the type" should be enabled by default in C 
> and disabled by default in C++. That may not be the right decision for other 
> reasons, though. In C++, writes through union members and new-expressions 
> should probably (re)set the type (do you have intrinsics the frontend can use 
> to do so?).

Also, in C, memcpy gets to copy the type for a destination that does not have 
declared types. It looks like the same is true for C++ for trivially-copyable 
types. Is the first read/write sets the unknown type (i.e. memory from 
malloc/calloc/memset, etc.) correct for C++ also? I recall discussing something 
along these lines in Kona.


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org

[PATCH] D32301: Don't pass FPOpFusion::Strict to the backend

2017-04-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D32301



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31885#727307, @kparzysz wrote:

> This is not meant as a fine-grained solution to the TBAA problem, but a 
> temporary fix for generating wrong information.  Just yesterday I helped 
> diagnose yet another problem related to this, so this issue is causing 
> trouble out there.


I understand. But temporary solutions have a tendency to become fairly long 
lived and I'd like to make sure we're not removing correct information (i.e. 
the scalar type information) along with the incorrect information. This has 
been broken for a long time, if it takes another few days to get a better fix 
then that's worthwhile.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#730716, @filcab wrote:

> Missing a `sanitizer-ld.c` test for freebsd.


Thanks for pointing this out. I'm going to remove the freebsd change for now. I 
suspect it works, but I've not tested it yet.




Comment at: include/clang/Basic/Attr.td:1849
+   GNU<"no_sanitize_memory">,
+   GNU<"no_sanitize_tbaa">];
   let Subjects = SubjectList<[Function, GlobalVar], ErrorDiag,

filcab wrote:
> Shouldn't you be extending the no_sanitize attribute instead, as it says in 
> the comment?
Indeed. I think that happened also. I'll make sure the tests reflect that.



Comment at: lib/CodeGen/CodeGenTBAA.cpp:96
+  if (!Features.Sanitize.has(SanitizerKind::TBAA) &&
+  (CodeGenOpts.OptimizationLevel == 0 || CodeGenOpts.RelaxedAliasing))
 return nullptr;

filcab wrote:
> Interesting that TSan needs TBAA (per the comment in the chunk above), but is 
> not in this condition.
Yea, I didn't understand the TSan-needs-TBAA bit either (considering that it 
does not need it at -O0?).


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel updated this revision to Diff 95829.
hfinkel added a comment.

Updated per review comments (use only no_sanitize("tbaa") instead of adding 
no_sanitize_tbaa and don't touch freebsd for now).


https://reviews.llvm.org/D32199

Files:
  include/clang/Basic/Sanitizers.def
  include/clang/Driver/SanitizerArgs.h
  lib/CodeGen/BackendUtil.cpp
  lib/CodeGen/CGDeclCXX.cpp
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenTBAA.cpp
  lib/Driver/SanitizerArgs.cpp
  lib/Driver/ToolChains/CommonArgs.cpp
  lib/Driver/ToolChains/Linux.cpp
  lib/Frontend/CompilerInvocation.cpp
  lib/Lex/PPMacroExpansion.cpp
  lib/Sema/SemaDeclAttr.cpp
  test/CodeGen/sanitize-tbaa-attr.cpp
  test/Driver/sanitizer-ld.c

Index: test/Driver/sanitizer-ld.c
===
--- test/Driver/sanitizer-ld.c
+++ test/Driver/sanitizer-ld.c
@@ -181,6 +181,18 @@
 
 // RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
+// RUN: -fsanitize=tbaa \
+// RUN: -resource-dir=%S/Inputs/resource_dir \
+// RUN: --sysroot=%S/Inputs/basic_linux_tree \
+// RUN:   | FileCheck --check-prefix=CHECK-TBAASAN-LINUX-CXX %s
+//
+// CHECK-TBAASAN-LINUX-CXX: "{{(.*[^-.0-9A-Z_a-z])?}}ld{{(.exe)?}}"
+// CHECK-TBAASAN-LINUX-CXX-NOT: stdc++
+// CHECK-TBAASAN-LINUX-CXX: "-whole-archive" "{{.*}}libclang_rt.tbaasan-x86_64.a" "-no-whole-archive"
+// CHECK-TBAASAN-LINUX-CXX: stdc++
+
+// RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
+// RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
 // RUN: -fsanitize=memory \
 // RUN: -resource-dir=%S/Inputs/resource_dir \
 // RUN: --sysroot=%S/Inputs/basic_linux_tree \
Index: test/CodeGen/sanitize-tbaa-attr.cpp
===
--- /dev/null
+++ test/CodeGen/sanitize-tbaa-attr.cpp
@@ -0,0 +1,64 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s | FileCheck -check-prefix=WITHOUT %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=tbaa | FileCheck -check-prefix=TBAASAN %s
+// RUN: echo "src:%s" | sed -e 's/\\//g' > %t
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=tbaa -fsanitize-blacklist=%t | FileCheck -check-prefix=BL %s
+
+// The sanitize_tbaa attribute should be attached to functions
+// when TBAASanitizer is enabled, unless no_sanitize("tbaa") attribute
+// is present.
+
+// WITHOUT:  NoTBAASAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTBAASAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// TBAASAN:  NoTBAASAN1{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("tbaa")))
+int NoTBAASAN1(int *a) { return *a; }
+
+// WITHOUT:  NoTBAASAN2{{.*}}) [[NOATTR]]
+// BL:  NoTBAASAN2{{.*}}) [[NOATTR]]
+// TBAASAN:  NoTBAASAN2{{.*}}) [[NOATTR]]
+__attribute__((no_sanitize("tbaa")))
+int NoTBAASAN2(int *a);
+int NoTBAASAN2(int *a) { return *a; }
+
+// WITHOUT:  NoTBAASAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTBAASAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// TBAASAN:  NoTBAASAN3{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("tbaa")))
+int NoTBAASAN3(int *a) { return *a; }
+
+// WITHOUT:  TBAASANOk{{.*}}) [[NOATTR]]
+// BL:  TBAASANOk{{.*}}) [[NOATTR]]
+// TBAASAN: TBAASANOk{{.*}}) [[WITH:#[0-9]+]]
+int TBAASANOk(int *a) { return *a; }
+
+// WITHOUT:  TemplateTBAASANOk{{.*}}) [[NOATTR]]
+// BL:  TemplateTBAASANOk{{.*}}) [[NOATTR]]
+// TBAASAN: TemplateTBAASANOk{{.*}}) [[WITH]]
+template
+int TemplateTBAASANOk() { return i; }
+
+// WITHOUT:  TemplateNoTBAASAN{{.*}}) [[NOATTR]]
+// BL:  TemplateNoTBAASAN{{.*}}) [[NOATTR]]
+// TBAASAN: TemplateNoTBAASAN{{.*}}) [[NOATTR]]
+template
+__attribute__((no_sanitize("tbaa")))
+int TemplateNoTBAASAN() { return i; }
+
+int force_instance = TemplateTBAASANOk<42>()
+   + TemplateNoTBAASAN<42>();
+
+// Check that __cxx_global_var_init* get the sanitize_tbaa attribute.
+int global1 = 0;
+int global2 = *(int*)((char*)+1);
+// WITHOUT: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// BL: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// TBAASAN: @__cxx_global_var_init{{.*}}[[WITH:#[0-9]+]]
+
+// WITHOUT: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+
+// BL: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+
+// TBAASAN: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+// TBAASAN: attributes [[WITH]] = { noinline nounwind sanitize_tbaa{{.*}} }
+
+// TBAASAN: Simple C++ TBAA
Index: lib/Sema/SemaDeclAttr.cpp
===
--- lib/Sema/SemaDeclAttr.cpp
+++ lib/Sema/SemaDeclAttr.cpp
@@ -5733,7 +5733,8 @@
 .Case("no_address_safety_analysis", "address")
 .Case("no_sanitize_address", "address")
 .Case("no_sanitize_thread", "thread")
-.Case("no_sanitize_memory", 

[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31885#730853, @hfinkel wrote:

>


...

> 
> 
>> (And the nonsensicality of the standard very much continues. The code that 
>> we've all agreed is obviously being miscompiled here is still a strict 
>> violation of the "improved" union rules in C++, and if we decide to treat it 
>> as ill-formed we're going to break a massive amount of code (and vector code 
>> in particular heavily uses unions), because trust me, nobody is reliably 
>> placement-newing union fields when they're non-trivial. C++'s new rules are 
>> unworkable in part because they rely on C++-specific features that cannot be 
>> easily adopted in headers which need to be language-neutral.)
> 
> I think we all agree that the vector types need to have their scalar 
> (element) types as parents in the current TBAA representation. That's a 
> separate change that we should just make. That, unfortunately, only fixes a 
> subset of the union-related miscompiles.

Looking at this in more detail, we actually generate builtin-vector types which 
use char* as their access type (always, even for floating-point vectors). It is 
only the struct-path info which causes the issue for the test case here. It 
might be reasonable to make this **less** conservative, but that's another 
story...


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31561: cmath: Skip Libc for integral types in isinf, etc.

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31561#731498, @dexonsmith wrote:

> Marshall, that's what I assumed originally, but I figured Hal had some 
> non-standard-but-worth-supporting use case in mind.


In this case, future-proofing and mathematical precision seemed aligned. I 
suspect that we'll end up with fixed-point types, etc. at some point, and users 
already certainly have such types (although it is true that using those types 
with the std:: math functions is not something we're required to support).

> Hal, what do you think?

I think that it is fine either way. Marshall is certainly correct, there are no 
types we're required to support that makes this a meaningful distinction.


https://reviews.llvm.org/D31561



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32260: [TBAA] Vector types should (only) alias with their element types

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel abandoned this revision.
hfinkel added a comment.

@rjmccall said, on this topic, in https://reviews.llvm.org/D31885:

> The root problem there is that the design of vector types and vector 
> interfaces is generally quite bad; you cannot rely on things like vectors 
> being stored with an appropriate element type for whatever value the user is 
> actually trying to work with.



> For example, Clang's xmmintrin.h specifies that __m128 is a vector of 4 ints. 
>  How confident are you that that type is never used to store a vector of 4 
> floats?  Keep in mind that the compiler allows __m128 to freely implicitly 
> convert to __v4sf.

And I think he's right; we probably can't do this in general. vector types 
really are just bags of bits on many platforms ;)


https://reviews.llvm.org/D32260



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#731332, @rsmith wrote:

> > ! In https://reviews.llvm.org/D32199#731252, @hfinkel wrote:
> > 
> >> How about renaming this to something more like `-fsanitize=type`?
> > 
> > I'm fine with that. Do you like TypeSanitizer or TypeAccessSantizer or 
> > TypeAliasingSanitizer best?
>
> I think calling it a type aliasing sanitizer would somewhat conflate the 
> details of the mechanism with the fundamentals of the check itself. For 
> example:
>
>   variant v;
>   int  = v.get;
>   v = 1.3f;
>   int m = n;
>
>
> ... is a lifetime bug, not an aliasing bug, but would be caught by this check 
> just the same.


Good point.

>   I'd be tempted to suggest EffectiveTypeSanitizer, since we seem to be 
> more-or-less directly implementing C's effective type rules, except that name 
> isn't so good for the C++ case. And in the longer term we will probably want 
> to provide an option to enforce the real C++ lifetime rules whereby a store 
> with certain !tbaa metadata is not sufficient to change the type of storage.

As I've currently implemented it, both reads and writes set the type of 
previously-unknown storage, and after that it says fixed (unless you memcpy to 
it, memset it, or its lifetime ends (the type gets reset on lifetime.start/end 
and for malloc/allocas/etc.). There's a flag to enable the "writes always set 
the type" rule, but that's not the default. Is this too strict?

> 
> 
>> One potential concern with calling it the type sanitizer is that we have an 
>> abbreviation overlap with the thread sanitizer.
> 
> Perhaps we could abbreviate it as "tysan"? *shrug*

SGTM.


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32260: [TBAA] Vector types should (only) alias with their element types

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel created this revision.
Herald added a subscriber: mcrosier.

Currently, all of our builtin vector types are equivalent to char for TBAA 
purposes. It would be useful to make this less conservative. This patch makes 
vector types equivalent to their element types for type-aliasing purposes. I 
think it's common for programmers to assume that they can cast freely between 
the vector types and the scalar types (so long as they're being sufficiently 
careful about alignments), and we should certainly preserve that model. Given 
that we assume that int* and float* don't alias, I don't think we need to 
assume that int* and vec_of_float* alias or vec_of_int* and vec_of_float* alias.

The cases I've seen where I know this would be helpful involve integral scalar 
pointers and floating-point vector pointers, so I'm generalizing a bit here. If 
this would break too much code, one option is fall back to doing this only for 
floating-point types. That having been said, I know that it is common on some 
platforms to case between floating-point vectors and integer vectors in order 
to implement operations like fabs, and maybe that makes this untenable on those 
platforms?

Thoughts?


https://reviews.llvm.org/D32260

Files:
  lib/CodeGen/CodeGenTBAA.cpp
  test/CodeGen/tbaa-vector.cpp


Index: test/CodeGen/tbaa-vector.cpp
===
--- /dev/null
+++ test/CodeGen/tbaa-vector.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -O1 -no-struct-path-tbaa 
-disable-llvm-passes %s -emit-llvm -o - | FileCheck %s
+
+typedef double __m256d __attribute__((__vector_size__(32)));
+__m256d foo(double *x, __m256d *y) {
+  *x = 0.0;
+  return *y;
+
+// CHECK-LABEL: define <4 x double> @_Z3fooPdPDv4_d
+// CHECK: store double 0.00e+00, double* %{{.*}}, align 8, !tbaa 
[[TAG_double:!.*]]
+// CHECK: load <4 x double>, <4 x double>* %{{.*}}, align {{.*}}, !tbaa 
[[TAG_double]]
+}
+
+// CHECK: [[TYPE_char:!.*]] = !{!"omnipotent char", [[TAG_cxx_tbaa:!.*]],
+// CHECK: [[TAG_cxx_tbaa]] = !{!"Simple C++ TBAA"}
+// CHECK: [[TAG_double]] = !{[[TYPE_double:!.*]], [[TYPE_double]], i64 0}
+// CHECK: [[TYPE_double]] = !{!"double", [[TYPE_char]],
+
Index: lib/CodeGen/CodeGenTBAA.cpp
===
--- lib/CodeGen/CodeGenTBAA.cpp
+++ lib/CodeGen/CodeGenTBAA.cpp
@@ -101,6 +101,10 @@
   if (TypeHasMayAlias(QTy))
 return getChar();
 
+  // Vector types should alias with their element types.
+  if (auto *VT = QTy->getAs())
+QTy = VT->getElementType();
+
   const Type *Ty = Context.getCanonicalType(QTy).getTypePtr();
 
   if (llvm::MDNode *N = MetadataCache[Ty])


Index: test/CodeGen/tbaa-vector.cpp
===
--- /dev/null
+++ test/CodeGen/tbaa-vector.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -O1 -no-struct-path-tbaa -disable-llvm-passes %s -emit-llvm -o - | FileCheck %s
+
+typedef double __m256d __attribute__((__vector_size__(32)));
+__m256d foo(double *x, __m256d *y) {
+  *x = 0.0;
+  return *y;
+
+// CHECK-LABEL: define <4 x double> @_Z3fooPdPDv4_d
+// CHECK: store double 0.00e+00, double* %{{.*}}, align 8, !tbaa [[TAG_double:!.*]]
+// CHECK: load <4 x double>, <4 x double>* %{{.*}}, align {{.*}}, !tbaa [[TAG_double]]
+}
+
+// CHECK: [[TYPE_char:!.*]] = !{!"omnipotent char", [[TAG_cxx_tbaa:!.*]],
+// CHECK: [[TAG_cxx_tbaa]] = !{!"Simple C++ TBAA"}
+// CHECK: [[TAG_double]] = !{[[TYPE_double:!.*]], [[TYPE_double]], i64 0}
+// CHECK: [[TYPE_double]] = !{!"double", [[TYPE_char]],
+
Index: lib/CodeGen/CodeGenTBAA.cpp
===
--- lib/CodeGen/CodeGenTBAA.cpp
+++ lib/CodeGen/CodeGenTBAA.cpp
@@ -101,6 +101,10 @@
   if (TypeHasMayAlias(QTy))
 return getChar();
 
+  // Vector types should alias with their element types.
+  if (auto *VT = QTy->getAs())
+QTy = VT->getElementType();
+
   const Type *Ty = Context.getCanonicalType(QTy).getTypePtr();
 
   if (llvm::MDNode *N = MetadataCache[Ty])
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#731144, @rsmith wrote:

>


...

> I would also expect that we will extend this in future to assign types to 
> storage even in cases where there is no store (for instance, we should be 
> able to catch `float f() { int n; return *(float*) }` despite there being 
> no TBAA violation in the naive IR).

Yes. My thought was that we'd make Clang generate tbaa metadata on allocas and 
lifetime.start intrinsics (and globals) so that we can mark the memory types 
upon creation. Would that catch everything?

> How about renaming this to something more like `-fsanitize=type`?

I'm fine with that. Do you like TypeSanitizer or TypeAccessSantizer or 
TypeAliasingSanitizer best?

One potential concern with calling it the type sanitizer is that we have an 
abbreviation overlap with the thread sanitizer.


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#731249, @hfinkel wrote:

> In https://reviews.llvm.org/D32199#731144, @rsmith wrote:
>
> >
>
>
> ...
>
> > I would also expect that we will extend this in future to assign types to 
> > storage even in cases where there is no store (for instance, we should be 
> > able to catch `float f() { int n; return *(float*) }` despite there 
> > being no TBAA violation in the naive IR).
>
> Yes. My thought was that we'd make Clang generate tbaa metadata on allocas 
> and lifetime.start intrinsics (and globals) so that we can mark the memory 
> types upon creation. Would that catch everything?


Also, we'd also want to do something similar for byval arguments. This may just 
be additional motivation to allow metadata on function arguments (there's an 
RFC on llvm-dev about this presently).

> 
> 
>> How about renaming this to something more like `-fsanitize=type`?
> 
> I'm fine with that. Do you like TypeSanitizer or TypeAccessSantizer or 
> TypeAliasingSanitizer best?
> 
> One potential concern with calling it the type sanitizer is that we have an 
> abbreviation overlap with the thread sanitizer.




https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

As a note: As follow-up work, we might want to extend Clang to add TBAA 
metadata to allocas and lifetime.start intrinsics so that the instrumentation 
pass can mark the types of memory upon declaration/construction instead of only 
upon first access.


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D12689: [libc++][static linking] std streams are not initialized prior to their use in static object constructors

2017-03-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D12689#515120, @eastig wrote:

> The issue is reported as:
>  https://llvm.org/bugs/show_bug.cgi?id=28954


We do need to fix this bug; I posted to the PR so we can discuss approach there.


https://reviews.llvm.org/D12689



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30415: Fix -mno-altivec cannot overwrite -maltivec option

2017-03-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D30415



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30415: Fix -mno-altivec cannot overwrite -maltivec option

2017-03-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30415#700534, @nemanjai wrote:

> This seems to change the relationship between -m[no-]altivec and 
> -f[no-]altivec which has been kind of contentious for a while. Can you add a 
> note as to whether the new behaviour matches the GCC behaviour. Also, perhaps 
> @echristo might want to chime in as to whether this is the desired behaviour 
> or not.


Why do you say this? It only seems to affect whether the syntax extensions can 
be disabled via the appropriate flag.


https://reviews.llvm.org/D30415



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30923: Warn on enum assignment to bitfields that can't fit all values

2017-03-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30923#700620, @thakis wrote:

> Maybe it should have some "to suppress the warning, do X" fixit?


I think we should at least include how many bits the field should have to fix 
the problem (pointing to the relevant field definition certainly seems helpful).


https://reviews.llvm.org/D30923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30923: Warn on enum assignment to bitfields that can't fit all values

2017-03-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D30923#700696, @rnk wrote:

> In https://reviews.llvm.org/D30923#700626, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D30923#700620, @thakis wrote:
> >
> > > Maybe it should have some "to suppress the warning, do X" fixit?
> >
> >
> > I think we should at least include how many bits the field should have to 
> > fix the problem (pointing to the relevant field definition certainly seems 
> > helpful).
>
>
> Agreed, I was just hacking that together. :)
>
> Do you think it's worth indicating that the error can be suppressed with an 
> explicit cast, or is that wasted space?


What might this look like? Also, I don't see a regression test for this.


https://reviews.llvm.org/D30923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29750: [PPC] Enable -fomit-frame-pointer by default for PPC

2017-03-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D29750



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> I'm not sure about the solution to #2, because i thought there were very 
> specific points in time at which the effective type could change.

I think this is a key point. I'm not sure that there are specific points that 
the frontend can deduce:

  union U {
int i;
float f;
  };
  
  void bar1(int *i) {
*i = 0; // we just reset the type here
  }
  
  void bar2(float *f) {
*f = 0.0f; // we just reset the type here too
  }
  
  void foo(U *u) {
bar1(>i);
bar2(>f);
  }

Even if the union has structs instead of scalar types, I'm not sure that 
changes the situation. There certainly are situation where you can't silently 
change the types of objects in C++ just by starting to write a to 
differently-typed object at the same location, but I think that using this 
property relies on having some lifetime information for the objects in 
question, and so AA would need to be able to use this lifetime information to 
do more. This seems like an orthogonal issue (i.e. we can always add TBAA write 
<-> write ability in the presence of such lifetime information as an additional 
feature). Maybe I'm missing something...

> As we've repeatedly said, memory locations and pointers don't really have 
> tbaa info, instructions do.

I agree (and I think we'll need to do this in order to handle such lifetime 
information (and there are also issues around noalias handling that are 
relevant)).


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D31885#727371, @efriedma wrote:

> In https://reviews.llvm.org/D31885#727167, @hfinkel wrote:
>
> > I'm not sure this is the right way to do this; I suspect we're lumping 
> > together a bunch of different bugs:
> >
> > 1. vector types need to have tbaa which makes them alias with their element 
> > types [to be clear, as vector types are an implementation extension, this 
> > is our choice; I believe most users expect this to be true, but I'm 
> > certainly open to leaving this as-is (i.e. the vector types and scalar 
> > types as independent/non-aliasing)].
> > 2. tbaa can't be used for write <-> write queries (only read <-> write 
> > queries) because the writes can change the effective type
> > 3. our 'struct' path TBAA for unions is broken (and to fix this we need to 
> > invert the tree structure, etc. as discussed on the list)
>
>
> See https://bugs.llvm.org/show_bug.cgi?id=28189 for a testcase for (2) for 
> this which doesn't involve unions.


Yes, this is what I had in mind. However, we may just want to not handle this 
at all. The demonstration you provide:

  #include 
  #include 
  #include 
  int f(int* x, int *y) {
*x = 10;
int z = *y;
*(float*)x = 1.0;
return z;
  }
  int (*ff)(int*,int*) = f;
  int main() {
void* x = malloc(4);
printf("%d\n", ff(x, x));
  }

shows that the problem is more than I implied. To support this, we not only 
need to ignore the TBAA between the two writes (*x and *(float*)x), but also 
between the float write and the preceding int read. I wonder how much of TBAA 
we could keep at all and still support this. Thoughts?


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TBAASan] A TBAA Sanitizer (Clang)

2017-04-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel created this revision.
Herald added subscribers: mcrosier, emaste.

This patch introduces the runtime components of a TBAA sanitizer: a sanitizer 
for type-based aliasing violations.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit  these 
given TBAA metadata added by Clang. Roughly, a pointer of given  type cannot be 
used to access an object of a different type (with, of  course, certain 
exceptions). Unfortunately, there's a lot of code in the wild that violates 
these rules (e.g. for type punning), and such code often must be built with 
-fno-strict-aliasing. Performance is often sacrificed as a result. Part of the 
problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer 
will help.

https://reviews.llvm.org/D32197 (Runtime)
https://reviews.llvm.org/D32198 (LLVM)

The Clang changes seems mostly formulaic, the one specific change being that 
when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0.

Clang's TBAA representation currently has a problem representing unions, as 
demonstrated by the one XFAIL'd test in the runtime patch. We'll update the 
TBAA representation to fix this, and at the same time, update the sanitizer.


https://reviews.llvm.org/D32199

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/Sanitizers.def
  include/clang/Driver/SanitizerArgs.h
  lib/CodeGen/BackendUtil.cpp
  lib/CodeGen/CGDeclCXX.cpp
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenTBAA.cpp
  lib/Driver/SanitizerArgs.cpp
  lib/Driver/ToolChains/CommonArgs.cpp
  lib/Driver/ToolChains/FreeBSD.cpp
  lib/Driver/ToolChains/Linux.cpp
  lib/Frontend/CompilerInvocation.cpp
  lib/Lex/PPMacroExpansion.cpp
  lib/Sema/SemaDeclAttr.cpp
  test/CodeGen/sanitize-tbaa-attr.cpp
  test/Driver/sanitizer-ld.c

Index: test/Driver/sanitizer-ld.c
===
--- test/Driver/sanitizer-ld.c
+++ test/Driver/sanitizer-ld.c
@@ -181,6 +181,18 @@
 
 // RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
+// RUN: -fsanitize=tbaa \
+// RUN: -resource-dir=%S/Inputs/resource_dir \
+// RUN: --sysroot=%S/Inputs/basic_linux_tree \
+// RUN:   | FileCheck --check-prefix=CHECK-TBAASAN-LINUX-CXX %s
+//
+// CHECK-TBAASAN-LINUX-CXX: "{{(.*[^-.0-9A-Z_a-z])?}}ld{{(.exe)?}}"
+// CHECK-TBAASAN-LINUX-CXX-NOT: stdc++
+// CHECK-TBAASAN-LINUX-CXX: "-whole-archive" "{{.*}}libclang_rt.tbaasan-x86_64.a" "-no-whole-archive"
+// CHECK-TBAASAN-LINUX-CXX: stdc++
+
+// RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
+// RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
 // RUN: -fsanitize=memory \
 // RUN: -resource-dir=%S/Inputs/resource_dir \
 // RUN: --sysroot=%S/Inputs/basic_linux_tree \
Index: test/CodeGen/sanitize-tbaa-attr.cpp
===
--- /dev/null
+++ test/CodeGen/sanitize-tbaa-attr.cpp
@@ -0,0 +1,64 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s | FileCheck -check-prefix=WITHOUT %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=tbaa | FileCheck -check-prefix=TBAASAN %s
+// RUN: echo "src:%s" | sed -e 's/\\//g' > %t
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=tbaa -fsanitize-blacklist=%t | FileCheck -check-prefix=BL %s
+
+// The sanitize_tbaa attribute should be attached to functions
+// when TBAASanitizer is enabled, unless no_sanitize_tbaa attribute
+// is present.
+
+// WITHOUT:  NoTBAASAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTBAASAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// TBAASAN:  NoTBAASAN1{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize_tbaa))
+int NoTBAASAN1(int *a) { return *a; }
+
+// WITHOUT:  NoTBAASAN2{{.*}}) [[NOATTR]]
+// BL:  NoTBAASAN2{{.*}}) [[NOATTR]]
+// TBAASAN:  NoTBAASAN2{{.*}}) [[NOATTR]]
+__attribute__((no_sanitize_tbaa))
+int NoTBAASAN2(int *a);
+int NoTBAASAN2(int *a) { return *a; }
+
+// WITHOUT:  NoTBAASAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTBAASAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// TBAASAN:  NoTBAASAN3{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("tbaa")))
+int NoTBAASAN3(int *a) { return *a; }
+
+// WITHOUT:  TBAASANOk{{.*}}) [[NOATTR]]
+// BL:  TBAASANOk{{.*}}) [[NOATTR]]
+// TBAASAN: TBAASANOk{{.*}}) [[WITH:#[0-9]+]]
+int TBAASANOk(int *a) { return *a; }
+
+// WITHOUT:  TemplateTBAASANOk{{.*}}) [[NOATTR]]
+// BL:  TemplateTBAASANOk{{.*}}) [[NOATTR]]
+// TBAASAN: TemplateTBAASANOk{{.*}}) [[WITH]]
+template
+int TemplateTBAASANOk() { return i; }
+
+// WITHOUT:  TemplateNoTBAASAN{{.*}}) [[NOATTR]]
+// BL:  TemplateNoTBAASAN{{.*}}) [[NOATTR]]
+// TBAASAN: TemplateNoTBAASAN{{.*}}) [[NOATTR]]
+template
+__attribute__((no_sanitize_tbaa))
+int TemplateNoTBAASAN() { return i; }
+
+int force_instance = TemplateTBAASANOk<42>()
+ 

[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

To be clear, I'm find with disabling union tbaa until we can fix things for 
real. I'm somewhat concerned that this patch is quadratic in the AST.

FWIW, I tried just setting TBAAPath = true in EmitLValueForField and then 
returning nullptr from CodeGenTBAA::getTBAAStructTagInfo when 
BaseQTy->isUnionType() is true. This mostly works, although it does not handle 
the array case as this patch does. Maybe putting some code in 
EmitArraySubscriptExpr will also take care of that?


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TySan] A Type Sanitizer (Clang)

2017-08-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D32199#828526, @rjmccall wrote:

> Has this proposal run aground?  I'm going back over some old patches that 
> I've been CC'ed on and trying to make sure they're not blocking on my review.


I need to rebase these now that we've "fixed" the union metadata, but I believe 
that orthogonal to the changes here, and otherwise these should be ready to go. 
Could you please review this patch? Thanks in advance.


https://reviews.llvm.org/D32199



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D32199: [TySan] A Type Sanitizer (Clang)

2017-08-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel updated this revision to Diff 09.
hfinkel added a comment.

Rebased.


https://reviews.llvm.org/D32199

Files:
  include/clang/Basic/Sanitizers.def
  include/clang/Driver/SanitizerArgs.h
  lib/CodeGen/BackendUtil.cpp
  lib/CodeGen/CGDecl.cpp
  lib/CodeGen/CGDeclCXX.cpp
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenTBAA.cpp
  lib/CodeGen/SanitizerMetadata.cpp
  lib/CodeGen/SanitizerMetadata.h
  lib/Driver/SanitizerArgs.cpp
  lib/Driver/ToolChains/CommonArgs.cpp
  lib/Driver/ToolChains/Linux.cpp
  lib/Lex/PPMacroExpansion.cpp
  test/CodeGen/sanitize-type-attr.cpp
  test/Driver/sanitizer-ld.c

Index: test/Driver/sanitizer-ld.c
===
--- test/Driver/sanitizer-ld.c
+++ test/Driver/sanitizer-ld.c
@@ -181,6 +181,18 @@
 
 // RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
+// RUN: -fsanitize=type \
+// RUN: -resource-dir=%S/Inputs/resource_dir \
+// RUN: --sysroot=%S/Inputs/basic_linux_tree \
+// RUN:   | FileCheck --check-prefix=CHECK-TYSAN-LINUX-CXX %s
+//
+// CHECK-TYSAN-LINUX-CXX: "{{(.*[^-.0-9A-Z_a-z])?}}ld{{(.exe)?}}"
+// CHECK-TYSAN-LINUX-CXX-NOT: stdc++
+// CHECK-TYSAN-LINUX-CXX: "-whole-archive" "{{.*}}libclang_rt.tysan-x86_64.a" "-no-whole-archive"
+// CHECK-TYSAN-LINUX-CXX: stdc++
+
+// RUN: %clangxx -no-canonical-prefixes %s -### -o %t.o 2>&1 \
+// RUN: -target x86_64-unknown-linux -fuse-ld=ld -stdlib=platform -lstdc++ \
 // RUN: -fsanitize=memory \
 // RUN: -resource-dir=%S/Inputs/resource_dir \
 // RUN: --sysroot=%S/Inputs/basic_linux_tree \
Index: test/CodeGen/sanitize-type-attr.cpp
===
--- /dev/null
+++ test/CodeGen/sanitize-type-attr.cpp
@@ -0,0 +1,75 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s | FileCheck -check-prefix=WITHOUT %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=type | FileCheck -check-prefix=TYSAN %s
+// RUN: echo "src:%s" | sed -e 's/\\//g' > %t
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=type -fsanitize-blacklist=%t | FileCheck -check-prefix=BL %s
+
+// The sanitize_type attribute should be attached to functions
+// when TypeSanitizer is enabled, unless no_sanitize("type") attribute
+// is present.
+
+// WITHOUT:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// TYSAN:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("type")))
+int NoTYSAN1(int *a) { return *a; }
+
+// WITHOUT:  NoTYSAN2{{.*}}) [[NOATTR]]
+// BL:  NoTYSAN2{{.*}}) [[NOATTR]]
+// TYSAN:  NoTYSAN2{{.*}}) [[NOATTR]]
+__attribute__((no_sanitize("type")))
+int NoTYSAN2(int *a);
+int NoTYSAN2(int *a) { return *a; }
+
+// WITHOUT:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// TYSAN:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("type")))
+int NoTYSAN3(int *a) { return *a; }
+
+// WITHOUT:  TYSANOk{{.*}}) [[NOATTR]]
+// BL:  TYSANOk{{.*}}) [[NOATTR]]
+// TYSAN: TYSANOk{{.*}}) [[WITH:#[0-9]+]]
+int TYSANOk(int *a) { return *a; }
+
+// WITHOUT:  TemplateTYSANOk{{.*}}) [[NOATTR]]
+// BL:  TemplateTYSANOk{{.*}}) [[NOATTR]]
+// TYSAN: TemplateTYSANOk{{.*}}) [[WITH]]
+template
+int TemplateTYSANOk() { return i; }
+
+// WITHOUT:  TemplateNoTYSAN{{.*}}) [[NOATTR]]
+// BL:  TemplateNoTYSAN{{.*}}) [[NOATTR]]
+// TYSAN: TemplateNoTYSAN{{.*}}) [[NOATTR]]
+template
+__attribute__((no_sanitize("type")))
+int TemplateNoTYSAN() { return i; }
+
+int force_instance = TemplateTYSANOk<42>()
+   + TemplateNoTYSAN<42>();
+
+// Check that __cxx_global_var_init* get the sanitize_type attribute.
+int global1 = 0;
+int global2 = *(int*)((char*)+1);
+// WITHOUT: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// BL: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// TYSAN: @__cxx_global_var_init{{.*}}[[WITH:#[0-9]+]]
+
+// Make sure that we don't add globals to the list for which we don't have a
+// specific type description.
+struct SX { int a, b; };
+SX sx;
+
+// WITHOUT: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+
+// BL: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+
+// TYSAN: attributes [[NOATTR]] = { noinline nounwind{{.*}} }
+// TYSAN: attributes [[WITH]] = { noinline nounwind sanitize_type{{.*}} }
+
+// TYSAN-DAG: !llvm.tysan.globals = !{[[G1MD:![0-9]+]], [[G2MD:![0-9]+]], [[G3MD:![0-9]+]]}
+// TYSAN-DAG: [[G1MD]] = !{i32* @force_instance, [[INTMD:![0-9]+]]}
+// TYSAN-DAG: [[INTMD]] = !{!"int",
+// TYSAN-DAG: [[G2MD]] = !{i32* @global1, [[INTMD]]}
+// TYSAN-DAG: [[G3MD]] = !{i32* @global2, [[INTMD]]}
+// TYSAN-DAG: Simple C++ TBAA
+
Index: lib/Lex/PPMacroExpansion.cpp
===
--- lib/Lex/PPMacroExpansion.cpp
+++ lib/Lex/PPMacroExpansion.cpp
@@ -1138,6 +1138,7 @@
   

[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

2017-08-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/CodeGen/CGDecl.cpp:1291
+// where the pointer to the local variable is the key in the map.
+void CodeGenFunction::EmitAutoVarNoAlias(const AutoVarEmission ) {
+  assert(emission.Variable && "emission was not valid!");

rjmccall wrote:
> This should really be a subroutine of EmitAutoVarAlloca; that will implicitly 
> pick this logic up for a bunch of less standard places in IRGen that create 
> local variables.  Please make it a static function (if possible) called 
> EmitNoAliasScope and call it immediately after EmitVarAnnotations.
Done (I didn't make it static, which is certainly possible, but didn't look 
much cleaner to me - let me know if you want it static anyway).



Comment at: lib/CodeGen/CGStmt.cpp:537
+  llvm::LLVMContext::MD_noalias),
+  NewScopeList));
+

rjmccall wrote:
> rjmccall wrote:
> > hfinkel wrote:
> > > rjmccall wrote:
> > > > This is a very strange representation.  Every memory operation in the 
> > > > lexical block is annotated with a list of all of the scopes that were 
> > > > entered within the block, even if they were entered after the 
> > > > operation.  But for some reason, not with nested scopes?
> > > > 
> > > > What's the right patch for me to read about this representation?
> > > Perhaps unfortunately, this is an artifact of the way that restrict is 
> > > defined in C. It applies to all accesses in the block in which the 
> > > variable is declared, even those before the declaration of the 
> > > restrict-qualified local itself.
> > > 
> > > It should work with nested scopes, in the sense that we add these things 
> > > as we complete each scope. So we add things to the inner scope, and then 
> > > when we complete the outer scope, we go back over the instructions 
> > > (including those in the inner scope because the scope recording recurses 
> > > up the scope hierarchy), and adds the outer scopes - it concatenates them 
> > > to any added by the inner (nested) scopes.
> > > 
> > > The noalias intrinsic's LangRef updates are in D9375.
> > Ah, I see the recursion now.  Please add a comment explaining that 
> > expectation here.
> I would still like this comment.  It should be enough to explain how 
> MemoryInsts actually gets filled: there's a callback from inserting an 
> instruction which adds all memory instructions to every active lexical scope 
> that's recording them.
Added.



Comment at: lib/CodeGen/CodeGenFunction.cpp:1986
+  if (I->mayReadOrWriteMemory())
+recordMemoryInstruction(I);
 }

rjmccall wrote:
> This condition will include calls, which I assume is unnecessary (except 
> maybe memory intrinsics?).  More seriously, it will also include loads and 
> stores into allocas, which will sweep in all sorts of bookkeeping temporaries 
> that IRGen uses in more complex code-generation situations.  You might want 
> to consider ways to avoid recording this kind of instruction that are very 
> likely to just get immediately optimized away by e.g. mem2reg.
> 
> Please also modify LexicalScope so that it records whether there's any active 
> recording scope in the CodeGenFunction.  Lexical scopes are nested, so that 
> should be as easy as saving the current state when you enter a scope and 
> restoring it when you leave.  That will allow you to optimize this code by 
> completely bypassing the recursion and even the check for whether the 
> instruction is a memory instruction in the extremely common case of a 
> function with no restrict variables.
> 
> In general, a lot of things about this approach seem to have worryingly poor 
> asymptotic performance in the face of large functions or (especially) many 
> restrict variables.  (You could, for example, have chosen to have each memory 
> instruction link to its enclosing noalias scope, which would contain a link 
> to its enclosing scope and a list of the variables it directly declares.  
> This would be a more complex representation to consume, but it would not 
> require inherently quadratic frontend work building all these lists.)  The 
> only saving grace is that we expect very little code to using restrict, so it 
> becomes vital to ensure that your code is immediately short-circuited when 
> nothing is using restrict.
> This condition will include calls, which I assume is unnecessary (except 
> maybe memory intrinsics?). 

No, getting calls here was intentional. The AA implementation can specifically 
deal with calls (especially those that are known only to access memory given by 
their arguments).

> More seriously, it will also include loads and stores into allocas, which 
> will sweep in all sorts of bookkeeping temporaries that IRGen uses in more 
> complex code-generation situations. You might want to consider ways to avoid 
> recording this kind of instruction that are very likely 

[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

2017-08-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel updated this revision to Diff 29.
hfinkel added a comment.
Herald added a subscriber: javed.absar.

Rebased and addressing review comments.


https://reviews.llvm.org/D9403

Files:
  lib/CodeGen/CGDecl.cpp
  lib/CodeGen/CGExpr.cpp
  lib/CodeGen/CGStmt.cpp
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenFunction.h
  test/CodeGen/arm_neon_intrinsics.c
  test/CodeGen/noalias.c
  test/OpenMP/taskloop_firstprivate_codegen.cpp
  test/OpenMP/taskloop_lastprivate_codegen.cpp
  test/OpenMP/taskloop_private_codegen.cpp
  test/OpenMP/taskloop_simd_firstprivate_codegen.cpp
  test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
  test/OpenMP/taskloop_simd_private_codegen.cpp

Index: test/OpenMP/taskloop_simd_private_codegen.cpp
===
--- test/OpenMP/taskloop_simd_private_codegen.cpp
+++ test/OpenMP/taskloop_simd_private_codegen.cpp
@@ -222,7 +222,8 @@
 // CHECK: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_DOUBLE_TY]]]*,
 // CHECK: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK: [[PRIV_SIVAR_ADDR:%.+]] = alloca i32*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*),
+// CHECK: store void (i8*, ...)* [[PRIV_NAFP]], void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
 // CHECK: [[MAP_FN:%.+]] = load void (i8*, ...)*, void (i8*, ...)** [[MAP_FN_ADDR]],
 // CHECK: call void (i8*, ...) [[MAP_FN]](i8* %{{.+}}, [[S_DOUBLE_TY]]** [[PRIV_VAR_ADDR]], i32** [[PRIV_T_VAR_ADDR]], [2 x [[S_DOUBLE_TY]]]** [[PRIV_S_ARR_ADDR]], [2 x i32]** [[PRIV_VEC_ADDR]], i32** [[PRIV_SIVAR_ADDR]])
 // CHECK: [[PRIV_VAR:%.+]] = load [[S_DOUBLE_TY]]*, [[S_DOUBLE_TY]]** [[PRIV_VAR_ADDR]],
@@ -350,7 +351,8 @@
 // CHECK-DAG: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK-DAG: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_INT_TY]]]*,
 // CHECK-DAG: [[PRIV_VAR_ADDR:%.+]] = alloca [[S_INT_TY]]*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**, [2 x i32]**, [2 x [[S_INT_TY]]]**, [[S_INT_TY]]**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**, [2 x i32]**, [2 x [[S_INT_TY]]]**, [[S_INT_TY]]**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*),
+// CHECK: store void (i8*, ...)* [[PRIV_NAFP]], void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
 // CHECK: [[MAP_FN:%.+]] = load void (i8*, ...)*, void (i8*, ...)** [[MAP_FN_ADDR]],
 // CHECK: call void (i8*, ...) [[MAP_FN]](i8* %{{.+}}, i32** [[PRIV_T_VAR_ADDR]], [2 x i32]** [[PRIV_VEC_ADDR]], [2 x [[S_INT_TY]]]** [[PRIV_S_ARR_ADDR]], [[S_INT_TY]]** [[PRIV_VAR_ADDR]])
 // CHECK: [[PRIV_T_VAR:%.+]] = load i32*, i32** [[PRIV_T_VAR_ADDR]],
Index: test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
===
--- test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
+++ test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
@@ -257,7 +257,8 @@
 // CHECK: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_DOUBLE_TY]]]*,
 // CHECK: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK: [[PRIV_SIVAR_ADDR:%.+]] = alloca i32*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*),
+// CHECK: store void (i8*, ...)* [[PRIV_NAFP]], void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
 // CHECK: [[MAP_FN:%.+]] = load void (i8*, ...)*, void (i8*, ...)** [[MAP_FN_ADDR]],
 
 // CHECK: call void (i8*, ...) [[MAP_FN]](i8* %{{.+}}, [[S_DOUBLE_TY]]** [[PRIV_VAR_ADDR]], i32** [[PRIV_T_VAR_ADDR]], [2 x [[S_DOUBLE_TY]]]** [[PRIV_S_ARR_ADDR]], [2 x i32]** [[PRIV_VEC_ADDR]], i32** [[PRIV_SIVAR_ADDR]])
@@ -425,7 +426,8 @@
 // CHECK-DAG: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK-DAG: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_INT_TY]]]*,
 // CHECK-DAG: [[PRIV_VAR_ADDR:%.+]] = alloca [[S_INT_TY]]*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**, [2 x i32]**, [2 x [[S_INT_TY]]]**, [[S_INT_TY]]**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void 

[PATCH] D22189: llvm.noalias - Clang CodeGen - check restrict variable map only for restrict-qualified lvalues

2017-08-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel updated this revision to Diff 30.
hfinkel added a comment.
Herald added a subscriber: jholewinski.

Rebased.


https://reviews.llvm.org/D22189

Files:
  lib/CodeGen/CGDeclCXX.cpp
  lib/CodeGen/CGExpr.cpp
  lib/CodeGen/CGOpenMPRuntime.cpp
  lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
  lib/CodeGen/CGStmtOpenMP.cpp
  lib/CodeGen/CodeGenFunction.h

Index: lib/CodeGen/CodeGenFunction.h
===
--- lib/CodeGen/CodeGenFunction.h
+++ lib/CodeGen/CodeGenFunction.h
@@ -3150,7 +3150,7 @@
   /// care to appropriately convert from the memory representation to
   /// the LLVM value representation.
   void EmitStoreOfScalar(llvm::Value *Value, Address Addr,
- bool Volatile, QualType Ty,
+ bool Volatile, bool Restrict, QualType Ty,
  LValueBaseInfo BaseInfo =
  LValueBaseInfo(AlignmentSource::Type),
  llvm::MDNode *TBAAInfo = nullptr, bool isInit = false,
Index: lib/CodeGen/CGStmtOpenMP.cpp
===
--- lib/CodeGen/CGStmtOpenMP.cpp
+++ lib/CodeGen/CGStmtOpenMP.cpp
@@ -364,7 +364,8 @@
 Address RefAddr = CGF.CreateMemTemp(
 CurVD->getType(), CGM.getPointerAlign(), ".materialized_ref");
 CGF.EmitStoreOfScalar(LocalAddr.getPointer(), RefAddr,
-  /*Volatile=*/false, CurVD->getType());
+  /*Volatile=*/false, /*Restrict=*/false,
+  CurVD->getType());
 LocalAddr = RefAddr;
   }
   if (!FO.RegisterCastedArgsOnly)
@@ -2964,7 +2965,8 @@
   const auto *VD = cast(cast(E)->getDecl());
   CGF.EmitVarDecl(*VD);
   CGF.EmitStoreOfScalar(ReductionDesc, CGF.GetAddrOfLocalVar(VD),
-/*Volatile=*/false, E->getType());
+/*Volatile=*/false, /*Restrict=*/false,
+E->getType());
 }
 CGF.EmitStmt(cast(S.getAssociatedStmt())->getCapturedStmt());
   };
Index: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
===
--- lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
+++ lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
@@ -1180,7 +1180,7 @@
 
 // Store the source element value to the dest element address.
 CGF.EmitStoreOfScalar(Elem, DestElementAddr, /*Volatile=*/false,
-  Private->getType());
+  /*Restrict=*/false, Private->getType());
 
 // Step 3.1: Modify reference in dest Reduce list as needed.
 // Modifying the reference in Reduce list to point to the newly
@@ -1191,7 +1191,7 @@
   CGF.EmitStoreOfScalar(Bld.CreatePointerBitCastOrAddrSpaceCast(
 DestElementAddr.getPointer(), CGF.VoidPtrTy),
 DestElementPtrAddr, /*Volatile=*/false,
-C.VoidPtrTy);
+/*Restrict=*/false, C.VoidPtrTy);
 }
 
 // Step 4.1: Increment SrcBase/DestBase so that it points to the starting
@@ -1618,7 +1618,7 @@
 
 // *TargetElemPtr = SrcMediumVal;
 CGF.EmitStoreOfScalar(SrcMediumValue, TargetElemPtr, /*Volatile=*/false,
-  Private->getType());
+  /*Restrict=*/false, Private->getType());
 Bld.CreateBr(W0MergeBB);
 
 CGF.EmitBlock(W0ElseBB);
@@ -2287,7 +2287,7 @@
   NativePointeeAddrSpace));
   Address NativeParamAddr = CGF.CreateMemTemp(NativeParamType);
   CGF.EmitStoreOfScalar(TargetAddr, NativeParamAddr, /*Volatile=*/false,
-NativeParam->getType());
+/*Restrict=*/false, NativeParam->getType());
   return NativeParamAddr;
 }
 
Index: lib/CodeGen/CGOpenMPRuntime.cpp
===
--- lib/CodeGen/CGOpenMPRuntime.cpp
+++ lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7741,7 +7741,8 @@
  CounterVal->getType(), Int64Ty,
  CounterVal->getExprLoc());
   Address CntAddr = CGF.CreateMemTemp(Int64Ty, ".cnt.addr");
-  CGF.EmitStoreOfScalar(CntVal, CntAddr, /*Volatile=*/false, Int64Ty);
+  CGF.EmitStoreOfScalar(CntVal, CntAddr, /*Volatile=*/false, /*Restrict=*/false,
+Int64Ty);
   llvm::Value *Args[] = {emitUpdateLocation(CGF, C->getLocStart()),
  getThreadID(CGF, C->getLocStart()),
  CntAddr.getPointer()};
Index: lib/CodeGen/CGExpr.cpp
===
--- lib/CodeGen/CGExpr.cpp
+++ lib/CodeGen/CGExpr.cpp
@@ -1540,8 +1540,8 @@
 }
 
 void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
-bool Volatile, QualType Ty,
-

[PATCH] D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets

2017-08-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D29660#838150, @gtbercea wrote:

> First of all, I apologize if I've upset you with my previous post. I am 
> actively working on understanding what is causing these issues. It is not my 
> intention to write tests that work on local configurations only. I am upset 
> to see that these tests keep failing for your and maybe other configurations. 
> Without knowing the actual reason of the failures I can only speculate what 
> is going wrong with them hence the flurry of changes.


Should we have a mock CUDA installation directory in the test directory? We 
have a bunch of these in test/Driver/Inputs for various other things. When we 
could point these tests at that directory (or directories if we have different 
mocks for different CUDA versions) and remove any dependence on local CUDA 
configurations.


Repository:
  rL LLVM

https://reviews.llvm.org/D29660



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36537: [OpenMP] Enable executable lookup into driver directory.

2017-08-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

I think this is fine, but please add a comment explaining that you're doing 
this to find (e.g., to find the offload bundler, etc.).


Repository:
  rL LLVM

https://reviews.llvm.org/D36537



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-07-11 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

This is much closer to what I had in mind, thanks! Now I think we're in a 
position to make this work for more than just the CUDA target. It looks like 
the added code is now:

1. Remove -march from the translated arguments (because any existing -march 
would apply only for the host compilation).
2. Process the -Xopenmp-target flags and add those arguments to the list.
3. If we don't have an -march in the translated arguments, then add 
-march=sm_20 so that there's a suitable default (noting that this default must 
be higher than the regular CUDA default).

I propose the following:

(1) is good, but should be more general. It is not just the host's -march that 
should not apply to any arbitrary toolchain, but any of the -m options. 
You should remove all options that are in the m_Group options group (which, as 
noted in Options.td, are "Target-dependent compilation options"). I believe 
that you can iterate over them all using something like:

  for (const Arg *A : Args.filtered(options::OPT_m_Group)) {

and that might help. This should be in toolchain-independent code, and I'd 
prefer that we always remove these options whenever the host and target 
toolchain differ, but leave them when they're the same.

(2) is good, but, along with (1), should be in toolchain-independent code. I 
recommend that we add a new member function to ToolChain, called, to make a 
specific suggestion, TranslateOpenMPTargetArgs, and put the logic from (1) and 
(2) in  this function. Then, we can augment Compilation::getArgsForToolChain to 
do something like the following:

  const DerivedArgList &
  Compilation::getArgsForToolChain(const ToolChain *TC, StringRef BoundArch,
   Action::OffloadKind DeviceOffloadKind) {
if (!TC)
  TC = 
  
DerivedArgList * = TCArgs[{TC, BoundArch, DeviceOffloadKind}];
if (!Entry) {
  // First, translate OpenMP toolchain arguments provided via the 
-Xopenmp-toolchain flags.
  Entry = TranslateOpenMPTargetArgs(*TranslatedArgs, BoundArch, 
DeviceOffloadKind);
  if (!Entry)
Entry = TranslatedArgs;
  
  Entry = TC->TranslateArgs(*Entry, BoundArch, DeviceOffloadKind);
  if (!Entry)
Entry = TranslatedArgs;
}
  
return *Entry;
  }

And then (3) we leave as it is (where it is).




Comment at: lib/Driver/ToolChains/Cuda.cpp:480
+if (MArchList.empty())
+  // Default compute capability for CUDA toolchain is sm_20.
+  DAL->AddJoinedArg(nullptr,

This default is only for OpenMP, right? Please explain in the comment why this 
is the default for OpenMP.


https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-07-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Driver/ToolChains/Cuda.cpp:474
+for (StringRef Opt : OptList) {
+  AddMArchOption(DAL, Opts, Opt);
+}

gtbercea wrote:
> hfinkel wrote:
> > gtbercea wrote:
> > > hfinkel wrote:
> > > > Shouldn't you be adding all of the options, not just the -march= ones?
> > > I thought that that would be the case but there are a few issues:
> > > 
> > > 1. PTXAS and NVLINK each use a different flag for specifying the arch, 
> > > and, in turn, each flag is different from -march.
> > > 
> > > 2. -Xopenmp-target passes a flag to the entire toolchain not to 
> > > individual components of the toolchain so a translation of flags is 
> > > required in some cases to adapt the flag to the actual tool. -march is 
> > > one example, I'm not sure if there are others.
> > > 
> > > 3. At this point in the code, in order to add a flag and its value to the 
> > > DAL list, I need to be able to specify the option type (i.e. 
> > > options::OPT_march_EQ). I therefore need to manually recognize the flag 
> > > in the string representing the value of -Xopenmp-target or 
> > > -Xopenmp-target=triple.
> > > 
> > > 4. This patch handles the passing of the arch and can be extended to pass 
> > > other flags (as is stands, no other flags are passed through to the CUDA 
> > > toolchain). This can be extended on a flag by flag basis for flags that 
> > > need translating to a particular tool's flag. If the flag doesn't need 
> > > translating then the flag and it's value can be appended to the command 
> > > line as they are.
> > > 1. PTXAS and NVLINK each use a different flag for specifying the arch, 
> > > and, in turn, each flag is different from -march.
> > 
> > I don't understand why this is relevant. Don't 
> > NVPTX::Assembler::ConstructJob and NVPTX::Linker::ConstructJob handle that 
> > in either case?
> > 
> > This seems to be the same comment to point 2 as well.
> > 
> > > 3. At this point in the code, in order to add a flag and its value to the 
> > > DAL list, I need to be able to specify the option type (i.e. 
> > > options::OPT_march_EQ). I therefore need to manually recognize the flag 
> > > in the string representing the value of -Xopenmp-target or 
> > > -Xopenmp-target=triple.
> > 
> > I don't understand why this is true. Doesn't the code just below this, 
> > which handles -Xarch, do the general thing (it calls Opts.ParseOneArg and 
> > then adds it to the list of derived arguments)? Can't we handle this like 
> > -Xarch?
> > 
> > > This patch handles the passing of the arch and can be extended to pass 
> > > other flags (as is stands, no other flags are passed through to the CUDA 
> > > toolchain). This can be extended on a flag by flag basis for flags that 
> > > need translating to a particular tool's flag. If the flag doesn't need 
> > > translating then the flag and it's value can be appended to the command 
> > > line as they are.
> > 
> > I don't understand this either. If we need to extend this on a flag-by-flag 
> > basis, then it seems fundamentally broken. How could we append a flag to 
> > the command line without it also affecting the host compile?
> @hfinkel 
> 
> The problem is that when using -Xopenmp-target= -opt=val the value of 
> this flag is a list of two strings:
> 
> ['', '-opt=val']
> 
> What needs to happen next is to parse the string containing "-opt=val". The 
> reason I have to do this is because if I use -march, I can't pass -march as 
> is to PTXAS and NVLINK which have different flags for expressing the arch. I 
> need to translate the -march=sm_60 flag. I will have to do this for all flags 
> which require translation. There is no way I can just append this string to 
> the PTXAS and NVLINK commands because the flags for the 2 tools are 
> different. A flag which works for one of them, will not work for the other. 
> 
> So I need to actually parse that value to check whether it is a "-march" and 
> create an Arg object with the OPT_march_EQ identifier and the sm_60 value. 
> When invoking the commands for PTXAS and NVLINK, the dervied arg list will be 
> travered and every -march=sm_60 option will be transformed into "--gpu-name" 
> "sm_60" for PTXAS and into "-arch" "sm_60" for NVLINK. 
> 
> In the case of -Xarch, you will see that after we have traversed the entire 
> arg list we still have to special case -march and add it is manually added to 
> the DAL.
> 
> Let me know your thoughts on this.
> 
> Thanks,
> 
> --Doru
> What needs to happen next is to parse the string containing "-opt=val". 

Yes, that's what ParseOneArg will do.

> The reason I have to do this is because if I use -march, I can't pass -march 
> as is to PTXAS and NVLINK which have different flags for expressing the arch. 
> I need to translate the -march=sm_60 flag. I will have to do this for all 
> flags which require translation. There is no way I can just append this 
> string to the PTXAS and NVLINK commands because the flags 

[PATCH] D34158: For standards compatibility, preinclude if the file is available

2017-07-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D34158#803752, @jyknight wrote:

> This version is still disabling upon -nostdinc, which doesn't make sense to 
> me.


I agree. If I pass -nostdinc, so that I then arrange include paths for my own 
libc headers, I'd then like to pick up the predef header from that library (if 
available). There's no clearly right answer for whether  `__STDC_NO_THREADS__` 
is defined, for example, regardless of whether we're using -nostdinc.

> You didn't remove that, nor respond explaining why you think it does make 
> sense?




https://reviews.llvm.org/D34158



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29647: [OpenMP] Extend CLANG target options with device offloading kind.

2017-06-28 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: test/Driver/openmp-offload.c:614
+/// Check -march propagates compute capability to device offloading toolchain.
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-save-temps -no-canonical-prefixes -march=sm_35 %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHK-COMPUTE-CAPABILITY %s

gtbercea wrote:
> hfinkel wrote:
> > How does this work on x86 where the host also uses -march?
> That's where the additional flag comes in. The new flag *should* be used 
> instead. 
Ah, okay. Please just propose a patch which adds a new flag. This workaround is 
a bit strong for my tastes.


Repository:
  rL LLVM

https://reviews.llvm.org/D29647



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D34784#795871, @gtbercea wrote:

> In https://reviews.llvm.org/D34784#795367, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D34784#795353, @gtbercea wrote:
> >
> > > In https://reviews.llvm.org/D34784#795287, @hfinkel wrote:
> > >
> > > > What happens if you have multiple targets? Maybe this should be 
> > > > -fopenmp-targets-arch=foo,bar,whatever?
> > > >
> > > > Once this all lands, please make sure that you add additional test 
> > > > cases here. Make sure that the arch is passed through to the ptx and 
> > > > cuda tools as it should be. Make sure that the defaults work. Make sure 
> > > > that something reasonable happens if the user specifies the option more 
> > > > than once (if they're all the same).
> > >
> > >
> > > Hi Hal,
> > >
> > > At the moment only one arch is supported and it would apply to all the 
> > > target triples under -fopenmp-targets.
> > >
> > > I was planning to address the multiple archs problem in a future patch.
> > >
> > > I am assuming that in the case of multiple archs, each arch in 
> > > -fopenmp-targets-arch=A1,A2,A3 will bind to a corresponding triple in 
> > > -fopenmp-targets=T1,T2,T3 like so: T1 with A1, T2 with A2 etc. Is this a 
> > > practical interpretation of what should happen?
> >
> >
> > Yea, that's what I was thinking. I'm a bit concerned that none of this 
> > generalizes well. To take a step back, under what circumstances do we 
> > support multiple targets right now?
>
>
> We allow -fopenmp-targets to get a list of triples. I am not aware of any 
> limitations in terms of how many of these triples you can have. Even in the 
> test file of this patch we have the following: 
> "-targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu,host-powerpc64le--linux"
>
> > 
> > 
> >> Regarding tests: more tests can be added as a separate patch once 
> >> offloading is enabled by the patch following this one (i.e. 
> >> https://reviews.llvm.org/D29654). There actually is a test in 
> >> https://reviews.llvm.org/D29654 where I check that the arch is passed to 
> >> ptxas and nvlink correctly using this flag. I will add some more test 
> >> cases to cover the other situations you mentioned.
> > 
> > Sounds good.
> > 
> >> Thanks,
> >> 
> >> --Doru
>
> In our previous solution there might be a problem.  The same triple might be 
> used multiple times just so that you can have several archs in the other flag 
> (T1 and T2 being the same). There are some alternatives which I have 
> discussed with @ABataev.
>
> One solution could be to associate an arch with each triple to avoid 
> positional matching of triples in one flag with archs in another flag:
>
>   -fopenmp-targets=T1:A1,T2,T3:A2
>
>
> ":A1" is optional, also, in the future, we can pass other things to the 
> toolchain such as "-L/a/b/c/d":
>
>   -fopenmp-targets=T1:A1: -L/a/b/c/d,T2,T3:A2
>


Okay, good, this is exactly where I was going when I said I was worried about 
generalization. -march seems like one of many flags I might want to pass to the 
target compilation. Moreover, it doesn't seem special in what regard.

We have -Xclang and -mllvm, etc. to pass flags through to other stages of 
compilation. Could we do something similar here? Maybe something like: 
``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7``. That's 
unfortunately long, but if there's only one target, we could omit the triple?

> An actual example:
> 
>   -fopenmp-targets=nvptx64-nvidia-cuda:sm_35,openmp-powerpc64le-ibm-linux-gnu




Repository:
  rL LLVM

https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34846: Remove Clang support for '-fvectorize-slp-aggressive' which used LLVM's basic block vectorizer. This vectorizer has had no known users for many, many years and is completely surpassed

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.

LGTM, but you need to get the:

  def fslp_vectorize_aggressive : Flag<["-"], "fslp-vectorize-aggressive">, 
Group,
HelpText<"Enable the BB vectorization passes">;
  def fno_slp_vectorize_aggressive : Flag<["-"], 
"fno-slp-vectorize-aggressive">, Group;

in include/clang/Driver/Options.td as well. There's corresponding code in 
lib/Driver/ToolChains/Clang.cpp:

  // -fno-slp-vectorize-aggressive is default.
  if (Args.hasFlag(options::OPT_fslp_vectorize_aggressive,
   options::OPT_fno_slp_vectorize_aggressive, false))
CmdArgs.push_back("-vectorize-slp-aggressive");


https://reviews.llvm.org/D34846



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34846: Remove Clang support for '-fvectorize-slp-aggressive' which used LLVM's basic block vectorizer. This vectorizer has had no known users for many, many years and is completely surpassed

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D34846#796102, @rsmith wrote:

> Removing a `-cc1` flag is OK: we explicitly tell people that the `-cc1` 
> interface is subject to change without notice.


I'm assuming we can remove the regular flag as well. They'll just get an 
"unused argument" warning, which I think makes sense in this context. Do you 
think that we should do more? I know some people are using this flag, but I 
don't think many. We should remember to add something to the release notes.


https://reviews.llvm.org/D34846



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D34784#795980, @gtbercea wrote:

> In https://reviews.llvm.org/D34784#795934, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D34784#795871, @gtbercea wrote:
> >
> > > In https://reviews.llvm.org/D34784#795367, @hfinkel wrote:
> > >
> > > > In https://reviews.llvm.org/D34784#795353, @gtbercea wrote:
> > > >
> > > > > In https://reviews.llvm.org/D34784#795287, @hfinkel wrote:
> > > > >
> > > > > > What happens if you have multiple targets? Maybe this should be 
> > > > > > -fopenmp-targets-arch=foo,bar,whatever?
> > > > > >
> > > > > > Once this all lands, please make sure that you add additional test 
> > > > > > cases here. Make sure that the arch is passed through to the ptx 
> > > > > > and cuda tools as it should be. Make sure that the defaults work. 
> > > > > > Make sure that something reasonable happens if the user specifies 
> > > > > > the option more than once (if they're all the same).
> > > > >
> > > > >
> > > > > Hi Hal,
> > > > >
> > > > > At the moment only one arch is supported and it would apply to all 
> > > > > the target triples under -fopenmp-targets.
> > > > >
> > > > > I was planning to address the multiple archs problem in a future 
> > > > > patch.
> > > > >
> > > > > I am assuming that in the case of multiple archs, each arch in 
> > > > > -fopenmp-targets-arch=A1,A2,A3 will bind to a corresponding triple in 
> > > > > -fopenmp-targets=T1,T2,T3 like so: T1 with A1, T2 with A2 etc. Is 
> > > > > this a practical interpretation of what should happen?
> > > >
> > > >
> > > > Yea, that's what I was thinking. I'm a bit concerned that none of this 
> > > > generalizes well. To take a step back, under what circumstances do we 
> > > > support multiple targets right now?
> > >
> > >
> > > We allow -fopenmp-targets to get a list of triples. I am not aware of any 
> > > limitations in terms of how many of these triples you can have. Even in 
> > > the test file of this patch we have the following: 
> > > "-targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu,host-powerpc64le--linux"
> > >
> > > > 
> > > > 
> > > >> Regarding tests: more tests can be added as a separate patch once 
> > > >> offloading is enabled by the patch following this one (i.e. 
> > > >> https://reviews.llvm.org/D29654). There actually is a test in 
> > > >> https://reviews.llvm.org/D29654 where I check that the arch is passed 
> > > >> to ptxas and nvlink correctly using this flag. I will add some more 
> > > >> test cases to cover the other situations you mentioned.
> > > > 
> > > > Sounds good.
> > > > 
> > > >> Thanks,
> > > >> 
> > > >> --Doru
> > >
> > > In our previous solution there might be a problem.  The same triple might 
> > > be used multiple times just so that you can have several archs in the 
> > > other flag (T1 and T2 being the same). There are some alternatives which 
> > > I have discussed with @ABataev.
> > >
> > > One solution could be to associate an arch with each triple to avoid 
> > > positional matching of triples in one flag with archs in another flag:
> > >
> > >   -fopenmp-targets=T1:A1,T2,T3:A2
> > >
> > >
> > > ":A1" is optional, also, in the future, we can pass other things to the 
> > > toolchain such as "-L/a/b/c/d":
> > >
> > >   -fopenmp-targets=T1:A1: -L/a/b/c/d,T2,T3:A2
> > >
> >
> >
> > Okay, good, this is exactly where I was going when I said I was worried 
> > about generalization. -march seems like one of many flags I might want to 
> > pass to the target compilation. Moreover, it doesn't seem special in what 
> > regard.
> >
> > We have -Xclang and -mllvm, etc. to pass flags through to other stages of 
> > compilation. Could we do something similar here? Maybe something like: 
> > ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7``. That's 
> > unfortunately long, but if there's only one target, we could omit the 
> > triple?
>
>
> The triple could be omitted, absolutely.
>
> If you have the following:
>
> -fopenmp-targets=openmp-powerpc64le-ibm-linux-gnu 
> ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7`` 
> ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr8``
>
> This would end up having a toolchain called for each one of the 
> -Xopenmp-target sets of flags even though a single triple was specified under 
> the -fopenmp-targets. Would this be ok?


Why? That does not sound desirable. And could you even use these multiple 
outputs? I think you'd want to pass all of the arguments for each target triple 
to the one toolchain invocation for that target triple. Is that possible?

> 
> 
>> 
>> 
>>> An actual example:
>>> 
>>>   
>>> -fopenmp-targets=nvptx64-nvidia-cuda:sm_35,openmp-powerpc64le-ibm-linux-gnu




Repository:
  rL LLVM

https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-06-30 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

...

>   
 
 Okay, good, this is exactly where I was going when I said I was worried 
 about generalization. -march seems like one of many flags I might want to 
 pass to the target compilation. Moreover, it doesn't seem special in what 
 regard.
 
 We have -Xclang and -mllvm, etc. to pass flags through to other stages of 
 compilation. Could we do something similar here? Maybe something like: 
 ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7``. That's 
 unfortunately long, but if there's only one target, we could omit the 
 triple?
>>> 
>>> The triple could be omitted, absolutely.
>>> 
>>> If you have the following:
>>> 
>>> -fopenmp-targets=openmp-powerpc64le-ibm-linux-gnu 
>>> ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7`` 
>>> ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr8``
>>> 
>>> This would end up having a toolchain called for each one of the 
>>> -Xopenmp-target sets of flags even though a single triple was specified 
>>> under the -fopenmp-targets. Would this be ok?
>> 
>> Why? That does not sound desirable. And could you even use these multiple 
>> outputs? I think you'd want to pass all of the arguments for each target 
>> triple to the one toolchain invocation for that target triple. Is that 
>> possible?
> 
> I agree, I don't think that is something we want (i.e. having one triple lead 
> to two toolchains), with the current flags you can't do that today. I wanted 
> to check with you though that's why i mentioned it.
> 
> I think appending all options for a particular triple together is more 
> desirable.

Good, let's do that.


Repository:
  rL LLVM

https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

What happens if you have multiple targets? Maybe this should be 
-fopenmp-targets-arch=foo,bar,whatever?

Once this all lands, please make sure that you add additional test cases here. 
Make sure that the arch is passed through to the ptx and cuda tools as it 
should be. Make sure that the defaults work. Make sure that something 
reasonable happens if the user specifies the option more than once (if they're 
all the same).


Repository:
  rL LLVM

https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29647: [OpenMP] Extend CLANG target options with device offloading kind.

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rL LLVM

https://reviews.llvm.org/D29647



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D34784#795353, @gtbercea wrote:

> In https://reviews.llvm.org/D34784#795287, @hfinkel wrote:
>
> > What happens if you have multiple targets? Maybe this should be 
> > -fopenmp-targets-arch=foo,bar,whatever?
> >
> > Once this all lands, please make sure that you add additional test cases 
> > here. Make sure that the arch is passed through to the ptx and cuda tools 
> > as it should be. Make sure that the defaults work. Make sure that something 
> > reasonable happens if the user specifies the option more than once (if 
> > they're all the same).
>
>
> Hi Hal,
>
> At the moment only one arch is supported and it would apply to all the target 
> triples under -fopenmp-targets.
>
> I was planning to address the multiple archs problem in a future patch.
>
> I am assuming that in the case of multiple archs, each arch in 
> -fopenmp-targets-arch=A1,A2,A3 will bind to a corresponding triple in 
> -fopenmp-targets=T1,T2,T3 like so: T1 with A1, T2 with A2 etc. Is this a 
> practical interpretation of what should happen?


Yea, that's what I was thinking. I'm a bit concerned that none of this 
generalizes well. To take a step back, under what circumstances do we support 
multiple targets right now?

> Regarding tests: more tests can be added as a separate patch once offloading 
> is enabled by the patch following this one (i.e. 
> https://reviews.llvm.org/D29654). There actually is a test in 
> https://reviews.llvm.org/D29654 where I check that the arch is passed to 
> ptxas and nvlink correctly using this flag. I will add some more test cases 
> to cover the other situations you mentioned.

Sounds good.

> Thanks,
> 
> --Doru




Repository:
  rL LLVM

https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29647: [OpenMP] Extend CLANG target options with device offloading kind.

2017-06-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D29647#795271, @hfinkel wrote:

> LGTM


When you commit this, please make sure to mention in the commit message that 
the test cases will be associated with follow-up commits.


Repository:
  rL LLVM

https://reviews.llvm.org/D29647



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-06-30 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: include/clang/Driver/Options.td:453
+def Xopenmp_target : Separate<["-"], "Xopenmp-target">,
+  HelpText<"Pass arguments to target offloading toolchain.">;
+def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">,

Can this be?

HelpText<"Pass  to the target offloading toolchain.">,
MetaVarName<"">;



Comment at: include/clang/Driver/Options.td:455
+def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">,
+  HelpText<"Pass arguments to target offloading toolchain. First entry is a 
triple that identifies the toolchain.">;
 def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>,

  HelpText<"Pass  to the specified target offloading toolchain. The triple 
that identifies the toolchain must be provided after the equals sign.">, 
MetaVarName<"">;



Comment at: lib/Driver/ToolChains/Cuda.cpp:443
+
+// Get the compute capability from the -fopenmp-targets flag.
+// The default compute capability is sm_20 since this is a CUDA

Is this first sentence accurate?



Comment at: lib/Driver/ToolChains/Cuda.cpp:444
+// Get the compute capability from the -fopenmp-targets flag.
+// The default compute capability is sm_20 since this is a CUDA
+// tool chain.

This comment should be moved down to where the sm_20 default is added.



Comment at: lib/Driver/ToolChains/Cuda.cpp:446
+// tool chain.
+auto OptList = Args.getAllArgValues(options::OPT_Xopenmp_target_EQ);
+

Why is this logic in this function? Don't you need the same logic in 
Generic_GCC::TranslateArgs to handle non-CUDA offloading toolchains?




Comment at: lib/Driver/ToolChains/Cuda.cpp:469
+// toolchain specified).
+assert(Args.getAllArgValues(options::OPT_fopenmp_targets_EQ).size() == 1 &&
+"Target toolchain not specified on -Xopenmp-target and cannot be 
deduced.");

A user can trigger this assert, right? Please make this a diagnostic error 
instead.



Comment at: lib/Driver/ToolChains/Cuda.cpp:474
+for (StringRef Opt : OptList) {
+  AddMArchOption(DAL, Opts, Opt);
+}

Shouldn't you be adding all of the options, not just the -march= ones?



Comment at: lib/Driver/ToolChains/Cuda.cpp:478
+auto MArchList = DAL->getAllArgValues(options::OPT_march_EQ);
+assert(MArchList.size() < 2 && "At most one GPU arch allowed.");
+if (MArchList.empty())

Can a user hit this? If so, it must be an actual diagnostic.



Comment at: test/Driver/openmp-offload.c:607
+
+// CHK-FOPENMP-EQ-TARGET: clang{{.*}} argument unused during compilation: 
'-Xopenmp-target=powerpc64le-ibm-linux-gnu -march=pwr8'
+

I don't see why you'd check that the arguments are unused. They should be used. 
One exception might be that you might want to force -Xopenmp-target=foo to be 
unused if foo is not a currently-targeted offloading triple. There could be a 
separate test case for that.

Otherwise, I think you should be able to check the relevant backend commands, 
no? (something like where CHK-COMMANDS is used above in this file).


https://reviews.llvm.org/D34784



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34928: Add docs for -foptimization-record-file=

2017-07-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D34928



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29647: [OpenMP] Extend CLANG target options with device offloading kind.

2017-06-28 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Driver/ToolChains/Cuda.cpp:435
+
+// TODO: get the compute capability from offloading arguments when not
+// using the default compute capability of sm_20.

gtbercea wrote:
> hfinkel wrote:
> > gtbercea wrote:
> > > hfinkel wrote:
> > > > Why is this a TODO? Is the necessary information not currently 
> > > > available in the offloading arguments, or are we just not grabbing it 
> > > > in this patch?
> > > > 
> > > This is for a future patch in which we will grab the compute capability 
> > > from a special flag. I believe there has already been some discussion on 
> > > the dev mailing list as to what that flag should be but I don't think 
> > > there was a general consensus as to how that flag should be named.
> > Which mailing list thread discussed this?
> I believe it was openmp-dev. The proposal was initially posted by Gregory 
> Rodgers.
Can you please find it and post the link from here: 
http://lists.llvm.org/pipermail/openmp-dev/

I do recall discussing this at some point somewhere, but I'm not finding it in 
my email.


Repository:
  rL LLVM

https://reviews.llvm.org/D29647



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29647: [OpenMP] Extend CLANG target options with device offloading kind.

2017-06-28 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Driver/ToolChains/Cuda.cpp:435
+
+// TODO: get the compute capability from offloading arguments when not
+// using the default compute capability of sm_20.

gtbercea wrote:
> hfinkel wrote:
> > Why is this a TODO? Is the necessary information not currently available in 
> > the offloading arguments, or are we just not grabbing it in this patch?
> > 
> This is for a future patch in which we will grab the compute capability from 
> a special flag. I believe there has already been some discussion on the dev 
> mailing list as to what that flag should be but I don't think there was a 
> general consensus as to how that flag should be named.
Which mailing list thread discussed this?


Repository:
  rL LLVM

https://reviews.llvm.org/D29647



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D29647: [OpenMP] Extend CLANG target options with device offloading kind.

2017-06-28 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: test/Driver/openmp-offload.c:614
+/// Check -march propagates compute capability to device offloading toolchain.
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-save-temps -no-canonical-prefixes -march=sm_35 %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHK-COMPUTE-CAPABILITY %s

How does this work on x86 where the host also uses -march?


Repository:
  rL LLVM

https://reviews.llvm.org/D29647



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >