[PATCH] D83915: [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support

2020-07-27 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision as: hfinkel.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM. Thank you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83915/new/

https://reviews.llvm.org/D83915



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83915: [PowerPC] Remove QPX/A2Q BGQ/BGP support

2020-07-23 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

Thanks for working on this. Please, however, take another pass of the test 
cases (especially those that are not in the PowerPC directory).  Many of those 
should not be deleted, please change triple instead. They're testing general 
functionality.




Comment at: 
llvm/test/Transforms/DeadStoreElimination/combined-partial-overwrites.ll:1
-; RUN: opt -S -dse -enable-dse-partial-store-merging=false < %s | FileCheck %s
-target datalayout = "E-m:e-i64:64-n32:64"

This test should not be deleted. Change the triple.



Comment at: llvm/test/Transforms/EntryExitInstrumenter/mcount.ll:1
-; RUN: opt 
-passes="function(ee-instrument),cgscc(inline),function(post-inline-ee-instrument)"
 -S < %s | FileCheck %s
-

This test should not be deleted. It's testing general functionality. The triple 
should be changed.



Comment at: openmp/runtime/src/kmp_platform.h:71
-#undef KMP_OS_CNK
-#define KMP_OS_CNK 1
-#endif

Was this not used anywhere? Or is there more to delete?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83915/new/

https://reviews.llvm.org/D83915



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77683: [Docs] Make code review policy clearer about requested pre-commit reviews

2020-04-13 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D77683#1973762 , @dblaikie wrote:

> In D77683#1973757 , @jdoerfert wrote:
>
> > In D77683#1970826 , @mehdi_amini 
> > wrote:
> >
> > > I am still not sure what "if someone has asked for extra review of a 
> > > specific area" refers to?
> >
> >
> > As said earlier
> >
> > >>   If I understand this correctly, this is meant to cover situations 
> > >> where reviewers are active in an area and indicated an interest in 
> > >> reviewing basically everything.
> > > 
> > > Pretty much, yes.
> >
> > this should mean that, if requested, all non-trivial patches should go 
> > through review. The current wording is very lenient, especially wrt. code 
> > owners.
> >  While most people I talked to don't see owners as special per se (but just 
> > assume they have more responsibility), this paragraph says they have 
> > special rights.
> >  Given the murky ways owners are "selected", I think we should have a well 
> > defined way to limit these rights without revoking the status.
>
>
> I disagree here. I think it's suitable (within the way the LLVM project 
> works) for code owners to commit without review - given they are the arbiters 
> of what's acceptable in that subcomponent - ultimately they can veto anyone 
> else (short of a broader project/code owner). Doesn't usually come to that, 
> but that's what the ownership role means.
>
> I don't think it's correct for arbitrary contributors to say "you need 
> to/this component needs review-before-commit" - the code owner could say that 
> if they really don't trust any of the contributors to conform to the 
> direction they have in mind for the component (that's not to discredit the 
> contributors - but that's the point of pre-commit review: to ensure it 
> conforms to the code owners vision for the component). 
> privledges
>  (yes, code owners aren't meant to be dictators - but they're ultimately the 
> final decider)


I disagree that this characterizes our conception of the role of code owner 
within the project, but there is some mitigating context. In my experience, we 
describe code owners as "reviewers of last resort", and their primary 
responsibility is to prevent review stagnation with their component (including 
from new/infrequent contributors whose patches might otherwise go unreviewed). 
The code owner is not the final decider in all cases, but can serve as a tie 
breaker when community consensus is unclear. That's an important role, because 
it ensures our ability to make forward progress, but is different in character 
from someone with overriding final authority. The code owners vision matters 
only slightly if the community consensus is for a different vision. There 
certainly are parts of LLVM that are developed by one person, and that person 
is the code owner, and thus excessive a lot of influence over the future 
direction of the component. Patches are often contributed without pre-commit 
review, in this case, and the code owner is often the most-skilled reviewer for 
any other patches as well. However, this state exists only at the pleasure of 
the rest of the community - we all see these patches on the commits list, and 
should more people become involved and request a more-inclusive pre-commit 
review cycle, that's what should happen. For components actively developed by 
many people, code owners have relatively minor privileges in this regard.

I suppose that we never wrote this down anywhere, but when I became code owner 
for the AA subsystem, we had an understanding that, because AA is so pervasive 
and subtle, even as code owner I would never commit anything (aside from 
reverts or similar) without pre-commit review. IMHO, this is the only 
responsible way to handle this subsystem. Not all subsystems are so risky, so 
there's appropriately a spectrum of development practices (some favoring 
velocity over stability), but I still view this as being more democratic than 
hierarchically authoritative.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77683/new/

https://reviews.llvm.org/D77683



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75779: [OpenMP] `omp begin/end declare variant` - part 2, sema (+"CG")

2020-03-26 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75779/new/

https://reviews.llvm.org/D75779



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75779: [OpenMP] `omp begin/end declare variant` - part 2, sema (+"CG")

2020-03-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/AST/DeclarationName.cpp:147
+Name = Name.split(getOpenMPVariantManglingSeparatorStr()).first;
+  OS << Name;
+}

Would it make sense to print " (omp variant)" after the name to disambiguate?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75779/new/

https://reviews.llvm.org/D75779



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75779: [OpenMP] `omp begin/end declare variant` - part 2, sema (+"CG")

2020-03-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/Sema/SemaOpenMP.cpp:5396
+  IdentifierInfo  = Context.Idents.get(
+  (D.getIdentifier()->getName() + "." + DVScope.NameSuffix).str());
+  D.SetIdentifier(, D.getBeginLoc());

jdoerfert wrote:
> hfinkel wrote:
> > Is there any way in which this name might become visible to users (e.g., in 
> > error messages)?
> Yes, I think so. One way to trigger it would be to define the same function 
> in the same `omp begin declare variant scope` (or two with the same context). 
> I haven't verified this though.
> TBH, I'm unsure how bad this actually is in the short term. The original name 
> is still a at the beginning. We should obviously make sure the error message 
> is appropriate eventually, e.g., it de-mangles the name.
I think that we have to consider diagnostic quality as a first-class citizen. 
There are a few different ways that we might approach this. A minimal diff from 
what you have might be:

  1. Replace the "." above with ".ompvariant."
  2. Add logic to FunctionDecl::getNameForDiagnostic (or 
NamedDecl::getNameForDiagnostic) that recognizes that magic string in the name 
and alters the printing to do something intelligible for the user with the name.

(I think that (1) is a good idea anyway).

Another way would be to add some first-class property to the function and then 
leave the mangling to CodeGen. Are we mangling these in Sema in other places 
too?




Also, these can have external linkage, right?



Comment at: clang/lib/Sema/SemaOpenMP.cpp:5487
+
+VMIs.erase(VMIs.begin() + BestIdx);
+Exprs.erase(Exprs.begin() + BestIdx);

jdoerfert wrote:
> hfinkel wrote:
> > Why is the ordering here useful? Don't you collect all of the variant 
> > clauses from all of the declarations? Can't there be duplicates? (and does 
> > the relative order need always be the same?) Are we effectively supporting 
> > here, as an extension, cases where not all of the declarations have the 
> > same set of variants declared (it loos like we are because there's no break 
> > in the `while (CalleeFnDecl)` loop, but this makes me wonder if that would 
> > still have an odd behavior.
> > Why is the ordering here useful?
> 
> I don't think it is ordered. We have a conceptual set and BestIdx determines 
> the which of the set is the best. All others are equal.
> 
> > Don't you collect all of the variant clauses from all of the declarations? 
> 
> Yes.
> 
> > Can't there be duplicates? (and does the relative order need always be the 
> > same?)
> 
> Yes (and no).  `getBestVariantMatchForContext` should determine the best 
> regardless of duplicates, we might just try it multiple times if we didn't 
> manage to create a call.
> 
> > Are we effectively supporting here, as an extension, cases where not all of 
> > the declarations have the same set of variants declared (it loos like we 
> > are because there's no break in the while (CalleeFnDecl) loop, but this 
> > makes me wonder if that would still have an odd behavior.
> 
> We are supporting that. All declarations are scanned and all variants are 
> collected. What odd behavior do you refer to?
> 
Makes sense, thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75779/new/

https://reviews.llvm.org/D75779



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75779: [OpenMP] `omp begin/end declare variant` - part 2, sema (+"CG")

2020-03-24 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticParseKinds.td:1253
+: Error<"expected '#pragma omp begin declare variant' to match this stray "
+"`#pragma omp end delcare variant`">;
 def err_omp_declare_target_unexpected_clause: Error<

We're not extremely consistent about how we phrase these kinds of errors, but I 
grepped for stray and we have only one message about stray tokens,  but grepped 
for matching and we have a bunch more. Phrasing this as:

  '#pragma omp end declare variant' with no matching ''#pragma omp begin 
declare variant'

might be better.



Comment at: clang/include/clang/Sema/Sema.h:9727
+
+  /// The declarator \p D defines a function in the socpe \p S which is nested
+  /// in an `omp begin/end declare variant` scope. In this method we create a

scope



Comment at: clang/lib/Sema/SemaDecl.cpp:13552
+  if (LangOpts.OpenMP && isInOpenMPDeclareVariantScope() &&
+  TemplateParameterLists.empty())
+BaseFD = ActOnStartOfFunctionDefinitionInOpenMPDeclareVariantScope(

Please add a comment here explaining why the `TemplateParameterLists.empty()` 
check is here.



Comment at: clang/lib/Sema/SemaOpenMP.cpp:5396
+  IdentifierInfo  = Context.Idents.get(
+  (D.getIdentifier()->getName() + "." + DVScope.NameSuffix).str());
+  D.SetIdentifier(, D.getBeginLoc());

Is there any way in which this name might become visible to users (e.g., in 
error messages)?



Comment at: clang/lib/Sema/SemaOpenMP.cpp:5471
+  // variant expression. We allow this to fail in which case we continue 
with
+  // the next best variant expression.
+  Sema::TentativeAnalysisScope Trap(*this);

Please provide an example or otherwise explain why this failure behavior is 
correct.



Comment at: clang/lib/Sema/SemaOpenMP.cpp:5487
+
+VMIs.erase(VMIs.begin() + BestIdx);
+Exprs.erase(Exprs.begin() + BestIdx);

Why is the ordering here useful? Don't you collect all of the variant clauses 
from all of the declarations? Can't there be duplicates? (and does the relative 
order need always be the same?) Are we effectively supporting here, as an 
extension, cases where not all of the declarations have the same set of 
variants declared (it loos like we are because there's no break in the `while 
(CalleeFnDecl)` loop, but this makes me wonder if that would still have an odd 
behavior.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75779/new/

https://reviews.llvm.org/D75779



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75285: Mark restrict pointer or reference to const as invariant

2020-02-27 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel requested changes to this revision.
hfinkel added a comment.
This revision now requires changes to proceed.

Unfortunately, we cannot do this kind of thing just because it seems to make 
sense. The language semantics must be exactly satisfied by the IR-level 
semantics. I certainly agree that it would make sense for users to be able to 
mark invariant loads, but this mechanism simply might not be the right one.

One problem here is that, with something like:

  char test2(X *x) {
const char* __restrict p = &(x->b);
return *p;
  }

what happens when the function is inlined? Does the "invariantness" only still 
apply to accesses within the scope of the local restrict pointer? I believe 
that it would not, and that would be a problem because later code might legally 
modify the relevant data.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75285/new/

https://reviews.llvm.org/D75285



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72996: [Sema] Attempt to perform call-size-specific `__attribute__((alloc_align(param_idx)))` validation

2020-01-21 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/Sema/SemaChecking.cpp:4489
+// Alignment calculations can wrap around if it's greater than 2**29.
+unsigned MaximumAlignment = 536870912;
+if (I > MaximumAlignment)

jdoerfert wrote:
> hfinkel wrote:
> > jdoerfert wrote:
> > > erichkeane wrote:
> > > > jdoerfert wrote:
> > > > > erichkeane wrote:
> > > > > > I thought we had this stored somewhere else?  We probably should 
> > > > > > have this be a constant somewhere in the frontend.  I THINK I 
> > > > > > remember doing a review where I pulled this value into clang 
> > > > > > somewhere...
> > > > > That was D72998, and I don't think Clang is the right place for this 
> > > > > constant. It is a property of the llvm alignment attribute and it 
> > > > > should live there. Thus, llvm/include/Attributes.h or some similar 
> > > > > place. Can't we "fix" the linker error by making it a constexpr 
> > > > > global or are the errors because of other file content? If the 
> > > > > latter, we could go with a llvm/include/magic_constants.h ;)
> > > > The one I was thinking of was this one: https://reviews.llvm.org/D68824
> > > > 
> > > > I don't remember what we came up with on the linking issue.  It would 
> > > > be really nice if it was just something included from LLVM, but I think 
> > > > SEMA typically doesn't include stuff from LLVM either.
> > > I'm not too happy with the duplication of the constant but defining it 
> > > once in clang is certainly better than having it in N places. For OpenMP 
> > > we look into LLVM during SEMA and here there is an argument to be made 
> > > that we should as well. I imagine more cases would pop up over time.
> > > 
> > > FWIW, if we allow to include LLVM headers, e.g., from IR or Frontend, we 
> > > could still have a wrapper in SEMA to get the information so it doesn't 
> > > expose the llvm:: namespace at the use sides (if that helps).
> > > For OpenMP we look into LLVM during SEMA 
> > 
> > How do we do that?
> > 
> > There's certainly an interesting philosophical issue around whether changes 
> > in LLVM should directly manifest as Clang behavioral changes, especially in 
> > -fsyntax-only. The answer to this question might be different for 
> > extensions vs. core language features (although alignment restrictions 
> > might implicate both). AFAIKT, historically , our answer has been to insist 
> > on separation.
> > > For OpenMP we look into LLVM during SEMA
> > How do we do that?
> 
> I was referring to code like this 
> https://reviews.llvm.org/D71830#C1739755NL11085 
> which is in CodeGen right now but has to move to SemaOverload. The code is 
> completely reusable between Clang and Flang so I put it in 
> lib/Frontend/OpenMP and I think that is the right place for it.
> 
> > There's certainly an interesting philosophical issue around whether changes 
> > in LLVM should directly manifest as Clang behavioral changes, especially in 
> > -fsyntax-only. The answer to this question might be different for 
> > extensions vs. core language features (although alignment restrictions 
> > might implicate both). AFAIKT, historically , our answer has been to insist 
> > on separation.
> 
> 
> I get that in a general sense. For the problem at hand, and as far as I 
> known, the restriction stems only from the LLVM-IR restriction, correct? If 
> so, what is the argument for separation? I mean, a change of the value in 
> LLVM might directly impact Clang behavior.
> 
> I could also see us clamping the alignment during codegen. While that might 
> have other problems they seem less practical to me.
> 
> 
> 
> 
> I was referring to code like this 
> https://reviews.llvm.org/D71830#C1739755NL11085
> which is in CodeGen right now but has to move to SemaOverload. The code is 
> completely reusable between Clang and Flang so I put it in 
> lib/Frontend/OpenMP and I think that is the right place for it.

Fair, but that's a library designed to be a home for cross-language frontend 
components. The variant-selection logic to which you're referring, itself, does 
not actually need to link to LLVM's IR library, correct?

> I get that in a general sense. For the problem at hand, and as far as I 
> known, the restriction stems only from the LLVM-IR restriction, correct? If 
> so, what is the argument for separation? I mean, a change of the value in 
> LLVM might directly impact Clang behavior.

Yes, I believe that the restriction is necessary because of an underlying LLVM 
IR restriction. From my perspective, your argument is perfectly rational. Clang 
only supports code generation using LLVM IR, and a restriction that comes from 
LLVM should be directly tied to the underlying LLVM threshold regardless of 
where it is surfaced. We have, however, avoided a linking dependence (I 
believe, primarily, to help the load times and file sizes of tools based on 
Clang which don't otherwise need to link to the LLVM IR 

[PATCH] D72996: [Sema] Attempt to perform call-size-specific `__attribute__((alloc_align(param_idx)))` validation

2020-01-21 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/Sema/SemaChecking.cpp:4489
+// Alignment calculations can wrap around if it's greater than 2**29.
+unsigned MaximumAlignment = 536870912;
+if (I > MaximumAlignment)

jdoerfert wrote:
> erichkeane wrote:
> > jdoerfert wrote:
> > > erichkeane wrote:
> > > > I thought we had this stored somewhere else?  We probably should have 
> > > > this be a constant somewhere in the frontend.  I THINK I remember doing 
> > > > a review where I pulled this value into clang somewhere...
> > > That was D72998, and I don't think Clang is the right place for this 
> > > constant. It is a property of the llvm alignment attribute and it should 
> > > live there. Thus, llvm/include/Attributes.h or some similar place. Can't 
> > > we "fix" the linker error by making it a constexpr global or are the 
> > > errors because of other file content? If the latter, we could go with a 
> > > llvm/include/magic_constants.h ;)
> > The one I was thinking of was this one: https://reviews.llvm.org/D68824
> > 
> > I don't remember what we came up with on the linking issue.  It would be 
> > really nice if it was just something included from LLVM, but I think SEMA 
> > typically doesn't include stuff from LLVM either.
> I'm not too happy with the duplication of the constant but defining it once 
> in clang is certainly better than having it in N places. For OpenMP we look 
> into LLVM during SEMA and here there is an argument to be made that we should 
> as well. I imagine more cases would pop up over time.
> 
> FWIW, if we allow to include LLVM headers, e.g., from IR or Frontend, we 
> could still have a wrapper in SEMA to get the information so it doesn't 
> expose the llvm:: namespace at the use sides (if that helps).
> For OpenMP we look into LLVM during SEMA 

How do we do that?

There's certainly an interesting philosophical issue around whether changes in 
LLVM should directly manifest as Clang behavioral changes, especially in 
-fsyntax-only. The answer to this question might be different for extensions 
vs. core language features (although alignment restrictions might implicate 
both). AFAIKT, historically , our answer has been to insist on separation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72996/new/

https://reviews.llvm.org/D72996



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> I'm not sure whether this is deliberate (but it seems weird) or just a bug. I 
> can ask the GCC developers ...

Please do. If there's a rationale, we should know.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72998: [IR] Attribute/AttrBuilder: use Value::MaximumAlignment magic constant

2020-01-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

Can we, at least, put this constant in a header file so we don't repeat it in 
several places? Also, can we write it as 0x2000 so that it's more obvious 
what the value is.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72998/new/

https://reviews.llvm.org/D72998



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72675: ix -ffast-math/-ffp-contract interaction

2020-01-13 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.
Herald added a subscriber: wuzish.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2721
   if (!MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros &&
-  !TrappingMath)
+  !TrappingMath && !FPContractDisabled)
 CmdArgs.push_back("-menable-unsafe-fp-math");

I think that you just need

  && !FPContract.equals("off")

or

  && !(FPContract.equals("off") || FPContract.equals("on"))

of which I think the latter. fp-contract=no is also not a fast-math-compatible 
setting in that regard, right?



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2745
   if (!FPContract.empty())
 CmdArgs.push_back(Args.MakeArgString("-ffp-contract=" + FPContract));
 

So now we pass it twice, because we also pass it here?



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2763
   if (!HonorINFs && !HonorNaNs && !MathErrno && AssociativeMath &&
   ReciprocalMath && !SignedZeros && !TrappingMath && !RoundingFPMath) {
+if (!FPContractDisabled)

You want the check here, I think, and not below so that if parts of fast-math 
are disabled `__FAST_MATH__` is not set. That seems to be what the comment/code 
currently do, although what does GCC do in this regard?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-17 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1787571 , @ABataev wrote:

> In D71241#1787265 , @hfinkel wrote:
>
> > In D71241#1786959 , @jdoerfert 
> > wrote:
> >
> > > In D71241#1786530 , @ABataev 
> > > wrote:
> > >
> > > > Most probably, we can use this solution without adding a new 
> > > > expression. `DeclRefExpr` class can contain 2 decls: FoundDecl and the 
> > > > Decl being used. You can use FoundDecl to point to the original 
> > > > function and used decl to point to the function being called in this 
> > > > context. But at first, we need to be sure that we can handle all corner 
> > > > cases correctly.
> > >
> > >
> > > What new expression are you talking about?
> >
> >
> > To be clear, I believe he's talking about the new expression that I 
> > proposed we add in order to represent this kind of call. If that's not 
> > needed, and we can use the FoundDecl/Decl pair for that purpose, that 
> > should also be considered.
> >
> > > This solution already does point to both declarations, as shown here: 
> > > https://reviews.llvm.org/D71241#1782504
>
>
> Exactly. We need to check if the `MemberRefExpr` can do this too to handle 
> member functions correctly.
>  And need to wait for opinion from Richard Smith about the design. We need to 
> be sure that it won't break compatibility with C/C++ in some corner cases. 
> Design bugs are very hard to solve and I want to be sure that we don't miss 
> anything. And we provide full compatibility with both C and C++.


We do need to be careful here. For cases with FoundDecl != Decl, I think that 
the typo-correction cases look like they probably work, but there are a few 
places where we make semantic decisions based on the mismatch, such as:

In SemaTemplate.cpp below line 512, we have (this is in C++03-specific code):

  } else if (!Found.isSuppressingDiagnostics()) {
//   - if the name found is a class template, it must refer to the same
// entity as the one found in the class of the object expression,
// otherwise the program is ill-formed.
if (!Found.isSingleResult() ||
getAsTemplateNameDecl(Found.getFoundDecl())->getCanonicalDecl() !=
OuterTemplate->getCanonicalDecl()) {
  Diag(Found.getNameLoc(),
   diag::ext_nested_name_member_ref_lookup_ambiguous)

and in SemaExpr.cpp near line 2783, we have:

  // If we actually found the member through a using declaration, cast
  // down to the using declaration's type.
  //
  // Pointer equality is fine here because only one declaration of a
  // class ever has member declarations.
  if (FoundDecl->getDeclContext() != Member->getDeclContext()) {
assert(isa(FoundDecl));
QualType URecordType = Context.getTypeDeclType(
   cast(FoundDecl->getDeclContext()));


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-16 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1786959 , @jdoerfert wrote:

> In D71241#1786530 , @ABataev wrote:
>
> > Most probably, we can use this solution without adding a new expression. 
> > `DeclRefExpr` class can contain 2 decls: FoundDecl and the Decl being used. 
> > You can use FoundDecl to point to the original function and used decl to 
> > point to the function being called in this context. But at first, we need 
> > to be sure that we can handle all corner cases correctly.
>
>
> What new expression are you talking about?


To be clear, I believe he's talking about the new expression that I proposed we 
add in order to represent this kind of call. If that's not needed, and we can 
use the FoundDecl/Decl pair for that purpose, that should also be considered.

> This solution already does point to both declarations, as shown here: 
> https://reviews.llvm.org/D71241#1782504


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67923: [TLI] Support for per-Function TLI that overrides available libfuncs

2019-12-13 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

In D67923#1784015 , @tejohnson wrote:

> Please take a look. This is now updated to reflect the commit of D71193 
> , which translated the options to the new 
> attributes. I also removed some comments that I realized didn't make sense, 
> as we need to keep a baseline availability array on the base TLII since that 
> is set based on the architecture. We simply override this for no-builtin* 
> attributes on the function. I removed the code from clang that was setting up 
> the availability array based on the options, and all tests pass.


LGTM.




Comment at: llvm/include/llvm/Analysis/TargetLibraryInfo.h:52
   unsigned char AvailableArray[(NumLibFuncs+3)/4];
+
   llvm::DenseMap CustomNames;

Unintentional new whitespace?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67923/new/

https://reviews.llvm.org/D67923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69770: [APFloat] Add recoverable string parsing errors to APFloat

2019-12-13 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

If we really want to be confident that this is robust, we should probably 
fuzz-test it. Regardless, this seems like a definite improvement. LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69770/new/

https://reviews.llvm.org/D69770



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-13 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1783444 , @ABataev wrote:

> In D71241#1782586 , @hfinkel wrote:
>
> > In D71241#1782460 , 
> > @JonChesterfield wrote:
> >
> > > > https://clang.llvm.org/docs/InternalsManual.html#the-ast-library
> > > > 
> > > >   Faithfulness¶
> > > >   The AST intends to provide a representation of the program that is 
> > > > faithful to the original source. 
> > >
> > > That's pretty convincing.
> >
> >
> > No, you're misinterpreting the intent of the statement. Here's the entire 
> > section...
> >
> > > Faithfulness
> > >  The AST intends to provide a representation of the program that is 
> > > faithful to the original source. We intend for it to be possible to write 
> > > refactoring tools using only information stored in, or easily 
> > > reconstructible from, the Clang AST. This means that the AST 
> > > representation should either not desugar source-level constructs to 
> > > simpler forms, or – where made necessary by language semantics or a clear 
> > > engineering tradeoff – should desugar minimally and wrap the result in a 
> > > construct representing the original source form.
> > > 
> > > For example, CXXForRangeStmt directly represents the syntactic form of a 
> > > range-based for statement, but also holds a semantic representation of 
> > > the range declaration and iterator declarations. It does not contain a 
> > > fully-desugared ForStmt, however.
> > > 
> > > Some AST nodes (for example, ParenExpr) represent only syntax, and others 
> > > (for example, ImplicitCastExpr) represent only semantics, but most nodes 
> > > will represent a combination of syntax and associated semantics. 
> > > Inheritance is typically used when representing different (but related) 
> > > syntaxes for nodes with the same or similar semantics.
> >
> > First, being "faithful" to the original source means both syntax and 
> > semantics. I realize that AST is a somewhat-ambiguous term - we have 
> > semantic elements in our AST - but Clang's AST is not just some kind of 
> > minimized parse tree. The AST even has semantics-only nodes (e.g., for 
> > implicit casts). As you can see, the design considerations here are not 
> > just "record the local syntactic elements", but require semantic 
> > interpretation of these elements.
> >
> > Again, Clang's AST is used for various kinds of static analysis tools, and 
> > these depend on having overload resolution correctly performed prior to 
> > that analysis. This includes overload resolution that depends on context 
> > (whether that's qualifications on `this` or host/device in CUDA or anything 
> > else).
> >
> > None of this is to say that we should not record the original spelling of 
> > the function call, we should do that *also*, and that should be done when 
> > constructing the AST in Sema in addition to performing the variant 
> > selection.
>
>
> You are not corret. Check the implementation of decltype, for example 
> https://godbolt.org/z/R76Nn. We keep the original decltype in AST, though we 
> could easily lower it to the real type. This is the design of AST - keep it 
> as much as possible close to the original code and modify it only if it is 
> absolutely impossible (again, check 
> https://clang.llvm.org/docs/InternalsManual.html#the-ast-library).


Yes, but our decltype representation is a semantically-accurate representation 
of the source constructs. Your CodeGen approach to variant selection leads to a 
semantically-inaccurate representation: it produces a CallExpr that has a 
DeclRefExpr to the wrong function declaration.

In any case, our decltype representation is fairly analogous to what I proposed 
above. DecltypeType derives from Type, and stores both the original expression 
and the underlying type. If we have an OpenMPVariantCallExpr, it would also 
store a deference to the base function definition, but like desugar() returns 
the "resolved" regular type, OpenMPVariantCallExpr's getCallee() would return 
the resolved function that will be called.

> Constexprs are not lowered in AST. They are lowered in place where it is 
> required. constexpr is just evaluated. It can be evaluated in Sema, if 
> required, or in codegen, in the analysis tool. Check 
> https://godbolt.org/z/wr9RFk  as an example. You will see, the constexprs are 
> not evaluated in AST, instead AST tries to do everything to keep them in 
> their original form.

I'm aware of how this works. If we see a need for a lazy evaluation strategy 
here we can certainly discuss that. And I agree that we should not drop all 
information about the base function. I'm simply saying that's not what 
getCallee() should return. However, what you're doing here is not analogous to 
the evaluation of constexprs. In short, keeping close to the original source 
does justify misrepresenting the semantics.


Repository:
  rG LLVM Github Monorepo


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782779 , @ABataev wrote:

> In D71241#1782742 , @hfinkel wrote:
>
> > In D71241#1782703 , @ABataev wrote:
> >
> > >
> >
> >
> > ...
> >
> > >> 
> > >> 
> > >>> Given we have two implementations, each at different points in the 
> > >>> pipeline, it might be constructive to each write down why you each 
> > >>> choose said point. I suspect the rationale is hidden by the 
> > >>> implementation details.
> > > 
> > > Actually, early resolution will break tbe tools, not help them. It 
> > > will definitely break clangd, for example. The user will try to 
> > > navigate to originally called function `base` and instead he will be 
> > > redirected to the function `hst`.
> >  
> >  Can you please be more specific? Please explain why the user would 
> >  consider this incorrect behavior. If the point of the tool is to allow 
> >  the user to navigate to the function actually being called, then 
> >  navigating to `base` seems incorrect if that's not the function being 
> >  called. This is just like any other kind of overload resolution - the 
> >  user likely wants to navigate to the function being called.
> >  
> >  Now the user might want an OpenMP-aware tool that understands 
> >  differences between host and accelerator behavior, how that affects 
> >  which functions are called, etc. The user might want this for 
> >  host/device overloads in CUDA too, but this is really an orthogonal 
> >  concern.
> > >>> 
> > >>> You wrote the code. You called a function in the expression. Now you 
> > >>> want to navivate to this function. Clicked on it and instead of the 
> > >>> called `base` you are redirected to `hst` because AST has the link to 
> > >>> `hst` functiin inthe expression instead of the `base`.
> > >> 
> > >> Sure, but it has that link because that `hst` function is actually the 
> > >> function being called (assuming that the clangd-using-tool is configured 
> > >> to interpret the code as if compiling for the host). When I click on a 
> > >> function call in a source file, I want to navigate to the function 
> > >> actually being called. I certainly understand that the function being 
> > >> called now depends on compilation context, and the tool in our example 
> > >> is only providing the resolution in one context, but at least it 
> > >> provides one valid answer. An OpenMP-aware tool could navigate to the 
> > >> base function (we do need to preserve information to make this 
> > >> possible). This is just like dealing with some host/device functions in 
> > >> CUDA (where there are separate overloads) - if you click on the function 
> > >> in such a tool you'll probably navigate to the host variant of the 
> > >> function (even if, in some other context, the device overload might be 
> > >> called).
> > >> 
> > >> Again, I see this as exactly analogous to overload resolution, or as 
> > >> another example, when calling a function template with specializations. 
> > >> When using such a tool, my experience is that users want to click on the 
> > >> function and navigate to the function actually being called. If, for 
> > >> example, I have a function template with specializations, if the 
> > >> specialized one is being called, I should navigate to the specialization 
> > >> being called (not the base function template).
> > > 
> > > You wrote wrong context matcher. You has a bug in the base function, 
> > > which should be called by default everu sw here but the host, and want to 
> > > fix it. Etc.
> >
> > I understand, but this is a generic problem. Same with host/device 
> > overloads in CUDA. Your tool only gets one compilation context, and thus 
> > only one callee. In addition, FWIW, having a base version called everywhere 
> > except on the host seems like an uncommon use case. Normally, the base 
> > version *is* the version called on the host. The named variants are likely 
> > the ones specialized for various accelerators.
> >
> > Regardless, this is exactly why we should do this in Sema. We can represent 
> > links to both Decls in the AST (as I indicated in an earlier comment), and 
> > then it will be *possible* for an OpenMP-aware tool to decide on which it 
> > wants. Right now, it's not easily possible to write a tool that can use an 
> > appropriate set of contexts to examine the AST where the actual callees are 
> > represented.
>
>
> No need to worry for the right decl. See D7097 
> . If you see a refernce for function, before 
> doing something with it, just call member function 
> `getOpenMPDeclareVariantFunction()` and you get the function that must be 
> actually called here. The tool must do the same. That's hiw the tools work. 
> They must be aware of special costructs/attributes.


But 

[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782723 , @ABataev wrote:

> I don't insist on function redefinition solution. You want to replace 
> functions - fine, but do this at the codegen, not in AST.


Again, no one is replacing anything, and we're not mutating the AST. We're 
simply resolving the callee according to the language rules. That's something 
that should be done during AST construction.

It's like if I have this code:

  template 
  int foo() { return 0; }
  
  template <>
  int foo<8>() { return 1; }
  
  int main() {
return foo<8>();
  }

and you said that, in the AST, it should look like the unspecialized `foo` was 
being called. And then later, in CodeGen, something happened in order to cause 
the correct specialization would be called. That clearly would not be 
considered an acceptable design.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782703 , @ABataev wrote:

>


...

>> 
>> 
>>> Given we have two implementations, each at different points in the 
>>> pipeline, it might be constructive to each write down why you each 
>>> choose said point. I suspect the rationale is hidden by the 
>>> implementation details.
> 
> Actually, early resolution will break tbe tools, not help them. It will 
> definitely break clangd, for example. The user will try to navigate to 
> originally called function `base` and instead he will be redirected to 
> the function `hst`.
 
 Can you please be more specific? Please explain why the user would 
 consider this incorrect behavior. If the point of the tool is to allow the 
 user to navigate to the function actually being called, then navigating to 
 `base` seems incorrect if that's not the function being called. This is 
 just like any other kind of overload resolution - the user likely wants to 
 navigate to the function being called.
 
 Now the user might want an OpenMP-aware tool that understands differences 
 between host and accelerator behavior, how that affects which functions 
 are called, etc. The user might want this for host/device overloads in 
 CUDA too, but this is really an orthogonal concern.
>>> 
>>> You wrote the code. You called a function in the expression. Now you want 
>>> to navivate to this function. Clicked on it and instead of the called 
>>> `base` you are redirected to `hst` because AST has the link to `hst` 
>>> functiin inthe expression instead of the `base`.
>> 
>> Sure, but it has that link because that `hst` function is actually the 
>> function being called (assuming that the clangd-using-tool is configured to 
>> interpret the code as if compiling for the host). When I click on a function 
>> call in a source file, I want to navigate to the function actually being 
>> called. I certainly understand that the function being called now depends on 
>> compilation context, and the tool in our example is only providing the 
>> resolution in one context, but at least it provides one valid answer. An 
>> OpenMP-aware tool could navigate to the base function (we do need to 
>> preserve information to make this possible). This is just like dealing with 
>> some host/device functions in CUDA (where there are separate overloads) - if 
>> you click on the function in such a tool you'll probably navigate to the 
>> host variant of the function (even if, in some other context, the device 
>> overload might be called).
>> 
>> Again, I see this as exactly analogous to overload resolution, or as another 
>> example, when calling a function template with specializations. When using 
>> such a tool, my experience is that users want to click on the function and 
>> navigate to the function actually being called. If, for example, I have a 
>> function template with specializations, if the specialized one is being 
>> called, I should navigate to the specialization being called (not the base 
>> function template).
> 
> You wrote wrong context matcher. You has a bug in the base function, which 
> should be called by default everu sw here but the host, and want to fix it. 
> Etc.

I understand, but this is a generic problem. Same with host/device overloads in 
CUDA. Your tool only gets one compilation context, and thus only one callee. In 
addition, FWIW, having a base version called everywhere except on the host 
seems like an uncommon use case. Normally, the base version *is* the version 
called on the host. The named variants are likely the ones specialized for 
various accelerators.

Regardless, this is exactly why we should do this in Sema. We can represent 
links to both Decls in the AST (as I indicated in an earlier comment), and then 
it will be *possible* for an OpenMP-aware tool to decide on which it wants. 
Right now, it's not easily possible to write a tool that can use an appropriate 
set of contexts to examine the AST where the actual callees are represented.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782668 , @ABataev wrote:

>


...

>> While we talk a lot about what you think is bad about this solution it seems 
>> we ignore the problems in the current one. Let me summarize a few:
>> 
>> - Take https://godbolt.org/z/XCjQUA where the wrong function is called in 
>> the target region (because the "hack" to inject code in the wrong definition 
>> is not applicable).
> 
> No time for it, just short answers. No definition for the variant - no 
> definition for the base.

Are the variants not permitted to be external functions?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782652 , @ABataev wrote:

> In D71241#1782648 , @hfinkel wrote:
>
> > In D71241#1782614 , @ABataev wrote:
> >
> > > In D71241#1782551 , @hfinkel 
> > > wrote:
> > >
> > > > In D71241#1779168 , 
> > > > @JonChesterfield wrote:
> > > >
> > > > > Lowering in sema or in codegen seems a standard phase ordering 
> > > > > choice. There will be pros and cons to both.
> > > > >
> > > > > I think prior art leans towards sema. Variants are loosely equivalent 
> > > > > to tag dispatching or constexpr if, both handled before lowering the 
> > > > > AST to IR.
> > > >
> > > >
> > > > This is exactly right. This is just like any other kind of static 
> > > > overload resolution. It should be resolved in Sema and the CallExpr's 
> > > > DeclRefExpr should refer to the correct callee. This will make sure 
> > > > that tools, including static analysis tools, will correctly understand 
> > > > the semantics of the call (e.g., IPA will work correctly). Also, we 
> > > > prefer, in Clang, to generate errors and warning messages in Sema, not 
> > > > in CodeGen, and it is certainly plausible that errors and warnings 
> > > > could be generated during the selector-based resolution process.
> > > >
> > > > That having been said, Alexey is also correct that we retain the 
> > > > unevaluated form of the constexpr expressions, and there is an 
> > > > important analogy here. I believe that one way of restating Alexey's 
> > > > concerns about the AST representation is that, if we resolve the 
> > > > variant selection as we build the AST, and then we print the AST, the 
> > > > printed function would be the name of the selected variant and not the 
> > > > name of the base function. This is certainly a legitimate concern, and 
> > > > there are several places in Clang where we take care to preserve the 
> > > > spelling used for constructs that are otherwise semantically equivalent 
> > > > (e.g., different spellings of keywords). I can certainly imagine a tool 
> > > > wanting to see the base function called, and we'll want that for the 
> > > > AST printer regardless. We might add this information to CallExpr or 
> > > > make a new subclass of CallExpr (e.g., OpenMPVariantCallExpr) that has 
> > > > that information (the latter seems likely better so that we don't 
> > > > increase the size of CallExpr for an uncommon case).
> > > >
> > > > > Writing the dispatch lowering on IR should make it easier to call 
> > > > > from a Fortran front end. I'm in favour of minimising work done on 
> > > > > the clang AST on general principles.
> > > >
> > > > We need to make the best decision for Clang in Clang, regardless of how 
> > > > this might impact a future Fortran implementation. While the 
> > > > OpenMPIRBuilder will be a point of reuse between different 
> > > > OpenMP-enabled frontends, it need not be the only one. Moreover, 
> > > > Fortran will also want to do this resolution earlier for the same 
> > > > reason that it should be done earlier in Clang (and, for Fortran, we'll 
> > > > end up with inlining and other IPA at the FIR level, so it will be 
> > > > required to resolve the variants prior to hitting the OpenMPIRBuilder). 
> > > > Thus, no, doing this in CodeGen is unlikely to work for the Flang 
> > > > implementation.
> > > >
> > > > Also, "minimizing work done in the Clang AST on general principles", 
> > > > seems like an oversimplification of our general Clang design 
> > > > philosophy. Overload resolution in Clang is certainly a significant 
> > > > part of the implementation, but we wouldn't consider doing it in 
> > > > CodeGen. The AST should faithfully represent the semantic elements in 
> > > > the source code. Overload resolution, template instantiation, constexpr 
> > > > evaluation, etc. all are highly non-trivial, and all happen during Sema 
> > > > (even in cases where we might, technically speaking, be able to delay 
> > > > that logic until CodeGen). What we don't do in Sema are  lowering tasks 
> > > > (e.g., translating references into pointers or other things related to 
> > > > picking an underlying implementation strategy for particular 
> > > > constructs) and optimizations - where we do them at all - e.g., 
> > > > constant folding, some devirtualization, and so on are done in CodeGen. 
> > > > For the most part, of course, we defer optimizations to LLVM's 
> > > > optimizer.
> > > >
> > > > > Given we have two implementations, each at different points in the 
> > > > > pipeline, it might be constructive to each write down why you each 
> > > > > choose said point. I suspect the rationale is hidden by the 
> > > > > implementation details.
> > >
> > >
> > > Actually, early resolution will break tbe tools, not help them. It will 
> > > definitely break 

[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782430 , @JonChesterfield 
wrote:

> In D71241#1782427 , @ABataev wrote:
>
> > In D71241#1782425 , 
> > @JonChesterfield wrote:
> >
> > > > Explain that you're replacing the function written by the user on the 
> > > > fly by another one. If they accept it, go ahead.
> > >
> > > That's the observational effect of variants. Replacing is very similar to 
> > > calling + inlining.
> >
> >
> > Not in the AST.
>
>
> I don't see much difference between mutating the AST and mutating the SSA. 
> What're your objections to the former, specifically? It's not a stable 
> interface so tools hanging off it will have a process for updating when it 
> changes.


I'd like to add that what we're talking about is none of these things. We're 
not talking about "mutating" the AST at all. Neither are we inlining. We're 
talking about performing callee resolution when building the AST in the first 
place. This is exactly what we do in all other places where to do overload 
resolution.

This is different from other places where we perform overload resolution only 
in that the callee won't have the same name as the identifier used in the call 
expression. But that's okay - those are the semantics of the calls with OpenMP 
variants. You type one name, and the function that ends up being called has 
another name. But it's all static and part of the specified language semantics. 
Should we record the original "base" function? Yes. Should we represent it as 
the callee? No.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1778671 , @ABataev wrote:

> There can be another one issue with this solution with inline assembly. I’m 
> not completely sure about it, will try to investigate it tomorrow. I suggest 
> to discuss this solution with Richard Smith (or John McCall). If he/they are 
> ok with this transformation of the AST, we can switch to this scheme.


@jdoerfert , please do add Richard and John to this thread. We should be kind 
to them, however, and please write a summary of the language feature including 
some examples showing usage, and please also summarize the current 
implementation strategy and the one being proposed, so that they don't need to 
read the OpenMP spec to figure out what the discussion is about.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782614 , @ABataev wrote:

> In D71241#1782551 , @hfinkel wrote:
>
> > In D71241#1779168 , 
> > @JonChesterfield wrote:
> >
> > > Lowering in sema or in codegen seems a standard phase ordering choice. 
> > > There will be pros and cons to both.
> > >
> > > I think prior art leans towards sema. Variants are loosely equivalent to 
> > > tag dispatching or constexpr if, both handled before lowering the AST to 
> > > IR.
> >
> >
> > This is exactly right. This is just like any other kind of static overload 
> > resolution. It should be resolved in Sema and the CallExpr's DeclRefExpr 
> > should refer to the correct callee. This will make sure that tools, 
> > including static analysis tools, will correctly understand the semantics of 
> > the call (e.g., IPA will work correctly). Also, we prefer, in Clang, to 
> > generate errors and warning messages in Sema, not in CodeGen, and it is 
> > certainly plausible that errors and warnings could be generated during the 
> > selector-based resolution process.
> >
> > That having been said, Alexey is also correct that we retain the 
> > unevaluated form of the constexpr expressions, and there is an important 
> > analogy here. I believe that one way of restating Alexey's concerns about 
> > the AST representation is that, if we resolve the variant selection as we 
> > build the AST, and then we print the AST, the printed function would be the 
> > name of the selected variant and not the name of the base function. This is 
> > certainly a legitimate concern, and there are several places in Clang where 
> > we take care to preserve the spelling used for constructs that are 
> > otherwise semantically equivalent (e.g., different spellings of keywords). 
> > I can certainly imagine a tool wanting to see the base function called, and 
> > we'll want that for the AST printer regardless. We might add this 
> > information to CallExpr or make a new subclass of CallExpr (e.g., 
> > OpenMPVariantCallExpr) that has that information (the latter seems likely 
> > better so that we don't increase the size of CallExpr for an uncommon case).
> >
> > > Writing the dispatch lowering on IR should make it easier to call from a 
> > > Fortran front end. I'm in favour of minimising work done on the clang AST 
> > > on general principles.
> >
> > We need to make the best decision for Clang in Clang, regardless of how 
> > this might impact a future Fortran implementation. While the 
> > OpenMPIRBuilder will be a point of reuse between different OpenMP-enabled 
> > frontends, it need not be the only one. Moreover, Fortran will also want to 
> > do this resolution earlier for the same reason that it should be done 
> > earlier in Clang (and, for Fortran, we'll end up with inlining and other 
> > IPA at the FIR level, so it will be required to resolve the variants prior 
> > to hitting the OpenMPIRBuilder). Thus, no, doing this in CodeGen is 
> > unlikely to work for the Flang implementation.
> >
> > Also, "minimizing work done in the Clang AST on general principles", seems 
> > like an oversimplification of our general Clang design philosophy. Overload 
> > resolution in Clang is certainly a significant part of the implementation, 
> > but we wouldn't consider doing it in CodeGen. The AST should faithfully 
> > represent the semantic elements in the source code. Overload resolution, 
> > template instantiation, constexpr evaluation, etc. all are highly 
> > non-trivial, and all happen during Sema (even in cases where we might, 
> > technically speaking, be able to delay that logic until CodeGen). What we 
> > don't do in Sema are  lowering tasks (e.g., translating references into 
> > pointers or other things related to picking an underlying implementation 
> > strategy for particular constructs) and optimizations - where we do them at 
> > all - e.g., constant folding, some devirtualization, and so on are done in 
> > CodeGen. For the most part, of course, we defer optimizations to LLVM's 
> > optimizer.
> >
> > > Given we have two implementations, each at different points in the 
> > > pipeline, it might be constructive to each write down why you each choose 
> > > said point. I suspect the rationale is hidden by the implementation 
> > > details.
>
>
> Actually, early resolution will break tbe tools, not help them. It will 
> definitely break clangd, for example. The user will try to navigate to 
> originally called function `base` and instead he will be redirected to the 
> function `hst`.


Can you please be more specific? Please explain why the user would consider 
this incorrect behavior. If the point of the tool is to allow the user to 
navigate to the function actually being called, then navigating to `base` seems 
incorrect if that's not the function being called. This is just like any other 
kind of overload resolution - 

[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1779779 , @ABataev wrote:

> Here is the example that does not work with the proposed solution but works 
> with the existing one:
>
>   static void cpu() { asm("nop"); }
>  
>   #pragma omp declare variant(cpu) match(device = {kind(cpu)})
>   static __attribute__((used)) void wrong_asm() {
> asm ("xxx");
>   }
>
>
> The existing solution has some problems with the delayed error messages too, 
> but they are very easy to fix.


I don't understand that this example represents. Unused static functions are 
generally not emitted. In general, if we perform overload resolution in Sema 
and, thus, only use the variants that are selected, then others that are static 
won't even be emitted. Here you're forcing the `wrong_asm` function to be used, 
but if it's used on all devices (host and target), then it's used, and we need 
to deal with the inline asm regardless.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1782460 , @JonChesterfield 
wrote:

> > https://clang.llvm.org/docs/InternalsManual.html#the-ast-library
> > 
> >   Faithfulness¶
> >   The AST intends to provide a representation of the program that is 
> > faithful to the original source. 
>
> That's pretty convincing.


No, you're misinterpreting the intent of the statement. Here's the entire 
section...

> Faithfulness
>  The AST intends to provide a representation of the program that is faithful 
> to the original source. We intend for it to be possible to write refactoring 
> tools using only information stored in, or easily reconstructible from, the 
> Clang AST. This means that the AST representation should either not desugar 
> source-level constructs to simpler forms, or – where made necessary by 
> language semantics or a clear engineering tradeoff – should desugar minimally 
> and wrap the result in a construct representing the original source form.
> 
> For example, CXXForRangeStmt directly represents the syntactic form of a 
> range-based for statement, but also holds a semantic representation of the 
> range declaration and iterator declarations. It does not contain a 
> fully-desugared ForStmt, however.
> 
> Some AST nodes (for example, ParenExpr) represent only syntax, and others 
> (for example, ImplicitCastExpr) represent only semantics, but most nodes will 
> represent a combination of syntax and associated semantics. Inheritance is 
> typically used when representing different (but related) syntaxes for nodes 
> with the same or similar semantics.

First, being "faithful" to the original source means both syntax and semantics. 
I realize that AST is a somewhat-ambiguous term - we have semantic elements in 
our AST - but Clang's AST is not just some kind of minimized parse tree. The 
AST even has semantics-only nodes (e.g., for implicit casts). As you can see, 
the design considerations here are not just "record the local syntactic 
elements", but require semantic interpretation of these elements.

Again, Clang's AST is used for various kinds of static analysis tools, and 
these depend on having overload resolution correctly performed prior to that 
analysis. This includes overload resolution that depends on context (whether 
that's qualifications on `this` or host/device in CUDA or anything else).

None of this is to say that we should not record the original spelling of the 
function call, we should do that *also*, and that should be done when 
constructing the AST in Sema in addition to performing the variant selection.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1779168 , @JonChesterfield 
wrote:

> Lowering in sema or in codegen seems a standard phase ordering choice. There 
> will be pros and cons to both.
>
> I think prior art leans towards sema. Variants are loosely equivalent to tag 
> dispatching or constexpr if, both handled before lowering the AST to IR.


This is exactly right. This is just like any other kind of static overload 
resolution. It should be resolved in Sema and the CallExpr's DeclRefExpr should 
refer to the correct callee. This will make sure that tools, including static 
analysis tools, will correctly understand the semantics of the call (e.g., IPA 
will work correctly). Also, we prefer, in Clang, to generate errors and warning 
messages in Sema, not in CodeGen, and it is certainly plausible that errors and 
warnings could be generated during the selector-based resolution process.

That having been said, Alexey is also correct that we retain the unevaluated 
form of the constexpr expressions, and there is an important analogy here. I 
believe that one way of restating Alexey's concerns about the AST 
representation is that, if we resolve the variant selection as we build the 
AST, and then we print the AST, the printed function would be the name of the 
selected variant and not the name of the base function. This is certainly a 
legitimate concern, and there are several places in Clang where we take care to 
preserve the spelling used for constructs that are otherwise semantically 
equivalent (e.g., different spellings of keywords). I can certainly imagine a 
tool wanting to see the base function called, and we'll want that for the AST 
printer regardless. We might add this information to CallExpr or make a new 
subclass of CallExpr (e.g., OpenMPVariantCallExpr) that has that information 
(the latter seems likely better so that we don't increase the size of CallExpr 
for an uncommon case).

> Writing the dispatch lowering on IR should make it easier to call from a 
> Fortran front end. I'm in favour of minimising work done on the clang AST on 
> general principles.

We need to make the best decision for Clang in Clang, regardless of how this 
might impact a future Fortran implementation. While the OpenMPIRBuilder will be 
a point of reuse between different OpenMP-enabled frontends, it need not be the 
only one. Moreover, Fortran will also want to do this resolution earlier for 
the same reason that it should be done earlier in Clang (and, for Fortran, 
we'll end up with inlining and other IPA at the FIR level, so it will be 
required to resolve the variants prior to hitting the OpenMPIRBuilder). Thus, 
no, doing this in CodeGen is unlikely to work for the Flang implementation.

Also, "minimizing work done in the Clang AST on general principles", seems like 
an oversimplification of our general Clang design philosophy. Overload 
resolution in Clang is certainly a significant part of the implementation, but 
we wouldn't consider doing it in CodeGen. The AST should faithfully represent 
the semantic elements in the source code. Overload resolution, template 
instantiation, constexpr evaluation, etc. all are highly non-trivial, and all 
happen during Sema (even in cases where we might, technically speaking, be able 
to delay that logic until CodeGen). What we don't do in Sema are  lowering 
tasks (e.g., translating references into pointers or other things related to 
picking an underlying implementation strategy for particular constructs) and 
optimizations - where we do them at all - e.g., constant folding, some 
devirtualization, and so on are done in CodeGen. For the most part, of course, 
we defer optimizations to LLVM's optimizer.

> Given we have two implementations, each at different points in the pipeline, 
> it might be constructive to each write down why you each choose said point. I 
> suspect the rationale is hidden by the implementation details.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D70258: [OpenMP][IR-Builder] Introduce the finalization stack

2019-12-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:51
+  /// at the time, and location, the callback is invoked.
+  using FinalizeCallbackTy = std::function;
+

jdoerfert wrote:
> ABataev wrote:
> > jdoerfert wrote:
> > > ABataev wrote:
> > > > jdoerfert wrote:
> > > > > ABataev wrote:
> > > > > > jdoerfert wrote:
> > > > > > > ABataev wrote:
> > > > > > > > `llvm::function_ref`?
> > > > > > > The lambda that is passed in here might go out of scope so we 
> > > > > > > need to own it. This is intentional.
> > > > > > Maybe better to save the intermediate data in CGOpenMPRuntime class 
> > > > > > rather than rely on owning lambda ref here? Clang does not use 
> > > > > > escaping decls and instead stores intermediate data explicitly. It 
> > > > > > really complicates analysis and potential source of resource 
> > > > > > leakage.
> > > > > I don't follow. Clang does use `std::function` to store callbacks 
> > > > > that have to life for a while. Why is this different? What would be 
> > > > > the benefit of having a function_ref here and a `std::function` in 
> > > > > CGOpenMPRuntime? Note that the FinaliztionInfo object is created in 
> > > > > the front-end and the std::function is assigned there already.
> > > > Clang uses it only in some rare cases, when it is really required, like 
> > > > typo correction or something like this. In other cases it is not 
> > > > recommended to use it.
> > > > 
> > > > I'm not saying that you need to store `std::function` in 
> > > > CGOpenMPRuntime class, I'm saying about necessary data.
> > > What necessary data? Can you please explain how you want it to look like 
> > > and why? This version seems to work fine.
> > I'm not arguing that it does not work, I'm asking do you really need such a 
> > complex solution? Just like I said before, it is very hard to maintain and 
> > to understand the lifetime of lambda, so it is a potential source of 
> > resource leakage. All I'm asking is `is there a way to implement it 
> > differently`, nothing else.
> There are alternatives, all of which are more complex and come with the same 
> "problems".
@jdoerfert , can you please elaborate? What other designs might be possible?

It looks to me like this lambda is necessary to maintain separation between 
Clang's CodeGen and the OpenMPIRBuilder (the non-unit-test use in this patch 
only captures CGF). I'm not particularly concerned about lifetime management of 
the lambdas if they only need to capture CGF, and maybe the design of this 
implies that the lambdas will only capture objects that live as long as the 
code generation phase, but it is certainly true that whenever we have 
long-lived lambdas thought about lifetime of captured data is required.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70258/new/

https://reviews.llvm.org/D70258



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71241: [OpenMP][WIP] Use overload centric declare variants

2019-12-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71241#1778134 , @ABataev wrote:

>


...

>> 
>> 
>>> Also, check how -ast-print works with your solution. It returns different 
>>> result than expected because you're transform the code too early. It is 
>>> incorrect behavior.
>> 
>> This is debatable. AST print does not print the input but the AST, thus what 
>> is correct wrt. OpenMP declare variant is nowhere defined but by us.
>>  Arguably, having it print the actually called function and not the base 
>> function is preferable. Thus, the new way is actually more informative.
> 
> You're completely wrong here! We shall keep the original AST. This is used in 
> several tools and you significantly modify the user code. I consider it a 
> real issue here.

Alexey, again, this kind of comment is not appropriate. We're all experienced 
developers here, and we all understand the importance of tooling support in 
Clang. We also serve developers who write tools using AST matchers and other 
Clang analysis facilities. Having the resolved callee represented in the AST 
for what looks like a static call from the base-language perspective makes a 
lot of sense from a tooling perspective. When performing static analysis on the 
code, forcing a tool to understand how OpenMP variant selectors work in order 
to perform inter-procedural static analysis is suboptimal in nearly all cases. 
It is also true that we might want the base callee represented in some way, but 
as that callee is never actually called, and one of the variants is always 
called at that call site, it is important that IPA propagate information into 
and out of the correct callee in order to produce the correct results. If we 
currently represent the base callee as the callee that will appear in the call 
graph, that's a bug: Clang's static analyzer will produce incorrect results.

If you know of specific tools that indeed depend on the current behavior to 
produce correct results, please provide us with details on what they're doing 
so that we understand the use cases. Regardless, we should prioritize 
correct-by-default functioning of AST-based call graphs and their associated 
static analysis.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1776761 , @ABataev wrote:

> In D71179#1776528 , @hfinkel wrote:
>
> > In D71179#1776491 , @ABataev wrote:
> >
> > > In D71179#1776487 , @hfinkel 
> > > wrote:
> > >
> > > > In D71179#1776467 , @ABataev 
> > > > wrote:
> > > >
> > > > > In D71179#1776457 , 
> > > > > @jdoerfert wrote:
> > > > >
> > > > > > > You're doing absolutely the same thing as the original declare 
> > > > > > > variant implementation.
> > > > > >
> > > > > > I don't think so but if you do why do you oppose this approach?
> > > > > >
> > > > > > > And I don't think it would be correct to add them as multiversiin 
> > > > > > > variants to the original function.
> > > > > >
> > > > > > Why wouldn't it be correct to pick the version through the overload 
> > > > > > resolution instead of the code generation?
> > > > > >  How this could work is already described in the TODO 
> > > > > > (CodeGenModule.cpp):
> > > > > >
> > > > > >   // TODO: We should introduce function aliases for `omp declare 
> > > > > > variant`
> > > > > >   //   directives such that we can treat them through the same 
> > > > > > overload
> > > > > >   //   resolution scheme (via multi versioning) as `omp begin 
> > > > > > declare
> > > > > >   //   variant` functions. For an `omp declare variant(VARIANT) 
> > > > > > ...`
> > > > > >   //   that is attached to a BASE function we would create a 
> > > > > > global alias
> > > > > >   //   VARIANT = BASE which will participate in the multi 
> > > > > > version overload
> > > > > >   //   resolution. If picked, here is no need to emit them 
> > > > > > explicitly.
> > > > > >   
> > > > > >
> > > > > >
> > > > > >
> > > > > >  ---
> > > > > >
> > > > > > I still haven't understood why we cannot/should not reuse the 
> > > > > > existing multi-version support and instead duplicate the logic in 
> > > > > > some custom scheme.
> > > > > >  We have this patch that shows how we can reuse the logic in Clang. 
> > > > > > It works on a per-call basis, so it will work for all context 
> > > > > > selector (incl. construct).
> > > > > >  If you think there is something conceptually not working, I'd like 
> > > > > > to hear about it. However, just saying "it wouldn't be correct" is 
> > > > > > not sufficient. You need to provide details about the situation, 
> > > > > > what you think would not work, and why.
> > > > >
> > > > >
> > > > > I explayned already: declare variant cannot be represented as 
> > > > > mutiversion functiin, for example.
> > > >
> > > >
> > > > @ABataev, can you please elaborate? It's not obvious to me that we 
> > > > cannot handle the existing declare variant with the same scheme (as 
> > > > @jdoerfert highlighted above). In general, I believe it's preferable to 
> > > > have one generic scheme and use it to handle all cases as opposed to 
> > > > continuing to use a more-limited scheme in addition to the generic 
> > > > scheme.
> > >
> > >
> > > Eaine already. Current version of declare variant cannot be represented 
> > > as multiversiin functions, because it is not. We have a function that is 
> > > the alias to other functions with different names. They just are not 
> > > multiversion functions by definition.
> >
> >
> > I understand that they have different names. I don't see why we that means 
> > that they can't be added to the overload set as multi-version candidates if 
> > we add logic which does exactly that.
>
>
>


@jdoerfert posted a prototype implementation in D71241 
, so we don't need to just have a theoretical 
discussion, but I'd like to address a high-level issue here:

> Because this is exactly what I said- you want to reuse the exiwting solution 
> for completely different purpose just because you want to you reuse though 
> even semantically it has nothing to do with multiversioning.

This kind of comment really isn't appropriate. We're all experienced developers 
here, and no one is proposing to reuse code in an inappropriate manner "just 
because" or for any other reason. I ask you to reconsider your reasoning here 
for two reasons:

1. "Reus[ing] the existing solution for a completely different purpose", which 
I'll classify as structural code reuse, is not necessarily bad. Structural code 
reuse, where you reuse code with a similar structure, but different purpose, 
from what you need, is often a useful impetus for the creation of new 
abstractions. The trade off relevant here, in my experience, is against future 
structural divergence. In the future, is it likely that the abstraction will 
break down because the different purposes will tend to require the code 
structure to change in the future in divergent ways? If so, that can be 

[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1776491 , @ABataev wrote:

> In D71179#1776487 , @hfinkel wrote:
>
> > In D71179#1776467 , @ABataev wrote:
> >
> > > In D71179#1776457 , @jdoerfert 
> > > wrote:
> > >
> > > > > You're doing absolutely the same thing as the original declare 
> > > > > variant implementation.
> > > >
> > > > I don't think so but if you do why do you oppose this approach?
> > > >
> > > > > And I don't think it would be correct to add them as multiversiin 
> > > > > variants to the original function.
> > > >
> > > > Why wouldn't it be correct to pick the version through the overload 
> > > > resolution instead of the code generation?
> > > >  How this could work is already described in the TODO 
> > > > (CodeGenModule.cpp):
> > > >
> > > >   // TODO: We should introduce function aliases for `omp declare 
> > > > variant`
> > > >   //   directives such that we can treat them through the same 
> > > > overload
> > > >   //   resolution scheme (via multi versioning) as `omp begin 
> > > > declare
> > > >   //   variant` functions. For an `omp declare variant(VARIANT) ...`
> > > >   //   that is attached to a BASE function we would create a global 
> > > > alias
> > > >   //   VARIANT = BASE which will participate in the multi version 
> > > > overload
> > > >   //   resolution. If picked, here is no need to emit them 
> > > > explicitly.
> > > >   
> > > >
> > > >
> > > >
> > > >  ---
> > > >
> > > > I still haven't understood why we cannot/should not reuse the existing 
> > > > multi-version support and instead duplicate the logic in some custom 
> > > > scheme.
> > > >  We have this patch that shows how we can reuse the logic in Clang. It 
> > > > works on a per-call basis, so it will work for all context selector 
> > > > (incl. construct).
> > > >  If you think there is something conceptually not working, I'd like to 
> > > > hear about it. However, just saying "it wouldn't be correct" is not 
> > > > sufficient. You need to provide details about the situation, what you 
> > > > think would not work, and why.
> > >
> > >
> > > I explayned already: declare variant cannot be represented as mutiversion 
> > > functiin, for example.
> >
> >
> > @ABataev, can you please elaborate? It's not obvious to me that we cannot 
> > handle the existing declare variant with the same scheme (as @jdoerfert 
> > highlighted above). In general, I believe it's preferable to have one 
> > generic scheme and use it to handle all cases as opposed to continuing to 
> > use a more-limited scheme in addition to the generic scheme.
>
>
> Eaine already. Current version of declare variant cannot be represented as 
> multiversiin functions, because it is not. We have a function that is the 
> alias to other functions with different names. They just are not multiversion 
> functions by definition.


I understand that they have different names. I don't see why we that means that 
they can't be added to the overload set as multi-version candidates if we add 
logic which does exactly that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1776467 , @ABataev wrote:

> In D71179#1776457 , @jdoerfert wrote:
>
> > > You're doing absolutely the same thing as the original declare variant 
> > > implementation.
> >
> > I don't think so but if you do why do you oppose this approach?
> >
> > > And I don't think it would be correct to add them as multiversiin 
> > > variants to the original function.
> >
> > Why wouldn't it be correct to pick the version through the overload 
> > resolution instead of the code generation?
> >  How this could work is already described in the TODO (CodeGenModule.cpp):
> >
> >   // TODO: We should introduce function aliases for `omp declare variant`
> >   //   directives such that we can treat them through the same overload
> >   //   resolution scheme (via multi versioning) as `omp begin declare
> >   //   variant` functions. For an `omp declare variant(VARIANT) ...`
> >   //   that is attached to a BASE function we would create a global 
> > alias
> >   //   VARIANT = BASE which will participate in the multi version 
> > overload
> >   //   resolution. If picked, here is no need to emit them explicitly.
> >   
> >
> >
> >
> >  ---
> >
> > I still haven't understood why we cannot/should not reuse the existing 
> > multi-version support and instead duplicate the logic in some custom scheme.
> >  We have this patch that shows how we can reuse the logic in Clang. It 
> > works on a per-call basis, so it will work for all context selector (incl. 
> > construct).
> >  If you think there is something conceptually not working, I'd like to hear 
> > about it. However, just saying "it wouldn't be correct" is not sufficient. 
> > You need to provide details about the situation, what you think would not 
> > work, and why.
>
>
> I explayned already: declare variant cannot be represented as mutiversion 
> functiin, for example.


@ABataev, can you please elaborate? It's not obvious to me that we cannot 
handle the existing declare variant with the same scheme (as @jdoerfert 
highlighted above). In general, I believe it's preferable to have one generic 
scheme and use it to handle all cases as opposed to continuing to use a 
more-limited scheme in addition to the generic scheme.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1775687 , @jdoerfert wrote:

>


...

> 
> 
>>> We restricted it for now to function definitions so we don't need to define 
>>> the mangling as you cannot expect linking. (I did this to get it in TR8 
>>> while I figured it will solve all our math.h problems already).
>>>  However, we need to avoid collisions with user code, e.g., through the use 
>>> of symbols in the name that are not allowed to be used by the user (I 
>>> thought "." is one of them).
>> 
>> Okay, but how to we distinguish functions for which there is a declaration 
>> and we need the mangling because the user has provided a definition 
>> elsewhere, from those for which there is a declaration, and we don't want 
>> mangling because we need to link to some system library?
> 
> The idea is, declarations inside begin/end declare variant are supposed to be 
> not affected by the begin/end declare variant. That is, if you have 
> declarations you cannot expect variant multi-versioning to happen. Having 
> declarations inside or outside the begin/end declare variant is still fine if 
> they all denote the same function.

Thanks, now I understand. This seems like it will work.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1775442 , @jdoerfert wrote:

> > @jdoerfert , how does the ".ompvariant" work with external functions? I see 
> > the part of the spec which says, "The symbol name of a function definition 
> > that appears between a begin declare variant...", but, if we append this 
> > name to, for example, the names of functions present in the device math 
> > library, won't we have a problem with linking?
>
> We restricted it for now to function definitions so we don't need to define 
> the mangling as you cannot expect linking. (I did this to get it in TR8 while 
> I figured it will solve all our math.h problems already).
>  However, we need to avoid collisions with user code, e.g., through the use 
> of symbols in the name that are not allowed to be used by the user (I thought 
> "." is one of them).


Okay, but how to we distinguish functions for which there is a declaration and 
we need the mangling because the user has provided a definition elsewhere, from 
those for which there is a declaration, and we don't want mangling because we 
need to link to some system library?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1775157 , @ABataev wrote:

> In D71179#1775066 , @hfinkel wrote:
>
> > In D71179#1774678 , @ABataev wrote:
> >
> > > In D71179#1774487 , @jdoerfert 
> > > wrote:
> > >
> > > > In D71179#1774471 , @ABataev 
> > > > wrote:
> > > >
> > > > > They do this because they have several function definitions with the 
> > > > > same name. In our case, we have several different functions with 
> > > > > different names and for us no need to worry about overloading 
> > > > > resolution, the compiler will do everything for us.
> > > >
> > > >
> > > > I think we talk past each other again. This is the implementation of 
> > > > `omp begin/end declare variant` as described in TR8. Bt definition, the 
> > > > new variant mechanism will result in several different function 
> > > > definitions with the same name. See the two tests for examples.
> > >
> > >
> > > I just don't get it. If begin/end is just a something like 
> > > #ifdef...endif, why you just can't skip everything between begin/end if 
> > > the context does not match?
> >
> >
> > The patch does this (see in ParseOpenMP.cpp where I asked about the 
> > potential inf-loop). But when the definitions are not skipped, then we have 
> > to worry about having multiple decls/defs of the same name and the overload 
> > priorities.
>
>
> I would recommend to drop all this extra stuff from the patch and focus on 
> the initial patch. We'll need something similar to multiversion in case of 
> the construct context selectors, but at first we need to solve all the 
> problems with the simple versions of the construct rather that try to solve 
> all the problems in the world in one patch. It is almost impossible to review.


I agree. We should split this into several patches (e.g., basic handling, 
skipping parsing for incompatible selectors, overload things). I think that 
@jdoerfert posted this so that people can see the high-level direction and 
provide feedback (including feedback on how to stage the functionality for 
review).

@jdoerfert , also, do we have tests that can go into the test suite / 
libomptarget regression tests demonstrating the collection of problems people 
have currently opened bugs on regarding math.h? I recall we still had problems 
with host code needing the long-double overloads, with constants from the 
system headers, etc.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1774678 , @ABataev wrote:

> In D71179#1774487 , @jdoerfert wrote:
>
> > In D71179#1774471 , @ABataev wrote:
> >
> > > They do this because they have several function definitions with the same 
> > > name. In our case, we have several different functions with different 
> > > names and for us no need to worry about overloading resolution, the 
> > > compiler will do everything for us.
> >
> >
> > I think we talk past each other again. This is the implementation of `omp 
> > begin/end declare variant` as described in TR8. Bt definition, the new 
> > variant mechanism will result in several different function definitions 
> > with the same name. See the two tests for examples.
>
>
> I just don't get it. If begin/end is just a something like #ifdef...endif, 
> why you just can't skip everything between begin/end if the context does not 
> match?


The patch does this (see in ParseOpenMP.cpp where I asked about the potential 
inf-loop). But when the definitions are not skipped, then we have to worry 
about having multiple decls/defs of the same name and the overload priorities.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D71179#1774487 , @jdoerfert wrote:

> In D71179#1774471 , @ABataev wrote:
>
> > They do this because they have several function definitions with the same 
> > name. In our case, we have several different functions with different names 
> > and for us no need to worry about overloading resolution, the compiler will 
> > do everything for us.
>
>
> I think we talk past each other again. This is the implementation of `omp 
> begin/end declare variant` as described in TR8. Bt definition, the new 
> variant mechanism will result in several different function definitions with 
> the same name. See the two tests for examples.


The intent of this feature is to allow us to include the device-function 
headers and the system headers simultaneously, giving preference to the device 
functions when compiling for the device, thus fixing a number of outstanding 
math.h OpenMP offloading problems. This definitely means that we'll have 
multiple functions with the same name and we need to pick the right ones during 
overload resolution.

@jdoerfert , how does the ".ompvariant" work with external functions? I see the 
part of the spec which says, "The symbol name of a function definition that 
appears between a begin declare variant...", but, if we append this name to, 
for example, the names of functions present in the device math library, won't 
we have a problem with linking?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71179: [OpenMP][WIP] Initial support for `begin/end declare variant`

2019-12-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/Headers/openmp_wrappers/__clang_openmp_math_declares.h:17
 
 #if defined(__NVPTX__) && defined(_OPENMP)
 

Should we use a more-specific selector and then get rid of this `__NVPTX__` 
check?



Comment at: clang/lib/Parse/ParseOpenMP.cpp:1489
+++Nesting;
+} while (Nesting);
+

Will this just inf-loop if the file ends?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71179/new/

https://reviews.llvm.org/D71179



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69897: Add #pragma clang loop aligned

2019-11-06 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:3135
+
+This predicates all the array references inside the loop to be aligned. The 
aligned access to them can increase fetch time and increase the performance. 
+

hfinkel wrote:
> lebedev.ri wrote:
> > What does this actually mean, in the end?
> > Will it assume that whatever alignment is required
> > for whatever vectorization width chosen,
> > is the actual alignment?
> Also, just arrays, or also pointers?
> Currently, it is using 32 bit.

You'll need to use the preferred alignment of the element type.

> By pointers, if you mean pointers to arrays? Then Yes.

No, I mean pointers.

What do you intend for these cases:

  double A[Size], B[Size];

  void foo() {
  #pragma clang loop aligned
for (int i = 0; i < n; ++i) {
  A[i] = B[i] + 1;
}
  }

or this:

  void foo(double *A, double *B) {
  #pragma clang loop aligned
for (int i = 0; i < n; ++i) {
  A[i] = B[i] + 1;
}
  }

or this:

  void foo(double *A, double *B, double *C) {
  #pragma clang loop aligned
for (int i = 0; i < n; ++i) {
  A[i] = B[i] + *C + 1;
}
  }

or this:

  void foo(double *A, double *B) {
  #pragma clang loop aligned
for (int i = 0; i < n; ++i) {
  *A++ = *B++ + 1;
}
  }

what about arrays of structs?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69897/new/

https://reviews.llvm.org/D69897



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69897: Add #pragma clang loop aligned

2019-11-06 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:3135
+
+This predicates all the array references inside the loop to be aligned. The 
aligned access to them can increase fetch time and increase the performance. 
+

lebedev.ri wrote:
> What does this actually mean, in the end?
> Will it assume that whatever alignment is required
> for whatever vectorization width chosen,
> is the actual alignment?
Also, just arrays, or also pointers?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69897/new/

https://reviews.llvm.org/D69897



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69391: Add #pragma clang loop ivdep

2019-10-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

This is the same as #pragma clang loop vectorize(assume_safety)?

In D69391#1720845 , @xbolva00 wrote:

> "#pragma ivdep" should work too (compatibiluty mode with intel, gcc).


The semantics are not the same, unfortunately.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69391/new/

https://reviews.llvm.org/D69391



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69272: Restricted variant of '#pragma STDC FENV_ACCESS'

2019-10-22 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:890
+  "cannot apply to inline functions, ignoring pragma">,
+   InGroup;
 

sepavloff wrote:
> hfinkel wrote:
> > sepavloff wrote:
> > > hfinkel wrote:
> > > > andrew.w.kaylor wrote:
> > > > > rjmccall wrote:
> > > > > > What's the purpose of this restriction?  Whether `inline` really 
> > > > > > has much to do with inlining depends a lot on the exact language 
> > > > > > settings.  (Also, even if this restriction were okay, the 
> > > > > > diagnostic is quite bad given that there are three separate 
> > > > > > conditions that can lead to it firing.)
> > > > > > 
> > > > > > Also, I thought we were adding instruction-level annotations for 
> > > > > > this kind of thing to LLVM IR.  Was that not in service of 
> > > > > > implementing this pragma?
> > > > > > 
> > > > > > I'm not categorically opposed to taking patches that only partially 
> > > > > > implement a feature, but I do want to feel confident that there's a 
> > > > > > reasonable technical path forward to the full implementation.  In 
> > > > > > this case, it feels like the function-level attribute is a dead end 
> > > > > > technically.
> > > > > I'm guessing this is intended to avoid the optimization problems that 
> > > > > would occur (currently) if a function with strictfp were inlined into 
> > > > > a function without it. I'm just guessing though, so correct me if I'm 
> > > > > wrong.
> > > > > 
> > > > > As I've said elsewhere, I hope this is a temporary problem. It is a 
> > > > > real problem though (as is the fact that the inliner isn't currently 
> > > > > handling this case correctly).
> > > > > 
> > > > > What would you think of a new command line option that caused us to 
> > > > > mark functions with strictfp as noinline? We'd still need an error 
> > > > > somewhat like this, but I feel like that would be more likely to 
> > > > > accomplish what we want on a broad scale.
> > > > We would not want to prevent all inlining, just inlining where the 
> > > > attributes don't match. We should fix his first. I think we just need 
> > > > to add a CompatRule to include/llvm/IR/Attributes.td (or something like 
> > > > that).
> > > As Andrew already said, `noinline` attribute is a mean to limit negative 
> > > performance impact. Of course, to inline or not to inline - such decision 
> > > is made by backend. However it a user requested a function to be inline, 
> > > a warning looks useful.
> > > 
> > > When constrained intrinsics get full support in optimizations, this 
> > > restriction will become unnecessary.
> > > 
> > > Outlining is one of the ways that converts this solution into full pragma 
> > > implementation. Another is implementation of constrained intrinsics 
> > > support in optimization transformations.
> > > 
> > > As for a new command line option that caused us to mark functions with 
> > > strictfp as noinline, it loos a good idea, but we must adapt inliner 
> > > first, so that it can convert ordinary floating operations to constrained 
> > > intrinsics during inlining.
> > > 
> > > In the case of `#pragma STDC FENV_ACCESS` we cannot in general say if 
> > > attributes are compatible so a function can be inlined into another. The 
> > > pragma only says that user modified floating point environment. One 
> > > function may set only rounding mode and another use different exception 
> > > handling, in this case we cannot do inlining. More fine grained pragmas, 
> > > like that proposed in https://reviews.llvm.org/D65997 could enable more 
> > > flexible inlining.
> > > Of course, to inline or not to inline - such decision is made by backend. 
> > > However it a user requested a function to be inline, a warning looks 
> > > useful.
> > 
> > We need to be careful. inline is not just an optimizaiton hint. It also 
> > affects function linkage. Also, when inlining into functions with similar 
> > constraints, there's no problem with the inlining.
> > 
> > > One function may set only rounding mode and another use different 
> > > exception handling, in this case we cannot do inlining.
> > 
> > Can you please explain this? It does not seem like that should block 
> > inlining when both the caller and callee are marked as fenv_access-enabled.
> > We need to be careful. inline is not just an optimizaiton hint. It also 
> > affects function linkage. Also, when inlining into functions with similar 
> > constraints, there's no problem with the inlining.
> This is an argument in favor of the warning on conflicting attributes.
> 
> >> One function may set only rounding mode and another use different 
> >> exception handling, in this case we cannot do inlining.
> >Can you please explain this? It does not seem like that should block 
> >inlining when both the caller and callee are marked as fenv_access-enabled. 
> 
> I was wrong. If a function correctly restores FP state, it can be inlined 
> into 

[PATCH] D69272: Restricted variant of '#pragma STDC FENV_ACCESS'

2019-10-22 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> I don't see how such warning can help a user. A note about impact of the 
> pragma on performance can be put into documentation. Issuing a warning on 
> every use of the pragma may be annoying.

I definitely agree. Performance may be fine in many cases, and performance may 
not be *relatively* important where the pragma is used. I don't believe that a 
warning should be added for this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69272/new/

https://reviews.llvm.org/D69272



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63607: [clang][driver] Add basic --driver-mode=flang support for fortran

2019-10-22 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

In D63607#1709258 , @peterwaller-arm 
wrote:

> Friendly ping to everybody watching. I'd like to get this in soon if possible.
>
> Hal - do you think this is close to being accepted? Note that I consider this 
> "the beginning" rather than "the end", since there will be more functionality 
> to add piecewise before this is fully functional. In the meantime, since it 
> is new functionality, it should not break anything.


One comment on the test, but otherwise LGTM.




Comment at: clang/test/Driver/flang/flang-not-installed.f90:11
+! shell syntax.
+! UNSUPPORTED: windows
+

I believe that you can write:

  REQUIRES: shell

for this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63607/new/

https://reviews.llvm.org/D63607



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69272: Restricted variant of '#pragma STDC FENV_ACCESS'

2019-10-22 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:890
+  "cannot apply to inline functions, ignoring pragma">,
+   InGroup;
 

sepavloff wrote:
> hfinkel wrote:
> > andrew.w.kaylor wrote:
> > > rjmccall wrote:
> > > > What's the purpose of this restriction?  Whether `inline` really has 
> > > > much to do with inlining depends a lot on the exact language settings.  
> > > > (Also, even if this restriction were okay, the diagnostic is quite bad 
> > > > given that there are three separate conditions that can lead to it 
> > > > firing.)
> > > > 
> > > > Also, I thought we were adding instruction-level annotations for this 
> > > > kind of thing to LLVM IR.  Was that not in service of implementing this 
> > > > pragma?
> > > > 
> > > > I'm not categorically opposed to taking patches that only partially 
> > > > implement a feature, but I do want to feel confident that there's a 
> > > > reasonable technical path forward to the full implementation.  In this 
> > > > case, it feels like the function-level attribute is a dead end 
> > > > technically.
> > > I'm guessing this is intended to avoid the optimization problems that 
> > > would occur (currently) if a function with strictfp were inlined into a 
> > > function without it. I'm just guessing though, so correct me if I'm wrong.
> > > 
> > > As I've said elsewhere, I hope this is a temporary problem. It is a real 
> > > problem though (as is the fact that the inliner isn't currently handling 
> > > this case correctly).
> > > 
> > > What would you think of a new command line option that caused us to mark 
> > > functions with strictfp as noinline? We'd still need an error somewhat 
> > > like this, but I feel like that would be more likely to accomplish what 
> > > we want on a broad scale.
> > We would not want to prevent all inlining, just inlining where the 
> > attributes don't match. We should fix his first. I think we just need to 
> > add a CompatRule to include/llvm/IR/Attributes.td (or something like that).
> As Andrew already said, `noinline` attribute is a mean to limit negative 
> performance impact. Of course, to inline or not to inline - such decision is 
> made by backend. However it a user requested a function to be inline, a 
> warning looks useful.
> 
> When constrained intrinsics get full support in optimizations, this 
> restriction will become unnecessary.
> 
> Outlining is one of the ways that converts this solution into full pragma 
> implementation. Another is implementation of constrained intrinsics support 
> in optimization transformations.
> 
> As for a new command line option that caused us to mark functions with 
> strictfp as noinline, it loos a good idea, but we must adapt inliner first, 
> so that it can convert ordinary floating operations to constrained intrinsics 
> during inlining.
> 
> In the case of `#pragma STDC FENV_ACCESS` we cannot in general say if 
> attributes are compatible so a function can be inlined into another. The 
> pragma only says that user modified floating point environment. One function 
> may set only rounding mode and another use different exception handling, in 
> this case we cannot do inlining. More fine grained pragmas, like that 
> proposed in https://reviews.llvm.org/D65997 could enable more flexible 
> inlining.
> Of course, to inline or not to inline - such decision is made by backend. 
> However it a user requested a function to be inline, a warning looks useful.

We need to be careful. inline is not just an optimizaiton hint. It also affects 
function linkage. Also, when inlining into functions with similar constraints, 
there's no problem with the inlining.

> One function may set only rounding mode and another use different exception 
> handling, in this case we cannot do inlining.

Can you please explain this? It does not seem like that should block inlining 
when both the caller and callee are marked as fenv_access-enabled.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69272/new/

https://reviews.llvm.org/D69272



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69272: Restricted variant of '#pragma STDC FENV_ACCESS'

2019-10-21 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:890
+  "cannot apply to inline functions, ignoring pragma">,
+   InGroup;
 

andrew.w.kaylor wrote:
> rjmccall wrote:
> > What's the purpose of this restriction?  Whether `inline` really has much 
> > to do with inlining depends a lot on the exact language settings.  (Also, 
> > even if this restriction were okay, the diagnostic is quite bad given that 
> > there are three separate conditions that can lead to it firing.)
> > 
> > Also, I thought we were adding instruction-level annotations for this kind 
> > of thing to LLVM IR.  Was that not in service of implementing this pragma?
> > 
> > I'm not categorically opposed to taking patches that only partially 
> > implement a feature, but I do want to feel confident that there's a 
> > reasonable technical path forward to the full implementation.  In this 
> > case, it feels like the function-level attribute is a dead end technically.
> I'm guessing this is intended to avoid the optimization problems that would 
> occur (currently) if a function with strictfp were inlined into a function 
> without it. I'm just guessing though, so correct me if I'm wrong.
> 
> As I've said elsewhere, I hope this is a temporary problem. It is a real 
> problem though (as is the fact that the inliner isn't currently handling this 
> case correctly).
> 
> What would you think of a new command line option that caused us to mark 
> functions with strictfp as noinline? We'd still need an error somewhat like 
> this, but I feel like that would be more likely to accomplish what we want on 
> a broad scale.
We would not want to prevent all inlining, just inlining where the attributes 
don't match. We should fix his first. I think we just need to add a CompatRule 
to include/llvm/IR/Attributes.td (or something like that).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69272/new/

https://reviews.llvm.org/D69272



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D68824: Fix __builtin_assume_aligned with too large values.

2019-10-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:2854
+def warn_assume_align_too_big : Warning<"alignments greater than 1 << 29 are "
+  "unsupported by LLVM">, InGroup;
 

Should this be in the IgnoredAttributes group? The builtin is not an attribute.

Also, I'd rather that the wording were similar to that used by 
err_attribute_aligned_too_great and other errors, and use 268435456 (which is 
what we get for MaxValidAlignment based on LLVM restrictions) instead of '1 << 
29'.





Comment at: clang/lib/Sema/SemaChecking.cpp:6066
  << Arg->getSourceRange();
+if (Result > (1 << 29ULL))
+  Diag(TheCall->getBeginLoc(), diag::warn_assume_align_too_big)

lebedev.ri wrote:
> `(1ULL << 29ULL)`.
> Is there some define for this somwhere, don't like magic numbers.
Yep, there's Value::MaximumAlignment.

We have:


  // Alignment calculations can wrap around if it's greater than 2**28.
  unsigned MaxValidAlignment =
  Context.getTargetInfo().getTriple().isOSBinFormatCOFF() ? 8192
  : 268435456;

in Sema::AddAlignedAttr. which also has the magic-number problem.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68824/new/

https://reviews.llvm.org/D68824



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64943#1683368 , @JonChesterfield 
wrote:

> The three way split looks great, thanks.


Makes sense to me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64943/new/

https://reviews.llvm.org/D64943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

This LGTM. I'm happy that this is a design improvement over the current scheme. 
@JonChesterfield , @ABataev , any further comments?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64943/new/

https://reviews.llvm.org/D64943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63607: [clang][driver] Add basic --driver-mode=flang support for fortran

2019-09-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> One thought that occurs to me when reviewing this. I think we will assume 
> that F18 /flang when it lands in the LLVM 
> project will be an optional thing to build and will be an opt-in project at 
> the start that is off by default. What happens when clang --driver-mode=flang 
> is called and flang has not been built? And where would we put a test for 
> that behaviour? If flang were not to be built, should --driver-mode=flang be 
> unsupported/unrecognised and emit an error message?

The Clang tests should not actually run flang regardless - they should just 
test the command line that should be run. An error might be emitted if "flang" 
isn't found in the relevant search paths, however, that makes sense to me 
(although we might override this for the purpose of the tests?).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63607/new/

https://reviews.llvm.org/D63607



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63607: [clang][driver] Add basic --driver-mode=fortran support for flang

2019-09-10 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/Driver/ToolChains/Flang.cpp:73
+  } else {
+assert(Output.isNothing() && "Input output.");
+  }

Should this say "Invalid output"?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63607/new/

https://reviews.llvm.org/D63607



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66490: [NewPM] Enable the New Pass Manager by Default in Clang

2019-08-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D66490#1638162 , @rupprecht wrote:

> We already know that we don't want this enabled for tsan builds due to 
> https://bugs.llvm.org/show_bug.cgi?id=42877, but I don't even know if anyone 
> else will hit it (it's only when building one particular library).


Under the circumstances, that seems like one particular library too many. 
PR42877 looks like a generic bug, so if we're hitting it here, I see no reason 
to suspect that others would not hit it elsewhere.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66490/new/

https://reviews.llvm.org/D66490



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65835: [OpenMP] Fix map/is_device_ptr with DSA on combined directive

2019-08-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D65835#1623903 , @ABataev wrote:

> In D65835#1623883 , @hfinkel wrote:
>
> > In D65835#1623814 , @hfinkel wrote:
> >
> > > In D65835#1623772 , @ABataev 
> > > wrote:
> > >
> > > > In D65835#1623756 , @kkwli0 
> > > > wrote:
> > > >
> > > > > In D65835#1622042 , @ABataev 
> > > > > wrote:
> > > > >
> > > > > > > I want to be sure we're on the same page: For OpenMP 5.0, should 
> > > > > > > we allow is_device_ptr with the private clauses?
> > > > > >
> > > > > > Yes, since it is allowed by the standard.
> > > > >
> > > > >
> > > > > Umm ... I probably missed some earlier discussions!  What would be 
> > > > > the behavior of the following code?
> > > > >
> > > > >   p = omp_target_alloc(...);
> > > > >   #pragma omp target private(p) is_device_ptr(p)
> > > > > p[...] = ...;   // crash or not?
> > > > >
> > > >
> > > >
> > > > It must crush, I assume. The main problem is that this construct is 
> > > > allowed by the standard.
> > >
> > >
> > > Yep. We should add a warning message for it.
> >
> >
> > Upon further reflection, this is not clearly allowed by the standard. My 
> > experience is that, when reading standards, sometimes things are disallowed 
> > by contradiction (i.e., the standard does not define some behavior, and 
> > what the standard does say that's relevant is self contradictory). In this 
> > case, 2.19.3 says that list items which are privatized (and which are used) 
> > undergo replacement (with new items created as specified) while 2.12.5 says 
> > that "The is_device_ptr clause is used to indicate that a list item is a 
> > device pointer already in the device data environment and that it should be 
> > used directly." A given list item cannot simultaneously be "used directly" 
> > (2.12.5) and also undergo replacement: "Inside the construct, all 
> > references to the original list item are replaced by references to a new 
> > list item received by the task or SIMD lane" (2.19.3). Thus, it may be 
> > disallowed.
>
>
> I think, this combination is still allowed. I assume that
>
>   #Pragma omp target parallel is_device_ptr(a) (a)
>
>
> is the same as
>
>   #pragma omp target is_device_ptr(a)
>   #pragma omp parallel (a)
>
>
> i.e. datasharing clauses are applied to inner sub-regions, not the target 
> region itself.


With the parallel, that makes sense to me. In that case, however, I'd imagine 
that the privitization works as normal and the code wouldn't crash. Right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65835/new/

https://reviews.llvm.org/D65835



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65835: [OpenMP] Fix map/is_device_ptr with DSA on combined directive

2019-08-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D65835#1623814 , @hfinkel wrote:

> In D65835#1623772 , @ABataev wrote:
>
> > In D65835#1623756 , @kkwli0 wrote:
> >
> > > In D65835#1622042 , @ABataev 
> > > wrote:
> > >
> > > > > I want to be sure we're on the same page: For OpenMP 5.0, should we 
> > > > > allow is_device_ptr with the private clauses?
> > > >
> > > > Yes, since it is allowed by the standard.
> > >
> > >
> > > Umm ... I probably missed some earlier discussions!  What would be the 
> > > behavior of the following code?
> > >
> > >   p = omp_target_alloc(...);
> > >   #pragma omp target private(p) is_device_ptr(p)
> > > p[...] = ...;   // crash or not?
> > >
> >
> >
> > It must crush, I assume. The main problem is that this construct is allowed 
> > by the standard.
>
>
> Yep. We should add a warning message for it.


Upon further reflection, this is not clearly allowed by the standard. My 
experience is that, when reading standards, sometimes things are disallowed by 
contradiction (i.e., the standard does not define some behavior, and what the 
standard does say that's relevant is self contradictory). In this case, 2.19.3 
says that list items which are privatized (and which are used) undergo 
replacement (with new items created as specified) while 2.12.5 says that "The 
is_device_ptr clause is used to indicate that a list item is a device pointer 
already in the device data environment and that it should be used directly." A 
given list item cannot simultaneously be "used directly" (2.12.5) and also 
undergo replacement: "Inside the construct, all references to the original list 
item are replaced by references to a new list item received by the task or SIMD 
lane" (2.19.3). Thus, it may be disallowed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65835/new/

https://reviews.llvm.org/D65835



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65835: [OpenMP] Fix map/is_device_ptr with DSA on combined directive

2019-08-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D65835#1623772 , @ABataev wrote:

> In D65835#1623756 , @kkwli0 wrote:
>
> > In D65835#1622042 , @ABataev wrote:
> >
> > > > I want to be sure we're on the same page: For OpenMP 5.0, should we 
> > > > allow is_device_ptr with the private clauses?
> > >
> > > Yes, since it is allowed by the standard.
> >
> >
> > Umm ... I probably missed some earlier discussions!  What would be the 
> > behavior of the following code?
> >
> >   p = omp_target_alloc(...);
> >   #pragma omp target private(p) is_device_ptr(p)
> > p[...] = ...;   // crash or not?
> >
>
>
> It must crush, I assume. The main problem is that this construct is allowed 
> by the standard.


Yep. We should add a warning message for it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65835/new/

https://reviews.llvm.org/D65835



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65835: [OpenMP] Fix map/is_device_ptr with DSA on combined directive

2019-08-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_is_device_ptr_messages.cpp:93
   ;
-#pragma omp target parallel for private(ps) is_device_ptr(ps) // 
expected-error{{private variable cannot be in a is_device_ptr clause in 
'#pragma omp target parallel for' directive}} expected-note{{defined as 
private}}
+#pragma omp target parallel for private(ps) is_device_ptr(ps)
 for (int ii=0; ii<10; ii++)

ABataev wrote:
> hfinkel wrote:
> > ABataev wrote:
> > > hfinkel wrote:
> > > > ABataev wrote:
> > > > > jdoerfert wrote:
> > > > > > ABataev wrote:
> > > > > > > jdoerfert wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > jdoerfert wrote:
> > > > > > > > > > ABataev wrote:
> > > > > > > > > > > jdoerfert wrote:
> > > > > > > > > > > > I think this should cause an error or at least a 
> > > > > > > > > > > > warning. Telling the compiler `ps` is a device pointer 
> > > > > > > > > > > > only to create a local uninitialized shadowing variable 
> > > > > > > > > > > > seems like an error to me.
> > > > > > > > > > > It is allowed according to OpenMP 5.0. Private copy must 
> > > > > > > > > > > be created in the context of the parallel region, not the 
> > > > > > > > > > > target region. So, for OpenMP 5.0 we should not emit 
> > > > > > > > > > > error here.
> > > > > > > > > > What does that mean and how does that affect my reasoning?
> > > > > > > > > It means, that for OpenMP 5.0 we should emit a warning/error 
> > > > > > > > > here. It is allowed according to the standard, we should 
> > > > > > > > > allow it too.
> > > > > > > > > So, for OpenMP 5.0 we should not emit error here.
> > > > > > > > > that for OpenMP 5.0 we should emit a warning/error here.
> > > > > > > > 
> > > > > > > > The last answer contradicts what you said earlier. I expect 
> > > > > > > > there is a *not* missing, correct?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Assuming you do not want an error, which is fine, I still think 
> > > > > > > > a warning is appropriate as it seems to me there is never a 
> > > > > > > > reason to have a `is_device_ptr` clause for a private value. I 
> > > > > > > > mean, it is either a bug by the programmer, e.g., 5 letters of 
> > > > > > > > `firstprivate` went missing, or simply nonsensical code for 
> > > > > > > > which we warn in other situations as well.
> > > > > > > Missed `not`.
> > > > > > > These kind of construct are explicitly allowed in OpenMP. And we 
> > > > > > > should obey the standard unconditionally.
> > > > > > > Plus, there might be situations where user may require it 
> > > > > > > explicitly. For example, the device pointer is dereferenced in 
> > > > > > > one of the clauses for the subregions but in the deeper subregion 
> > > > > > > it might be used as a private pointer. Why we should emit a 
> > > > > > > warning here?
> > > > > > If you have a different situation, e.g., the one you describe, you 
> > > > > > should not have a warning. Though, that is not the point. If you 
> > > > > > have the situation above (single directive), as per my reasoning, 
> > > > > > there doesn't seem to be a sensible use case. If you have one and 
> > > > > > we should create an explicit test for it.
> > > > > > 
> > > > > > > These kind of construct are explicitly allowed in OpenMP.
> > > > > > `explicitly allowed` != `not forbidded` (yet)
> > > > > > > And we should obey the standard unconditionally.
> > > > > > Nobody says we should not. We warn people all the time even if it 
> > > > > > is valid code.
> > > > > Warnings may prevent successful compilation in some cases, e.g. when 
> > > > > warnings are treated as errors. Why we should emit a warning if the 
> > > > > construct is allowed by the standard? Ask to change the standard if 
> > > > > you did not agree with it.
> > > > Warnings are specifically for constructs which are legal, but likely 
> > > > wrong (i.e., will not do what the user likely intended). Treating 
> > > > warnings as errors is not a conforming compilation mode - by design 
> > > > (specifically because that will reject conforming programs). Thus...
> > > > 
> > > > > Why we should emit a warning if the construct is allowed by the 
> > > > > standard? Ask to change the standard if you did not agree with it.
> > > > 
> > > > This is the wrong way to approach this. Warnings are specifically for 
> > > > legal code. They help users prevent errors, however, in cases where 
> > > > that legal code is likely problematic or won't do what the user intends.
> > > > 
> > > Ok, we could emit wqrnings in some cases. But better to do it in the 
> > > separate patches. Each particular case requires additional analysis.
> > > 
> > > > This is the wrong way to approach this.
> > > 
> > > I don't think so. If some cases are really meaningless, better to ask to 
> > > prohibit them in the standard. It is always a good idea to change the 
> > > requirements first, if you think that some 

[PATCH] D65835: [OpenMP] Fix map/is_device_ptr with DSA on combined directive

2019-08-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_is_device_ptr_messages.cpp:93
   ;
-#pragma omp target parallel for private(ps) is_device_ptr(ps) // 
expected-error{{private variable cannot be in a is_device_ptr clause in 
'#pragma omp target parallel for' directive}} expected-note{{defined as 
private}}
+#pragma omp target parallel for private(ps) is_device_ptr(ps)
 for (int ii=0; ii<10; ii++)

ABataev wrote:
> hfinkel wrote:
> > ABataev wrote:
> > > jdoerfert wrote:
> > > > ABataev wrote:
> > > > > jdoerfert wrote:
> > > > > > ABataev wrote:
> > > > > > > jdoerfert wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > jdoerfert wrote:
> > > > > > > > > > I think this should cause an error or at least a warning. 
> > > > > > > > > > Telling the compiler `ps` is a device pointer only to 
> > > > > > > > > > create a local uninitialized shadowing variable seems like 
> > > > > > > > > > an error to me.
> > > > > > > > > It is allowed according to OpenMP 5.0. Private copy must be 
> > > > > > > > > created in the context of the parallel region, not the target 
> > > > > > > > > region. So, for OpenMP 5.0 we should not emit error here.
> > > > > > > > What does that mean and how does that affect my reasoning?
> > > > > > > It means, that for OpenMP 5.0 we should emit a warning/error 
> > > > > > > here. It is allowed according to the standard, we should allow it 
> > > > > > > too.
> > > > > > > So, for OpenMP 5.0 we should not emit error here.
> > > > > > > that for OpenMP 5.0 we should emit a warning/error here.
> > > > > > 
> > > > > > The last answer contradicts what you said earlier. I expect there 
> > > > > > is a *not* missing, correct?
> > > > > > 
> > > > > > 
> > > > > > Assuming you do not want an error, which is fine, I still think a 
> > > > > > warning is appropriate as it seems to me there is never a reason to 
> > > > > > have a `is_device_ptr` clause for a private value. I mean, it is 
> > > > > > either a bug by the programmer, e.g., 5 letters of `firstprivate` 
> > > > > > went missing, or simply nonsensical code for which we warn in other 
> > > > > > situations as well.
> > > > > Missed `not`.
> > > > > These kind of construct are explicitly allowed in OpenMP. And we 
> > > > > should obey the standard unconditionally.
> > > > > Plus, there might be situations where user may require it explicitly. 
> > > > > For example, the device pointer is dereferenced in one of the clauses 
> > > > > for the subregions but in the deeper subregion it might be used as a 
> > > > > private pointer. Why we should emit a warning here?
> > > > If you have a different situation, e.g., the one you describe, you 
> > > > should not have a warning. Though, that is not the point. If you have 
> > > > the situation above (single directive), as per my reasoning, there 
> > > > doesn't seem to be a sensible use case. If you have one and we should 
> > > > create an explicit test for it.
> > > > 
> > > > > These kind of construct are explicitly allowed in OpenMP.
> > > > `explicitly allowed` != `not forbidded` (yet)
> > > > > And we should obey the standard unconditionally.
> > > > Nobody says we should not. We warn people all the time even if it is 
> > > > valid code.
> > > Warnings may prevent successful compilation in some cases, e.g. when 
> > > warnings are treated as errors. Why we should emit a warning if the 
> > > construct is allowed by the standard? Ask to change the standard if you 
> > > did not agree with it.
> > Warnings are specifically for constructs which are legal, but likely wrong 
> > (i.e., will not do what the user likely intended). Treating warnings as 
> > errors is not a conforming compilation mode - by design (specifically 
> > because that will reject conforming programs). Thus...
> > 
> > > Why we should emit a warning if the construct is allowed by the standard? 
> > > Ask to change the standard if you did not agree with it.
> > 
> > This is the wrong way to approach this. Warnings are specifically for legal 
> > code. They help users prevent errors, however, in cases where that legal 
> > code is likely problematic or won't do what the user intends.
> > 
> Ok, we could emit wqrnings in some cases. But better to do it in the separate 
> patches. Each particular case requires additional analysis.
> 
> > This is the wrong way to approach this.
> 
> I don't think so. If some cases are really meaningless, better to ask to 
> prohibit them in the standard. It is always a good idea to change the 
> requirements first, if you think that some scenarios are not described 
> correctly rather than do the changes in the code. It leads to different 
> behavior of different compilers in the same situation and it is not good for 
> the users.
> I don't think so. If some cases are really meaningless, better to ask to 
> prohibit them in the standard. It is always a good idea to change the 
> requirements first, if you 

[PATCH] D65835: [OpenMP] Fix map/is_device_ptr with DSA on combined directive

2019-08-07 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_is_device_ptr_messages.cpp:93
   ;
-#pragma omp target parallel for private(ps) is_device_ptr(ps) // 
expected-error{{private variable cannot be in a is_device_ptr clause in 
'#pragma omp target parallel for' directive}} expected-note{{defined as 
private}}
+#pragma omp target parallel for private(ps) is_device_ptr(ps)
 for (int ii=0; ii<10; ii++)

ABataev wrote:
> jdoerfert wrote:
> > ABataev wrote:
> > > jdoerfert wrote:
> > > > ABataev wrote:
> > > > > jdoerfert wrote:
> > > > > > ABataev wrote:
> > > > > > > jdoerfert wrote:
> > > > > > > > I think this should cause an error or at least a warning. 
> > > > > > > > Telling the compiler `ps` is a device pointer only to create a 
> > > > > > > > local uninitialized shadowing variable seems like an error to 
> > > > > > > > me.
> > > > > > > It is allowed according to OpenMP 5.0. Private copy must be 
> > > > > > > created in the context of the parallel region, not the target 
> > > > > > > region. So, for OpenMP 5.0 we should not emit error here.
> > > > > > What does that mean and how does that affect my reasoning?
> > > > > It means, that for OpenMP 5.0 we should emit a warning/error here. It 
> > > > > is allowed according to the standard, we should allow it too.
> > > > > So, for OpenMP 5.0 we should not emit error here.
> > > > > that for OpenMP 5.0 we should emit a warning/error here.
> > > > 
> > > > The last answer contradicts what you said earlier. I expect there is a 
> > > > *not* missing, correct?
> > > > 
> > > > 
> > > > Assuming you do not want an error, which is fine, I still think a 
> > > > warning is appropriate as it seems to me there is never a reason to 
> > > > have a `is_device_ptr` clause for a private value. I mean, it is either 
> > > > a bug by the programmer, e.g., 5 letters of `firstprivate` went 
> > > > missing, or simply nonsensical code for which we warn in other 
> > > > situations as well.
> > > Missed `not`.
> > > These kind of construct are explicitly allowed in OpenMP. And we should 
> > > obey the standard unconditionally.
> > > Plus, there might be situations where user may require it explicitly. For 
> > > example, the device pointer is dereferenced in one of the clauses for the 
> > > subregions but in the deeper subregion it might be used as a private 
> > > pointer. Why we should emit a warning here?
> > If you have a different situation, e.g., the one you describe, you should 
> > not have a warning. Though, that is not the point. If you have the 
> > situation above (single directive), as per my reasoning, there doesn't seem 
> > to be a sensible use case. If you have one and we should create an explicit 
> > test for it.
> > 
> > > These kind of construct are explicitly allowed in OpenMP.
> > `explicitly allowed` != `not forbidded` (yet)
> > > And we should obey the standard unconditionally.
> > Nobody says we should not. We warn people all the time even if it is valid 
> > code.
> Warnings may prevent successful compilation in some cases, e.g. when warnings 
> are treated as errors. Why we should emit a warning if the construct is 
> allowed by the standard? Ask to change the standard if you did not agree with 
> it.
Warnings are specifically for constructs which are legal, but likely wrong 
(i.e., will not do what the user likely intended). Treating warnings as errors 
is not a conforming compilation mode - by design (specifically because that 
will reject conforming programs). Thus...

> Why we should emit a warning if the construct is allowed by the standard? Ask 
> to change the standard if you did not agree with it.

This is the wrong way to approach this. Warnings are specifically for legal 
code. They help users prevent errors, however, in cases where that legal code 
is likely problematic or won't do what the user intends.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65835/new/

https://reviews.llvm.org/D65835



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65880: [Driver] Move LIBRARY_PATH before user inputs

2019-08-07 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM.

Local flags should certainly override LIBRARY_PATH.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65880/new/

https://reviews.llvm.org/D65880



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65410: [PassManager] First Pass implementation at -O1 pass pipeline

2019-08-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

Thanks for starting on this. Can you go ahead and replace the sroa calls with 
mem2reg calls for `O1` and then see what that does to the performance? That 
strikes me as a major change, but certainly one that potentially makes sense, 
so I'd rather we go ahead and test it now before we make decisions about other 
adjustments.

FWIW, I thought that we might run InstCombine less often (or maybe replace it 
with InstSimplify, in some places). Did you try that?




Comment at: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp:362
 
+  // TODO: Investigate the cost/benefit of tail call elimination on debugging.
   MPM.add(createTailCallEliminationPass()); // Eliminate tail calls

By definition, this loses information from the call stack, no?



Comment at: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp:432
 
+  // TODO: Investigate if this is too expensive at O1.
   MPM.add(createAggressiveDCEPass()); // Delete dead instructions

Yes, I'd fall back to using regular DCE.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65410/new/

https://reviews.llvm.org/D65410



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64386: CodeGen: Use memset in initializers for non-zeros

2019-07-24 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/test/CodeGen/init-memset.c:29
   char a[] = "aa";
-  // CHECK: call void @llvm.memcpy.{{.*}}
+  // CHECK: call void @llvm.memset.{{.*}}
   use(a);

I'd be more comfortable if the test here actually checked the length and the 
value in the memset call here and below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64386/new/

https://reviews.llvm.org/D64386



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64067: [X86][PPC] Support -mlong-double-64

2019-07-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64067#1592201 , @troyj wrote:

> Hi, we just inherited this commit at Cray when we did our latest upstream 
> merge and there are a few problems with it that I'd like to point out.  Sorry 
> that I was not part of the initial discussion here, but I didn't know that 
> this work was being done and I had already done it for x86 in our downstream 
> compiler a while ago.
>
> 1. This patch adds the -m options to f_Group, which is weird.  They should be 
> in m_Group since they are target-specific.  I actually implemented them as 
> part of a subgroup of m_Group so that I can take the last option from that 
> group that appears on the command line.
> 2. This patch lacks the GNU -mlong-double-80 option for x86.
> 3. Instead of tracking the size in clang/Basic/LangOptions.def, I track it in 
> clang/Basic/TargetOptions.h.  This is a target-specific thing afterall.
>
>   We're trying to resolve merge conflicts with this change and might be able 
> to submit a patch to help reduce our differences.  Would that be of interest?


Yes, that would be helpful.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64067/new/

https://reviews.llvm.org/D64067



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64850: Remove use of malloc in PPC mm_malloc.

2019-07-17 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64850/new/

https://reviews.llvm.org/D64850



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64283: [PowerPC] Support -mabi=ieeelongdouble and -mabi=ibmlongdouble

2019-07-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64283/new/

https://reviews.llvm.org/D64283



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64283: [PowerPC] Support -mabi=ieeelongdouble and -mabi=ibmlongdouble

2019-07-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

Ah, fun with overloaded, legacy command-line options...




Comment at: lib/Driver/ToolChains/Clang.cpp:1825
+  // that don't use the altivec abi.
+  if (!SeenOther)
+ABIName = A->getValue();

This seems like an unintentional behavior change on top of the existing 
behavior (which may have also been not quite right). As best I can tell, we're 
trying to set ABIName equal to the last ABI type specified, unless that type is 
"altivec", in which case the ABI name should be its default value (either a 
nullptr or something like "elfv2"). With this change, we'll now take the first 
ABI name seen, not the last (as we'll get SeenOther to true after the first 
name).

Maybe we should loop over the list multiple times, once to get the long-double 
format, once to get the basic ABI name (where we save the default, and then if 
the last name found is "altivec", then reset to the default)?



Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64283/new/

https://reviews.llvm.org/D64283



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

2019-07-09 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> Do we have any policies against using clang-only builtins in the codebase?

No, but obviously it will need to be protected with appropriate ifdefs and 
integrated in some maintainable fashion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

2019-07-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64128#1572504 , @rjmccall wrote:

> I would be opposed to any addition of a `-f` flag for this absent any 
> evidence that it's valuable for some actual C code; this patch appears to be 
> purely speculative.  I certainly don't think we should be adding it 
> default-on.  Your argument about alignment is what I was referring to when I 
> mentioned that this is enforcing alignment restrictions in places we 
> traditionally and intentionally haven't.


My underlying thought here is: The more we generate a particular IR construct 
the more quickly we'll find the bugs and the better we'll end up optimizing it. 
Thus, I found the idea behind this patch appealing. My experience is also that 
providing more pointer underlying-object information tends to help code 
quality. However, I definitely see your point that this is potentially risky: 
bugs uncovered by generating this intrinsic based on this code pattern might 
reveal only code doing interesting things with pointer bits and not actual 
optimizer bugs. I think that we have different feelings on the magnitude of 
that risk, but I definitely glad that you weighed in here.

I suggest that we drop this unless and until we have benchmark data showing it 
to be useful. If we do, then we can consider an off-by-default flag. It will 
also be interesting to explore adding a Clang intrinsic.

> Note that it won't help Clang's major uses of `PointerIntPair` and 
> `PointerUnion` because our traits say that types like `Decl` and `Type` have 
> more alignment bits than their formal alignment would indicate, and 
> `PointerIntPair` occupies alignment bits starting with the most significant.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59919: [Attributor] Deduce "returned" argument attribute

2019-07-05 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: clang/test/CodeGenOpenCL/as_type.cl:3
+// Once the attributor is on by default remove the following run line and 
change the prefixes below.
+// RUN: %clang_cc1 %s -emit-llvm -mllvm -attributor-disable=false -triple 
spir-unknown-unknown -o - | FileCheck %s --check-prefix=ATTRIBUTOR
 

I recommend leaving the Clang tests along. Clang tests that run the optimizer 
don't follow out best practices and, while end-to-end testing is valuable, I 
don't think that we should encourage general optimizer testing here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59919/new/

https://reviews.llvm.org/D59919



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

2019-07-05 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64128#1570609 , @rjmccall wrote:

> In D64128#1569836 , @hfinkel wrote:
>
> > In D64128#1569817 , @rjmccall 
> > wrote:
> >
> > > The pointer/integer conversion is "implementation-defined", but it's not 
> > > totally unconstrained.  C notes that "The mapping functions for 
> > > converting a pointer to an integer or an integer to a pointer are 
> > > intended to be consistent with the addressing structure of the execution 
> > > environment.", and we do have to honor that.  The standard allows that 
> > > "the result ... might not point to an entity of the referenced type", but 
> > > when in fact it's guaranteed to do so (i.e. it's not just a coincidental 
> > > result of an implementation decision like the exact address of a global 
> > > variable — no "guessing"), I do think we have an obligation to make it 
> > > work.  And on a practical level, there has to be *some* way of playing 
> > > clever address tricks in the language in order to implement things like 
> > > allocators and so forth.  So this makes me very antsy.
> >
> >
> > I don't disagree. But I believe the question is if we have:
> >
> >   int *x = malloc(4);
> >   int *y = malloc(4);
> >   if (x & ~15 == y) {
> > *(x & ~15) = 5; // Is this allowed, and if so, must the compiler assume 
> > that it might set the value of *y?
> >   }
> >   
> >
> > I certainly agree that we must allow the implementation of allocators, etc. 
> > But allocators, I think, have the opposite problem. They actually have some 
> > large underlying objects (from mmap or whatever), and we want the rest of 
> > the system to treat some subobjects of these larger objects as though they 
> > were independent objects of some given types. From the point of view of the 
> > allocator, we have x, and we have `void *memory_pool`, and we need to allow 
> > `x & N` to point into `memory_pool`, but because, from the allocator's 
> > perspective, we never knew that x didn't point into memory_pool (as, in 
> > fact, it likely does), that should be fine (*).
> >
> > There might be more of an issue, for example, if for a given object, I 
> > happen to know that there's some interesting structure at the beginning of 
> > its page (or some other boundary).
>
>
> This is what I was thinking about for allocators; this is a common 
> implementation technique for `free` / `realloc` / `malloc_size`.
>
> > If I also have a pointer to this structure via some other means, then maybe 
> > this will cause a problem. This kind of thing certainly falls outside of 
> > the C/C++ abstract machine, and I'd lean toward a flag for supporting it 
> > (not on by default).
>
> If you mean a theoretical minimal C abstract machine that does not correspond 
> to an actual target and is therefore not bound by any of the statements in 
> the C standard that say things like "this is expected to have its obvious 
> translation on the target", then yes, I completely agree.  If you're talking 
> about the actual C programming language that does correspond to actual 
> targets, then it's not clear at all that it's outside the C abstract machine, 
> because AFAICT integer-pointer conversions are (1) well-specified on specific 
> targets by this de facto requirement of corresponding directly to pointer 
> representations and (2) well-behaved as long as the integer does correspond 
> to the address of an actual object of that type.
>
> Also, please understand that compiler writers have been telling our users for 
> decades that (1) pointer arithmetic is subject to some restrictions on 
> penalty of UB and (2) they can avoid those restrictions by using 
> pointer-integer conversions and doing integer arithmetic instead.


Indeed, we have.

>   So any proposal to weaken the latter as a workaround makes me very worried, 
> especially if it's also enforcing alignment restrictions that we've generally 
> chosen not to enforce when separated from actual memory accesses.

I *think* that what @fhahn points out about the masking restrictions addresses 
the concerns that we were discussing: My understanding of the 
potentially-problematic cases here are when the masking could get you from one 
variable from one underlying object (that you might access) to another variable 
in a different underlying object (that you might also access), and given the 
masking restrictions, this is impossible (*): because the high bits can't be 
used for addressing, and for the lower bits, these fall within the alignment of 
the type, and so in order for that to move you between underlying objects, the 
objects would need to be smaller than the type alignment.

(*) I suppose one can arrange for this to break something given some 
system-specific setup:

  // note: the linker script ensures that these are packed and adjacent.
  short a;
  struct {
short b, c;
  } s;
  
  ...
  int *ib = 

[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

2019-07-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64128#1569817 , @rjmccall wrote:

> The pointer/integer conversion is "implementation-defined", but it's not 
> totally unconstrained.  C notes that "The mapping functions for converting a 
> pointer to an integer or an integer to a pointer are intended to be 
> consistent with the addressing structure of the execution environment.", and 
> we do have to honor that.  The standard allows that "the result ... might not 
> point to an entity of the referenced type", but when in fact it's guaranteed 
> to do so (i.e. it's not just a coincidental result of an implementation 
> decision like the exact address of a global variable — no "guessing"), I do 
> think we have an obligation to make it work.  And on a practical level, there 
> has to be *some* way of playing clever address tricks in the language in 
> order to implement things like allocators and so forth.  So this makes me 
> very antsy.


I don't disagree. But I believe the question is if we have:

  int *x = malloc(4);
  int *y = malloc(4);
  if (x & ~15 == y) {
*(x & ~15) = 5; // Is this allowed, and if so, must the compiler assume 
that it might set the value of *y?
  }

I certainly agree that we must allow the implementation of allocators, etc. But 
allocators, I think, have the opposite problem. They actually have some large 
underlying objects (from mmap or whatever), and we want the rest of the system 
to treat some subobjects of these larger objects as though they were 
independent objects of some given types. From the point of view of the 
allocator, we have x, and we have `void *memory_pool`, and we need to allow `x 
& N` to point into `memory_pool`, but because, from the allocator's 
perspective, we never knew that x didn't point into memory_pool (as, in fact, 
it likely does), that should be fine (*).

There might be more of an issue, for example, if for a given object, I happen 
to know that there's some interesting structure at the beginning of its page 
(or some other boundary). If I also have a pointer to this structure via some 
other means, then maybe this will cause a problem. This kind of thing certainly 
falls outside of the C/C++ abstract machine, and I'd lean toward a flag for 
supporting it (not on by default). I'm assuming that this would be rare. If I'm 
wrong, then we shouldn't do this by default.

(*) We do have a problem if we inline the implementation of malloc, given how 
our noalias return attribute works, but that's a preexisting problem, and the 
malloc implementation should probably be compiled with -fno-builtin-malloc 
regardless.

> If the general language rules are too permissive for some interesting 
> optimization, it's fine to consider builtins that impose stronger 
> restrictions on their use.

I agree.

Also, and I could be wrong, but my impression is that all of this is extra - 
this motivating use case requires generating the intrinsic from the code in 
lib/CodeGen/TargetInfo.cpp - generating it from C/C++ expressions is just a 
potential additional benefit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

2019-07-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64128#1569590 , @efriedma wrote:

> > If they're all syntactically together like this, maybe that's safe?
>
> Having them together syntactically doesn't really help, I think; it might be 
> guarded by some code that does the same conversion (and if you repeat the 
> conversion, it has to produce the same result).


Indeed. That's correct (and also why the hasOneUse check at the IR level would 
have been ineffective). However...

In D64128#1569578 , @rjmccall wrote:

> I agree with Eli that this isn't obviously a legal transformation.  
> `llvm.ptrmask` appears to make semantic guarantees about e.g. the pointer 
> after the mask referring to the same underlying object, which means we can 
> only safely emit it when something about the source program makes that 
> guarantee.  It's not at all clear that C does so for an expression like `(T*) 
> ((intptr_t) x & N)`.


I think that this is the key point. First, at the IR level we have a problem 
because we have no way to robustly track pointer provenance information. If we 
have `if (a == b) { f(a); }` the optimizer can transform this code into `if (a 
== b) { f(b); }` and we've lost track of whether the parameter to f is based on 
a or b. At the source level we don't have this problem (because we have the 
unaltered expressions provided by the user, and can therefore use whatever 
provenance information that source implies).

Thus, as John says, the question is whether, at the source level, `(T*) 
((intptr_t) x & N)` always has, and only has, the same underlying objects as x 
- when executing the expression is well defined. In C++, I think that this is 
clearly true for implementations with "strict pointer safety" (6.6.5.4.3), as 
the rules for safely-derived pointer values state that, while you can get 
safely-derived pointer values using integer casts and bitwise operators, the 
result must be one that could have been safely derived from the original object 
using well-defined pointer arithmetic, and that's only true for pointers into 
some array pointed into by x (or one past the end). For implementations with 
"relaxed pointer safety", it's all implementation defined, so I don't see we 
couldn't choose our implementation-defined semantics to define this problem 
away (although we certainly need to be careful that we don't unintentionally 
make any significant body of preexisting code incompatible with Clang by doing 
so).

For C, we also need to be concerned with the definition of "based on" 
(6.7.3.1). In some philosophical sense, this seems trickier (i.e., what if 
modifying the value of x at some sequence point prior to the expression makes 
the expressions dead? Are we required, as part of the standardized through 
experiment, to also modify the other variables to keep the expression alive 
when performing the "based on" analysis, and do those modifications count for 
the purposes of determining the "based on" property?). Regardless, given that 
the intent is to enable optimizations, it seems reasonable to say that `(T*) 
((intptr_t) x & N)` is only based on x. For C, 6.3.2.3 makes the conversion 
validity itself implementation defined.

@rsmith , thoughts on this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

2019-07-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64128#1568857 , @efriedma wrote:

> I don't think this transform is valid, for the same reasons we don't do it in 
> IR optimizations.


I believe that in the problematic cases we previously discussed (e.g., from 
https://reviews.llvm.org/D59065#1449682), they all depend on some control 
dependency being introduced somewhere in between the initial pointer casts and 
the other operations. If they're all syntactically together like this, maybe 
that's safe?

Does this actually catch the ABI code that motivated this in the first place? 
Isn't that in lib/CodeGen/TargetInfo.cpp (e.g., in 
emitRoundPointerUpToAlignment)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64067: [X86][PPC] Support -mlong-double-64

2019-07-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D64067#156 , @rnk wrote:

> In D64067#1568533 , @andrew.w.kaylor 
> wrote:
>
> > In this review (https://reviews.llvm.org/D6260) @rsmith mentions that this 
> > should also have an effect on name mangling.
>
>
> I'm not sure that's consistent with GCC, at least not anymore:
>  https://gcc.godbolt.org/z/eUviCd
>  Looks like you can still have an overload set with double and long double, 
> even if they both use the same representation.


It has to work that way, because they're different, standard language-level 
types.

One thing to realize about these flags is that they're ABI-altering flags. If 
the user provides the flag to alter the platform defaults, this only works if 
the user also ensures that matches the ABI of the relevant system libraries 
that the compiler is using (e.g., because the user is explicitly linking with a 
suitable libc).

> This is a backend -m flag, after all, so that seems reasonable to me.
> 
>> What will this do if the user calls a library function that expects a long 
>> double? What does gcc do in that case?
> 
> Looks like it passes according to the usual 64-bit IEEE double representation.




Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64067/new/

https://reviews.llvm.org/D64067



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64067: [X86][PPC] Support -mlong-double-64

2019-07-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

Can you please add a test ensuring that -malign-double and -mlong-double-64 
interact properly? I think that, in the current patch, they do (as 
-malign-double is processed first), but I'd prefer that we cover that case 
explicitly.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64067/new/

https://reviews.llvm.org/D64067



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63329: Allow static linking of libc++ on Linux, just like -static-libstdc++

2019-06-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: clang/lib/Driver/ToolChains/Gnu.cpp:570
 
+  // FIXME: libc++ dynamically links against libpthread, so for static
+  // linking, we need to supply that dependency.

Why does this say FIXME?



Comment at: clang/lib/Driver/ToolChains/Gnu.cpp:574
+// FIXME: Again, like above, does this really make sense for all GNU
+// toolchains?
+WantPthread = true;

It seems to me that, in general, the answer will be yes. std::thread is built 
on pthreads, generally, and if we need to change this for particular systems, 
then that can always be done as a special case.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63329/new/

https://reviews.llvm.org/D63329



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61634: [clang/llvm] Allow efficient implementation of libc's memory functions in C/C++

2019-05-15 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a subscriber: chandlerc.
hfinkel added a comment.

In D61634#1502201 , @tejohnson wrote:

> In D61634#1502138 , @hfinkel wrote:
>
> > In D61634#1502043 , @efriedma 
> > wrote:
> >
> > > > I have a related patch that turns -fno-builtin* options into module 
> > > > flags
> > >
> > > Do you have any opinion on representing -fno-builtin using a module flag 
> > > vs. a function attribute in IR?  It seems generally more flexible and 
> > > easier to reason about a function attribute from my perspective.  But I 
> > > might be missing something about the semantics of -fno-builtin that would 
> > > make that representation awkward.  Or I guess it might just be more work 
> > > to implement, given we have some IPO passes that use TargetLibraryInfo?
> >
> >
> > I think that a function attribute would be better. We generally use these 
> > flags only in the context of certain translation units, and when we use 
> > LTO, it would be sad if we had to take the most-conservative settings 
> > across the entire application. When we insert new function call to a 
> > standard library, we always do it in the context of some function. We'd 
> > probably need to block inlining in some cases, but that's better than a 
> > global conservative setting.
>
>
> Using function level attributes instead of module flags does provide finer 
> grained control and avoids the conservativeness when merging IR for LTO. The 
> downsides I see, mostly just in terms of the engineering effort to get this 
> to work, are:
>
> - need to prevent inlining with different attributes


I think that this should be relatively straightforward. You just need to update 
`AttributeFuncs::areInlineCompatible` and/or 
`AttributeFuncs::mergeAttributesForInlining` by adding a new MergeRule in 
include/llvm/IR/Attributes.td and writing a function like 
adjustCallerStackProbeSize.

> - currently the TargetLibraryInfo is constructed on a per-module basis. 
> Presumably it would instead need to be created per Function - this one in 
> particular seems like it would require fairly extensive changes.

Interesting point. The largest issue I see is that we need TLI available from 
loop passes, etc., and we can't automatically recompute a function-level 
analysis there. We need to make sure that it's always available and not 
invalidated. TLI is one of those analysis passes, being derived only from 
things which don't change (i.e., the target triple), or things that change very 
rarely (e.g., function attributes), that we probably don't want to require all 
passes to explicitly say that they preserve it (not that the mechanical change 
to all existing passes is hard, but it's easy to forget), so I think we'd like 
something like the CFG-only concept in the current passes, but stronger and 
perhaps turned on by default, for this kind of "attributes-only" pass. 
(@chandlerc , thoughts on this?).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61634/new/

https://reviews.llvm.org/D61634



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61634: [clang/llvm] Allow efficient implementation of libc's memory functions in C/C++

2019-05-14 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D61634#1502043 , @efriedma wrote:

> > I have a related patch that turns -fno-builtin* options into module flags
>
> Do you have any opinion on representing -fno-builtin using a module flag vs. 
> a function attribute in IR?  It seems generally more flexible and easier to 
> reason about a function attribute from my perspective.  But I might be 
> missing something about the semantics of -fno-builtin that would make that 
> representation awkward.  Or I guess it might just be more work to implement, 
> given we have some IPO passes that use TargetLibraryInfo?


I think that a function attribute would be better. We generally use these flags 
only in the context of certain translation units, and when we use LTO, it would 
be sad if we had to take the most-conservative settings across the entire 
application. When we insert new function call to a standard library, we always 
do it in the context of some function. We'd probably need to block inlining in 
some cases, but that's better than a global conservative setting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61634/new/

https://reviews.llvm.org/D61634



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-07 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: lib/Headers/CMakeLists.txt:36
   bmiintrin.h
+  openmp_wrappers/math.h
+  openmp_wrappers/cmath

JDevlieghere wrote:
> This doesn't do what you think it would do. The files are copied into the 
> root of the resource directory, which causes stage 2 build failures on 
> GreenDragon. 
Can you provide a link to the failure log? Is the problem that the files are 
not copied into their subdirectory?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> It is up to you. I don't have strong objections if you think this will work 
> as required. Just the tests must be fixed, especially codegen tests.

Thanks, Alexey. I think this will work as required, and then we'll be able to 
update it when we get declare variant. Agreed on the tests (on all points).


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-03 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> Still, I think we need to prvide the default implementation of those 
> non-standard functions (they can be very simple, maybe reporting error is 
> going to be enough), which can be overriden by user.

I appreciate your motivation, and I agree with you to some extent. I don't 
object to having generic versions of useful math functions, but I don't think 
they should be required. It's not reasonable to make someone add generic 
versions of every function which happens to appear in a system/target-specific 
math.h header. NVPTX won't be the only target that has target-optimized 
functions that get pulled in, even from our own headers, but system headers 
also have differences anyway depending on what preprocessor macros are defined. 
In the end, people can write portable code if they stick to what's in the 
standard, and we should make it reasonably easy for them to step outside of the 
standard to do what they need to do when the standard subset of available 
functionality doesn't meet their needs for whatever reason. This is what we do 
for C/C++, where we provide intrinsics and other system functions for those who 
can't write their code only using the facilities that C/C++ provide.

In any case, I think that we can figure out how to add generic versions of 
non-standard math functions in a separate thread. I think that we should move 
forward with this and then make progress on generic versions separately. It's 
also possible that we want to fold this discussion into the discussion on an 
LLVM math library (we've talked about this for some time in the context of 
vector math libraries, and I'd not thought about accelerators in this context, 
but maybe this is all related).


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

> Only if they also differ in some other way. C++ does not (generally) have 
> return-type-based overloading. The two functions described would even mangle 
> the same way if CUDA didn't include host/device in the mangling.

Certainly. I didn't mean to imply otherwise.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61458/new/

https://reviews.llvm.org/D61458



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D61458#1488970 , @jlebar wrote:

> Here's one for you:
>
>   __host__ float bar();
>   __device__ int bar();
>   __host__ __device__ auto foo() -> decltype(bar()) {}
>
>
> What is the return type of `foo`?  :)
>
> I don't believe the right answer is, "float when compiling for host, int when 
> compiling for device."


So, actually, I wonder if that's not the right answer. We generally allow 
different overloads to have different return types. What if, for example, the 
return type on the host is __float128 and on the device it's `MyLongFloatTy`?

> I'd be happy if we said this was an error, so long as it's well-defined what 
> exactly we're disallowing.  But I bet @rsmith can come up with substantially 
> more evil testcases than this.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61458/new/

https://reviews.llvm.org/D61458



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D61399#1488366 , @ABataev wrote:

> In D61399#1488329 , @hfinkel wrote:
>
> > In D61399#1488309 , @ABataev wrote:
> >
> > > In D61399#1488299 , @hfinkel 
> > > wrote:
> > >
> > > > In D61399#1488262 , @ABataev 
> > > > wrote:
> > > >
> > > > > I don't like this implementation. Seems to me, it breaks one of the 
> > > > > OpenMP standard requirements: the program can be compiled without 
> > > > > openmp support. I assume, that with this includes the program won't 
> > > > > be able to be compiled without OpenMP support anymore because it may 
> > > > > use some device-specific math functions explicitly.
> > > > >  Instead, I would like to see some additional, device-scpecific math 
> > > > > header file, that must be included explicitly to support some 
> > > > > device-specific math functions. And we need to provide default 
> > > > > implementations for those extra math functions for all the platforms 
> > > > > we're going to support, including default host implementations.
> > > >
> > > >
> > > > Can you provide an example of a conforming program that can't be 
> > > > compiled without OpenMP support? Regardless of the use of any 
> > > > device-specific functions (which isn't covered by the standard, of 
> > > > course, but might be needed in practice), the code still needs to be 
> > > > compilable by the host in order to generate the host-fallback version. 
> > > > This doesn't change that. Thus, any program that uses anything from 
> > > > this math.h, etc. needs to compile for the host, and thus, likely 
> > > > compiles without OpenMP support. Maybe I'm missing your point, however.
> > >
> > >
> > > Assume we have something like this:
> > >
> > >   #pragma omp target if(cond)
> > >   a = __nv_();
> > >
> > >
> > > Instead of `__nv_xxx` you can try to use any Cuda-specific function, 
> > > which is not the part of the standard `math.h`/`cmath` files. Will it be 
> > > compilable even with OpenMP?
> >
> >
> > I don't think that this changes that one way or the other. Your example 
> > won't work, AFAIK, unless you do something like:
> >
> >   #pragma omp target if(cond)
> >   #ifdef __NVPTX__
> >   a = __nv_();
> >   #else
> >   a = something_on_the_host;
> >   #endif
> >   
> >
> > and anything from these headers that doesn't also have a host version will 
> > suffer the same fate: if it won't also compile for the host (one way or 
> > another), then it won't work.
>
>
> The problem with this header file is that it allows to use those 
> Cuda-specific functions unconditionally in some cases:
>
>   #pragma omp target
>   a = __nv_();
>
>
> It won't require any target-specific guards to compile this code (if we 
> compile it only for Cuda-specific devices) and we're loosing the consistency 
> here: in some cases target regions will require special device guards, in 
> others, with the same function calls, it is not. And the worst thing, is that 
> we implicitly allow to introduce this kind of incostistency into users code. 
> That's why I would prefer to see a special kind of the include file, NVPTX 
> specific, that must be included explicitly, so the user explictly commanded 
> to use some target-specific math functions, if he really wants it. Plus, 
> maybe, in this files we need force check of the platform and warn users that 
> the functions from this header file must be used using device-specific 
> checks. Or provide some kind of the default implementations for all the 
> platforms, that do not support those math function natively.


I believe that I understand your point, but two things:

1. I think that you're mistaken on the underlying premise. That code will not 
meaningfully compile without ifdefs, even if only CUDA-specific devices are the 
only ones selected. We *always* compile code for the host as well, not for 
offloading proper, but for the fallback (for execution when the offloading 
fails). If I emulate this situation by writing this:



  #ifdef __NVPTX__
  int __nv_floor();
  #endif
  
  int main() {
  #pragma omp target
  __nv_floor();
  }

and try to compile using Clang with -fopenmp -fopenmp-targets=nvptx64, the 
compilation fails:

  int1.cpp:8:1: error: use of undeclared identifier '__nv_floor'

and this is because, when we invoke the compilation for the host, there is no 
declaration for that function. This is true even though nvptx64 is the only 
target for which the code is being compiled (because we always also compile the 
host fallback).

2. I believe that the future state -- what we get by following this patch, and 
then when declare variant is available using that -- gives us all what we want. 
When we have declare variant, then all of the definitions in these headers will 
be declared as variants only 

[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D61399#1488309 , @ABataev wrote:

> In D61399#1488299 , @hfinkel wrote:
>
> > In D61399#1488262 , @ABataev wrote:
> >
> > > I don't like this implementation. Seems to me, it breaks one of the 
> > > OpenMP standard requirements: the program can be compiled without openmp 
> > > support. I assume, that with this includes the program won't be able to 
> > > be compiled without OpenMP support anymore because it may use some 
> > > device-specific math functions explicitly.
> > >  Instead, I would like to see some additional, device-scpecific math 
> > > header file, that must be included explicitly to support some 
> > > device-specific math functions. And we need to provide default 
> > > implementations for those extra math functions for all the platforms 
> > > we're going to support, including default host implementations.
> >
> >
> > Can you provide an example of a conforming program that can't be compiled 
> > without OpenMP support? Regardless of the use of any device-specific 
> > functions (which isn't covered by the standard, of course, but might be 
> > needed in practice), the code still needs to be compilable by the host in 
> > order to generate the host-fallback version. This doesn't change that. 
> > Thus, any program that uses anything from this math.h, etc. needs to 
> > compile for the host, and thus, likely compiles without OpenMP support. 
> > Maybe I'm missing your point, however.
>
>
> Assume we have something like this:
>
>   #pragma omp target if(cond)
>   a = __nv_();
>
>
> Instead of `__nv_xxx` you can try to use any Cuda-specific function, which is 
> not the part of the standard `math.h`/`cmath` files. Will it be compilable 
> even with OpenMP?


I don't think that this changes that one way or the other. Your example won't 
work, AFAIK, unless you do something like:

  #pragma omp target if(cond)
  #ifdef __NVPTX__
  a = __nv_();
  #else
  a = something_on_the_host;
  #endif

and anything from these headers that doesn't also have a host version will 
suffer the same fate: if it won't also compile for the host (one way or 
another), then it won't work.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D61399#1488262 , @ABataev wrote:

> I don't like this implementation. Seems to me, it breaks one of the OpenMP 
> standard requirements: the program can be compiled without openmp support. I 
> assume, that with this includes the program won't be able to be compiled 
> without OpenMP support anymore because it may use some device-specific math 
> functions explicitly.
>  Instead, I would like to see some additional, device-scpecific math header 
> file, that must be included explicitly to support some device-specific math 
> functions. And we need to provide default implementations for those extra 
> math functions for all the platforms we're going to support, including 
> default host implementations.


Can you provide an example of a conforming program that can't be compiled 
without OpenMP support? Regardless of the use of any device-specific functions 
(which isn't covered by the standard, of course, but might be needed in 
practice), the code still needs to be compilable by the host in order to 
generate the host-fallback version. This doesn't change that. Thus, any program 
that uses anything from this math.h, etc. needs to compile for the host, and 
thus, likely compiles without OpenMP support. Maybe I'm missing your point, 
however.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61399: [OpenMP][Clang] Support for target math functions

2019-05-02 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added inline comments.
This revision is now accepted and ready to land.



Comment at: lib/Driver/ToolChains/Clang.cpp:1163
+  llvm::sys::path::append(P, "openmp_wrappers");
+  CmdArgs.push_back("-internal-isystem");
+  CmdArgs.push_back(Args.MakeArgString(P));

Please add a driver test covering this.



Comment at: lib/Headers/openmp_wrappers/__clang_openmp_math.h:20
+/// respectively. This is actually what the Clang-CUDA code path does, using
+/// __device__ instead of variants to avoid redeclarations and get the desired
+/// overload resolution.

And this is why I wanted to just make `__device__` work in OpenMP mode. But, as 
a temporary solution until we get `declare variant` working, I'm okay with this.

Modulo the naming nit and the missing driver test, LGTM.



Comment at: lib/Headers/openmp_wrappers/__clang_openmp_math.h:46
+/// Magic macro for stopping the math.h/cmath host header from being included.
+#define __NO_HOST_MATH__
+

I'd prefer that, as this is clang specific, we make this clear by naming this 
`__CLANG_NO_HOST_MATH__`.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60907: [OpenMP] Add math functions support in OpenMP offloading

2019-04-30 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D60907#1483660 , @jdoerfert wrote:

> In D60907#1483615 , @hfinkel wrote:
>
> > In D60907#1479370 , @gtbercea 
> > wrote:
> >
> > > In D60907#1479142 , @hfinkel 
> > > wrote:
> > >
> > > > In D60907#1479118 , @gtbercea 
> > > > wrote:
> > > >
> > > > > Ping @hfinkel @tra
> > > >
> > > >
> > > > The last two comments in D47849  
> > > > indicated exploration of a different approach, and one which still 
> > > > seems superior to this one. Can you please comment on why you're now 
> > > > pursuing this approach instead?
> > >
> > >
> > > ...
> > >
> > > Hal, as far as I can tell, this solution is similar to yours but with a 
> > > slightly different implementation. If there are particular aspects about 
> > > this patch you would like to discuss/give feedback on please let me know.
> >
> >
> > The solution I suggested had the advantages of:
> >
> > 1. Being able to directly reuse the code in 
> > `__clang_cuda_device_functions.h`. On the other hand, using this solution 
> > we need to implement a wrapper function for every math function. When 
> > `__clang_cuda_device_functions.h` is updated, we need to update the OpenMP 
> > wrapper as well.
>
>
> I'd even go as far as to argue that `__clang_cuda_device_functions.h` should 
> include the internal math.h wrapper to get all math functions. See also the 
> next comment.
>
> > 2. Providing access to wrappers for other CUDA intrinsics in a natural way 
> > (e.g., rnorm3d) [it looks a bit nicer to provide a host version of rnorm3d 
> > than __nv_rnorm3d in user code].
>
> @hfinkel 
>  I don't see why you want to mix CUDA intrinsics with math.h overloads.


What I had in mind was matching non-standard functions in a standard way. For 
example, let's just say that I have a CUDA kernel that uses the rnorm3d 
function, or I otherwise have a function that I'd like to write in OpenMP that 
will make good use of this CUDA function (because it happens to have an 
efficient device implementation). This is a function that CUDA provides, in the 
global namespace, although it's not standard.

Then I can do something like this (depending on how we setup the 
implementation):

  double rnorm3d(double a,  double b, double c) {
return sqrt(a*a + b*b + c*c);
  }
  
  ...
  
  #pragma omp target
  {
double a = ..., b = ..., c = ...;
double r = rnorm3d(a, b, c)
  }

and, if we use the CUDA math headers for CUDA math-function support, than this 
might "just work." To be clear, I can see an argument for having this work 
being a bad idea ;) -- but it has the advantage of providing a way to take 
advantage of system-specific functions while still writing completely-portable 
code.

>   I added a rough outline of how I imagined the internal math.h header to 
> look like as a comment in D47849. Could you elaborate how that differs from 
> what you imagine and how the other intrinsics come in?

That looks like what I had in mind (including `__clang_cuda_device_functions.h` 
to get the device functions.)

> 
> 
>> 3. Being similar to the "declare variant" functionality from OpenMP 5, and 
>> thus, I suspect, closer to the solution we'll eventually be able to apply in 
>> a standard way to all targets.
> 
> I can see this.
> 
>>> This solution is following Alexey's suggestions. This solution allows the 
>>> optimization of math calls if they apply (example: pow(x,2) => x*x ) which 
>>> was one of the issues in the previous solution I implemented.
>> 
>> So we're also missing that optimization for CUDA code when compiling with 
>> Clang? Isn't this also something that, regardless, should be fixed?
> 
> Maybe through a general built-in recognition and lowering into target 
> specific implementations/intrinsics late again?

I suspect that we need to match the intrinsics and perform the optimizations in 
LLVM at that level in order to get the optimizations for CUDA.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60907/new/

https://reviews.llvm.org/D60907



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60907: [OpenMP] Add math functions support in OpenMP offloading

2019-04-29 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D60907#1479370 , @gtbercea wrote:

> In D60907#1479142 , @hfinkel wrote:
>
> > In D60907#1479118 , @gtbercea 
> > wrote:
> >
> > > Ping @hfinkel @tra
> >
> >
> > The last two comments in D47849  indicated 
> > exploration of a different approach, and one which still seems superior to 
> > this one. Can you please comment on why you're now pursuing this approach 
> > instead?
>
>
> ...
>
> Hal, as far as I can tell, this solution is similar to yours but with a 
> slightly different implementation. If there are particular aspects about this 
> patch you would like to discuss/give feedback on please let me know.


The solution I suggested had the advantages of:

1. Being able to directly reuse the code in `__clang_cuda_device_functions.h`. 
On the other hand, using this solution we need to implement a wrapper function 
for every math function. When `__clang_cuda_device_functions.h` is updated, we 
need to update the OpenMP wrapper as well.
2. Providing access to wrappers for other CUDA intrinsics in a natural way 
(e.g., rnorm3d) [it looks a bit nicer to provide a host version of rnorm3d than 
__nv_rnorm3d in user code].
3. Being similar to the "declare variant" functionality from OpenMP 5, and 
thus, I suspect, closer to the solution we'll eventually be able to apply in a 
standard way to all targets.

What are all of the long-double functions going to do on NVPTX?

> This solution is following Alexey's suggestions. This solution allows the 
> optimization of math calls if they apply (example: pow(x,2) => x*x ) which 
> was one of the issues in the previous solution I implemented.

So we're also missing that optimization for CUDA code when compiling with 
Clang? Isn't this also something that, regardless, should be fixed?

Also, how fragile is this? We inline bottom up but this optimization needs to 
apply before inlining?

Finally, regardless of all of this, do we really need to preinclude this 
header? Can't we do this with a math.h wrapper?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60907/new/

https://reviews.llvm.org/D60907



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60907: [OpenMP] Add math functions support in OpenMP offloading

2019-04-25 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D60907#1479118 , @gtbercea wrote:

> Ping @hfinkel @tra


The last two comments in D47849  indicated 
exploration of a different approach, and one which still seems superior to this 
one. Can you please comment on why you're now pursuing this approach instead?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60907/new/

https://reviews.llvm.org/D60907



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59712: [APSInt][OpenMP] Fix isNegative, etc. for unsigned types

2019-04-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

In D59712#1473223 , @jdenny wrote:

> In D59712#1469392 , @lebedev.ri 
> wrote:
>
> > Does this pass `check-all`? `check-all` of stage-2? test-suite?
>
>
> For me, all these tests behave with the current patch.  As before, the only 
> subproject I did not build was llgo.


Great. LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59712/new/

https://reviews.llvm.org/D59712



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59712: [APSInt][OpenMP] Fix isNegative, etc. for unsigned types

2019-04-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D59712#1472544 , @jdenny wrote:

> In D59712#1472358 , @hfinkel wrote:
>
> > > I've never tried running the other tests you mention, for any patch.  I 
> > > thought people normally left those to the bots.  Should this patch be 
> > > handled differently?
> >
> > We have a lot of people actively working off of trunk, and we try very hard 
> > to keep trunk clean. The bots are a second line of defense, not the primary 
> > checkers. In any case, this comes down to professional judgement. It is not 
> > uncommon to ask for a patch author to check self hosting and a test suite 
> > run before committing - specifically, those patches that might affect 
> > correctness, or introduce other subtle problems, and for which running the 
> > compiler over a bunch of C/C++ code might uncover a problem.
>
>
> Thanks for explaining.  It's my first time receiving these particular 
> requests (probably because of what parts of LLVM I normally edit), so I 
> wasn't sure I understood.


No problem.

> For self-hosting, is it best to build again with CMAKE_C_COMPILER and 
> CMAKE_CXX_COMPILE pointing into the previous build, or is there a better 
> approach?

That's what I do.

> 
> 
>> Also, is this review now missing some files? I see here only updates to  
>> APSInt.h (only adding functions), APSIntTest.cpp, and a bunch of tests. 
>> Nothing that would cause changes to the tests, however (maybe I'm just 
>> missing something).
> 
> All looks fine to me.  The APSInt.h changes are the reason for all the test 
> changes.

Indeed. I forgot that you were changing overrides.

Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59712/new/

https://reviews.llvm.org/D59712



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59712: [APSInt][OpenMP] Fix isNegative, etc. for unsigned types

2019-04-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

In D59712#1472318 , @jdenny wrote:

> In D59712#1469760 , @lebedev.ri 
> wrote:
>
> > In D59712#1469693 , @jdenny wrote:
> >
> > > In D59712#1469392 , @lebedev.ri 
> > > wrote:
> > >
> > > > In D59712#1469358 , @jdenny 
> > > > wrote:
> > > >
> > > > > In D59712#1469301 , 
> > > > > @lebedev.ri wrote:
> > > > >
> > > > > > In D59712#1469295 , 
> > > > > > @craig.topper wrote:
> > > > > >
> > > > > > > Wondering if it would be better to assert for asking for the sign 
> > > > > > > of an unsigned APSInt. I could see a caller just wanting to get 
> > > > > > > the msb for some reason and not knowing that isNegative won’t 
> > > > > > > work.
> > > > > >
> > > > > >
> > > > > > Yes, i, too, would think an assert is much better solution (since i 
> > > > > > literally just tripped over this in this review.)
> > > > >
> > > >
> > > >
> > > > Does this pass `check-all`? `check-all` of stage-2? test-suite?
> > >
> > >
> > > No.  The assert breaks cases in at least ExprConstant.cpp and 
> > > SemaExpr.cpp.
> >
> >
> > Err, i was talking about the current code in the patch :)
>
>
> For me, check-all's success is not affected by the current patch.  I built 
> all subprojects except llgo, which gave me a build problem independently of 
> this patch.
>
> I've never tried running the other tests you mention, for any patch.  I 
> thought people normally left those to the bots.  Should this patch be handled 
> differently?


We have a lot of people actively working off of trunk, and we try very hard to 
keep trunk clean. The bots are a second line of defense, not the primary 
checkers. In any case, this comes down to professional judgement. It is not 
uncommon to ask for a patch author to check self hosting and a test suite run 
before committing - specifically, those patches that might affect correctness, 
or introduce other subtle problems, and for which running the compiler over a 
bunch of C/C++ code might uncover a problem.

Also, is this review now missing some files? I see here only updates to  
APSInt.h (only adding functions), APSIntTest.cpp, and a bunch of tests. Nothing 
that would cause changes to the tests, however (maybe I'm just missing 
something).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59712/new/

https://reviews.llvm.org/D59712



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D60406: Move the builtin headers to use the new license file header.

2019-04-08 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM.  Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60406/new/

https://reviews.llvm.org/D60406



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2019-03-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment.

We need to make progress on this, and I'd like to suggest a path forward...

First, we have a fundamental problem here: Using host headers to declare 
functions for the device execution environment isn't sound. Those host headers 
can do anything, and while some platforms might provide a way to make the host 
headers more friendly (e.g., by defining __NO_MATH_INLINES), these mechanisms 
are neither robust nor portable. Thus, we should not rely on host headers to 
define functions that might be available on the device. However, even when 
compiling for the device, code meant only for host execution must be 
semantically analyzable. This, in general, requires the host headers. So we 
have a situation in which we must both use the host headers during device 
compilation (to keep the semantic analysis of the surrounding host code 
working) and also can't use the host headers to provide definitions for use for 
device code (e.g., because those host headers might provide definitions relying 
on host inline asm, intrinsics, using types not lowerable in device code, could 
provide declarations using linkage-affecting attributes not lowerable for the 
device, etc.).

This is, or is very similar to, the problem that the host/device overloading 
addresses in CUDA. It is also the problem, or very similar to the problem, that 
the new OpenMP 5 `declare variant` directive is intended to address. Johannes 
and I discussed this earlier today, and I suggest that we:

1. Add a math.h wrapper to clang/lib/Headers, which generally just does an 
include_next of math.h, but provides us with the ability to customize this 
behavior. Writing a header for OpenMP on NVIDIA GPUs which is essentially 
identical to the math.h functions in __clang_cuda_device_functions.h would be 
unfortunate, and as CUDA does provide the underlying execution environment for 
OpenMP target offload on NVIDIA GPUs, duplicative even in principle. We don't 
need to alter the default global namespace, however, but can include this file 
from the wrapper math.h.
2. We should allow host/device overloading in OpenMP mode. As an extension, we 
could directly reuse the CUDA host/device overloading capability - this also 
has the advantage of allowing us to directly reuse 
__clang_cuda_device_functions.h (and perhaps do a similar thing to pick up the 
device-side printf, etc. from __clang_cuda_runtime_wrapper.h). In the future, 
we can extend these to provide overloading using OpenMP declare variant, if 
desired, when in OpenMP mode.

Thoughts?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D47849/new/

https://reviews.llvm.org/D47849



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D58128: [PowerPC] Stop defining _ARCH_PWR6X on POWER7 and up

2019-02-12 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58128/new/

https://reviews.llvm.org/D58128



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53928: Enable builtins necessary for SLEEF [AArch64] vectorized trigonometry libm functions

2019-01-17 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM

Please, in the future, make sure you post full-context patches.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53928/new/

https://reviews.llvm.org/D53928



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55928: [OpenMP] Add flag for preventing the extension to 64 bits for the collapse loop counter

2018-12-20 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added inline comments.



Comment at: docs/OpenMPSupport.rst:120
+compile your program with the `-fopenmp-optimistic-collapse`.
+
+

Can you please clarify here what happens when the loop induction variables are 
already 64 bits. If any of them are already 64 bits, then we still use 64 bits 
overall?



Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55928/new/

https://reviews.llvm.org/D55928



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D52117: Generate llvm.loop.parallel_accesses instead of llvm.mem.parallel_loop_access metadata.

2018-12-18 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.

Minor typo noted below, but otherwise, LGTM (to avoid any misunderstanding: 
this should be committed after the LLVM change lands).




Comment at: lib/CodeGen/CGLoopInfo.cpp:341
+for (const LoopInfo  : Active) {
+  // Here we assume that ever loop that has an access group is parallel.
+  if (MDNode *Group = AL.getAccessGroup())

ever -> every


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D52117/new/

https://reviews.llvm.org/D52117



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >