from:"Diogo N. Sampaio via Phabricator via cfe\-commits"

[PATCH] D97187: [Clang][Sema] Warn when function argument is less aligned than parameter

2021-03-02 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM . Just a few coding style nits to fix. But please wait a couple of days to 
see if someone else has anything else to say.




Comment at: clang/lib/Sema/SemaChecking.cpp:4475
+  if (ParamTy->isIncompleteType() || ArgTy->isIncompleteType())
+return;
+  CharUnits ParamAlign = Context.getTypeAlignInChars(ParamTy);

empty line after return.



Comment at: clang/lib/Sema/SemaChecking.cpp:4545
+auto *FT = FDecl->getFunctionType();
+if (FT && isa(FT))
+  Proto = cast(FDecl->getFunctionType());

Follow clang-tidy.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97187/new/

https://reviews.llvm.org/D97187

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-07-30 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Ping ... ping...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-07-24 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-07-17 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked 6 inline comments as done.
dnsampaio added a comment.

Indeed not all of them. Fixed this time.




Comment at: clang/include/clang/Basic/CodeGenOptions.def:396
+/// according to the field declaring type width.
+CODEGENOPT(ForceNoAAPCSBitfieldWidth, 1, 0)
+

ostannard wrote:
> Why is this a negative option, when the one above is positive?
The enforcing of number of accesses would not be accepted if it was not an 
opt-in option. This one I expect it should be accepted with a single opt-out 
option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-07-15 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.
Herald added a subscriber: dang.

Ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D75169: [ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend

2020-06-12 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Perhaps we could move to making half a valid type for the arm back-end as 
follow up patches. Allowing half as argument through the IR is already a step 
to that direction.
IMO this patch is already quite big and it excels in fixing the bugs it 
proposed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75169/new/

https://reviews.llvm.org/D75169



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79378: PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz.

2020-05-27 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Hi @rsmith,
are you still looking into this?
cheers


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79378/new/

https://reviews.llvm.org/D79378



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79378: PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz.

2020-05-14 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM, as far @rjmccall 's concern about documentation is addressed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79378/new/

https://reviews.llvm.org/D79378



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D75169: [ARM] Enforcing calling convention for half-precision FP arguments and returns for big-endian AArch32

2020-05-14 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

In D75169#1952159 , @pratlucas wrote:

> > Why not just make half as an argument do the right thing for that case?
>
> That would be the ideal approach, but currently there's a limitation on the 
> backend's calling convention lowering that gets in the way.
>  The lowering of calls in `SelectionDAGBuilder` includes a target-independent 
> step that is responsible for spliting or promoting each argument into "legal 
> registers" and takes place before the targets' calling convention lowering.
>  As `f16` is not a legal type on many of the `AAPCS_VFP` targets, it gets 
> promoted to `f32` before the target's lowering code has a chance to define 
> how to handle it.
>  Ideally, this stpe should only take place if lowering calling conventions 
> after type legalization - there's a FIXME there already capturing that -, but 
> that would involve a major rewriting that would impact multiple targets.
>  Inserting a hacky target-dependent fix in this step also didn't look very 
> good.
>  Do you see other alternatives for handling it? If not, which approach would 
> you suggest?


Would it be possible to pass a `half` argument and fix-it-up at 
`CodeGenPrepare`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75169/new/

https://reviews.llvm.org/D75169



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79378: PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz.

2020-05-11 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

From my point it does LGTM.




Comment at: clang/lib/CodeGen/CGExprCXX.cpp:2042-2049
   // Null check the pointer.
   llvm::BasicBlock *DeleteNotNull = createBasicBlock("delete.notnull");
   llvm::BasicBlock *DeleteEnd = createBasicBlock("delete.end");
 
   llvm::Value *IsNull = Builder.CreateIsNull(Ptr.getPointer(), "isnull");
 
   Builder.CreateCondBr(IsNull, DeleteEnd, DeleteNotNull);

rsmith wrote:
> dnsampaio wrote:
> > Unless I missed something, isn't it better to just avoid emitting this 
> > check and basic blocks all together if we are optimizing for size and when 
> > we know that Ptr is never null?
> > I would consider in doing something alike:
> >  ```
> >   const bool emitNullCheck = CGM.getCodeGenOpts().OptimizeSize <= 1;
> >   llvm::BasicBlock *DeleteNotNull;
> >   llvm::BasicBlock *DeleteEnd;
> >   if (emitNullCheck){
> > // Null check the pointer.
> > DeleteNotNull = createBasicBlock("delete.notnull");
> > DeleteEnd = createBasicBlock("delete.end");
> > 
> > llvm::Value *IsNull = Builder.CreateIsNull(Ptr.getPointer(), "isnull");
> > 
> > Builder.CreateCondBr(IsNull, DeleteEnd, DeleteNotNull);
> > EmitBlock(DeleteNotNull);
> >   }
> > ```
> > 
> > and we use the same emitNullCheck to avoid EmitBlocks below.
> I don't think we can reasonably do this. There are a lot of different ways 
> that `delete` emission can be performed, and many of them (for example, 
> calling a virtual deleting destructor, destroying operator delete, or array 
> delete with cookie) require the null check to be performed in advance for 
> correctness. It would be brittle to duplicate all of those checks here.
> 
> We *could* sink the null checks into the various paths through 
> `EmitArrayDelete` and `EmitObjectDelete`, but I think that makes the code 
> significantly more poorly factored.
Fair enough.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79378/new/

https://reviews.llvm.org/D79378



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D79378: PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz.

2020-05-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

I believe we can avoid creating some blocks for latter removing them, no?




Comment at: clang/lib/CodeGen/CGExprCXX.cpp:2042-2049
   // Null check the pointer.
   llvm::BasicBlock *DeleteNotNull = createBasicBlock("delete.notnull");
   llvm::BasicBlock *DeleteEnd = createBasicBlock("delete.end");
 
   llvm::Value *IsNull = Builder.CreateIsNull(Ptr.getPointer(), "isnull");
 
   Builder.CreateCondBr(IsNull, DeleteEnd, DeleteNotNull);

Unless I missed something, isn't it better to just avoid emitting this check 
and basic blocks all together if we are optimizing for size and when we know 
that Ptr is never null?
I would consider in doing something alike:
 ```
  const bool emitNullCheck = CGM.getCodeGenOpts().OptimizeSize <= 1;
  llvm::BasicBlock *DeleteNotNull;
  llvm::BasicBlock *DeleteEnd;
  if (emitNullCheck){
// Null check the pointer.
DeleteNotNull = createBasicBlock("delete.notnull");
DeleteEnd = createBasicBlock("delete.end");

llvm::Value *IsNull = Builder.CreateIsNull(Ptr.getPointer(), "isnull");

Builder.CreateCondBr(IsNull, DeleteEnd, DeleteNotNull);
EmitBlock(DeleteNotNull);
  }
```

and we use the same emitNullCheck to avoid EmitBlocks below.



Comment at: clang/lib/CodeGen/CGExprCXX.cpp:2090
   } else {
-EmitObjectDelete(*this, E, Ptr, DeleteTy);
+if (!EmitObjectDelete(*this, E, Ptr, DeleteTy, DeleteEnd))
+  EmitBlock(DeleteEnd);

If avoiding the emission above it would not require changing `EmitObjectDelete` 
at all.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79378/new/

https://reviews.llvm.org/D79378



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77074: [FPEnv][AArch64] Platform-specific builtin constrained FP enablement

2020-04-09 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM, not forgetting to remove the exit comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77074/new/

https://reviews.llvm.org/D77074



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77074: [FPEnv][AArch64] Platform-specific builtin constrained FP enablement

2020-04-02 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:8486-8492
+  return Builder.CreateConstrainedFPCall(
+  F,
+  {EmitScalarExpr(E->getArg(1)), EmitScalarExpr(E->getArg(2)), 
Ops[0]});
+} else {
+  Function *F = CGM.getIntrinsic(Intrinsic::fma, HalfTy);
+  // NEON intrinsic puts accumulator first, unlike the LLVM fma.
+  return Builder.CreateCall(F, {EmitScalarExpr(E->getArg(1)),

kpn wrote:
> dnsampaio wrote:
> > It seems that `Builder.CreateCall` and `Builder.CreateConstrainedFPCall` 
> > usually have the same arguments, except for the function F being or not 
> > part of "experimental_constrained_".
> > Would it make sense to teach the `Builder` to select between creating a 
> > constrained or not call, depending if the function passed is constrained?
> > 
> > I was thinking in something like this:
> > ```
> > Function *F = CGM.getIntrinsic( Builder.getIsFPConstrained()?
> > 
> > Intrinsic::experimental_constrained_fma :
> > Intrinsic::fma, 
> > HalfTy);
> > return Builder.CreateCallConstrainedFPIfRequired(F, 
> > ```
> > 
> In CGBuiltins.cpp we already have emitUnaryMaybeConstrainedFPBuiltin() plus 
> Binary and Ternary. They work well for the pattern seen on other hosts. But 
> they won't easily work for this ticket. 
> 
> How about a new function just below those three that will work well here? A 
> emitCallMaybeConstrainedFPBuiltin() that takes two intrinsic IDs and chooses 
> which one based on constrained FP would make for an even more compact use. 
> The block of example code you put above would just turn into a single 
> function call. Does that work for you?
Sounds good to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77074/new/

https://reviews.llvm.org/D77074



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77074: [FPEnv][AArch64] Platform-specific builtin constrained FP enablement

2020-04-01 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:8486-8492
+  return Builder.CreateConstrainedFPCall(
+  F,
+  {EmitScalarExpr(E->getArg(1)), EmitScalarExpr(E->getArg(2)), 
Ops[0]});
+} else {
+  Function *F = CGM.getIntrinsic(Intrinsic::fma, HalfTy);
+  // NEON intrinsic puts accumulator first, unlike the LLVM fma.
+  return Builder.CreateCall(F, {EmitScalarExpr(E->getArg(1)),

It seems that `Builder.CreateCall` and `Builder.CreateConstrainedFPCall` 
usually have the same arguments, except for the function F being or not part of 
"experimental_constrained_".
Would it make sense to teach the `Builder` to select between creating a 
constrained or not call, depending if the function passed is constrained?

I was thinking in something like this:
```
Function *F = CGM.getIntrinsic( Builder.getIsFPConstrained()?

Intrinsic::experimental_constrained_fma :
Intrinsic::fma, HalfTy);
return Builder.CreateCallConstrainedFPIfRequired(F, 
```



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77074/new/

https://reviews.llvm.org/D77074



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77074: [FPEnv][AArch64] Platform-specific builtin constrained FP enablement

2020-03-31 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/test/CodeGen/aarch64-neon-intrinsics-constrained.c:288
+
+// XXX FIXME do we need to check for both w and x registers?
+// COMMON-LABEL: test_vceq_f64

kpn wrote:
> Anyone? I'm not an ARM expert.
The is variants of the `cset` instruction for both `w` and `x`. But I believe 
that for these tests it should be stable enough to accept only one of them.



Comment at: clang/test/CodeGen/aarch64-neon-intrinsics-constrained.c:889
+
+// FIXME why the unused bitcast? There are several of them!
+// COMMON-LABEL: test_vrnda_f64

kpn wrote:
> ???
It is a known issue. I believe no-one dug through the neon-emmiter machinery to 
find out why yet.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77074/new/

https://reviews.llvm.org/D77074



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74766: [ARM] Fixing range checks for Neon's vqdmulhq_lane and vqrdmulhq_lane intrinsics

2020-03-19 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74766/new/

https://reviews.llvm.org/D74766



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74619: [ARM] Enabling range checks on Neon intrinsics' lane arguments

2020-03-18 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Hi,
thanks for looking into this.  The patch LGTM, but regarding the indentation, I 
don't know what would be the best practice here. We tend to like to preserve 
the line-git-history, but if we start ignoring the formater check, then it has 
no sense in they being here.
Perhaps @t.p.northover or @olista01 could share their thoughts here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74619/new/

https://reviews.llvm.org/D74619



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D73638: [AST] Move dependence computations into a separate file

2020-03-17 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/lib/AST/ComputeDependence.cpp:607
+  auto D = toExprDependence(E->getType()->getDependence());
+  for (auto *E : E->arguments())
+D |= E->getDependence() & ~ExprDependence::Type;

I'm impressed this even compiled. With gcc 8.3 I get an error:
```
/work/bf/LLVM/src/clang/lib/AST/ComputeDependence.cpp:607:18: error: use of ‘E’ 
before deduction of ‘auto’
   for (auto *E : E->arguments())
  ^
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73638/new/

https://reviews.llvm.org/D73638



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74618: [ARM] Creating 'call_mangled' for Neon intrinsics definitions

2020-03-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM, after a nit inline.




Comment at: clang/include/clang/Basic/arm_neon_incl.td:66
+//that has the variation and takes the given types, an error
+//is generated at tblgen time.
+def call_mangled;

As the example above, it would be nice to have an example here as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74618/new/

https://reviews.llvm.org/D74618



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74766: [ARM] Fixing range checks for Neon's vqdmulhq_lane and vqrdmulhq_lane intrinsics

2020-03-09 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/test/CodeGen/arm-neon-range-checks.c:284-285
+void test_vqdmulhq_lane(int32x4_t a, int32x2_t b) {
+  vqdmulhq_lane_s32(a, b, -1); // expected-error {{argument value -1 is 
outside the valid range}}
+  vqdmulhq_lane_s32(a, b, 2); // expected-error {{argument value 2 is outside 
the valid range}}
+  vqdmulhq_lane_s32(a, b, 0);

As the previous one, could we have it testing the actual ranges?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74766/new/

https://reviews.llvm.org/D74766



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74619: [ARM] Enabling range checks on Neon intrinsics' lane arguments

2020-03-09 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/test/CodeGen/arm-neon-range-checks.c:7
+void test_vdot_lane(int32x2_t r, int8x8_t a, int8x8_t b) {
+  vdot_lane_s32(r, a, b, -1); // expected-error {{argument value -1 is outside 
the valid range}}
+  vdot_lane_s32(r, a, b, 2); // expected-error {{argument value 2 is outside 
the valid range}}

Could we have the valid range in the tests?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74619/new/

https://reviews.llvm.org/D74619



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74618: [ARM] Creating 'call_mangled' for Neon intrinsics definitions

2020-03-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/utils/TableGen/NeonEmitter.cpp:1890-1891
 }
+if (MangledName)
+  Good &= I.getMangledName(true) == MangledName;
+

Can we move this above the loop just before? Perhaps, if false, can we just 
continue the outer loop?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74618/new/

https://reviews.llvm.org/D74618



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74617: [ARM] Keeping sign information on bitcasts for Neon vdot_lane intrinsics

2020-03-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Is this missing a test?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74617/new/

https://reviews.llvm.org/D74617



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D74616: [ARM] Setting missing isLaneQ attribute on Neon Intrisics definitions

2020-03-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM with a nit: we can save some space using sintax like this:

  let isLaneQ = 1 in
  def UDOT_LANEQ : SOpInst<"vdot_laneq", "..(<<)(<;

or concatenating those that are just one after the other:

  let isLaneQ = 1 in {
def VFMLAL_LANEQ_LOW  : SOpInst<"vfmlal_laneq_low",  "(F>)(F>)F(FQ)I", 
"hQh", OP_FMLAL_LN>;
 def VFMLSL_LANEQ_LOW  : SOpInst<"vfmlsl_laneq_low",  "(F>)(F>)F(FQ)I", 
"hQh", OP_FMLSL_LN>;
 def VFMLAL_LANEQ_HIGH : SOpInst<"vfmlal_laneq_high", "(F>)(F>)F(FQ)I", 
"hQh", OP_FMLAL_LN_Hi>;
 def VFMLSL_LANEQ_HIGH : SOpInst<"vfmlsl_laneq_high", "(F>)(F>)F(FQ)I", 
"hQh", OP_FMLSL_LN_Hi>;
  }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74616/new/

https://reviews.llvm.org/D74616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-02-11 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Hi @ostannard, thanks for your review.
I updated the patch so it won't act when the computed volatile bit-field access 
will overlap a zero length bit-field, avoiding the conflict. We can update it 
accordingly to future versions of the AAPCS if required.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-02-07 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 243197.
dnsampaio added a comment.

Added opt-out flag


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CGRecordLayout.h
  clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/aapcs-bitfield.c
  clang/test/CodeGen/bitfield-2.c

Index: clang/test/CodeGen/bitfield-2.c
===
--- clang/test/CodeGen/bitfield-2.c
+++ clang/test/CodeGen/bitfield-2.c
@@ -1,3 +1,4 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -emit-llvm -triple x86_64 -O3 -o %t.opt.ll %s \
 // RUN:   -fdump-record-layouts > %t.dump.txt
 // RUN: FileCheck -check-prefix=CHECK-RECORD < %t.dump.txt %s
@@ -14,7 +15,7 @@
 // CHECK-RECORD:   LLVMType:%struct.s0 = type { [3 x i8] }
 // CHECK-RECORD:   IsZeroInitializable:1
 // CHECK-RECORD:   BitFields:[
-// CHECK-RECORD: 
+// CHECK-RECORD: 
-// CHECK-RECORD: 
+// CHECK-RECORD: 
+// CHECK-RECORD: 
-// CHECK-RECORD: 
+// CHECK-RECORD: 
 
 struct __attribute__((packed)) s9 {
Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -1,8 +1,12 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
-// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=LE
-// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=BE
-// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=LE,LENUMLOADS
-// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=BE,BENUMLOADS
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fno-AAPCSBitfieldWidth | FileCheck %s -check-prefix=LE
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fno-AAPCSBitfieldWidth | FileCheck %s -check-prefix=BE
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad -fno-AAPCSBitfieldWidth | FileCheck %s -check-prefixes=LE,LENUMLOADS
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad -fno-AAPCSBitfieldWidth | FileCheck %s -check-prefixes=BE,BENUMLOADS
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=LEWIDTH
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=BEWIDTH
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=LEWIDTH,LEWIDTHNUM
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=BEWIDTH,BEWIDTHNUM
 
 struct st0 {
   short c : 7;
@@ -24,6 +28,22 @@
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
 // BE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
 // BE-NEXT:ret i32 [[CONV]]
+// LEWIDTH-LABEL: @st0_check_load(
+// LEWIDTH-NEXT:  entry:
+// LEWIDTH-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* [[M:%.*]], i32 0, i32 0
+// LEWIDTH-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
+// LEWIDTH-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 1
+// LEWIDTH-NEXT:[[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 1
+// LEWIDTH-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// LEWIDTH-NEXT:ret i32 [[CONV]]
+//
+// BEWIDTH-LABEL: @st0_check_load(
+// BEWIDTH-NEXT:  entry:
+// BEWIDTH-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* [[M:%.*]], i32 0, i32 0
+// BEWIDTH-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
+// BEWIDTH-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
+// BEWIDTH-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// BEWIDTH-NEXT:ret i32 [[CONV]]
 //
 int st0_check_load(struct st0 *m) {
   return m->c;
@@ -46,6 +66,23 @@
 // BE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
 // BE-NEXT:store i8 [[BF_SET]], i8* [[TMP0]], align 2
 // BE-NEXT:ret void
+// LEWIDTH-LABEL: @st0_check_store(
+// LEWIDTH-NEXT:  entry:
+// LEWIDTH-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* [[M:%.*]], i32 0, i32 0
+// LEWIDTH-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
+// LEWIDTH-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -128
+// LEWIDTH-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
+// LEWIDTH-NEXT:store i8 [[BF_SET]], i8* [[TMP0]], align 2
+// LEWIDTH-NEXT:ret void
+//
+//

[PATCH] D67399: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields

2020-02-07 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG9d869180c4ad: [ARM] Follow AACPS for preserving number of 
loads/stores of volatile bit-fields (authored by dnsampaio).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/aapcs-bitfield.c

Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -1,6 +1,8 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=LE
 // RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=BE
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=LE,LENUMLOADS
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=BE,BENUMLOADS
 
 struct st0 {
   short c : 7;
@@ -144,7 +146,7 @@
 void st2_check_store(struct st2 *m) {
   m->c = 1;
 }
-
+// Volatile access is allowed to use 16 bits
 struct st3 {
   volatile short c : 7;
 };
@@ -191,7 +193,7 @@
 void st3_check_store(struct st3 *m) {
   m->c = 1;
 }
-
+// Volatile access to st4.c should use a char ld/st
 struct st4 {
   int b : 9;
   volatile char c : 5;
@@ -324,16 +326,16 @@
 // LE-LABEL: @st6_check_load(
 // LE-NEXT:  entry:
 // LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST6:%.*]], %struct.st6* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 4
 // LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 4
 // LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 1
-// LE-NEXT:[[TMP1:%.*]] = load i8, i8* [[B]], align 2, !tbaa !3
+// LE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2
 // LE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 2
-// LE-NEXT:[[BF_LOAD1:%.*]] = load i8, i8* [[C]], align 1
+// LE-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[C]], align 1
 // LE-NEXT:[[BF_SHL2:%.*]] = shl i8 [[BF_LOAD1]], 3
 // LE-NEXT:[[BF_ASHR3:%.*]] = ashr exact i8 [[BF_SHL2]], 3
 // LE-NEXT:[[BF_CAST4:%.*]] = sext i8 [[BF_ASHR3]] to i32
@@ -343,21 +345,21 @@
 // BE-LABEL: @st6_check_load(
 // BE-NEXT:  entry:
 // BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST6:%.*]], %struct.st6* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 4
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 1
-// BE-NEXT:[[TMP1:%.*]] = load i8, i8* [[B]], align 2, !tbaa !3
+// BE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2
 // BE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 2
-// BE-NEXT:[[BF_LOAD1:%.*]] = load i8, i8* [[C]], align 1
+// BE-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[C]], align 1
 // BE-NEXT:[[BF_ASHR2:%.*]] = ashr i8 [[BF_LOAD1]], 3
 // BE-NEXT:[[BF_CAST3:%.*]] = sext i8 [[BF_ASHR2]] to i32
 // BE-NEXT:[[ADD4:%.*]] = add nsw i32 [[ADD]], [[BF_CAST3]]
 // BE-NEXT:ret i32 [[ADD4]]
 //
-int st6_check_load(struct st6 *m) {
+int st6_check_load(volatile struct st6 *m) {
   int x = m->a;
   x += m->b;
   x += m->c;
@@ -372,7 +374,7 @@
 // LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
 // LE-NEXT:store i16 [[BF_SET]], i16* [[TMP0]], align 4
 // LE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 1
-// LE-NEXT:store i8 2, i8* [[B]], align 2, !tbaa !3
+// LE-NEXT:store i8 2, i8* [[B]], align 2
 // LE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 2
 // LE-NEXT:[[BF_LOAD1:%.*]] = load i8, i8* [[C]], align 1
 // LE-NEXT:[[BF_CLEAR2:%.*]] = and i8 [[BF_LOAD1]], -32
@@ -388,7 +390,7 @@
 // BE-NEXT:[[BF_SET:%.*]] = or i16

[PATCH] D67399: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields

2020-02-07 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 243102.
dnsampaio added a comment.

Revordered some tests adding and definition of the "isAAPCS" function from 
patch D72932 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/aapcs-bitfield.c

Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -1,6 +1,8 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=LE
 // RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=BE
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=LE,LENUMLOADS
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=BE,BENUMLOADS
 
 struct st0 {
   short c : 7;
@@ -144,7 +146,7 @@
 void st2_check_store(struct st2 *m) {
   m->c = 1;
 }
-
+// Volatile access is allowed to use 16 bits
 struct st3 {
   volatile short c : 7;
 };
@@ -191,7 +193,7 @@
 void st3_check_store(struct st3 *m) {
   m->c = 1;
 }
-
+// Volatile access to st4.c should use a char ld/st
 struct st4 {
   int b : 9;
   volatile char c : 5;
@@ -324,16 +326,16 @@
 // LE-LABEL: @st6_check_load(
 // LE-NEXT:  entry:
 // LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST6:%.*]], %struct.st6* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 4
 // LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 4
 // LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 1
-// LE-NEXT:[[TMP1:%.*]] = load i8, i8* [[B]], align 2, !tbaa !3
+// LE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2
 // LE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 2
-// LE-NEXT:[[BF_LOAD1:%.*]] = load i8, i8* [[C]], align 1
+// LE-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[C]], align 1
 // LE-NEXT:[[BF_SHL2:%.*]] = shl i8 [[BF_LOAD1]], 3
 // LE-NEXT:[[BF_ASHR3:%.*]] = ashr exact i8 [[BF_SHL2]], 3
 // LE-NEXT:[[BF_CAST4:%.*]] = sext i8 [[BF_ASHR3]] to i32
@@ -343,21 +345,21 @@
 // BE-LABEL: @st6_check_load(
 // BE-NEXT:  entry:
 // BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST6:%.*]], %struct.st6* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 4
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 1
-// BE-NEXT:[[TMP1:%.*]] = load i8, i8* [[B]], align 2, !tbaa !3
+// BE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2
 // BE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 2
-// BE-NEXT:[[BF_LOAD1:%.*]] = load i8, i8* [[C]], align 1
+// BE-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[C]], align 1
 // BE-NEXT:[[BF_ASHR2:%.*]] = ashr i8 [[BF_LOAD1]], 3
 // BE-NEXT:[[BF_CAST3:%.*]] = sext i8 [[BF_ASHR2]] to i32
 // BE-NEXT:[[ADD4:%.*]] = add nsw i32 [[ADD]], [[BF_CAST3]]
 // BE-NEXT:ret i32 [[ADD4]]
 //
-int st6_check_load(struct st6 *m) {
+int st6_check_load(volatile struct st6 *m) {
   int x = m->a;
   x += m->b;
   x += m->c;
@@ -372,7 +374,7 @@
 // LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
 // LE-NEXT:store i16 [[BF_SET]], i16* [[TMP0]], align 4
 // LE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 1
-// LE-NEXT:store i8 2, i8* [[B]], align 2, !tbaa !3
+// LE-NEXT:store i8 2, i8* [[B]], align 2
 // LE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* [[M]], i32 0, i32 2
 // LE-NEXT:[[BF_LOAD1:%.*]] = load i8, i8* [[C]], align 1
 // LE-NEXT:[[BF_CLEAR2:%.*]] = and i8 [[BF_LOAD1]], -32
@@ -388,7 +390,7 @@
 // BE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]],

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-02-06 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio planned changes to this revision.
dnsampaio added a comment.

Updated wrong patch here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-02-06 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 242823.
dnsampaio added a comment.
Herald added subscribers: llvm-commits, hiraditya.
Herald added a project: LLVM.



- Removed test
- Added clear at the end of run as well, to clear waste
- Moved clearing to a more sensible position


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

Files:
  llvm/lib/CodeGen/TypePromotion.cpp


Index: llvm/lib/CodeGen/TypePromotion.cpp
===
--- llvm/lib/CodeGen/TypePromotion.cpp
+++ llvm/lib/CodeGen/TypePromotion.cpp
@@ -847,8 +847,7 @@
 
   // Iterate through, and add to, a tree of operands and users in the use-def.
   while (!WorkList.empty()) {
-Value *V = WorkList.back();
-WorkList.pop_back();
+Value *V = WorkList.pop_back_val();
 if (CurrentVisited.count(V))
   continue;
 
@@ -917,7 +916,7 @@
  ++ToPromote;
}
 
-  // DAG optimisations should be able to handle these cases better, especially
+  // DAG optimizations should be able to handle these cases better, especially
   // for function arguments.
   if (ToPromote < 2 || (Blocks.size() == 1 && (NonFreeArgs > SafeWrap.size(
 return false;
@@ -941,6 +940,9 @@
   if (!TPC)
 return false;
 
+  AllVisited.clear();
+  SafeToPromote.clear();
+  SafeWrap.clear();
   bool MadeChange = false;
   const DataLayout  = F.getParent()->getDataLayout();
   const TargetMachine  = TPC->getTM();
@@ -998,6 +1000,10 @@
   if (MadeChange)
 LLVM_DEBUG(dbgs() << "After TypePromotion: " << F << "\n");
 
+  AllVisited.clear();
+  SafeToPromote.clear();
+  SafeWrap.clear();
+
   return MadeChange;
 }
 


Index: llvm/lib/CodeGen/TypePromotion.cpp
===
--- llvm/lib/CodeGen/TypePromotion.cpp
+++ llvm/lib/CodeGen/TypePromotion.cpp
@@ -847,8 +847,7 @@
 
   // Iterate through, and add to, a tree of operands and users in the use-def.
   while (!WorkList.empty()) {
-Value *V = WorkList.back();
-WorkList.pop_back();
+Value *V = WorkList.pop_back_val();
 if (CurrentVisited.count(V))
   continue;
 
@@ -917,7 +916,7 @@
  ++ToPromote;
}
 
-  // DAG optimisations should be able to handle these cases better, especially
+  // DAG optimizations should be able to handle these cases better, especially
   // for function arguments.
   if (ToPromote < 2 || (Blocks.size() == 1 && (NonFreeArgs > SafeWrap.size(
 return false;
@@ -941,6 +940,9 @@
   if (!TPC)
 return false;
 
+  AllVisited.clear();
+  SafeToPromote.clear();
+  SafeWrap.clear();
   bool MadeChange = false;
   const DataLayout  = F.getParent()->getDataLayout();
   const TargetMachine  = TPC->getTM();
@@ -998,6 +1000,10 @@
   if (MadeChange)
 LLVM_DEBUG(dbgs() << "After TypePromotion: " << F << "\n");
 
+  AllVisited.clear();
+  SafeToPromote.clear();
+  SafeWrap.clear();
+
   return MadeChange;
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-02-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-02-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Ping :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D67399: [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields

2020-01-30 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 241486.
dnsampaio added a comment.

- Added flag to allow user to opt-in


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/aapcs-bitfield.c

Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -1,6 +1,8 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=LE
 // RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 | FileCheck %s -check-prefix=BE
+// RUN: %clang_cc1 -triple armv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=LE,LENUMLOADS
+// RUN: %clang_cc1 -triple armebv8-none-linux-eabi %s -emit-llvm -o - -O3 -fAAPCSBitfieldLoad | FileCheck %s -check-prefixes=BE,BENUMLOADS
 
 struct st0 {
   short c : 7;
@@ -680,12 +682,14 @@
 // LE-LABEL: @store_st11(
 // LE-NEXT:  entry:
 // LE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], %struct.st11* [[M:%.*]], i32 0, i32 1
+// LENUMLOADS-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // LE-NEXT:store volatile i16 1, i16* [[F]], align 1
 // LE-NEXT:ret void
 //
 // BE-LABEL: @store_st11(
 // BE-NEXT:  entry:
 // BE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], %struct.st11* [[M:%.*]], i32 0, i32 1
+// BENUMLOADS-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // BE-NEXT:store volatile i16 1, i16* [[F]], align 1
 // BE-NEXT:ret void
 //
@@ -698,6 +702,7 @@
 // LE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], %struct.st11* [[M:%.*]], i32 0, i32 1
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // LE-NEXT:[[INC:%.*]] = add i16 [[BF_LOAD]], 1
+// LENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i16, i16* [[F]], align 1
 // LE-NEXT:store volatile i16 [[INC]], i16* [[F]], align 1
 // LE-NEXT:ret void
 //
@@ -706,6 +711,7 @@
 // BE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], %struct.st11* [[M:%.*]], i32 0, i32 1
 // BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // BE-NEXT:[[INC:%.*]] = add i16 [[BF_LOAD]], 1
+// BENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i16, i16* [[F]], align 1
 // BE-NEXT:store volatile i16 [[INC]], i16* [[F]], align 1
 // BE-NEXT:ret void
 //
@@ -882,6 +888,7 @@
 // LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST14:%.*]], %struct.st14* [[S:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // LE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1
+// LENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // LE-NEXT:store volatile i8 [[INC]], i8* [[TMP0]], align 1
 // LE-NEXT:ret void
 //
@@ -890,6 +897,7 @@
 // BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST14:%.*]], %struct.st14* [[S:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // BE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1
+// BENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // BE-NEXT:store volatile i8 [[INC]], i8* [[TMP0]], align 1
 // BE-NEXT:ret void
 //
@@ -906,6 +914,7 @@
 // LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST15:%.*]], %struct.st15* [[S:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // LE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1
+// LENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // LE-NEXT:store volatile i8 [[INC]], i8* [[TMP0]], align 1
 // LE-NEXT:ret void
 //
@@ -914,6 +923,7 @@
 // BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST15:%.*]], %struct.st15* [[S:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // BE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1
+// BENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[TMP0]], align 1
 // BE-NEXT:store volatile i8 [[INC]], i8* [[TMP0]], align 1
 // BE-NEXT:ret void
 //
@@ -1061,6 +1071,7 @@
 // LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st16* [[S:%.*]] to i32*
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i32, i32* [[TMP0]], align 4
 // LE-NEXT:[[INC:%.*]] = add nsw i32 [[BF_LOAD]], 1
+// LENUMLOADS-NEXT:[[BF_LOAD1:%.*]] = load volatile i32, i32* [[TMP0]], align 4
 // LE-NEXT:store volatile i32 [[INC]], i32* [[TMP0]], align 4
 // LE-NEXT:ret void
 //
@@ -1069,6 +1080,7 @@
 // BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st16*

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-30 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 241477.
dnsampaio added a comment.

- Do not generate special volatile access if the record alignment is smaller 
than the bit-field declared type


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CGRecordLayout.h
  clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
  clang/test/CodeGen/aapcs-bitfield.c
  clang/test/CodeGen/bitfield-2.c

Index: clang/test/CodeGen/bitfield-2.c
===
--- clang/test/CodeGen/bitfield-2.c
+++ clang/test/CodeGen/bitfield-2.c
@@ -1,3 +1,4 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -emit-llvm -triple x86_64 -O3 -o %t.opt.ll %s \
 // RUN:   -fdump-record-layouts > %t.dump.txt
 // RUN: FileCheck -check-prefix=CHECK-RECORD < %t.dump.txt %s
@@ -14,7 +15,7 @@
 // CHECK-RECORD:   LLVMType:%struct.s0 = type { [3 x i8] }
 // CHECK-RECORD:   IsZeroInitializable:1
 // CHECK-RECORD:   BitFields:[
-// CHECK-RECORD: 
+// CHECK-RECORD: 
-// CHECK-RECORD: 
+// CHECK-RECORD: 
+// CHECK-RECORD: 
-// CHECK-RECORD: 
+// CHECK-RECORD: 
 
 struct __attribute__((packed)) s9 {
Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -144,26 +144,26 @@
 void st2_check_store(struct st2 *m) {
   m->c = 1;
 }
-
+// Volatile access is allowed to use 16 bits
 struct st3 {
   volatile short c : 7;
 };
 
 // LE-LABEL: @st3_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 1
-// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 1
-// LE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 9
+// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 9
+// LE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:ret i32 [[CONV]]
 //
 // BE-LABEL: @st3_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
-// BE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 9
+// BE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:ret i32 [[CONV]]
 //
 int st3_check_load(struct st3 *m) {
@@ -172,26 +172,26 @@
 
 // LE-LABEL: @st3_check_store(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -128
-// LE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
-// LE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -128
+// LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
+// LE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // LE-NEXT:ret void
 //
 // BE-LABEL: @st3_check_store(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
-// BE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
-// BE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], 511
+// BE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 512
+// BE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // BE-NEXT:ret void
 //
 void st3_check_store(struct st3 *m) {
   m->c = 1;
 }
-
+// Volatile access to st4.c should use a char ld/st
 struct st4 {
   int b : 9;
   volatile char c : 5;
@@ -199,24 +199,22 @@
 
 // LE-LABEL: @st4_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-30 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 241421.
dnsampaio added a comment.
Herald added a subscriber: jfb.

- Moved computation of volatile accesses to the record layout builder


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CGRecordLayout.h
  clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
  clang/test/CodeGen/aapcs-bitfield.c
  clang/test/CodeGen/bitfield-2.c

Index: clang/test/CodeGen/bitfield-2.c
===
--- clang/test/CodeGen/bitfield-2.c
+++ clang/test/CodeGen/bitfield-2.c
@@ -1,3 +1,4 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -emit-llvm -triple x86_64 -O3 -o %t.opt.ll %s \
 // RUN:   -fdump-record-layouts > %t.dump.txt
 // RUN: FileCheck -check-prefix=CHECK-RECORD < %t.dump.txt %s
@@ -14,7 +15,7 @@
 // CHECK-RECORD:   LLVMType:%struct.s0 = type { [3 x i8] }
 // CHECK-RECORD:   IsZeroInitializable:1
 // CHECK-RECORD:   BitFields:[
-// CHECK-RECORD: 
+// CHECK-RECORD: 
-// CHECK-RECORD: 
+// CHECK-RECORD: 
+// CHECK-RECORD: 
-// CHECK-RECORD: 
+// CHECK-RECORD: 
 
 struct __attribute__((packed)) s9 {
Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -151,19 +151,19 @@
 
 // LE-LABEL: @st3_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 1
-// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 1
-// LE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 9
+// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 9
+// LE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:ret i32 [[CONV]]
 //
 // BE-LABEL: @st3_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
-// BE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 9
+// BE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:ret i32 [[CONV]]
 //
 int st3_check_load(struct st3 *m) {
@@ -172,20 +172,20 @@
 
 // LE-LABEL: @st3_check_store(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -128
-// LE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
-// LE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -128
+// LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
+// LE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // LE-NEXT:ret void
 //
 // BE-LABEL: @st3_check_store(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
-// BE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
-// BE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], 511
+// BE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 512
+// BE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // BE-NEXT:ret void
 //
 void st3_check_store(struct st3 *m) {
@@ -199,24 +199,22 @@
 
 // LE-LABEL: @st4_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST4:%.*]], %struct.st4* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 4
-// LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 2
-// LE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_SHL]], 11
-// LE-NEXT:[[BF_CAST:%.*]] = zext i16

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

In D72932#1829716 , @ostannard wrote:

> Why are you doing this in CodeGen, rather than adjusting the existing layout 
> code in CGRecordLowering? Doing it this way will result in 
> AdjustAAPCSBitfieldLValue being called for every access to the bitfield, 
> rather than just once. This is probably more fragile too, because it's 
> spreading the logic across multiple parts of the codebase, and has to undo 
> some of the layout done by CGRecordLowering.

@ostannard Indeed, after looking at the `CGRecordLayout` I am not kind for 
doing this change. It will require changing all possible initializations, with 
a sensible value. And either way, codegen will also need to change, to define 
the address, it must know if it is volatile or not. I don't believe that 
recomputing it a every access would pose much of an overhead, perhaps we can 
leave it as a TODO to move it there?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

In D72932#1829716 , @ostannard wrote:

> Why are you doing this in CodeGen, rather than adjusting the existing layout 
> code in CGRecordLowering? Doing it this way will result in 
> AdjustAAPCSBitfieldLValue being called for every access to the bitfield, 
> rather than just once. This is probably more fragile too, because it's 
> spreading the logic across multiple parts of the codebase, and has to undo 
> some of the layout done by CGRecordLowering.

@olista01 Indeed, after looking at the `CGRecordLayout` I am not kind for doing 
this change. It will require changing all possible initialization, with a 
sensible value, and add a special getAddress function that would consider if an 
access is volatile. I don't believe that volatile accesses are that frequent 
that this would pose much of an overhead, perhaps we can leave it as a todo to 
move it there?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-21 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio reopened this revision.
dnsampaio added a comment.

Sorry, submitted using ide by mistake. Already reverted it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-21 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was not accepted when it landed; it landed in state "Needs 
Review".
This revision was automatically updated to reflect the committed changes.
Closed by commit rG6a24339a4524: [ARM] Follow AACPS standard for volatile 
bit-fields access width (authored by dnsampaio).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CGValue.h
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGen/aapcs-bitfield.c

Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -151,19 +151,19 @@
 
 // LE-LABEL: @st3_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 1
-// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 1
-// LE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 9
+// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 9
+// LE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:ret i32 [[CONV]]
 //
 // BE-LABEL: @st3_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
-// BE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 9
+// BE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:ret i32 [[CONV]]
 //
 int st3_check_load(struct st3 *m) {
@@ -172,20 +172,20 @@
 
 // LE-LABEL: @st3_check_store(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -128
-// LE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
-// LE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -128
+// LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
+// LE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // LE-NEXT:ret void
 //
 // BE-LABEL: @st3_check_store(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
-// BE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
-// BE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], 511
+// BE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 512
+// BE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // BE-NEXT:ret void
 //
 void st3_check_store(struct st3 *m) {
@@ -199,24 +199,22 @@
 
 // LE-LABEL: @st4_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST4:%.*]], %struct.st4* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 4
-// LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 2
-// LE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_SHL]], 11
-// LE-NEXT:[[BF_CAST:%.*]] = zext i16 [[BF_ASHR]] to i32
-// LE-NEXT:[[SEXT:%.*]] = shl i32 [[BF_CAST]], 24
-// LE-NEXT:[[CONV:%.*]] = ashr exact i32 [[SEXT]], 24
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st4* [[M:%.*]] to i8*
+// LE-NEXT:[[TMP1:%.*]] = getelementptr inbounds i8, i8* [[TMP0]], i32 1
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP1]], align 1
+// LE-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 2
+// LE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_SHL]], 3
+// LE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
 // LE-NEXT:ret i32 [[CONV]]
 //
 // BE-LABEL: @st4_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST4:%.*]], %struct.st4* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-20 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

In D72932#1829716 , @ostannard wrote:

> Why are you doing this in CodeGen, rather than adjusting the existing layout 
> code in CGRecordLowering? Doing it this way will result in 
> AdjustAAPCSBitfieldLValue being called for every access to the bit-field, 
> rather than just once. This is probably more fragile too, because it's 
> spreading the logic across multiple parts of the codebase, and has to undo 
> some of the layout done by CGRecordLowering.

Only at CodeGen we definitely know if an access is volatile, due casts to 
volatile.
About undoing what the CGRecordLowering does, that can't be avoided. We only 
can compute if a given offset and width will touch outside the bit-field when 
all fields and padding are defined.
We could add special volatile access fields into CGBitFieldInfo and compute 
them at the end for CGRecordLowering::lower, but these fields would only be 
relevant to ARM. And they will need to be computed even if there are any 
volatile accesses.
I have no strong opinion about which is better, I can add the fields if that 
suites better.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72932/new/

https://reviews.llvm.org/D72932

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72932: [ARM] Follow AACPS standard for volatile bit-fields access width

2020-01-17 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added reviewers: rsmith, rjmccall.
Herald added subscribers: cfe-commits, kristof.beyls.
Herald added a project: clang.

This patch resumes the work of D16586 .
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:

  struct S1 {
short a : 1;
  }

should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586 ,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.

  struct S2 {
short a;
int  b : 16;
  }

Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2019Q1.1
(https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:

  This ABI does not place any restrictions on the access widths
  of bit-fields where the container overlaps with a non-bit-field member.
   This is because the C/C++ memory model defines these as being separate
  memory locations, which can be accessed by two threads
   simultaneously. For this reason, compilers must be permitted to use a
  narrower memory access width (including splitting the access
   into multiple instructions) to avoid writing to a different memory location.

I've updated the patch D16586  to follow such 
behavior by verifying that we
only change volatile bit-field access when:

- it won't overlap with any other non-bit-field member
- we only access memory inside the bounds of the record

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D72932

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CGValue.h
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGen/aapcs-bitfield.c

Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -151,19 +151,19 @@
 
 // LE-LABEL: @st3_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 1
-// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 1
-// LE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 9
+// LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 9
+// LE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:ret i32 [[CONV]]
 //
 // BE-LABEL: @st3_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
-// BE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
+// BE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 9
+// BE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:ret i32 [[CONV]]
 //
 int st3_check_load(struct st3 *m) {
@@ -172,20 +172,20 @@
 
 // LE-LABEL: @st3_check_store(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// LE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -128
-// LE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
-// LE-NEXT:store volatile i8 [[BF_SET]], i8* [[TMP0]], align 2
+// LE-NEXT:[[TMP0:%.*]] = bitcast %struct.st3* [[M:%.*]] to i16*
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[TMP0]], align 2
+// LE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -128
+// LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
+// LE-NEXT:store volatile i16 [[BF_SET]], i16* [[TMP0]], align 2
 // LE-NEXT:ret void
 //
 // BE-LABEL: @st3_check_store(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST3:%.*]], %struct.st3* [[M:%.*]], i32 0, i32 0
-// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 2
-// BE-NEXT:[[BF_CLEAR:%.*]] =

[PATCH] D70183: Detect source location overflow due includes

2019-11-29 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Just wondering, I don't fully understand why is that important from the point I 
do a `return FileID();`. The critical error has already been inserted to the 
error list. clang shall either die due an assert in debug, or with a nice 
message in release. Unless clang is all broken, and the error message would be 
corrupted, which I don't believe is the case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-19 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Git diff 979ae80af7ec49624b932954d22cb91900f17121 did not send a test as well. 
Feel free to send me a reasonable sized reproducer, the one I have is about 
36MB. Don't think it will be that well received.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-19 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked an inline comment as done.
dnsampaio added inline comments.



Comment at: clang/lib/Basic/SourceManager.cpp:587
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }

miyuki wrote:
> dnsampaio wrote:
> > miyuki wrote:
> > > dnsampaio wrote:
> > > > miyuki wrote:
> > > > > dnsampaio wrote:
> > > > > > For debug builds, I could not find any other way to not reach an 
> > > > > > assert failure other than exiting here. Perhaps there is a more 
> > > > > > llvm specific way to die? llvm_unreachable() ?
> > > > > There is a similar error `err_file_too_large`. How is it handled? 
> > > > > `llvm_unreachable` is intended for code that is unreachable: it 
> > > > > causes an assertion failure in debug builds and undefined behavior in 
> > > > > release builds.
> > > > The `err_file_too_large` above in this file allows the function to 
> > > > finish and return further on. Here, we assert false before the message 
> > > > being printed.
> > > > If `llvm_unreachable` is undefined behavior, then I should stick to 
> > > > exit(1) ?
> > > I think you should return an invalid file ID (i.e. `return FileID();` and 
> > > propagate the error through the call stack. Clang can be used as a 
> > > library, so you can't just call exit(), this would terminate the user's 
> > > program.
> > It will still reach an false assert in builds that enable them. But in 
> > release it linger on and ends with the correct warning.
> Has this problem been fixed?
No. Don't intend to fix the in this patch, as they would be failing at the 
moment anyway for such cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-19 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked 2 inline comments as done.
dnsampaio added a comment.

Yes. It does return a non-valid FileID, and in builds without assert you get 
the expected error message.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-19 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked an inline comment as done.
dnsampaio added a comment.

Ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-14 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked an inline comment as done.
dnsampaio added inline comments.



Comment at: clang/lib/Basic/SourceManager.cpp:587
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }

miyuki wrote:
> dnsampaio wrote:
> > miyuki wrote:
> > > dnsampaio wrote:
> > > > For debug builds, I could not find any other way to not reach an assert 
> > > > failure other than exiting here. Perhaps there is a more llvm specific 
> > > > way to die? llvm_unreachable() ?
> > > There is a similar error `err_file_too_large`. How is it handled? 
> > > `llvm_unreachable` is intended for code that is unreachable: it causes an 
> > > assertion failure in debug builds and undefined behavior in release 
> > > builds.
> > The `err_file_too_large` above in this file allows the function to finish 
> > and return further on. Here, we assert false before the message being 
> > printed.
> > If `llvm_unreachable` is undefined behavior, then I should stick to exit(1) 
> > ?
> I think you should return an invalid file ID (i.e. `return FileID();` and 
> propagate the error through the call stack. Clang can be used as a library, 
> so you can't just call exit(), this would terminate the user's program.
It will still reach an false assert in builds that enable them. But in release 
it linger on and ends with the correct warning.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-14 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 229254.
dnsampaio marked an inline comment as done.
dnsampaio added a comment.

- Return an invalid FileID instead of exiting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183

Files:
  clang/include/clang/Basic/DiagnosticCommonKinds.td
  clang/lib/Basic/SourceManager.cpp


Index: clang/lib/Basic/SourceManager.cpp
===
--- clang/lib/Basic/SourceManager.cpp
+++ clang/lib/Basic/SourceManager.cpp
@@ -577,13 +577,18 @@
 SLocEntryLoaded[Index] = true;
 return FileID::get(LoadedID);
   }
+  unsigned FileSize = File->getSize();
+  if (!(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
+NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset)) {
+// From this point, there is no sensible way to point to the current
+// source-location and say: This include at line ### generates a too
+// big file, as the IncludePos received is
+Diag.Report(IncludePos, diag::err_include_too_large);
+return FileID();
+  }
   LocalSLocEntryTable.push_back(
   SLocEntry::get(NextLocalOffset,
  FileInfo::get(IncludePos, File, FileCharacter, 
Filename)));
-  unsigned FileSize = File->getSize();
-  assert(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
- NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset &&
- "Ran out of source locations!");
   // We do a +1 here because we want a SourceLocation that means "the end of 
the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -282,6 +282,10 @@
   "file '%0' modified since it was first processed">, DefaultFatal;
 def err_file_too_large : Error<
   "sorry, unsupported: file '%0' is too large for Clang to process">;
+def err_include_too_large : Error<
+  "sorry, this include generates a translation unit too large for"
+  " Clang to process. This may by a result from multiple"
+  " inclusions of unguarded header files.">, DefaultFatal;
 def err_unsupported_bom : Error<"%0 byte order mark detected in '%1', but "
   "encoding is not supported">, DefaultFatal;
 def err_unable_to_rename_temp : Error<


Index: clang/lib/Basic/SourceManager.cpp
===
--- clang/lib/Basic/SourceManager.cpp
+++ clang/lib/Basic/SourceManager.cpp
@@ -577,13 +577,18 @@
 SLocEntryLoaded[Index] = true;
 return FileID::get(LoadedID);
   }
+  unsigned FileSize = File->getSize();
+  if (!(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
+NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset)) {
+// From this point, there is no sensible way to point to the current
+// source-location and say: This include at line ### generates a too
+// big file, as the IncludePos received is
+Diag.Report(IncludePos, diag::err_include_too_large);
+return FileID();
+  }
   LocalSLocEntryTable.push_back(
   SLocEntry::get(NextLocalOffset,
  FileInfo::get(IncludePos, File, FileCharacter, Filename)));
-  unsigned FileSize = File->getSize();
-  assert(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
- NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset &&
- "Ran out of source locations!");
   // We do a +1 here because we want a SourceLocation that means "the end of the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -282,6 +282,10 @@
   "file '%0' modified since it was first processed">, DefaultFatal;
 def err_file_too_large : Error<
   "sorry, unsupported: file '%0' is too large for Clang to process">;
+def err_include_too_large : Error<
+  "sorry, this include generates a translation unit too large for"
+  " Clang to process. This may by a result from multiple"
+  " inclusions of unguarded header files.">, DefaultFatal;
 def err_unsupported_bom : Error<"%0 byte order mark detected in '%1', but "
   "encoding is not supported">, DefaultFatal;
 def err_unable_to_rename_temp : Error<
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-13 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: clang/lib/Basic/SourceManager.cpp:587
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }

miyuki wrote:
> dnsampaio wrote:
> > For debug builds, I could not find any other way to not reach an assert 
> > failure other than exiting here. Perhaps there is a more llvm specific way 
> > to die? llvm_unreachable() ?
> There is a similar error `err_file_too_large`. How is it handled? 
> `llvm_unreachable` is intended for code that is unreachable: it causes an 
> assertion failure in debug builds and undefined behavior in release builds.
The `err_file_too_large` above in this file allows the function to finish and 
return further on. Here, we assert false before the message being printed.
If `llvm_unreachable` is undefined behavior, then I should stick to exit(1) ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-13 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 229113.
dnsampaio marked 2 inline comments as done.
dnsampaio added a comment.

- Add ", DefaultFatal";


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183

Files:
  clang/include/clang/Basic/DiagnosticCommonKinds.td
  clang/lib/Basic/SourceManager.cpp


Index: clang/lib/Basic/SourceManager.cpp
===
--- clang/lib/Basic/SourceManager.cpp
+++ clang/lib/Basic/SourceManager.cpp
@@ -577,13 +577,18 @@
 SLocEntryLoaded[Index] = true;
 return FileID::get(LoadedID);
   }
+  unsigned FileSize = File->getSize();
+  if (!(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
+NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset)) {
+// From this point, there is no sensible way to point to the current
+// source-location and say: This include at line ### generates a too
+// big file, as the IncludePos received is
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }
   LocalSLocEntryTable.push_back(
   SLocEntry::get(NextLocalOffset,
  FileInfo::get(IncludePos, File, FileCharacter, 
Filename)));
-  unsigned FileSize = File->getSize();
-  assert(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
- NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset &&
- "Ran out of source locations!");
   // We do a +1 here because we want a SourceLocation that means "the end of 
the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -282,6 +282,10 @@
   "file '%0' modified since it was first processed">, DefaultFatal;
 def err_file_too_large : Error<
   "sorry, unsupported: file '%0' is too large for Clang to process">;
+def err_include_too_large : Error<
+  "sorry, this include generates a translation unit too large for"
+  " Clang to process. This may by a result from multiple"
+  " inclusions of unguarded header files.">, DefaultFatal;
 def err_unsupported_bom : Error<"%0 byte order mark detected in '%1', but "
   "encoding is not supported">, DefaultFatal;
 def err_unable_to_rename_temp : Error<


Index: clang/lib/Basic/SourceManager.cpp
===
--- clang/lib/Basic/SourceManager.cpp
+++ clang/lib/Basic/SourceManager.cpp
@@ -577,13 +577,18 @@
 SLocEntryLoaded[Index] = true;
 return FileID::get(LoadedID);
   }
+  unsigned FileSize = File->getSize();
+  if (!(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
+NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset)) {
+// From this point, there is no sensible way to point to the current
+// source-location and say: This include at line ### generates a too
+// big file, as the IncludePos received is
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }
   LocalSLocEntryTable.push_back(
   SLocEntry::get(NextLocalOffset,
  FileInfo::get(IncludePos, File, FileCharacter, Filename)));
-  unsigned FileSize = File->getSize();
-  assert(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
- NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset &&
- "Ran out of source locations!");
   // We do a +1 here because we want a SourceLocation that means "the end of the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -282,6 +282,10 @@
   "file '%0' modified since it was first processed">, DefaultFatal;
 def err_file_too_large : Error<
   "sorry, unsupported: file '%0' is too large for Clang to process">;
+def err_include_too_large : Error<
+  "sorry, this include generates a translation unit too large for"
+  " Clang to process. This may by a result from multiple"
+  " inclusions of unguarded header files.">, DefaultFatal;
 def err_unsupported_bom : Error<"%0 byte order mark detected in '%1', but "
   "encoding is not supported">, DefaultFatal;
 def err_unable_to_rename_temp : Error<
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70183: Detect source location overflow due includes

2019-11-13 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added reviewers: rsmith, thakis, miyuki.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
dnsampaio marked an inline comment as done.
dnsampaio added inline comments.



Comment at: clang/lib/Basic/SourceManager.cpp:587
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }

For debug builds, I could not find any other way to not reach an assert failure 
other than exiting here. Perhaps there is a more llvm specific way to die? 
llvm_unreachable() ?


As discussed in http://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
the overflow of the souce locations (limited to 2^31 chars) can generate all 
sorts of
weird things (bogus warnings, hangs, crashes, miscompilation and correct 
compilation).
In debug mode this assert would fail. So it might be a good start, as in 
PR42301,
to detect the failure and exit with a proper error message.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70183

Files:
  clang/include/clang/Basic/DiagnosticCommonKinds.td
  clang/lib/Basic/SourceManager.cpp


Index: clang/lib/Basic/SourceManager.cpp
===
--- clang/lib/Basic/SourceManager.cpp
+++ clang/lib/Basic/SourceManager.cpp
@@ -577,13 +577,18 @@
 SLocEntryLoaded[Index] = true;
 return FileID::get(LoadedID);
   }
+  unsigned FileSize = File->getSize();
+  if (!(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
+NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset)) {
+// From this point, there is no sensible way to point to the current
+// source-location and say: This include at line ### generates a too
+// big file, as the IncludePos received is
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }
   LocalSLocEntryTable.push_back(
   SLocEntry::get(NextLocalOffset,
  FileInfo::get(IncludePos, File, FileCharacter, 
Filename)));
-  unsigned FileSize = File->getSize();
-  assert(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
- NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset &&
- "Ran out of source locations!");
   // We do a +1 here because we want a SourceLocation that means "the end of 
the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -282,6 +282,10 @@
   "file '%0' modified since it was first processed">, DefaultFatal;
 def err_file_too_large : Error<
   "sorry, unsupported: file '%0' is too large for Clang to process">;
+def err_include_too_large : Error<
+  "sorry, this include generates a translation unit too large for"
+  " Clang to process. This may by a result from multiple"
+  " inclusions of unguarded header files.">;
 def err_unsupported_bom : Error<"%0 byte order mark detected in '%1', but "
   "encoding is not supported">, DefaultFatal;
 def err_unable_to_rename_temp : Error<


Index: clang/lib/Basic/SourceManager.cpp
===
--- clang/lib/Basic/SourceManager.cpp
+++ clang/lib/Basic/SourceManager.cpp
@@ -577,13 +577,18 @@
 SLocEntryLoaded[Index] = true;
 return FileID::get(LoadedID);
   }
+  unsigned FileSize = File->getSize();
+  if (!(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
+NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset)) {
+// From this point, there is no sensible way to point to the current
+// source-location and say: This include at line ### generates a too
+// big file, as the IncludePos received is
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }
   LocalSLocEntryTable.push_back(
   SLocEntry::get(NextLocalOffset,
  FileInfo::get(IncludePos, File, FileCharacter, Filename)));
-  unsigned FileSize = File->getSize();
-  assert(NextLocalOffset + FileSize + 1 > NextLocalOffset &&
- NextLocalOffset + FileSize + 1 <= CurrentLoadedOffset &&
- "Ran out of source locations!");
   // We do a +1 here because we want a SourceLocation that means "the end of the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -282,6 +282,10 @@
   "file '%0' modified since it was first processed">, DefaultFatal;
 def err_file_too_large : Error<
   "sorry, unsupported: file '%0' is too large for Clang to process">;
+def err_include_too_large : Error<
+  "sorry,

[PATCH] D70183: Detect source location overflow due includes

2019-11-13 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked an inline comment as done.
dnsampaio added inline comments.



Comment at: clang/lib/Basic/SourceManager.cpp:587
+Diag.Report(IncludePos, diag::err_include_too_large);
+exit(1);
+  }

For debug builds, I could not find any other way to not reach an assert failure 
other than exiting here. Perhaps there is a more llvm specific way to die? 
llvm_unreachable() ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70183/new/

https://reviews.llvm.org/D70183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D67608: [ARM] Preserve fpu behaviour for '-crypto'

2019-10-14 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG2cb43b45713d: [ARM] Preserve fpu behaviour for 
-crypto (authored by dnsampaio).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67608/new/

https://reviews.llvm.org/D67608

Files:
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/Driver/arm-features.c


Index: clang/test/Driver/arm-features.c
===
--- clang/test/Driver/arm-features.c
+++ clang/test/Driver/arm-features.c
@@ -79,3 +79,18 @@
 // RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-m23+crypto -### -c %s 
2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
 // CHECK-NOCRYPTO5: warning: ignoring extension 'crypto' because the {{.*}} 
architecture does not support it
 // CHECK-NOCRYPTO5-NOT: "-target-feature" "+crypto"{{.*}} "-target-feature" 
"+sha2" "-target-feature" "+aes"
+//
+// Check +crypto does not affect -march=armv7a -mfpu=crypto-neon-fp-armv8, but 
it does warn that +crypto has no effect
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as -march=armv7a 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as -march=armv7a+aes 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2 -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2+aes -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA,CHECK-HASAES %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as -march=armv7a+aes 
-mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2 -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2+aes -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASSHA,CHECK-HASAES %s
+// CHECK-WARNONLY: warning: ignoring extension 'crypto' because the 'armv7-a' 
architecture does not support it
+// ALL: "-target-feature"
+// CHECK-WARNONLY-NOT: "-target-feature" "-crypto"
+// CHECK-HASSHA-SAME:  "-target-feature" "+sha2"
+// CHECK-HASAES-SAME:  "-target-feature" "+aes"
+//
Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -479,24 +479,33 @@
 
   // For Arch >= ARMv8.0 && A profile:  crypto = sha2 + aes
   // FIXME: this needs reimplementation after the TargetParser rewrite
-  auto CryptoIt =
-llvm::find_if(llvm::reverse(Features),
-  [](const StringRef F) { return F.contains("crypto"); });
-  if (CryptoIt != Features.rend() && CryptoIt->take_front() == "+") {
-StringRef ArchSuffix = arm::getLLVMArchSuffixForARM(
-arm::getARMTargetCPU(CPUName, ArchName, Triple), ArchName, Triple);
-if (llvm::ARM::parseArchVersion(ArchSuffix) >= 8 &&
-llvm::ARM::parseArchProfile(ArchSuffix) == llvm::ARM::ProfileKind::A) {
-  if (ArchName.find_lower("+nosha2") == StringRef::npos &&
-  CPUName.find_lower("+nosha2") == StringRef::npos)
-Features.push_back("+sha2");
-  if (ArchName.find_lower("+noaes") == StringRef::npos &&
-  CPUName.find_lower("+noaes") == StringRef::npos)
-Features.push_back("+aes");
-} else {
-  D.Diag(clang::diag::warn_target_unsupported_extension)
-<< "crypto" << 
llvm::ARM::getArchName(llvm::ARM::parseArch(ArchSuffix));
-  Features.push_back("-crypto");
+  auto CryptoIt = llvm::find_if(llvm::reverse(Features), [](const StringRef F) 
{
+return F.contains("crypto");
+  });
+  if (CryptoIt != Features.rend()) {
+if (CryptoIt->take_front() == "+") {
+  StringRef ArchSuffix = arm::getLLVMArchSuffixForARM(
+  arm::getARMTargetCPU(CPUName, ArchName, Triple), ArchName, Triple);
+  if (llvm::ARM::parseArchVersion(ArchSuffix) >= 8 &&
+  llvm::ARM::parseArchProfile(ArchSuffix) ==
+  llvm::ARM::ProfileKind::A) {
+if (ArchName.find_lower("+nosha2") == StringRef::npos &&
+CPUName.find_lower("+nosha2") == StringRef::npos)
+  Features.push_back("+sha2");
+if (ArchName.find_lower("+noaes") == StringRef::npos &&
+CPUName.find_lower("+noaes") == StringRef::npos)
+  Features.push_back("+aes");
+  } else {
+D.Diag(clang::diag::warn_target_unsupported_extension)
+<< "crypto"
+<<

[PATCH] D67608: [ARM] Preserve fpu behaviour for '-crypto'

2019-10-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 224323.
dnsampaio added a comment.

Attending review request:

- Fixed to only add '-crypto' when not passing -fno-integrated-as
- Fixed test to use arm-none-none-eabi
- Changed comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67608/new/

https://reviews.llvm.org/D67608

Files:
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/Driver/arm-features.c


Index: clang/test/Driver/arm-features.c
===
--- clang/test/Driver/arm-features.c
+++ clang/test/Driver/arm-features.c
@@ -79,3 +79,18 @@
 // RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-m23+crypto -### -c %s 
2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
 // CHECK-NOCRYPTO5: warning: ignoring extension 'crypto' because the {{.*}} 
architecture does not support it
 // CHECK-NOCRYPTO5-NOT: "-target-feature" "+crypto"{{.*}} "-target-feature" 
"+sha2" "-target-feature" "+aes"
+//
+// Check +crypto does not affect -march=armv7a -mfpu=crypto-neon-fp-armv8, but 
it does warn that +crypto has no effect
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as -march=armv7a 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as -march=armv7a+aes 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2 -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2+aes -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA,CHECK-HASAES %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as -march=armv7a+aes 
-mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2 -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-none-none-eabi -fno-integrated-as 
-march=armv7a+sha2+aes -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASSHA,CHECK-HASAES %s
+// CHECK-WARNONLY: warning: ignoring extension 'crypto' because the 'armv7-a' 
architecture does not support it
+// ALL: "-target-feature"
+// CHECK-WARNONLY-NOT: "-target-feature" "-crypto"
+// CHECK-HASSHA-SAME:  "-target-feature" "+sha2"
+// CHECK-HASAES-SAME:  "-target-feature" "+aes"
+//
Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -479,24 +479,33 @@
 
   // For Arch >= ARMv8.0 && A profile:  crypto = sha2 + aes
   // FIXME: this needs reimplementation after the TargetParser rewrite
-  auto CryptoIt =
-llvm::find_if(llvm::reverse(Features),
-  [](const StringRef F) { return F.contains("crypto"); });
-  if (CryptoIt != Features.rend() && CryptoIt->take_front() == "+") {
-StringRef ArchSuffix = arm::getLLVMArchSuffixForARM(
-arm::getARMTargetCPU(CPUName, ArchName, Triple), ArchName, Triple);
-if (llvm::ARM::parseArchVersion(ArchSuffix) >= 8 &&
-llvm::ARM::parseArchProfile(ArchSuffix) == llvm::ARM::ProfileKind::A) {
-  if (ArchName.find_lower("+nosha2") == StringRef::npos &&
-  CPUName.find_lower("+nosha2") == StringRef::npos)
-Features.push_back("+sha2");
-  if (ArchName.find_lower("+noaes") == StringRef::npos &&
-  CPUName.find_lower("+noaes") == StringRef::npos)
-Features.push_back("+aes");
-} else {
-  D.Diag(clang::diag::warn_target_unsupported_extension)
-<< "crypto" << 
llvm::ARM::getArchName(llvm::ARM::parseArch(ArchSuffix));
-  Features.push_back("-crypto");
+  auto CryptoIt = llvm::find_if(llvm::reverse(Features), [](const StringRef F) 
{
+return F.contains("crypto");
+  });
+  if (CryptoIt != Features.rend()) {
+if (CryptoIt->take_front() == "+") {
+  StringRef ArchSuffix = arm::getLLVMArchSuffixForARM(
+  arm::getARMTargetCPU(CPUName, ArchName, Triple), ArchName, Triple);
+  if (llvm::ARM::parseArchVersion(ArchSuffix) >= 8 &&
+  llvm::ARM::parseArchProfile(ArchSuffix) ==
+  llvm::ARM::ProfileKind::A) {
+if (ArchName.find_lower("+nosha2") == StringRef::npos &&
+CPUName.find_lower("+nosha2") == StringRef::npos)
+  Features.push_back("+sha2");
+if (ArchName.find_lower("+noaes") == StringRef::npos &&
+CPUName.find_lower("+noaes") == StringRef::npos)
+  Features.push_back("+aes");
+  } else {
+D.Diag(clang::diag::warn_target_unsupported_extension)
+<<

[PATCH] D67608: [ARM] Preserve fpu behaviour for '-crypto'

2019-09-16 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added reviewers: peter.smith, labrinea.
Herald added subscribers: cfe-commits, dmgreen, kristof.beyls.
Herald added a project: clang.

This patch restores the behaviour that -fpu overwrites the
architecture obtained from -march or -mcpu flags, not enforcing to
disable 'crypto' if march=armv7 and mfpu=neon-fp-armv8.
However, it does warn that 'crypto' is ignored when passing
mfpu=crypto-neon-fp-armv8.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D67608

Files:
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/Driver/arm-features.c


Index: clang/test/Driver/arm-features.c
===
--- clang/test/Driver/arm-features.c
+++ clang/test/Driver/arm-features.c
@@ -79,3 +79,18 @@
 // RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-m23+crypto -### -c %s 
2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
 // CHECK-NOCRYPTO5: warning: ignoring extension 'crypto' because the {{.*}} 
architecture does not support it
 // CHECK-NOCRYPTO5-NOT: "-target-feature" "+crypto"{{.*}} "-target-feature" 
"+sha2" "-target-feature" "+aes"
+//
+// Check +crypto does not affect -march=armv7a -mfpu=crypto-neon-fp-armv8, but 
it does warn that +crypto has no effect
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+aes 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+sha2 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+sha2+aes 
-mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA,CHECK-HASAES %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+aes -mfpu=neon-fp-armv8 
-### -c %s 2>&1 | FileCheck -check-prefixes=ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+sha2 
-mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+sha2+aes 
-mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck 
-check-prefixes=ALL,CHECK-HASSHA,CHECK-HASAES %s
+// CHECK-WARNONLY: warning: ignoring extension 'crypto' because the 'armv7-a' 
architecture does not support it
+// ALL: "-target-feature"
+// CHECK-WARNONLY-NOT: "-target-feature" "-crypto"
+// CHECK-HASSHA-SAME:  "-target-feature" "+sha2"
+// CHECK-HASAES-SAME:  "-target-feature" "+aes"
+//
Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -496,7 +496,12 @@
 } else {
   D.Diag(clang::diag::warn_target_unsupported_extension)
 << "crypto" << 
llvm::ARM::getArchName(llvm::ARM::parseArch(ArchSuffix));
-  Features.push_back("-crypto");
+  //-mfpu=crypto-neon-fp-armv8 does allow +sha2 / +aes / +crypto features,
+  // even if compiling for an cpu armv7a, if explicitly defined by the 
user,
+  // so do not deactivate them.
+  if ((llvm::find(Features, "+fp-armv8") == Features.end()) ||
+  (llvm::find(Features, "+neon") == Features.end()))
+Features.push_back("-crypto");
 }
   }
 


Index: clang/test/Driver/arm-features.c
===
--- clang/test/Driver/arm-features.c
+++ clang/test/Driver/arm-features.c
@@ -79,3 +79,18 @@
 // RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-m23+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
 // CHECK-NOCRYPTO5: warning: ignoring extension 'crypto' because the {{.*}} architecture does not support it
 // CHECK-NOCRYPTO5-NOT: "-target-feature" "+crypto"{{.*}} "-target-feature" "+sha2" "-target-feature" "+aes"
+//
+// Check +crypto does not affect -march=armv7a -mfpu=crypto-neon-fp-armv8, but it does warn that +crypto has no effect
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefixes=CHECK-WARNONLY,ALL %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+aes -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASAES %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+sha2 -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+sha2+aes -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefixes=CHECK-WARNONLY,ALL,CHECK-HASSHA,CHECK-HASAES %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv7a+aes -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck

[PATCH] D67399: [ARM] Follow AACPS standard for volatile bitfields

2019-09-13 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Indeed our main concern is regarding the access widths of loads. As mentioned 
by @rjmccall, most volatile bitfields are used to perform memory mapped I/O, 
and some hardware only support them with a specific access width.
The spurious load I am more than glad to leave it disable behind a command 
flag, so it will only appear if the user requests it. See that volatile 
accesses might have side effects, and for example, an I/O read counter holding 
an odd number could define that the data is still being processed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-09-11 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL371597: [ARM] Take into account -mcpu and -mfpu options 
while handling crypto feature (authored by dnsampaio, committed by 
).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018

Files:
  cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
  cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp
  cfe/trunk/test/Driver/arm-features.c

Index: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
@@ -384,6 +384,9 @@
 def warn_target_unsupported_compact_branches : Warning<
   "ignoring '-mcompact-branches=' option because the '%0' architecture does not"
   " support it">, InGroup;
+def warn_target_unsupported_extension : Warning<
+  "ignoring extension '%0' because the '%1' architecture does not support it">,
+   InGroup;
 def warn_drv_unsupported_gpopt : Warning<
   "ignoring '-mgpopt' option as it cannot be used with %select{|the implicit"
   " usage of }0-mabicalls">,
Index: cfe/trunk/test/Driver/arm-features.c
===
--- cfe/trunk/test/Driver/arm-features.c
+++ cfe/trunk/test/Driver/arm-features.c
@@ -37,7 +37,8 @@
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.2a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO2 %s
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.3a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO2 %s
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.4a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO2 %s
-// CHECK-CRYPTO2: "-cc1"{{.*}} "-target-cpu" "generic"{{.*}} "-target-feature" "+crypto" "-target-feature" "+sha2" "-target-feature" "+aes"
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.5a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO2 %s
+// CHECK-CRYPTO2: "-cc1"{{.*}} "-target-cpu" "generic"{{.*}} "-target-feature" "+crypto"{{.*}} "-target-feature" "+sha2" "-target-feature" "+aes"
 //
 // Check -crypto:
 //
@@ -45,14 +46,36 @@
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.2a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO2 %s
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.3a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO2 %s
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.4a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO2 %s
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.5a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO2 %s
 // CHECK-NOCRYPTO2-NOT: "-target-feature" "+crypto" "-target-feature" "+sha2" "-target-feature" "+aes"
 //
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO2-CPU %s
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57 -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO2-CPU %s
+// CHECK-CRYPTO2-CPU: "-cc1"{{.*}} "-target-cpu" "cortex-a57"{{.*}} "-target-feature" "+crypto"{{.*}} "-target-feature" "+sha2" "-target-feature" "+aes"
+//
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57+norypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO2-CPU %s
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57 -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO2-CPU %s
+// CHECK-NOCRYPTO2-CPU-NOT: "-cc1"{{.*}} "-target-cpu" "cortex-a57"{{.*}} "-target-feature" "+crypto"{{.*}} "-target-feature" "+sha2" "-target-feature" "+aes"
+//
 // Check +crypto -sha2 -aes:
 //
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.1a+crypto+nosha2+noaes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO3 %s
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57+crypto+nosha2+noaes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO3 %s
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57+nosha2+noaes -mfpu=crypto-neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO3 %s
 // CHECK-CRYPTO3-NOT: "-target-feature" "+sha2" "-target-feature" "+aes"
 //
 // Check -crypto +sha2 +aes:
 //
 // RUN: %clang -target arm-arm-none-eabi -march=armv8.1a+nocrypto+sha2+aes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO4 %s
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57+nocrypto+sha2+aes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO4 %s
+// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a57+sha2+aes -mfpu=neon-fp-armv8 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO4 %s
 // CHECK-CRYPTO4: "-target-feature" "+sha2" "-target-feature" "+aes"
+//
+// Check +crypto for M and R profiles:
+//
+// RUN: %clang -target arm-arm-none-eabi

[PATCH] D67399: [ARM] Follow AACPS standard for volatile bitfields

2019-09-11 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked 4 inline comments as done.
dnsampaio added a comment.

Hi @jfb. In a example such as:

  struct { int a : 1; int b : 16; } S;
  extern int expensive_computaion(int v);
  void foo(volatile S* s){
s->b = expensive_computation(s->b);
  }

There is no guarantee that `s->a` is not modified during a expensive 
computation, so it must be obtained just before writing the `s->b` value, as 
`a` and `b` share the same memory position. This is already done by llvm. 
Indeed, the exact output would be

  define void @foo(%struct.S* %S) local_unnamed_addr #0 {
  entry:
%0 = bitcast %struct.S* %S to i32*
%bf.load = load volatile i32, i32* %0, align 4
%bf.shl = shl i32 %bf.load, 15
%bf.ashr = ashr i32 %bf.shl, 16
%call = tail call i32 @expensive_computation(i32 %bf.ashr) #2
%bf.load1 = load volatile i32, i32* %0, align 4 ; <<<== Here it obtains the 
value to s->a to restore it.
%bf.value = shl i32 %call, 1
%bf.shl2 = and i32 %bf.value, 131070
%bf.clear = and i32 %bf.load1, -131071
%bf.set = or i32 %bf.clear, %bf.shl2
store volatile i32 %bf.set, i32* %0, align 4
ret void
  }

These extra loads here are required to make uniform the number of times the 
volatile bitfield is read, independent if they share or not memory with other 
data.

We could have it under a flag, such as `-faacps-volatilebitfield`, disabled by 
default.

The other point not conformant to the AACPS is the width of the loads/stores 
used to obtain bitfields. They should be the width of the container, if it does 
that would not overlap with non-bitfields. Do you have any idea where that 
could be computed? I imagine that would be when computing the alignment of the 
elements of the structure, where I can check if the performing the entire 
container width load would overlap with other elements. Could you point me 
where that is done?

Comment at: clang/test/CodeGen/aapcs-bitfield.c:541
 // BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], 
%struct.st9* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // BE-NEXT:store volatile i8 1, i8* [[TMP0]], align 4

jfb wrote:
> These are just extra loads? Why?
Yes, these are just extra loads. As the AACPS describes, every write requires 
to perform a load as well, even if all bits of the volatile bitfield is going 
to be replaced.

Comment at: clang/test/CodeGen/aapcs-bitfield.c:552
 // LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], 
%struct.st9* [[M:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // LE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1

jfb wrote:
> Why isn't this load sufficient?
Technically speaking, that is the load for reading the bitfield, not the load 
required when writing it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D67399: [ARM] Follow AACPS standard for volatile bitfields

2019-09-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

@ostannard might prove me wrong, but according to the AACPS:

  When a volatile bit-field is written, and its container does not overlap with 
any non-bit-field member, its
  container must be read exactly once and written exactly once using the access 
width appropriate to the
  type of the container. The two accesses are not atomic.

This rule does not define that the load is done if required. It states that it 
will be read once. It even gives the example that an increment will always 
perform two reads and one write, bitwidth agnostic. It writes just after:

  Note: Note the volatile access rules apply even when the width and alignment 
of the bit-field imply that
  the access could be achieved more efficiently using a narrower type. For a 
write operation the read must
  always occur even if the entire contents of the container will be replaced.

The rationale is to provide a uniform behavior for volatile bitfields 
independent of their width (as far they do not overlap with non-bitfields).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D67399: [ARM] Follow AACPS standard for volatile bitfields

2019-09-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

This patch could hack clang to generate an extra load. However, my knowledge in 
the clang code base is not extensive. How could we ensure that the width of 
loads and stores are the size of the container, and  that they don't overlap 
non-bitfields?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67399/new/

https://reviews.llvm.org/D67399



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D67399: [ARM] Follow AACPS standard for volatile bitfields

2019-09-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added reviewers: lebedev.ri, ostannard.
Herald added subscribers: cfe-commits, jfb, kristof.beyls.
Herald added a project: clang.

Bug 43264
This is a first draft to understand what has to be
done to fix volatale bitfield access, as to conform
to the AACPS


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D67399

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGen/aapcs-bitfield.c


Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -531,12 +531,14 @@
 // LE-LABEL: @store_st9(
 // LE-NEXT:  entry:
 // LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], 
%struct.st9* [[M:%.*]], i32 0, i32 0
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // LE-NEXT:store volatile i8 1, i8* [[TMP0]], align 4
 // LE-NEXT:ret void
 //
 // BE-LABEL: @store_st9(
 // BE-NEXT:  entry:
 // BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], 
%struct.st9* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // BE-NEXT:store volatile i8 1, i8* [[TMP0]], align 4
 // BE-NEXT:ret void
 //
@@ -549,6 +551,7 @@
 // LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], 
%struct.st9* [[M:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // LE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1
+// LE-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // LE-NEXT:store volatile i8 [[INC]], i8* [[TMP0]], align 4
 // LE-NEXT:ret void
 //
@@ -557,6 +560,7 @@
 // BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], 
%struct.st9* [[M:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // BE-NEXT:[[INC:%.*]] = add i8 [[BF_LOAD]], 1
+// BE-NEXT:[[BF_LOAD1:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // BE-NEXT:store volatile i8 [[INC]], i8* [[TMP0]], align 4
 // BE-NEXT:ret void
 //
@@ -667,12 +671,14 @@
 // LE-LABEL: @store_st11(
 // LE-NEXT:  entry:
 // LE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], 
%struct.st11* [[M:%.*]], i32 0, i32 1
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // LE-NEXT:store volatile i16 1, i16* [[F]], align 1
 // LE-NEXT:ret void
 //
 // BE-LABEL: @store_st11(
 // BE-NEXT:  entry:
 // BE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], 
%struct.st11* [[M:%.*]], i32 0, i32 1
+// BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // BE-NEXT:store volatile i16 1, i16* [[F]], align 1
 // BE-NEXT:ret void
 //
@@ -685,6 +691,7 @@
 // LE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], 
%struct.st11* [[M:%.*]], i32 0, i32 1
 // LE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // LE-NEXT:[[INC:%.*]] = add i16 [[BF_LOAD]], 1
+// LE-NEXT:[[BF_LOAD1:%.*]] = load volatile i16, i16* [[F]], align 1
 // LE-NEXT:store volatile i16 [[INC]], i16* [[F]], align 1
 // LE-NEXT:ret void
 //
@@ -693,6 +700,7 @@
 // BE-NEXT:[[F:%.*]] = getelementptr inbounds [[STRUCT_ST11:%.*]], 
%struct.st11* [[M:%.*]], i32 0, i32 1
 // BE-NEXT:[[BF_LOAD:%.*]] = load volatile i16, i16* [[F]], align 1
 // BE-NEXT:[[INC:%.*]] = add i16 [[BF_LOAD]], 1
+// BE-NEXT:[[BF_LOAD1:%.*]] = load volatile i16, i16* [[F]], align 1
 // BE-NEXT:store volatile i16 [[INC]], i16* [[F]], align 1
 // BE-NEXT:ret void
 //
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -2061,6 +2061,13 @@
 SrcVal = Builder.CreateOr(Val, SrcVal, "bf.set");
   } else {
 assert(Info.Offset == 0);
+// Acording to the AACPS:
+// When a volatile bit-field is written, and its container does not overlap
+// with any non-bit-field member, its container must be read exactly once 
and
+// written exactly once using the access width appropriate to the type of 
the
+// container. The two accesses are not atomic.
+if ( (CGM.getTarget().getABI() == "aapcs") && Dst.isVolatileQualified() )
+  Builder.CreateLoad(Ptr, Dst.isVolatileQualified(), "bf.load");
   }
 
   // Write the new value back out.


Index: clang/test/CodeGen/aapcs-bitfield.c
===
--- clang/test/CodeGen/aapcs-bitfield.c
+++ clang/test/CodeGen/aapcs-bitfield.c
@@ -531,12 +531,14 @@
 // LE-LABEL: @store_st9(
 // LE-NEXT:  entry:
 // LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST9:%.*]], %struct.st9* [[M:%.*]], i32 0, i32 0
+// LE-NEXT:[[BF_LOAD:%.*]] = load volatile i8, i8* [[TMP0]], align 4
 // LE-NEXT:store volatile i8 1, i8* [[TMP0]], align 4
 //

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-09-03 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM. Thanks. Will commit for you as requested soon.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66588: [ARM NEON] Avoid duplicated decarations

2019-09-03 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL370716: [ARM NEON] Avoid duplicated decarations (authored by 
dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D66588?vs=218412=218414#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66588/new/

https://reviews.llvm.org/D66588

Files:
  cfe/trunk/utils/TableGen/NeonEmitter.cpp


Index: cfe/trunk/utils/TableGen/NeonEmitter.cpp
===
--- cfe/trunk/utils/TableGen/NeonEmitter.cpp
+++ cfe/trunk/utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1578,7 +1589,10 @@
   Intr.Dependencies.insert();
 
   // Now create the call itself.
-  std::string S = CallPrefix.str() + Callee.getMangledName(true) + "(";
+  std::string S = "";
+  if (!Callee.isBigEndianSafe())
+S += CallPrefix.str();
+  S += Callee.getMangledName(true) + "(";
   for (unsigned I = 0; I < DI->getNumArgs() - 1; ++I) {
 if (I != 0)
   S += ", ";
@@ -1889,6 +1903,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and little endian
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";


Index: cfe/trunk/utils/TableGen/NeonEmitter.cpp
===
--- cfe/trunk/utils/TableGen/NeonEmitter.cpp
+++ cfe/trunk/utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1578,7 +1589,10 @@
   Intr.Dependencies.insert();
 
   // Now create the call itself.
-  std::string S = CallPrefix.str() + Callee.getMangledName(true) + "(";
+  std::string S = "";
+  if (!Callee.isBigEndianSafe())
+S += CallPrefix.str();
+  S += Callee.getMangledName(true) + "(";
   for (unsigned I = 0; I < DI->getNumArgs() - 1; ++I) {
 if (I != 0)
   S += ", ";
@@ -1889,6 +1903,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and little endian
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66588: [ARM NEON] Avoid duplicated decarations

2019-09-03 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 218412.
dnsampaio added a comment.

- Fix comment


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66588/new/

https://reviews.llvm.org/D66588

Files:
  clang/utils/TableGen/NeonEmitter.cpp


Index: clang/utils/TableGen/NeonEmitter.cpp
===
--- clang/utils/TableGen/NeonEmitter.cpp
+++ clang/utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1578,7 +1589,10 @@
   Intr.Dependencies.insert();
 
   // Now create the call itself.
-  std::string S = CallPrefix.str() + Callee.getMangledName(true) + "(";
+  std::string S = "";
+  if (!Callee.isBigEndianSafe())
+S += CallPrefix.str();
+  S += Callee.getMangledName(true) + "(";
   for (unsigned I = 0; I < DI->getNumArgs() - 1; ++I) {
 if (I != 0)
   S += ", ";
@@ -1889,6 +1903,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and little endian
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";


Index: clang/utils/TableGen/NeonEmitter.cpp
===
--- clang/utils/TableGen/NeonEmitter.cpp
+++ clang/utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1578,7 +1589,10 @@
   Intr.Dependencies.insert();
 
   // Now create the call itself.
-  std::string S = CallPrefix.str() + Callee.getMangledName(true) + "(";
+  std::string S = "";
+  if (!Callee.isBigEndianSafe())
+S += CallPrefix.str();
+  S += Callee.getMangledName(true) + "(";
   for (unsigned I = 0; I < DI->getNumArgs() - 1; ++I) {
 if (I != 0)
   S += ", ";
@@ -1889,6 +1903,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and little endian
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-09-02 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Hi, I do agree that giving the user a warning that the argument is ignored is 
the best solution. If you wouldn't mind adding it to this patch, that would be 
great. Thanks.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-08-30 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio requested changes to this revision.
dnsampaio added a comment.
This revision now requires changes to proceed.

`clang -### -target arm-arm-none-eabit -march=armv8-m.main+crypto` did not show 
+sha2 or +aes. After the patch it does.
I believe that is not expected, as in ARM.td `crypto` is not applied for any M 
profile. And Arm®v8-M Architecture Reference Manual does not reference these 
extensions neither.




Comment at: lib/Driver/ToolChains/Arch/ARM.cpp:485
+
+  if (llvm::ARM::parseArchVersion(ArchSuffix) >= 8) {
+auto CryptoIt =

` && (llvm::ARM::parseArchProfile(Arch) == llvm::ARM::ProfileKind::A)`



Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-08-29 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio accepted this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

LGTM. One optional nit as it is not related with this patch anymore.




Comment at: lib/Driver/ToolChains/Arch/ARM.cpp:659
   llvm::ARM::ArchKind ArchKind;
-  if (CPU == "generic") {
+  if (CPU == "generic" || CPU.empty()) {
 std::string ARMArch = tools::arm::getARMArch(Arch, Triple);

dnsampaio wrote:
> Good catch.
For safety perhaps we can keep the CPU.empty() test.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66588: [ARM NEON] Avoid duplicated decarations

2019-08-28 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 217594.
dnsampaio added a comment.

Fix / Update / Rebase

- Avoid appending __noswap_ to intrinsics that are BigEndianSafe
- Moved to monorepo


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66588/new/

https://reviews.llvm.org/D66588

Files:
  clang/utils/TableGen/NeonEmitter.cpp


Index: clang/utils/TableGen/NeonEmitter.cpp
===
--- clang/utils/TableGen/NeonEmitter.cpp
+++ clang/utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1578,7 +1589,10 @@
   Intr.Dependencies.insert();
 
   // Now create the call itself.
-  std::string S = CallPrefix.str() + Callee.getMangledName(true) + "(";
+  std::string S = "";
+  if (!Callee.isBigEndianSafe())
+S += CallPrefix.str();
+  S += Callee.getMangledName(true) + "(";
   for (unsigned I = 0; I < DI->getNumArgs() - 1; ++I) {
 if (I != 0)
   S += ", ";
@@ -1889,6 +1903,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and small endians
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";


Index: clang/utils/TableGen/NeonEmitter.cpp
===
--- clang/utils/TableGen/NeonEmitter.cpp
+++ clang/utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1578,7 +1589,10 @@
   Intr.Dependencies.insert();
 
   // Now create the call itself.
-  std::string S = CallPrefix.str() + Callee.getMangledName(true) + "(";
+  std::string S = "";
+  if (!Callee.isBigEndianSafe())
+S += CallPrefix.str();
+  S += Callee.getMangledName(true) + "(";
   for (unsigned I = 0; I < DI->getNumArgs() - 1; ++I) {
 if (I != 0)
   S += ", ";
@@ -1889,6 +1903,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and small endians
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66588: [ARM NEON] Avoid duplicated decarations

2019-08-27 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio planned changes to this revision.
dnsampaio added a comment.

Breaks the header. Needs to avoid generating calls to functions with predicated 
__noswap when it is BigEndianSafe.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66588/new/

https://reviews.llvm.org/D66588



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66588: [ARM NEON] Avoid duplicated decarations

2019-08-23 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 216801.
dnsampaio added a comment.

- Consider BigEndianSafe intrinsics that all inputs and outputs are scalar or 
single element vectors


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66588/new/

https://reviews.llvm.org/D66588

Files:
  utils/TableGen/NeonEmitter.cpp


Index: utils/TableGen/NeonEmitter.cpp
===
--- utils/TableGen/NeonEmitter.cpp
+++ utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1889,6 +1900,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and small endians
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";


Index: utils/TableGen/NeonEmitter.cpp
===
--- utils/TableGen/NeonEmitter.cpp
+++ utils/TableGen/NeonEmitter.cpp
@@ -332,6 +332,17 @@
   NeonEmitter 
   std::stringstream OS;
 
+  bool isBigEndianSafe() const {
+if (BigEndianSafe)
+  return true;
+
+for (const auto  : Types){
+  if (T.isVector() && T.getNumElements() > 1)
+return false;
+}
+return true;
+  }
+
 public:
   Intrinsic(Record *R, StringRef Name, StringRef Proto, TypeSpec OutTS,
 TypeSpec InTS, ClassKind CK, ListInit *Body, NeonEmitter ,
@@ -1293,7 +1304,7 @@
 }
 
 void Intrinsic::emitArgumentReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
 
   // Reverse all vector arguments.
@@ -1314,7 +1325,7 @@
 }
 
 void Intrinsic::emitReturnReversal() {
-  if (BigEndianSafe)
+  if (isBigEndianSafe())
 return;
   if (!getReturnType().isVector() || getReturnType().isVoid() ||
   getReturnType().getNumElements() == 1)
@@ -1889,6 +1900,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and small endians
+  if (isBigEndianSafe()) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66588: [ARM NEON] Avoid duplicated decarations

2019-08-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added reviewers: t.p.northover, ostannard.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

The declaration of arm neon intrinsics that are
"big endian safe" print the same code for big
and small endian targets.
This patch avoids duplicates by checking if an
intrinsic is safe to have a single definition.
(decreases header 6030 lines out of 73k).


Repository:
  rC Clang

https://reviews.llvm.org/D66588

Files:
  utils/TableGen/NeonEmitter.cpp


Index: utils/TableGen/NeonEmitter.cpp
===
--- utils/TableGen/NeonEmitter.cpp
+++ utils/TableGen/NeonEmitter.cpp
@@ -1889,6 +1889,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and small endians
+  if (BigEndianSafe) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";


Index: utils/TableGen/NeonEmitter.cpp
===
--- utils/TableGen/NeonEmitter.cpp
+++ utils/TableGen/NeonEmitter.cpp
@@ -1889,6 +1889,11 @@
 }
 
 std::string Intrinsic::generate() {
+  // Avoid duplicated code for big and small endians
+  if (BigEndianSafe) {
+generateImpl(false, "", "");
+return OS.str();
+  }
   // Little endian intrinsics are simple and don't require any argument
   // swapping.
   OS << "#ifdef __LITTLE_ENDIAN__\n";
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-08-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

Hi @krisb,
thanks for looking into this, and sorry for the delay, was out for a week.




Comment at: lib/Driver/ToolChains/Arch/ARM.cpp:486-490
+  if (ArchKind == llvm::ARM::ArchKind::ARMV8A ||
+  ArchKind == llvm::ARM::ArchKind::ARMV8_1A ||
+  ArchKind == llvm::ARM::ArchKind::ARMV8_2A ||
+  ArchKind == llvm::ARM::ArchKind::ARMV8_3A ||
+  ArchKind == llvm::ARM::ArchKind::ARMV8_4A) {

Could we do something a little more permanent here, for example it is already 
missing ARMV8_5A.
Digging I found the function `StringRef arm::getLLVMArchSuffixForARM(StringRef 
CPU, StringRef Arch, const llvm::Triple )`.
Perhaps just testing  
`(llvm::ARM::parseArchVersion(arm::getLLVMArchSuffixForARM(StringRef CPU, 
StringRef Arch, const llvm::Triple )) >= 8)`
would do a better job.




Comment at: lib/Driver/ToolChains/Arch/ARM.cpp:659
   llvm::ARM::ArchKind ArchKind;
-  if (CPU == "generic") {
+  if (CPU == "generic" || CPU.empty()) {
 std::string ARMArch = tools::arm::getARMArch(Arch, Triple);

Good catch.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D66018: [ARM] Take into account -mcpu and -mfpu options while handling 'crypto' feature

2019-08-14 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added inline comments.



Comment at: lib/Driver/ToolChains/Arch/ARM.cpp:482-486
+  llvm::ARM::ArchKind ArchKind =
+  !ArchName.empty()
+  ? llvm::ARM::parseArch(arm::getARMArch(ArchName, Triple))
+  : llvm::ARM::parseCPUArch(
+arm::getARMTargetCPU(CPUName, ArchName, Triple));

Could just use `llvm::ARM::ArchKind arm::getLLVMArchKindForARM(StringRef CPU, 
StringRef Arch,  const llvm::Triple )` ?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66018/new/

https://reviews.llvm.org/D66018



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-08-08 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL368288: [ARM] Set default alignment to 64bits (authored by 
dnsampaio, committed by ).

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000

Files:
  cfe/trunk/lib/Basic/Targets/ARM.cpp
  cfe/trunk/test/CodeGenCXX/ARM/exception-alignment.cpp
  cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp


Index: cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
+// RUN: %clang_cc1 -triple arm-linux-androideabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mips64el-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
Index: cfe/trunk/test/CodeGenCXX/ARM/exception-alignment.cpp
===
--- cfe/trunk/test/CodeGenCXX/ARM/exception-alignment.cpp
+++ cfe/trunk/test/CodeGenCXX/ARM/exception-alignment.cpp
@@ -0,0 +1,21 @@
+// Bug: https://bugs.llvm.org/show_bug.cgi?id=42668
+// REQUIRES: arm-registered-target
+
+// RUN: %clang_cc1 -triple armv8-arm-none-eabi -emit-llvm -target-cpu generic 
-Os -fcxx-exceptions -o - -x c++ %s | FileCheck --check-prefixes=CHECK,A8 %s
+// RUN: %clang_cc1 -triple armv8-unknown-linux-android -emit-llvm -target-cpu 
generic -Os -fcxx-exceptions -o - -x c++ %s | FileCheck 
--check-prefixes=CHECK,A16 %s
+
+// CHECK: [[E:%[A-z0-9]+]] = tail call i8* @__cxa_allocate_exception
+// CHECK-NEXT: [[BC:%[A-z0-9]+]] = bitcast i8* [[E]] to <2 x i64>*
+// A8-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 8
+// A16-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 16
+#include 
+
+int main(void) {
+  try {
+throw vld1q_u64(((const uint64_t[2]){1, 2}));
+  } catch (uint64x2_t exc) {
+return 0;
+  }
+  return 1;
+}
+
Index: cfe/trunk/lib/Basic/Targets/ARM.cpp
===
--- cfe/trunk/lib/Basic/Targets/ARM.cpp
+++ cfe/trunk/lib/Basic/Targets/ARM.cpp
@@ -309,8 +309,9 @@
   setAtomic();
 
   // Maximum alignment for ARM NEON data types should be 64-bits (AAPCS)
+  // as well the default alignment
   if (IsAAPCS && (Triple.getEnvironment() != llvm::Triple::Android))
-MaxVectorAlign = 64;
+DefaultAlignForAttributeAligned = MaxVectorAlign = 64;
 
   // Do force alignment of members that follow zero length bitfields.  If
   // the alignment of the zero-length bitfield is greater than the member


Index: cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ cfe/trunk/test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-08-06 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

I have tested this in our MacOS and linux environments. @thakis @thegameg 
@phosek, would it be possible for you to check if this works for you?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-08-06 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 213540.
dnsampaio added a comment.

Fix test


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000

Files:
  lib/Basic/Targets/ARM.cpp
  test/CodeGenCXX/ARM/exception-alignment.cpp
  test/SemaCXX/warn-overaligned-type-thrown.cpp


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
+// RUN: %clang_cc1 -triple arm-linux-androideabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mips64el-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
Index: test/CodeGenCXX/ARM/exception-alignment.cpp
===
--- /dev/null
+++ test/CodeGenCXX/ARM/exception-alignment.cpp
@@ -0,0 +1,21 @@
+// Bug: https://bugs.llvm.org/show_bug.cgi?id=42668
+// REQUIRES: arm-registered-target
+
+// RUN: %clang_cc1 -triple armv8-arm-none-eabi -emit-llvm -target-cpu generic 
-Os -fcxx-exceptions -o - -x c++ %s | FileCheck --check-prefixes=CHECK,A8 %s
+// RUN: %clang_cc1 -triple armv8-unknown-linux-android -emit-llvm -target-cpu 
generic -Os -fcxx-exceptions -o - -x c++ %s | FileCheck 
--check-prefixes=CHECK,A16 %s
+
+// CHECK: [[E:%[A-z0-9]+]] = tail call i8* @__cxa_allocate_exception
+// CHECK-NEXT: [[BC:%[A-z0-9]+]] = bitcast i8* [[E]] to <2 x i64>*
+// A8-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 8
+// A16-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 16
+#include 
+
+int main(void) {
+  try {
+throw vld1q_u64(((const uint64_t[2]){1, 2}));
+  } catch (uint64x2_t exc) {
+return 0;
+  }
+  return 1;
+}
+
Index: lib/Basic/Targets/ARM.cpp
===
--- lib/Basic/Targets/ARM.cpp
+++ lib/Basic/Targets/ARM.cpp
@@ -309,8 +309,9 @@
   setAtomic();
 
   // Maximum alignment for ARM NEON data types should be 64-bits (AAPCS)
+  // as well the default alignment
   if (IsAAPCS && (Triple.getEnvironment() != llvm::Triple::Android))
-MaxVectorAlign = 64;
+DefaultAlignForAttributeAligned = MaxVectorAlign = 64;
 
   // Do force alignment of members that follow zero length bitfields.  If
   // the alignment of the zero-length bitfield is greater than the member


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
-//

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-08-06 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio reopened this revision.
dnsampaio added a comment.
This revision is now accepted and ready to land.

Hi, first thanks for those that looked into this and sorry for the delay.
We have investigated the errors and seems that the test was, first in the wrong 
folder, inside CodeGen where it should be in CodeGenCXX and we should use 
clang_cc1.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-07-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked an inline comment as done.
dnsampaio added a comment.

True. Thx again.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-07-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 211132.
dnsampaio added a comment.

- Joined assignments for default alignments and neon_vector alignment
- Added missing align 8 test


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000

Files:
  lib/Basic/Targets/ARM.cpp
  test/CodeGen/ARM/exception-alignment.cpp
  test/SemaCXX/warn-overaligned-type-thrown.cpp


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
+// RUN: %clang_cc1 -triple arm-linux-androideabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mips64el-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
Index: test/CodeGen/ARM/exception-alignment.cpp
===
--- /dev/null
+++ test/CodeGen/ARM/exception-alignment.cpp
@@ -0,0 +1,19 @@
+// Bug: https://bugs.llvm.org/show_bug.cgi?id=42668
+// REQUIRES: arm-registered-target
+// RUN: %clang --target=arm-arm-none-eabi -march=armv8-a -S -emit-llvm -Os -o 
- %s | FileCheck --check-prefixes=CHECK,A8 %s
+// RUN: %clang --target=arm-linux-androideabi -march=armv8-a -S -emit-llvm -Os 
-o - %s | FileCheck --check-prefixes=CHECK,A16 %s
+// CHECK: [[E:%[A-z0-9]+]] = tail call i8* @__cxa_allocate_exception
+// CHECK-NEXT: [[BC:%[A-z0-9]+]] = bitcast i8* [[E]] to <2 x i64>*
+// A8-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 8
+// A16-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 16
+#include 
+
+int main(void) {
+  try {
+throw vld1q_u64(((const uint64_t[2]){1, 2}));
+  } catch (uint64x2_t exc) {
+return 0;
+  }
+  return 1;
+}
+
Index: lib/Basic/Targets/ARM.cpp
===
--- lib/Basic/Targets/ARM.cpp
+++ lib/Basic/Targets/ARM.cpp
@@ -309,8 +309,9 @@
   setAtomic();
 
   // Maximum alignment for ARM NEON data types should be 64-bits (AAPCS)
+  // as well the default alignment
   if (IsAAPCS && (Triple.getEnvironment() != llvm::Triple::Android))
-MaxVectorAlign = 64;
+DefaultAlignForAttributeAligned = MaxVectorAlign = 64;
 
   // Do force alignment of members that follow zero length bitfields.  If
   // the alignment of the zero-length bitfield is greater than the member


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only -std=c++11 -fcxx-exceptions

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-07-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 211103.
dnsampaio added a comment.

- Joined assignments for default alignments and neon_vector alignment


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000

Files:
  lib/Basic/Targets/ARM.cpp
  test/CodeGen/ARM/exception-alignment.cpp
  test/SemaCXX/warn-overaligned-type-thrown.cpp


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
+// RUN: %clang_cc1 -triple arm-linux-androideabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mips64el-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
Index: test/CodeGen/ARM/exception-alignment.cpp
===
--- /dev/null
+++ test/CodeGen/ARM/exception-alignment.cpp
@@ -0,0 +1,18 @@
+// Bug: https://bugs.llvm.org/show_bug.cgi?id=42668
+// REQUIRES: arm-registered-target
+// RUN: %clang --target=arm-arm-none-eabi -march=armv8-a -S -emit-llvm -Os -o 
- %s | FileCheck --check-prefixes=CHECK,A8 %s
+// RUN: %clang --target=arm-linux-androideabi -march=armv8-a -S -emit-llvm -Os 
-o - %s | FileCheck --check-prefixes=CHECK,A16 %s
+// CHECK: [[E:%[A-z0-9]+]] = tail call i8* @__cxa_allocate_exception
+// CHECK-NEXT: [[BC:%[A-z0-9]+]] = bitcast i8* [[E]] to <2 x i64>*
+// A16-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 16
+#include 
+
+int main(void) {
+  try {
+throw vld1q_u64(((const uint64_t[2]){1, 2}));
+  } catch (uint64x2_t exc) {
+return 0;
+  }
+  return 1;
+}
+
Index: lib/Basic/Targets/ARM.cpp
===
--- lib/Basic/Targets/ARM.cpp
+++ lib/Basic/Targets/ARM.cpp
@@ -309,8 +309,9 @@
   setAtomic();
 
   // Maximum alignment for ARM NEON data types should be 64-bits (AAPCS)
+  // as well the default alignment
   if (IsAAPCS && (Triple.getEnvironment() != llvm::Triple::Android))
-MaxVectorAlign = 64;
+DefaultAlignForAttributeAligned = MaxVectorAlign = 64;
 
   // Do force alignment of members that follow zero length bitfields.  If
   // the alignment of the zero-length bitfield is greater than the member


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-07-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio marked 2 inline comments as done.
dnsampaio added a comment.

Set android-abi default to 128. Added tests for android and not-android.




Comment at: lib/Basic/Targets/ARM.cpp:311
 
   // Maximum alignment for ARM NEON data types should be 64-bits (AAPCS)
   if (IsAAPCS && (Triple.getEnvironment() != llvm::Triple::Android))

peter.smith wrote:
> I think that Android can require a higher alignment in some cases (See 
> below). I think that this was explained in D33205
Hi @peter.smith ,
thanks for the review, I did not know about the androidabi being 128 bits 
default. Updated accordingly.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-07-22 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 211034.
dnsampaio added a comment.

- Set androideabi alignment to 128 bits


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65000/new/

https://reviews.llvm.org/D65000

Files:
  lib/Basic/Targets/ARM.cpp
  test/CodeGen/ARM/exception-alignment.cpp
  test/SemaCXX/warn-overaligned-type-thrown.cpp


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-androideabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mips64el-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
Index: test/CodeGen/ARM/exception-alignment.cpp
===
--- /dev/null
+++ test/CodeGen/ARM/exception-alignment.cpp
@@ -0,0 +1,18 @@
+// Bug: https://bugs.llvm.org/show_bug.cgi?id=42668
+// REQUIRES: arm-registered-target
+// RUN: %clang --target=arm-arm-none-eabi -march=armv8-a -S -emit-llvm -Os -o 
- %s | FileCheck --check-prefixes=CHECK,A8 %s
+// RUN: %clang --target=arm-linux-androideabi -march=armv8-a -S -emit-llvm -Os 
-o - %s | FileCheck --check-prefixes=CHECK,A16 %s
+// CHECK: [[E:%[A-z0-9]+]] = tail call i8* @__cxa_allocate_exception
+// CHECK-NEXT: [[BC:%[A-z0-9]+]] = bitcast i8* [[E]] to <2 x i64>*
+// A16-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 16
+#include 
+
+int main(void) {
+  try {
+throw vld1q_u64(((const uint64_t[2]){1, 2}));
+  } catch (uint64x2_t exc) {
+return 0;
+  }
+  return 1;
+}
+
Index: lib/Basic/Targets/ARM.cpp
===
--- lib/Basic/Targets/ARM.cpp
+++ lib/Basic/Targets/ARM.cpp
@@ -325,6 +325,11 @@
: "\01mcount";
 
   SoftFloatABI = llvm::is_contained(Opts.FeaturesAsWritten, "+soft-float-abi");
+
+  // For AArch32, the largest alignment required by the ABI is 64-bit
+   // but Android ABI uses 128-bit alignment as default
+  DefaultAlignForAttributeAligned =
+   (Triple.getEnvironment() == llvm::Triple::Android)?128:64;
 }
 
 StringRef ARMTargetInfo::getABI() const { return ABI; }


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,12 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-androideabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions

[PATCH] D65000: [ARM] Set default alignment to 64bits

2019-07-19 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added reviewers: ostannard, dmgreen.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

The maximum alignment used by ARM arch
is 64bits, not 128.

This could cause overaligned memory
access for 128 bit neon vector that
have unpredictable behaviour.

This fixes: https://bugs.llvm.org/show_bug.cgi?id=42668


Repository:
  rC Clang

https://reviews.llvm.org/D65000

Files:
  lib/Basic/Targets/ARM.cpp
  test/CodeGen/ARM/exception-alignment.cpp
  test/SemaCXX/warn-overaligned-type-thrown.cpp


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,11 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only 
-std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mips64el-linux-gnu -verify -fsyntax-only -std=c++11 
-fcxx-exceptions -fexceptions %s
Index: test/CodeGen/ARM/exception-alignment.cpp
===
--- /dev/null
+++ test/CodeGen/ARM/exception-alignment.cpp
@@ -0,0 +1,17 @@
+// Bug: https://bugs.llvm.org/show_bug.cgi?id=42668
+// REQUIRES: arm-registered-target
+// RUN: %clang --target=arm-arm-none-eabi -march=armv8-a -S -emit-llvm -Os -o 
- %s | FileCheck %s
+// CHECK: [[E:%[A-z0-9]+]] = tail call i8* @__cxa_allocate_exception
+// CHECK-NEXT: [[BC:%[A-z0-9]+]] = bitcast i8* [[E]] to <2 x i64>*
+// CHECK-NEXT: store <2 x i64> , <2 x i64>* [[BC]], align 8
+#include 
+
+int main(void) {
+  try {
+throw vld1q_u64(((const uint64_t[2]){1, 2}));
+  } catch (uint64x2_t exc) {
+return 0;
+  }
+  return 1;
+}
+
Index: lib/Basic/Targets/ARM.cpp
===
--- lib/Basic/Targets/ARM.cpp
+++ lib/Basic/Targets/ARM.cpp
@@ -325,6 +325,9 @@
: "\01mcount";
 
   SoftFloatABI = llvm::is_contained(Opts.FeaturesAsWritten, "+soft-float-abi");
+
+  // For AArch32, the largest alignment required by the ABI is 64-bit
+  DefaultAlignForAttributeAligned = 64;
 }
 
 StringRef ARMTargetInfo::getABI() const { return ABI; }


Index: test/SemaCXX/warn-overaligned-type-thrown.cpp
===
--- test/SemaCXX/warn-overaligned-type-thrown.cpp
+++ test/SemaCXX/warn-overaligned-type-thrown.cpp
@@ -2,11 +2,11 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos10 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos4 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions -DUNDERALIGNED %s
+// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions  -DUNDERALIGNED %s
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.14 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-ios12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-tvos12 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple arm64-apple-watchos5 -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
-// RUN: %clang_cc1 -triple arm-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple aarch64-linux-gnueabi -verify -fsyntax-only -std=c++11 -fcxx-exceptions -fexceptions %s
 // RUN: %clang_cc1 -triple mipsel-linux-gnu -verify -fsyntax-only

[PATCH] D64211: [AArch64] Fix vector vuqadd intrinsics operands

2019-07-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL365609: [AArch64] Fix vector vuqadd intrinsics operands 
(authored by dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64211/new/

https://reviews.llvm.org/D64211

Files:
  cfe/trunk/include/clang/Basic/arm_neon.td
  cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c


Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -703,7 +703,7 @@
 
 

 // Signed Saturating Accumulated of Unsigned Value
-def SUQADD : SInst<"vuqadd", "ddd", "csilQcQsQiQl">;
+def SUQADD : SInst<"vuqadd", "ddu", "csilQcQsQiQl">;
 
 

 // Unsigned Saturating Accumulated of Signed Value
Index: cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
@@ -17528,6 +17528,50 @@
   return vabdd_f64(a, b);
 }
 
+// CHECK-LABEL: @test_vuqaddq_s8(
+// CHECK: entry:
+// CHECK-NEXT:  [[V:%.*]] = call <16 x i8> @llvm.aarch64.neon.suqadd.v16i8(<16 
x i8> %a, <16 x i8> %b)
+// CHECK-NEXT:  ret <16 x i8> [[V]]
+int8x16_t test_vuqaddq_s8(int8x16_t a, uint8x16_t b) {
+  return vuqaddq_s8(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s32(
+// CHECK: [[V:%.*]] = call <4 x i32> @llvm.aarch64.neon.suqadd.v4i32(<4 x i32> 
%a, <4 x i32> %b)
+// CHECK-NEXT:  ret <4 x i32> [[V]]
+int32x4_t test_vuqaddq_s32(int32x4_t a, uint32x4_t b) {
+  return vuqaddq_s32(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s64(
+// CHECK: [[V:%.*]] = call <2 x i64> @llvm.aarch64.neon.suqadd.v2i64(<2 x i64> 
%a, <2 x i64> %b)
+// CHECK-NEXT:  ret <2 x i64> [[V]]
+int64x2_t test_vuqaddq_s64(int64x2_t a, uint64x2_t b) {
+  return vuqaddq_s64(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s16(
+// CHECK: [[V:%.*]] = call <8 x i16> @llvm.aarch64.neon.suqadd.v8i16(<8 x i16> 
%a, <8 x i16> %b)
+// CHECK-NEXT:  ret <8 x i16> [[V]]
+int16x8_t test_vuqaddq_s16(int16x8_t a, uint16x8_t b) {
+  return vuqaddq_s16(a, b);
+}
+
+// CHECK-LABEL: @test_vuqadd_s8(
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call <8 x i8> @llvm.aarch64.neon.suqadd.v8i8(<8 x 
i8> %a, <8 x i8> %b)
+// CHECK-NEXT: ret <8 x i8> [[V]]
+int8x8_t test_vuqadd_s8(int8x8_t a, uint8x8_t b) {
+  return vuqadd_s8(a, b);
+}
+
+// CHECK-LABEL: @test_vuqadd_s32(
+// CHECK: [[V:%.*]] = call <2 x i32> @llvm.aarch64.neon.suqadd.v2i32(<2 x i32> 
%a, <2 x i32> %b)
+// CHECK-NEXT:  ret <2 x i32> [[V]]
+int32x2_t test_vuqadd_s32(int32x2_t a, uint32x2_t b) {
+  return vuqadd_s32(a, b);
+}
+
 // CHECK-LABEL: @test_vuqadd_s64(
 // CHECK:   [[TMP0:%.*]] = bitcast <1 x i64> %a to <8 x i8>
 // CHECK:   [[TMP1:%.*]] = bitcast <1 x i64> %b to <8 x i8>
@@ -17537,6 +17581,13 @@
   return vuqadd_s64(a, b);
 }
 
+// CHECK-LABEL: @test_vuqadd_s16(
+// CHECK: [[V:%.*]] = call <4 x i16> @llvm.aarch64.neon.suqadd.v4i16(<4 x i16> 
%a, <4 x i16> %b)
+// CHECK-NEXT:  ret <4 x i16> [[V]]
+int16x4_t test_vuqadd_s16(int16x4_t a, uint16x4_t b) {
+  return vuqadd_s16(a, b);
+}
+
 // CHECK-LABEL: @test_vsqadd_u64(
 // CHECK:   [[TMP0:%.*]] = bitcast <1 x i64> %a to <8 x i8>
 // CHECK:   [[TMP1:%.*]] = bitcast <1 x i64> %b to <8 x i8>


Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -703,7 +703,7 @@
 
 
 // Signed Saturating Accumulated of Unsigned Value
-def SUQADD : SInst<"vuqadd", "ddd", "csilQcQsQiQl">;
+def SUQADD : SInst<"vuqadd", "ddu", "csilQcQsQiQl">;
 
 
 // Unsigned Saturating Accumulated of Signed Value
Index: cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
@@ -17528,6 +17528,50 @@
   return vabdd_f64(a, b);
 }
 
+// CHECK-LABEL: @test_vuqaddq_s8(
+// CHECK: entry:
+// CHECK-NEXT:  [[V:%.*]] = call <16 x i8> @llvm.aarch64.neon.suqadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+// CHECK-NEXT:  ret <16 x i8> [[V]]
+int8x16_t test_vuqaddq_s8(int8x16_t a, uint8x16_t b) {
+  return vuqaddq_s8(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s32(
+// CHECK: [[V:%.*]] = call <4 x i32> @llvm.aarch64.neon.suqadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+// CHECK-NEXT:  ret <4 x i32> [[V]]
+int32x4_t

[PATCH] D64210: [NFC][AArch64] Fix vector vsqadd intrinsics operands

2019-07-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL365608: [NFC][AArch64] Fix vector vsqadd intrinsics operands 
(authored by dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64210/new/

https://reviews.llvm.org/D64210

Files:
  cfe/trunk/include/clang/Basic/arm_neon.td


Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -707,7 +707,7 @@
 
 

 // Unsigned Saturating Accumulated of Signed Value
-def USQADD : SInst<"vsqadd", "ddd", "UcUsUiUlQUcQUsQUiQUl">;
+def USQADD : SInst<"vsqadd", "ddx", "UcUsUiUlQUcQUsQUiQUl">;
 
 

 // Reciprocal/Sqrt


Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -707,7 +707,7 @@
 
 
 // Unsigned Saturating Accumulated of Signed Value
-def USQADD : SInst<"vsqadd", "ddd", "UcUsUiUlQUcQUsQUiQUl">;
+def USQADD : SInst<"vsqadd", "ddx", "UcUsUiUlQUcQUsQUiQUl">;
 
 
 // Reciprocal/Sqrt
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D64243: [NFC][AArch64] Fix vector vqtb[lx][1-4]_s8 operand

2019-07-10 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL365598: [NFC][AArch64] Fix vector vqtb[lx][1-4]_s8 operand 
(authored by dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64243/new/

https://reviews.llvm.org/D64243

Files:
  cfe/trunk/include/clang/Basic/arm_neon.td
  cfe/trunk/test/CodeGen/aarch64-neon-tbl.c

Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -1070,16 +1070,16 @@
 
 // Table lookup
 let InstName = "vtbl" in {
-def VQTBL1_A64 : WInst<"vqtbl1", "djt",  "UccPcQUcQcQPc">;
-def VQTBL2_A64 : WInst<"vqtbl2", "dBt",  "UccPcQUcQcQPc">;
-def VQTBL3_A64 : WInst<"vqtbl3", "dCt",  "UccPcQUcQcQPc">;
-def VQTBL4_A64 : WInst<"vqtbl4", "dDt",  "UccPcQUcQcQPc">;
+def VQTBL1_A64 : WInst<"vqtbl1", "dju",  "UccPcQUcQcQPc">;
+def VQTBL2_A64 : WInst<"vqtbl2", "dBu",  "UccPcQUcQcQPc">;
+def VQTBL3_A64 : WInst<"vqtbl3", "dCu",  "UccPcQUcQcQPc">;
+def VQTBL4_A64 : WInst<"vqtbl4", "dDu",  "UccPcQUcQcQPc">;
 }
 let InstName = "vtbx" in {
-def VQTBX1_A64 : WInst<"vqtbx1", "ddjt", "UccPcQUcQcQPc">;
-def VQTBX2_A64 : WInst<"vqtbx2", "ddBt", "UccPcQUcQcQPc">;
-def VQTBX3_A64 : WInst<"vqtbx3", "ddCt", "UccPcQUcQcQPc">;
-def VQTBX4_A64 : WInst<"vqtbx4", "ddDt", "UccPcQUcQcQPc">;
+def VQTBX1_A64 : WInst<"vqtbx1", "ddju", "UccPcQUcQcQPc">;
+def VQTBX2_A64 : WInst<"vqtbx2", "ddBu", "UccPcQUcQcQPc">;
+def VQTBX3_A64 : WInst<"vqtbx3", "ddCu", "UccPcQUcQcQPc">;
+def VQTBX4_A64 : WInst<"vqtbx4", "ddDu", "UccPcQUcQcQPc">;
 }
 
 
Index: cfe/trunk/test/CodeGen/aarch64-neon-tbl.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-tbl.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-tbl.c
@@ -16,7 +16,7 @@
 // CHECK-LABEL: define <8 x i8> @test_vqtbl1_s8(<16 x i8> %a, <8 x i8> %b) #1 {
 // CHECK:   [[VTBL1_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl1.v8i8(<16 x i8> %a, <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL1_I]]
-int8x8_t test_vqtbl1_s8(int8x16_t a, int8x8_t b) {
+int8x8_t test_vqtbl1_s8(int8x16_t a, uint8x8_t b) {
   return vqtbl1_s8(a, b);
 }
 
@@ -59,7 +59,7 @@
 // CHECK:   [[TMP2:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX2_I]], align 16
 // CHECK:   [[VTBL2_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl2.v8i8(<16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL2_I]]
-int8x8_t test_vqtbl2_s8(int8x16x2_t a, int8x8_t b) {
+int8x8_t test_vqtbl2_s8(int8x16x2_t a, uint8x8_t b) {
   return vqtbl2_s8(a, b);
 }
 
@@ -109,7 +109,7 @@
 // CHECK:   [[TMP3:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX4_I]], align 16
 // CHECK:   [[VTBL3_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl3.v8i8(<16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL3_I]]
-int8x8_t test_vqtbl3_s8(int8x16x3_t a, int8x8_t b) {
+int8x8_t test_vqtbl3_s8(int8x16x3_t a, uint8x8_t b) {
   return vqtbl3_s8(a, b);
 }
 
@@ -165,7 +165,7 @@
 // CHECK:   [[TMP4:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX6_I]], align 16
 // CHECK:   [[VTBL4_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl4.v8i8(<16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <16 x i8> [[TMP4]], <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL4_I]]
-int8x8_t test_vqtbl4_s8(int8x16x4_t a, int8x8_t b) {
+int8x8_t test_vqtbl4_s8(int8x16x4_t a, uint8x8_t b) {
   return vqtbl4_s8(a, b);
 }
 
@@ -348,7 +348,7 @@
 // CHECK-LABEL: define <8 x i8> @test_vqtbx1_s8(<8 x i8> %a, <16 x i8> %b, <8 x i8> %c) #1 {
 // CHECK:   [[VTBX1_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx1.v8i8(<8 x i8> %a, <16 x i8> %b, <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX1_I]]
-int8x8_t test_vqtbx1_s8(int8x8_t a, int8x16_t b, int8x8_t c) {
+int8x8_t test_vqtbx1_s8(int8x8_t a, int8x16_t b, uint8x8_t c) {
   return vqtbx1_s8(a, b, c);
 }
 
@@ -369,7 +369,7 @@
 // CHECK:   [[TMP2:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX2_I]], align 16
 // CHECK:   [[VTBX2_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx2.v8i8(<8 x i8> %a, <16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX2_I]]
-int8x8_t test_vqtbx2_s8(int8x8_t a, int8x16x2_t b, int8x8_t c) {
+int8x8_t test_vqtbx2_s8(int8x8_t a, int8x16x2_t b, uint8x8_t c) {
   return vqtbx2_s8(a, b, c);
 }
 
@@ -393,7 +393,7 @@
 // CHECK:   [[TMP3:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX4_I]], align 16
 // CHECK:   [[VTBX3_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx3.v8i8(<8 x i8> %a, <16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX3_I]]
-int8x8_t

[PATCH] D64242: [AArch64] Fix scalar vuqadd intrinsics operands

2019-07-08 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL365300: [AArch64] Fix scalar vuqadd intrinsics operands 
(authored by dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D64242?vs=208144=208332#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64242/new/

https://reviews.llvm.org/D64242

Files:
  cfe/trunk/include/clang/Basic/arm_neon.td
  cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
  cfe/trunk/test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c


Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -1333,7 +1333,7 @@
 
 

 // Scalar Signed Saturating Accumulated of Unsigned Value
-def SCALAR_SUQADD : SInst<"vuqadd", "sss", "ScSsSiSl">;
+def SCALAR_SUQADD : SInst<"vuqadd", "ssb", "ScSsSiSl">;
 
 

 // Scalar Unsigned Saturating Accumulated of Signed Value
Index: cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
@@ -13879,7 +13879,7 @@
 // CHECK:   [[VUQADDB_S8_I:%.*]] = call <8 x i8> 
@llvm.aarch64.neon.suqadd.v8i8(<8 x i8> [[TMP0]], <8 x i8> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <8 x i8> [[VUQADDB_S8_I]], i64 0
 // CHECK:   ret i8 [[TMP2]]
-int8_t test_vuqaddb_s8(int8_t a, int8_t b) {
+int8_t test_vuqaddb_s8(int8_t a, uint8_t b) {
   return (int8_t)vuqaddb_s8(a, b);
 }
 
@@ -13889,21 +13889,21 @@
 // CHECK:   [[VUQADDH_S16_I:%.*]] = call <4 x i16> 
@llvm.aarch64.neon.suqadd.v4i16(<4 x i16> [[TMP0]], <4 x i16> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <4 x i16> [[VUQADDH_S16_I]], i64 0
 // CHECK:   ret i16 [[TMP2]]
-int16_t test_vuqaddh_s16(int16_t a, int16_t b) {
+int16_t test_vuqaddh_s16(int16_t a, uint16_t b) {
   return (int16_t)vuqaddh_s16(a, b);
 }
 
 // CHECK-LABEL: @test_vuqadds_s32(
 // CHECK:   [[VUQADDS_S32_I:%.*]] = call i32 @llvm.aarch64.neon.suqadd.i32(i32 
%a, i32 %b)
 // CHECK:   ret i32 [[VUQADDS_S32_I]]
-int32_t test_vuqadds_s32(int32_t a, int32_t b) {
+int32_t test_vuqadds_s32(int32_t a, uint32_t b) {
   return (int32_t)vuqadds_s32(a, b);
 }
 
 // CHECK-LABEL: @test_vuqaddd_s64(
 // CHECK:   [[VUQADDD_S64_I:%.*]] = call i64 @llvm.aarch64.neon.suqadd.i64(i64 
%a, i64 %b)
 // CHECK:   ret i64 [[VUQADDD_S64_I]]
-int64_t test_vuqaddd_s64(int64_t a, int64_t b) {
+int64_t test_vuqaddd_s64(int64_t a, uint64_t b) {
   return (int64_t)vuqaddd_s64(a, b);
 }
 
Index: cfe/trunk/test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN:  -S -disable-O0-optnone -emit-llvm -o - %s 2>&1 | FileCheck %s
+
+#include 
+
+// Check float conversion is not accepted for unsigned int argument
+int8_t test_vuqaddb_s8(){
+  return vuqaddb_s8(1, -1.0f);
+}
+
+int16_t test_vuqaddh_s16() {
+  return vuqaddh_s16(1, -1.0f);
+}
+
+int32_t test_vuqadds_s32() {
+  return vuqadds_s32(1, -1.0f);
+}
+
+int64_t test_vuqaddd_s64() {
+  return vuqaddd_s64(1, -1.0f);
+}
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint8_t' (aka 'unsigned char') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint16_t' (aka 'unsigned short') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint32_t' (aka 'unsigned int') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint64_t' (aka 'unsigned long') is undefined
+


Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -1333,7 +1333,7 @@
 
 
 // Scalar Signed Saturating Accumulated of Unsigned Value
-def SCALAR_SUQADD : SInst<"vuqadd", "sss", "ScSsSiSl">;
+def SCALAR_SUQADD : SInst<"vuqadd", "ssb", "ScSsSiSl">;
 
 
 // Scalar Unsigned Saturating Accumulated of Signed Value
Index: cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
===
---

[PATCH] D64239: [AArch64] Fix vsqadd scalar intrinsics operands

2019-07-08 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL365298: [AArch64] Fix vsqadd scalar intrinsics operands 
(authored by dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D64239?vs=208137=208330#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64239/new/

https://reviews.llvm.org/D64239

Files:
  cfe/trunk/include/clang/Basic/arm_neon.td
  cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
  cfe/trunk/test/CodeGen/aarch64-neon-vsqadd-float-conversion.c

Index: cfe/trunk/include/clang/Basic/arm_neon.td
===
--- cfe/trunk/include/clang/Basic/arm_neon.td
+++ cfe/trunk/include/clang/Basic/arm_neon.td
@@ -1337,7 +1337,7 @@
 
 
 // Scalar Unsigned Saturating Accumulated of Signed Value
-def SCALAR_USQADD : SInst<"vsqadd", "sss", "SUcSUsSUiSUl">;
+def SCALAR_USQADD : SInst<"vsqadd", "ss$", "SUcSUsSUiSUl">;
 
 
 // Signed Saturating Doubling Multiply-Add Long
Index: cfe/trunk/test/CodeGen/aarch64-neon-vsqadd-float-conversion.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-vsqadd-float-conversion.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-vsqadd-float-conversion.c
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN:  -S -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg -dce \
+// RUN: | FileCheck %s
+
+#include 
+
+// Check float conversion is accepted for int argument
+uint8_t test_vsqaddb_u8(){
+  return vsqaddb_u8(1, -1.0f);
+}
+
+uint16_t test_vsqaddh_u16() {
+  return vsqaddh_u16(1, -1.0f);
+}
+
+uint32_t test_vsqadds_u32() {
+  return vsqadds_u32(1, -1.0f);
+}
+
+uint64_t test_vsqaddd_u64() {
+  return vsqaddd_u64(1, -1.0f);
+}
+
+// CHECK-LABEL: @test_vsqaddb_u8()
+// CHECK: entry:
+// CHECK-NEXT: [[T0:%.*]] = insertelement <8 x i8> undef, i8 1, i64 0
+// CHECK-NEXT: [[T1:%.*]] = insertelement <8 x i8> undef, i8 -1, i64 0
+// CHECK-NEXT: [[V:%.*]] = call <8 x i8> @llvm.aarch64.neon.usqadd.v8i8(<8 x i8> [[T0]], <8 x i8> [[T1]])
+// CHECK-NEXT: [[R:%.*]] = extractelement <8 x i8> [[V]], i64 0
+// CHECK-NEXT: ret i8 [[R]]
+
+// CHECK-LABEL: @test_vsqaddh_u16()
+// CHECK: entry:
+// CHECK-NEXT: [[T0:%.*]] = insertelement <4 x i16> undef, i16 1, i64 0
+// CHECK-NEXT: [[T1:%.*]] = insertelement <4 x i16> undef, i16 -1, i64 0
+// CHECK-NEXT: [[V:%.*]]  = call <4 x i16> @llvm.aarch64.neon.usqadd.v4i16(<4 x i16> [[T0]], <4 x i16> [[T1]])
+// CHECK-NEXT: [[R:%.*]] = extractelement <4 x i16> [[V]], i64 0
+// CHECK-NEXT: ret i16 [[R]]
+
+// CHECK-LABEL: @test_vsqadds_u32()
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call i32 @llvm.aarch64.neon.usqadd.i32(i32 1, i32 -1)
+// CHECK-NEXT: ret i32 [[V]]
+
+// CHECK-LABEL: @test_vsqaddd_u64()
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call i64 @llvm.aarch64.neon.usqadd.i64(i64 1, i64 -1)
+// CHECK-NEXT: ret i64 [[V]]
+
Index: cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
===
--- cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
+++ cfe/trunk/test/CodeGen/aarch64-neon-intrinsics.c
@@ -13913,7 +13913,7 @@
 // CHECK:   [[VSQADDB_U8_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.usqadd.v8i8(<8 x i8> [[TMP0]], <8 x i8> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <8 x i8> [[VSQADDB_U8_I]], i64 0
 // CHECK:   ret i8 [[TMP2]]
-uint8_t test_vsqaddb_u8(uint8_t a, uint8_t b) {
+uint8_t test_vsqaddb_u8(uint8_t a, int8_t b) {
   return (uint8_t)vsqaddb_u8(a, b);
 }
 
@@ -13923,21 +13923,21 @@
 // CHECK:   [[VSQADDH_U16_I:%.*]] = call <4 x i16> @llvm.aarch64.neon.usqadd.v4i16(<4 x i16> [[TMP0]], <4 x i16> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <4 x i16> [[VSQADDH_U16_I]], i64 0
 // CHECK:   ret i16 [[TMP2]]
-uint16_t test_vsqaddh_u16(uint16_t a, uint16_t b) {
+uint16_t test_vsqaddh_u16(uint16_t a, int16_t b) {
   return (uint16_t)vsqaddh_u16(a, b);
 }
 
 // CHECK-LABEL: @test_vsqadds_u32(
 // CHECK:   [[VSQADDS_U32_I:%.*]] = call i32 @llvm.aarch64.neon.usqadd.i32(i32 %a, i32 %b)
 // CHECK:   ret i32 [[VSQADDS_U32_I]]
-uint32_t test_vsqadds_u32(uint32_t a, uint32_t b) {
+uint32_t test_vsqadds_u32(uint32_t a, int32_t b) {
   return (uint32_t)vsqadds_u32(a, b);
 }
 
 // CHECK-LABEL: @test_vsqaddd_u64(
 // CHECK:   [[VSQADDD_U64_I:%.*]] = call i64 @llvm.aarch64.neon.usqadd.i64(i64 %a, i64 %b)
 // CHECK:   ret i64 [[VSQADDD_U64_I]]
-uint64_t test_vsqaddd_u64(uint64_t a, uint64_t b) {
+uint64_t test_vsqaddd_u64(uint64_t a, int64_t b) {
   return (uint64_t)vsqaddd_u64(a, b);
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org

[PATCH] D64243: [NFC][AArch64] Fix vector vqtb[lx][1-4]_s8 operand

2019-07-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added a reviewer: LukeCheeseman.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

Change the vqtb[lx][1-4]_s8 instrinsics to have the last argument as vector of 
unsigned valuse, not
signed, accordingly to 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics


Repository:
  rC Clang

https://reviews.llvm.org/D64243

Files:
  include/clang/Basic/arm_neon.td
  test/CodeGen/aarch64-neon-tbl.c

Index: test/CodeGen/aarch64-neon-tbl.c
===
--- test/CodeGen/aarch64-neon-tbl.c
+++ test/CodeGen/aarch64-neon-tbl.c
@@ -16,7 +16,7 @@
 // CHECK-LABEL: define <8 x i8> @test_vqtbl1_s8(<16 x i8> %a, <8 x i8> %b) #1 {
 // CHECK:   [[VTBL1_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl1.v8i8(<16 x i8> %a, <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL1_I]]
-int8x8_t test_vqtbl1_s8(int8x16_t a, int8x8_t b) {
+int8x8_t test_vqtbl1_s8(int8x16_t a, uint8x8_t b) {
   return vqtbl1_s8(a, b);
 }
 
@@ -59,7 +59,7 @@
 // CHECK:   [[TMP2:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX2_I]], align 16
 // CHECK:   [[VTBL2_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl2.v8i8(<16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL2_I]]
-int8x8_t test_vqtbl2_s8(int8x16x2_t a, int8x8_t b) {
+int8x8_t test_vqtbl2_s8(int8x16x2_t a, uint8x8_t b) {
   return vqtbl2_s8(a, b);
 }
 
@@ -109,7 +109,7 @@
 // CHECK:   [[TMP3:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX4_I]], align 16
 // CHECK:   [[VTBL3_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl3.v8i8(<16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL3_I]]
-int8x8_t test_vqtbl3_s8(int8x16x3_t a, int8x8_t b) {
+int8x8_t test_vqtbl3_s8(int8x16x3_t a, uint8x8_t b) {
   return vqtbl3_s8(a, b);
 }
 
@@ -165,7 +165,7 @@
 // CHECK:   [[TMP4:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX6_I]], align 16
 // CHECK:   [[VTBL4_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbl4.v8i8(<16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <16 x i8> [[TMP4]], <8 x i8> %b) #3
 // CHECK:   ret <8 x i8> [[VTBL4_I]]
-int8x8_t test_vqtbl4_s8(int8x16x4_t a, int8x8_t b) {
+int8x8_t test_vqtbl4_s8(int8x16x4_t a, uint8x8_t b) {
   return vqtbl4_s8(a, b);
 }
 
@@ -348,7 +348,7 @@
 // CHECK-LABEL: define <8 x i8> @test_vqtbx1_s8(<8 x i8> %a, <16 x i8> %b, <8 x i8> %c) #1 {
 // CHECK:   [[VTBX1_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx1.v8i8(<8 x i8> %a, <16 x i8> %b, <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX1_I]]
-int8x8_t test_vqtbx1_s8(int8x8_t a, int8x16_t b, int8x8_t c) {
+int8x8_t test_vqtbx1_s8(int8x8_t a, int8x16_t b, uint8x8_t c) {
   return vqtbx1_s8(a, b, c);
 }
 
@@ -369,7 +369,7 @@
 // CHECK:   [[TMP2:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX2_I]], align 16
 // CHECK:   [[VTBX2_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx2.v8i8(<8 x i8> %a, <16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX2_I]]
-int8x8_t test_vqtbx2_s8(int8x8_t a, int8x16x2_t b, int8x8_t c) {
+int8x8_t test_vqtbx2_s8(int8x8_t a, int8x16x2_t b, uint8x8_t c) {
   return vqtbx2_s8(a, b, c);
 }
 
@@ -393,7 +393,7 @@
 // CHECK:   [[TMP3:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX4_I]], align 16
 // CHECK:   [[VTBX3_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx3.v8i8(<8 x i8> %a, <16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX3_I]]
-int8x8_t test_vqtbx3_s8(int8x8_t a, int8x16x3_t b, int8x8_t c) {
+int8x8_t test_vqtbx3_s8(int8x8_t a, int8x16x3_t b, uint8x8_t c) {
   return vqtbx3_s8(a, b, c);
 }
 
@@ -420,14 +420,14 @@
 // CHECK:   [[TMP4:%.*]] = load <16 x i8>, <16 x i8>* [[ARRAYIDX6_I]], align 16
 // CHECK:   [[VTBX4_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.tbx4.v8i8(<8 x i8> %a, <16 x i8> [[TMP1]], <16 x i8> [[TMP2]], <16 x i8> [[TMP3]], <16 x i8> [[TMP4]], <8 x i8> %c) #3
 // CHECK:   ret <8 x i8> [[VTBX4_I]]
-int8x8_t test_vqtbx4_s8(int8x8_t a, int8x16x4_t b, int8x8_t c) {
+int8x8_t test_vqtbx4_s8(int8x8_t a, int8x16x4_t b, uint8x8_t c) {
   return vqtbx4_s8(a, b, c);
 }
 
 // CHECK-LABEL: define <16 x i8> @test_vqtbx1q_s8(<16 x i8> %a, <16 x i8> %b, <16 x i8> %c) #1 {
 // CHECK:   [[VTBX1_I:%.*]] = call <16 x i8> @llvm.aarch64.neon.tbx1.v16i8(<16 x i8> %a, <16 x i8> %b, <16 x i8> %c) #3
 // CHECK:   ret <16 x i8> [[VTBX1_I]]
-int8x16_t test_vqtbx1q_s8(int8x16_t a, int8x16_t b, int8x16_t c) {
+int8x16_t test_vqtbx1q_s8(int8x16_t a, int8x16_t b, uint8x16_t c) {
   return vqtbx1q_s8(a, b, c);
 }
 
Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -1070,16 +1070,16 @@
 
 // Table lookup
 let InstName = "vtbl" in {
-def VQTBL1_A64 :

[PATCH] D64242: [AArch64] Fix scalar vuqadd intrinsics operands

2019-07-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 208144.
dnsampaio added a comment.

- Fix previously existing tests


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64242/new/

https://reviews.llvm.org/D64242

Files:
  include/clang/Basic/arm_neon.td
  test/CodeGen/aarch64-neon-intrinsics.c
  test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c


Index: test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
===
--- /dev/null
+++ test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone 
-emit-llvm -o - %s 2>&1 \
+// RUN: | FileCheck %s
+
+#include 
+
+// Check float conversion is not accepted for unsigned int argument
+int8_t test_vuqaddb_s8(){
+  return vuqaddb_s8(1, -1.0f);
+}
+
+int16_t test_vuqaddh_s16() {
+  return vuqaddh_s16(1, -1.0f);
+}
+
+int32_t test_vuqadds_s32() {
+  return vuqadds_s32(1, -1.0f);
+}
+
+int64_t test_vuqaddd_s64() {
+  return vuqaddd_s64(1, -1.0f);
+}
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint8_t' (aka 'unsigned char') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint16_t' (aka 'unsigned short') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint32_t' (aka 'unsigned int') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint64_t' (aka 'unsigned long') is undefined
+
Index: test/CodeGen/aarch64-neon-intrinsics.c
===
--- test/CodeGen/aarch64-neon-intrinsics.c
+++ test/CodeGen/aarch64-neon-intrinsics.c
@@ -13879,7 +13879,7 @@
 // CHECK:   [[VUQADDB_S8_I:%.*]] = call <8 x i8> 
@llvm.aarch64.neon.suqadd.v8i8(<8 x i8> [[TMP0]], <8 x i8> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <8 x i8> [[VUQADDB_S8_I]], i64 0
 // CHECK:   ret i8 [[TMP2]]
-int8_t test_vuqaddb_s8(int8_t a, int8_t b) {
+int8_t test_vuqaddb_s8(int8_t a, uint8_t b) {
   return (int8_t)vuqaddb_s8(a, b);
 }
 
@@ -13889,21 +13889,21 @@
 // CHECK:   [[VUQADDH_S16_I:%.*]] = call <4 x i16> 
@llvm.aarch64.neon.suqadd.v4i16(<4 x i16> [[TMP0]], <4 x i16> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <4 x i16> [[VUQADDH_S16_I]], i64 0
 // CHECK:   ret i16 [[TMP2]]
-int16_t test_vuqaddh_s16(int16_t a, int16_t b) {
+int16_t test_vuqaddh_s16(int16_t a, uint16_t b) {
   return (int16_t)vuqaddh_s16(a, b);
 }
 
 // CHECK-LABEL: @test_vuqadds_s32(
 // CHECK:   [[VUQADDS_S32_I:%.*]] = call i32 @llvm.aarch64.neon.suqadd.i32(i32 
%a, i32 %b)
 // CHECK:   ret i32 [[VUQADDS_S32_I]]
-int32_t test_vuqadds_s32(int32_t a, int32_t b) {
+int32_t test_vuqadds_s32(int32_t a, uint32_t b) {
   return (int32_t)vuqadds_s32(a, b);
 }
 
 // CHECK-LABEL: @test_vuqaddd_s64(
 // CHECK:   [[VUQADDD_S64_I:%.*]] = call i64 @llvm.aarch64.neon.suqadd.i64(i64 
%a, i64 %b)
 // CHECK:   ret i64 [[VUQADDD_S64_I]]
-int64_t test_vuqaddd_s64(int64_t a, int64_t b) {
+int64_t test_vuqaddd_s64(int64_t a, uint64_t b) {
   return (int64_t)vuqaddd_s64(a, b);
 }
 
Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -1333,7 +1333,7 @@
 
 

 // Scalar Signed Saturating Accumulated of Unsigned Value
-def SCALAR_SUQADD : SInst<"vuqadd", "sss", "ScSsSiSl">;
+def SCALAR_SUQADD : SInst<"vuqadd", "ssb", "ScSsSiSl">;
 
 

 // Scalar Unsigned Saturating Accumulated of Signed Value


Index: test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
===
--- /dev/null
+++ test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone -emit-llvm -o - %s 2>&1 \
+// RUN: | FileCheck %s
+
+#include 
+
+// Check float conversion is not accepted for unsigned int argument
+int8_t test_vuqaddb_s8(){
+  return vuqaddb_s8(1, -1.0f);
+}
+
+int16_t test_vuqaddh_s16() {
+  return vuqaddh_s16(1, -1.0f);
+}
+
+int32_t test_vuqadds_s32() {
+  return vuqadds_s32(1, -1.0f);
+}
+
+int64_t test_vuqaddd_s64() {
+  return vuqaddd_s64(1, -1.0f);
+}
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint8_t' (aka 'unsigned char') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint16_t' (aka 'unsigned short') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint32_t' (aka 'unsigned

[PATCH] D64242: [AArch64] Fix scalar vuqadd intrinsics operands

2019-07-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added a reviewer: LukeCheeseman.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

Change the vuqadd scalar instrinsics to have the second argument as unsigned 
values, not signed,
accordingly to 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

So now the compiler correctly warns that a undefined negative float conversion 
is being done.


Repository:
  rC Clang

https://reviews.llvm.org/D64242

Files:
  include/clang/Basic/arm_neon.td
  test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c


Index: test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
===
--- /dev/null
+++ test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone 
-emit-llvm -o - %s 2>&1 \
+// RUN: | FileCheck %s
+
+#include 
+
+// Check float conversion is not accepted for unsigned int argument
+int8_t test_vuqaddb_s8(){
+  return vuqaddb_s8(1, -1.0f);
+}
+
+int16_t test_vuqaddh_s16() {
+  return vuqaddh_s16(1, -1.0f);
+}
+
+int32_t test_vuqadds_s32() {
+  return vuqadds_s32(1, -1.0f);
+}
+
+int64_t test_vuqaddd_s64() {
+  return vuqaddd_s64(1, -1.0f);
+}
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint8_t' (aka 'unsigned char') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint16_t' (aka 'unsigned short') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint32_t' (aka 'unsigned int') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 
'uint64_t' (aka 'unsigned long') is undefined
+
Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -1333,7 +1333,7 @@
 
 

 // Scalar Signed Saturating Accumulated of Unsigned Value
-def SCALAR_SUQADD : SInst<"vuqadd", "sss", "ScSsSiSl">;
+def SCALAR_SUQADD : SInst<"vuqadd", "ssb", "ScSsSiSl">;
 
 

 // Scalar Unsigned Saturating Accumulated of Signed Value


Index: test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
===
--- /dev/null
+++ test/CodeGen/aarch64-neon-vuqadd-float-conversion-warning.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone -emit-llvm -o - %s 2>&1 \
+// RUN: | FileCheck %s
+
+#include 
+
+// Check float conversion is not accepted for unsigned int argument
+int8_t test_vuqaddb_s8(){
+  return vuqaddb_s8(1, -1.0f);
+}
+
+int16_t test_vuqaddh_s16() {
+  return vuqaddh_s16(1, -1.0f);
+}
+
+int32_t test_vuqadds_s32() {
+  return vuqadds_s32(1, -1.0f);
+}
+
+int64_t test_vuqaddd_s64() {
+  return vuqaddd_s64(1, -1.0f);
+}
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint8_t' (aka 'unsigned char') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint16_t' (aka 'unsigned short') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint32_t' (aka 'unsigned int') is undefined
+// CHECK: warning: implicit conversion of out of range value from 'float' to 'uint64_t' (aka 'unsigned long') is undefined
+
Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -1333,7 +1333,7 @@
 
 
 // Scalar Signed Saturating Accumulated of Unsigned Value
-def SCALAR_SUQADD : SInst<"vuqadd", "sss", "ScSsSiSl">;
+def SCALAR_SUQADD : SInst<"vuqadd", "ssb", "ScSsSiSl">;
 
 
 // Scalar Unsigned Saturating Accumulated of Signed Value
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D64239: [AArch64] Fix vsqadd scalar intrinsics operands

2019-07-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added a reviewer: LukeCheeseman.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

Change the vsqadd scalar instrinsics to have the second argument as signed 
values, not unsigned,
accordingly to 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

The existing unsigned argument can cause faulty code as float to unsigned 
conversion is undefined,
which llvm/clang optimizes away.


Repository:
  rC Clang

https://reviews.llvm.org/D64239

Files:
  include/clang/Basic/arm_neon.td
  test/CodeGen/aarch64-neon-intrinsics.c
  test/CodeGen/aarch64-neon-vsqadd-float-conversion.c

Index: test/CodeGen/aarch64-neon-vsqadd-float-conversion.c
===
--- /dev/null
+++ test/CodeGen/aarch64-neon-vsqadd-float-conversion.c
@@ -0,0 +1,50 @@
+// RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +neon \
+// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone -emit-llvm -o - %s \
+// RUN: | opt -S -mem2reg -dce \
+// RUN: | FileCheck %s
+
+#include 
+
+// Check float conversion is accepted for int argument
+uint8_t test_vsqaddb_u8(){
+  return vsqaddb_u8(1, -1.0f);
+}
+
+uint16_t test_vsqaddh_u16() {
+  return vsqaddh_u16(1, -1.0f);
+}
+
+uint32_t test_vsqadds_u32() {
+  return vsqadds_u32(1, -1.0f);
+}
+
+uint64_t test_vsqaddd_u64() {
+  return vsqaddd_u64(1, -1.0f);
+}
+
+// CHECK-LABEL: @test_vsqaddb_u8()
+// CHECK: entry:
+// CHECK-NEXT: [[T0:%.*]] = insertelement <8 x i8> undef, i8 1, i64 0
+// CHECK-NEXT: [[T1:%.*]] = insertelement <8 x i8> undef, i8 -1, i64 0
+// CHECK-NEXT: [[V:%.*]] = call <8 x i8> @llvm.aarch64.neon.usqadd.v8i8(<8 x i8> [[T0]], <8 x i8> [[T1]])
+// CHECK-NEXT: [[R:%.*]] = extractelement <8 x i8> [[V]], i64 0
+// CHECK-NEXT: ret i8 [[R]]
+
+// CHECK-LABEL: @test_vsqaddh_u16()
+// CHECK: entry:
+// CHECK-NEXT: [[T0:%.*]] = insertelement <4 x i16> undef, i16 1, i64 0
+// CHECK-NEXT: [[T1:%.*]] = insertelement <4 x i16> undef, i16 -1, i64 0
+// CHECK-NEXT: [[V:%.*]]  = call <4 x i16> @llvm.aarch64.neon.usqadd.v4i16(<4 x i16> [[T0]], <4 x i16> [[T1]])
+// CHECK-NEXT: [[R:%.*]] = extractelement <4 x i16> [[V]], i64 0
+// CHECK-NEXT: ret i16 [[R]]
+
+// CHECK-LABEL: @test_vsqadds_u32()
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call i32 @llvm.aarch64.neon.usqadd.i32(i32 1, i32 -1)
+// CHECK-NEXT: ret i32 [[V]]
+
+// CHECK-LABEL: @test_vsqaddd_u64()
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call i64 @llvm.aarch64.neon.usqadd.i64(i64 1, i64 -1)
+// CHECK-NEXT: ret i64 [[V]]
+
Index: test/CodeGen/aarch64-neon-intrinsics.c
===
--- test/CodeGen/aarch64-neon-intrinsics.c
+++ test/CodeGen/aarch64-neon-intrinsics.c
@@ -13913,7 +13913,7 @@
 // CHECK:   [[VSQADDB_U8_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.usqadd.v8i8(<8 x i8> [[TMP0]], <8 x i8> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <8 x i8> [[VSQADDB_U8_I]], i64 0
 // CHECK:   ret i8 [[TMP2]]
-uint8_t test_vsqaddb_u8(uint8_t a, uint8_t b) {
+uint8_t test_vsqaddb_u8(uint8_t a, int8_t b) {
   return (uint8_t)vsqaddb_u8(a, b);
 }
 
@@ -13923,21 +13923,21 @@
 // CHECK:   [[VSQADDH_U16_I:%.*]] = call <4 x i16> @llvm.aarch64.neon.usqadd.v4i16(<4 x i16> [[TMP0]], <4 x i16> [[TMP1]])
 // CHECK:   [[TMP2:%.*]] = extractelement <4 x i16> [[VSQADDH_U16_I]], i64 0
 // CHECK:   ret i16 [[TMP2]]
-uint16_t test_vsqaddh_u16(uint16_t a, uint16_t b) {
+uint16_t test_vsqaddh_u16(uint16_t a, int16_t b) {
   return (uint16_t)vsqaddh_u16(a, b);
 }
 
 // CHECK-LABEL: @test_vsqadds_u32(
 // CHECK:   [[VSQADDS_U32_I:%.*]] = call i32 @llvm.aarch64.neon.usqadd.i32(i32 %a, i32 %b)
 // CHECK:   ret i32 [[VSQADDS_U32_I]]
-uint32_t test_vsqadds_u32(uint32_t a, uint32_t b) {
+uint32_t test_vsqadds_u32(uint32_t a, int32_t b) {
   return (uint32_t)vsqadds_u32(a, b);
 }
 
 // CHECK-LABEL: @test_vsqaddd_u64(
 // CHECK:   [[VSQADDD_U64_I:%.*]] = call i64 @llvm.aarch64.neon.usqadd.i64(i64 %a, i64 %b)
 // CHECK:   ret i64 [[VSQADDD_U64_I]]
-uint64_t test_vsqaddd_u64(uint64_t a, uint64_t b) {
+uint64_t test_vsqaddd_u64(uint64_t a, int64_t b) {
   return (uint64_t)vsqaddd_u64(a, b);
 }
 
Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -1337,7 +1337,7 @@
 
 
 // Scalar Unsigned Saturating Accumulated of Signed Value
-def SCALAR_USQADD : SInst<"vsqadd", "sss", "SUcSUsSUiSUl">;
+def SCALAR_USQADD : SInst<"vsqadd", "ss$", "SUcSUsSUiSUl">;
 
 
 // Signed Saturating Doubling Multiply-Add Long
___
cfe-commits mailing list
cfe-commits@lists.llvm.org

[PATCH] D64211: [ARM] Fix vector vuqadd intrinsics operands

2019-07-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio updated this revision to Diff 208127.
dnsampaio added a comment.

- Added tests


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64211/new/

https://reviews.llvm.org/D64211

Files:
  include/clang/Basic/arm_neon.td
  test/CodeGen/aarch64-neon-intrinsics.c


Index: test/CodeGen/aarch64-neon-intrinsics.c
===
--- test/CodeGen/aarch64-neon-intrinsics.c
+++ test/CodeGen/aarch64-neon-intrinsics.c
@@ -17528,6 +17528,50 @@
   return vabdd_f64(a, b);
 }
 
+// CHECK-LABEL: @test_vuqaddq_s8(
+// CHECK: entry:
+// CHECK-NEXT:  [[V:%.*]] = call <16 x i8> @llvm.aarch64.neon.suqadd.v16i8(<16 
x i8> %a, <16 x i8> %b)
+// CHECK-NEXT:  ret <16 x i8> [[V]]
+int8x16_t test_vuqaddq_s8(int8x16_t a, uint8x16_t b) {
+  return vuqaddq_s8(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s32(
+// CHECK: [[V:%.*]] = call <4 x i32> @llvm.aarch64.neon.suqadd.v4i32(<4 x i32> 
%a, <4 x i32> %b)
+// CHECK-NEXT:  ret <4 x i32> [[V]]
+int32x4_t test_vuqaddq_s32(int32x4_t a, uint32x4_t b) {
+  return vuqaddq_s32(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s64(
+// CHECK: [[V:%.*]] = call <2 x i64> @llvm.aarch64.neon.suqadd.v2i64(<2 x i64> 
%a, <2 x i64> %b)
+// CHECK-NEXT:  ret <2 x i64> [[V]]
+int64x2_t test_vuqaddq_s64(int64x2_t a, uint64x2_t b) {
+  return vuqaddq_s64(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s16(
+// CHECK: [[V:%.*]] = call <8 x i16> @llvm.aarch64.neon.suqadd.v8i16(<8 x i16> 
%a, <8 x i16> %b)
+// CHECK-NEXT:  ret <8 x i16> [[V]]
+int16x8_t test_vuqaddq_s16(int16x8_t a, uint16x8_t b) {
+  return vuqaddq_s16(a, b);
+}
+
+// CHECK-LABEL: @test_vuqadd_s8(
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call <8 x i8> @llvm.aarch64.neon.suqadd.v8i8(<8 x 
i8> %a, <8 x i8> %b)
+// CHECK-NEXT: ret <8 x i8> [[V]]
+int8x8_t test_vuqadd_s8(int8x8_t a, uint8x8_t b) {
+  return vuqadd_s8(a, b);
+}
+
+// CHECK-LABEL: @test_vuqadd_s32(
+// CHECK: [[V:%.*]] = call <2 x i32> @llvm.aarch64.neon.suqadd.v2i32(<2 x i32> 
%a, <2 x i32> %b)
+// CHECK-NEXT:  ret <2 x i32> [[V]]
+int32x2_t test_vuqadd_s32(int32x2_t a, uint32x2_t b) {
+  return vuqadd_s32(a, b);
+}
+
 // CHECK-LABEL: @test_vuqadd_s64(
 // CHECK:   [[TMP0:%.*]] = bitcast <1 x i64> %a to <8 x i8>
 // CHECK:   [[TMP1:%.*]] = bitcast <1 x i64> %b to <8 x i8>
@@ -17537,6 +17581,13 @@
   return vuqadd_s64(a, b);
 }
 
+// CHECK-LABEL: @test_vuqadd_s16(
+// CHECK: [[V:%.*]] = call <4 x i16> @llvm.aarch64.neon.suqadd.v4i16(<4 x i16> 
%a, <4 x i16> %b)
+// CHECK-NEXT:  ret <4 x i16> [[V]]
+int16x4_t test_vuqadd_s16(int16x4_t a, uint16x4_t b) {
+  return vuqadd_s16(a, b);
+}
+
 // CHECK-LABEL: @test_vsqadd_u64(
 // CHECK:   [[TMP0:%.*]] = bitcast <1 x i64> %a to <8 x i8>
 // CHECK:   [[TMP1:%.*]] = bitcast <1 x i64> %b to <8 x i8>
Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -703,7 +703,7 @@
 
 

 // Signed Saturating Accumulated of Unsigned Value
-def SUQADD : SInst<"vuqadd", "ddd", "csilQcQsQiQl">;
+def SUQADD : SInst<"vuqadd", "ddu", "csilQcQsQiQl">;
 
 

 // Unsigned Saturating Accumulated of Signed Value


Index: test/CodeGen/aarch64-neon-intrinsics.c
===
--- test/CodeGen/aarch64-neon-intrinsics.c
+++ test/CodeGen/aarch64-neon-intrinsics.c
@@ -17528,6 +17528,50 @@
   return vabdd_f64(a, b);
 }
 
+// CHECK-LABEL: @test_vuqaddq_s8(
+// CHECK: entry:
+// CHECK-NEXT:  [[V:%.*]] = call <16 x i8> @llvm.aarch64.neon.suqadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+// CHECK-NEXT:  ret <16 x i8> [[V]]
+int8x16_t test_vuqaddq_s8(int8x16_t a, uint8x16_t b) {
+  return vuqaddq_s8(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s32(
+// CHECK: [[V:%.*]] = call <4 x i32> @llvm.aarch64.neon.suqadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+// CHECK-NEXT:  ret <4 x i32> [[V]]
+int32x4_t test_vuqaddq_s32(int32x4_t a, uint32x4_t b) {
+  return vuqaddq_s32(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s64(
+// CHECK: [[V:%.*]] = call <2 x i64> @llvm.aarch64.neon.suqadd.v2i64(<2 x i64> %a, <2 x i64> %b)
+// CHECK-NEXT:  ret <2 x i64> [[V]]
+int64x2_t test_vuqaddq_s64(int64x2_t a, uint64x2_t b) {
+  return vuqaddq_s64(a, b);
+}
+
+// CHECK-LABEL: @test_vuqaddq_s16(
+// CHECK: [[V:%.*]] = call <8 x i16> @llvm.aarch64.neon.suqadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+// CHECK-NEXT:  ret <8 x i16> [[V]]
+int16x8_t test_vuqaddq_s16(int16x8_t a, uint16x8_t b) {
+  return vuqaddq_s16(a, b);
+}
+
+// CHECK-LABEL: @test_vuqadd_s8(
+// CHECK: entry:
+// CHECK-NEXT: [[V:%.*]] = call <8 x i8> @llvm.aarch64.neon.suqadd.v8i8(<8 x i8> %a, <8 x i8> %b)
+// CHECK-NEXT: ret <8 x i8> [[V]]
+int8x8_t test_vuqadd_s8(int8x8_t a, uint8x8_t b) {
+  return vuqadd_s8(a, b);
+}
+
+// CHECK-LABEL:

[PATCH] D64210: [NFC][ARM] Fix vector vsqadd intrinsics operands

2019-07-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio added a comment.

In D64210#1570515 , @LukeCheeseman 
wrote:

> Are there some changes/addition to tests attached to this?


Hi,
I can see no difference in the CodeGen test in 
tools/clang/test/CodeGen/aarch64-neon-intrinsics.c, which already tests tests 
these intrinsics with the correct arguments. So technically speaking, it is a 
NFC.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64210/new/

https://reviews.llvm.org/D64210



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D64211: [ARM] Fix vector vuqadd intrinsics operands

2019-07-04 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added a reviewer: LukeCheeseman.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

Change the vuqadd vector instrinsics to have the second argument as unsigned 
values, not signed,
accordingly to 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics


Repository:
  rC Clang

https://reviews.llvm.org/D64211

Files:
  include/clang/Basic/arm_neon.td


Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -703,7 +703,7 @@
 
 

 // Signed Saturating Accumulated of Unsigned Value
-def SUQADD : SInst<"vuqadd", "ddd", "csilQcQsQiQl">;
+def SUQADD : SInst<"vuqadd", "ddu", "csilQcQsQiQl">;
 
 

 // Unsigned Saturating Accumulated of Signed Value


Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -703,7 +703,7 @@
 
 
 // Signed Saturating Accumulated of Unsigned Value
-def SUQADD : SInst<"vuqadd", "ddd", "csilQcQsQiQl">;
+def SUQADD : SInst<"vuqadd", "ddu", "csilQcQsQiQl">;
 
 
 // Unsigned Saturating Accumulated of Signed Value
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D64210: [ARM] Fix vector vsqadd intrinsics operands

2019-07-04 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
dnsampaio added a reviewer: LukeCheeseman.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.

Change the vsqadd vector instrinsics to have the second argument as signed 
values, not unsigned,
accordingly to 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics


Repository:
  rC Clang

https://reviews.llvm.org/D64210

Files:
  include/clang/Basic/arm_neon.td


Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -707,7 +707,7 @@
 
 

 // Unsigned Saturating Accumulated of Signed Value
-def USQADD : SInst<"vsqadd", "ddd", "UcUsUiUlQUcQUsQUiQUl">;
+def USQADD : SInst<"vsqadd", "ddx", "UcUsUiUlQUcQUsQUiQUl">;
 
 

 // Reciprocal/Sqrt


Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -707,7 +707,7 @@
 
 
 // Unsigned Saturating Accumulated of Signed Value
-def USQADD : SInst<"vsqadd", "ddd", "UcUsUiUlQUcQUsQUiQUl">;
+def USQADD : SInst<"vsqadd", "ddx", "UcUsUiUlQUcQUsQUiQUl">;
 
 
 // Reciprocal/Sqrt
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D60828: [ARM] Fix armv8 features tree and add fp16fml

2019-06-05 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio abandoned this revision.
dnsampaio added a comment.

Fixed.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60828/new/

https://reviews.llvm.org/D60828



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D61668: [ARM] Fix the extensions implied by a cpu name

2019-05-09 Thread Diogo N. Sampaio via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL360324: [ARM] Fix the extensions implied by a cpu name 
(authored by dnsampaio, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61668/new/

https://reviews.llvm.org/D61668

Files:
  cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp
  cfe/trunk/test/Driver/arm-cortex-cpus.c

Index: cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp
+++ cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -88,6 +88,7 @@
 
 static void DecodeARMFeaturesFromCPU(const Driver , StringRef CPU,
  std::vector ) {
+  CPU = CPU.split("+").first;
   if (CPU != "generic") {
 llvm::ARM::ArchKind ArchKind = llvm::ARM::parseCPUArch(CPU);
 unsigned Extension = llvm::ARM::getDefaultExtensions(CPU, ArchKind);
@@ -350,11 +351,9 @@
   D.Diag(clang::diag::warn_drv_unused_argument)
   << CPUArg->getAsString(Args);
 CPUName = StringRef(WaCPU->getValue()).substr(6);
-checkARMCPUName(D, WaCPU, Args, CPUName, ArchName, Features, Triple);
-  } else if (CPUArg) {
+CPUArg = WaCPU;
+  } else if (CPUArg)
 CPUName = CPUArg->getValue();
-checkARMCPUName(D, CPUArg, Args, CPUName, ArchName, Features, Triple);
-  }
 
   // Add CPU features for generic CPUs
   if (CPUName == "native") {
@@ -367,6 +366,8 @@
 DecodeARMFeaturesFromCPU(D, CPUName, Features);
   }
 
+  if (CPUArg)
+checkARMCPUName(D, CPUArg, Args, CPUName, ArchName, Features, Triple);
   // Honor -mfpu=. ClangAs gives preference to -Wa,-mfpu=.
   const Arg *FPUArg = Args.getLastArg(options::OPT_mfpu_EQ);
   if (WaFPU) {
Index: cfe/trunk/test/Driver/arm-cortex-cpus.c
===
--- cfe/trunk/test/Driver/arm-cortex-cpus.c
+++ cfe/trunk/test/Driver/arm-cortex-cpus.c
@@ -340,30 +340,31 @@
 // RUN: %clang -target armv8a-linux-eabi -mcpu=cortex-a53+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-CORTEX-A53-FP16 %s
 // RUN: %clang -target armv8a-linux-eabi -mcpu=cortex-a53+nofp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-CORTEX-A53-NOFP16 %s
 // CHECK-CORTEX-A53-FP16: "-cc1" {{.*}}"-target-cpu" "cortex-a53" {{.*}}"-target-feature" "+fullfp16"
-// CHECK-CORTEX-A53-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-CORTEX-A53-NOFP16: "-cc1" {{.*}}"-target-cpu" "cortex-a53" {{.*}}"-target-feature" "-fullfp16" "-target-feature" "-fp16fml"
+// CHECK-CORTEX-A53-FP16-NOT: "-target-feature" "+fp16fml"
+// CHECK-CORTEX-A53-NOFP16-NOT: "+fullfp16"
+// CHECK-CORTEX-A53-NOFP16-NOT: "+fp16fml"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V8A-NOFP16FML %s
-// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"
+// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "+fp16fml"
+// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V8A-FP16 %s
-// CHECK-V8A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V8A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V8A-FP16: "-target-feature" "+fullfp16"
-// CHECK-V8A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V8A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V8A-FP16-SAME: {{$}}
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8-a+fp16fml -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V8A-FP16FML %s
 // CHECK-V8A-FP16FML: "-target-feature" "+fp16fml" "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V82A-NOFP16FML %s
-// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"
+// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "+fp16fml"
+// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V82A-FP16 %s
-// CHECK-V82A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V82A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V82A-FP16: "-target-feature" "+fullfp16"
-// CHECK-V82A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V82A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V82A-FP16-SAME: {{$}}
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a+fp16fml -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V82A-FP16FML %s
@@ -382,13 +383,13 @@
 // CHECK-V82A-NOFP16-FP16FML: "-target-feature" "+fp16fml" "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A-NOFP16FML %s

[PATCH] D61668: [ARM] Fix the extensions implied by a cpu name

2019-05-08 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio created this revision.
Herald added subscribers: cfe-commits, kristof.beyls, javed.absar.
Herald added a project: clang.
dnsampaio added reviewers: keith.walker.arm, DavidSpickett, carwil.

When using `clang -mcpu=CPUNAME+FEATURELIST`,
the intrinsic features defined by CPUNAME are
not obtained, as the entire string is passed.
This fixes that by spiting the cpuname
string in the first `+`, if any.

For example, when using

  clang -### --target=arm-arm-none-eabi -march=armv7-a -mcpu=cortex-a8+nocrc

the intrinsic

  "target-feature" "+dsp"

implied by `cortex-a8` is missing.


Repository:
  rC Clang

https://reviews.llvm.org/D61668

Files:
  lib/Driver/ToolChains/Arch/ARM.cpp
  test/Driver/arm-cortex-cpus.c

Index: test/Driver/arm-cortex-cpus.c
===
--- test/Driver/arm-cortex-cpus.c
+++ test/Driver/arm-cortex-cpus.c
@@ -340,30 +340,31 @@
 // RUN: %clang -target armv8a-linux-eabi -mcpu=cortex-a53+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-CORTEX-A53-FP16 %s
 // RUN: %clang -target armv8a-linux-eabi -mcpu=cortex-a53+nofp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-CORTEX-A53-NOFP16 %s
 // CHECK-CORTEX-A53-FP16: "-cc1" {{.*}}"-target-cpu" "cortex-a53" {{.*}}"-target-feature" "+fullfp16"
-// CHECK-CORTEX-A53-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-CORTEX-A53-NOFP16: "-cc1" {{.*}}"-target-cpu" "cortex-a53" {{.*}}"-target-feature" "-fullfp16" "-target-feature" "-fp16fml"
+// CHECK-CORTEX-A53-FP16-NOT: "-target-feature" "+fp16fml"
+// CHECK-CORTEX-A53-NOFP16-NOT: "+fullfp16"
+// CHECK-CORTEX-A53-NOFP16-NOT: "+fp16fml"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V8A-NOFP16FML %s
-// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"
+// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "+fp16fml"
+// CHECK-V8A-NOFP16FML-NOT: "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V8A-FP16 %s
-// CHECK-V8A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V8A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V8A-FP16: "-target-feature" "+fullfp16"
-// CHECK-V8A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V8A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V8A-FP16-SAME: {{$}}
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8-a+fp16fml -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V8A-FP16FML %s
 // CHECK-V8A-FP16FML: "-target-feature" "+fp16fml" "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V82A-NOFP16FML %s
-// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"
+// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "+fp16fml"
+// CHECK-V82A-NOFP16FML-NOT: "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V82A-FP16 %s
-// CHECK-V82A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V82A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V82A-FP16: "-target-feature" "+fullfp16"
-// CHECK-V82A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V82A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V82A-FP16-SAME: {{$}}
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a+fp16fml -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V82A-FP16FML %s
@@ -382,13 +383,13 @@
 // CHECK-V82A-NOFP16-FP16FML: "-target-feature" "+fp16fml" "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A-NOFP16FML %s
-// CHECK-V83A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
-// CHECK-V83A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"
+// CHECK-V83A-NOFP16FML-NOT: "-target-feature" "+fp16fml"
+// CHECK-V83A-NOFP16FML-NOT: "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.3-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A-FP16 %s
-// CHECK-V83A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V83A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V83A-FP16: "-target-feature" "+fullfp16"
-// CHECK-V83A-FP16-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// CHECK-V83A-FP16-NOT: "-target-feature" "+fp16fml"
 // CHECK-V83A-FP16-SAME: {{$}}
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.3-a+fp16fml -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A-FP16FML %s
@@ -407,8 +408,8 @@
 // CHECK-V83A-NOFP16-FP16FML: "-target-feature" "+fp16fml" "-target-feature" "+fullfp16"
 
 // RUN: %clang -target armv8a-linux-eabi -march=armv8.4-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A-NOFP16FML %s
-// CHECK-V84A-NOFP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
-//

[PATCH] D60828: [ARM] Fix armv8 features tree and add fp16fml

2019-04-18 Thread Diogo N. Sampaio via Phabricator via cfe-commits

dnsampaio planned changes to this revision.
dnsampaio added a comment.

Waiting for the outcome of D60691 .


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60828/new/

https://reviews.llvm.org/D60828



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 >

1 - 100 of 118 matches

Mail list logo