[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #16 from Jakub Jelinek  ---
Actually, there is no need to use the __COUNTER__ stuff, just use
static const int __ia64_asr_p = ...;
in the scope.  It is just important __OPTIMIZE__ is on, because at -O0 the
compiler wouldn't optimize those static vars away.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #15 from Jakub Jelinek  ---
(In reply to Matthew Wilcox from comment #14)
> Confirmed this fixes the problem.  I'll send it to Tony and see if he likes
> it.  May I add your Signed-off-by to the patch?

Sure.  Feel free to reformat it as the kernel wants, it has been almost 20
years since I've been active in the kernel, so don't know the formatting rules
anymore.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread matthew at wil dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #14 from Matthew Wilcox  ---
Confirmed this fixes the problem.  I'll send it to Tony and see if he likes it.
 May I add your Signed-off-by to the patch?

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #13 from Jakub Jelinek  ---
What about this approach, force __builtin_constant_p to evaluate in the FE
before optimizations and decide just on that.  Will work with constant literals
passed to the macro, will do the fallback if you say use it in inline and hope
that if the inline is called with a constant it will propagate:

struct V { int counter; };
int ia64_fetch_and_add(int, int *);
int ia64_atomic_sub(int, struct V *);

#ifdef __OPTIMIZE__
#define atomic_sub_return_1(i,v,c)  \
({  \
int __ia64_asr_i = (i); \
static const int __ia64_asr_p_##c   \
  = __builtin_constant_p(i) \
? ((i) == 1 || (i) == 4 || (i) == 8 || (i) == 16\
   || (i) == -1 || (i) == -4 || (i) == -8 || (i) == -16)\
: 0;\
__ia64_asr_p_##c\
  ? ia64_fetch_and_add(-__ia64_asr_i, &(v)->counter)\
  : ia64_atomic_sub(__ia64_asr_i, v);   \
})
#define atomic_sub_return_2(i,v,c) atomic_sub_return_1(i,v,c)
#define atomic_sub_return(i,v) atomic_sub_return_2(i,v,__COUNTER__)
#else
#define atomic_sub_return(i,v) ia64_atomic_sub(i,v)
#endif

void
foo (struct V *p, int i)
{
  atomic_sub_return (4, p);
  atomic_sub_return (8, p);
  atomic_sub_return (16, p);
  atomic_sub_return (7, p);
  atomic_sub_return (i, p);
}

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread aldyh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #12 from Aldy Hernandez  ---
(In reply to Matthew Wilcox from comment #9)
> Maybe I'm a little slow, but I don't see what the path is that sets 'nr' to
> 0.  It's 1UL << compound_order.  Typically, compound_order is 0, although it
> may be anything up to log2(number of pages in the machine).  Are you saying

Sorry, yes 1<<0 will be 1, which is handled.  Let me dig into the threading.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread matthew at wil dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #11 from Matthew Wilcox  ---
I'm sorry, I still don't get it.

What I think you're saying is that GCC performs this optimisation:

nr = 1UL << compound_order(page);
atomic_sub_return(x, nr);

into:

if (PageHead(page))
  atomic_sub_return(x, 1);
else
  atomic_sub_return(x, 1UL << page[1].order)

That seems like a great optimisation to me, and I'm all for it.  What I don't
understand is where the b_c_p call gets lost.

Also, this bug is pretty fragile.  If I replace the 'nr' in the call to
atomic_sub_return with '1UL << compound_order(page)', the bug goes away.

Anyway, I have no vested interest in ia64 or the code that's currently using
__b_c_p.  I just want it to stop blocking me getting my patch in.  It's a bit
different from 72785 because I can't just resolve it by removing the checking
code.  Just tell me what the macro should look like.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 CC||law at redhat dot com
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=72785
 Resolution|--- |INVALID

--- Comment #10 from Jeffrey A. Law  ---
So look for something similar to this:

  [local count: 96151367]:
  # i_280 = PHI <0(28), 0(30)>
  if (i_280 < nr_300)
goto ; [92.50%]
  else
goto ; [7.50%]

The subscripts may change, but I think that's what you're looking for.

i_280 will be replaced with zero resulting in

if (0 < nr_300)
  true
else
  false


And because nr is unsigned that will turn into an equality comparison

if (nr_300 != 0)
  true
else
  false

At that point the compiler knows that nr_300 has the value 0 on the false arm
of the conditional.


Anyway, this isn't a GCC bug.  I'll note that it may be worth reviewing pr72785
 which is essentially the same issue with misunderstanding how
builtin_constant_p works.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread matthew at wil dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #9 from Matthew Wilcox  ---
Maybe I'm a little slow, but I don't see what the path is that sets 'nr' to 0. 
It's 1UL << compound_order.  Typically, compound_order is 0, although it may be
anything up to log2(number of pages in the machine).  Are you saying that nr
could be 0 because DOM2 thinks compound_order() could be larger than 64, and
thus undefined?

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread aldyh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #8 from Aldy Hernandez  ---
Ah, I see what the problem is.

 unsigned long i, nr = 1UL << compound_order(page);
...
 page_ref_sub(page, nr); // calls __builtin_constant_p(nr) eventually

The problem is that there is a path to __builtin_constant_p(nr) for which NR is
0, and thus a constant.  This is because compound_order() is defined as:

static inline unsigned int compound_order(struct page *page)
{
if (!PageHead(page))
return 0;
return page[1].compound_order;
}

The dom2 threading pass is isolating the path returning 0, and realizing that
that particular path is a constant.  Your series of if's does not handle 0, and
you get the __bad_increment_for_ia64_fetch_and_add exposed.

Don't blame me, I'm just the messenger :).

Perhaps Richi has a suggestion on how to code your macro.

If, as Richard says, this is a known quirk, perhaps we should document it in
the section for __builtin_constant_p() in the manual, along with a suggestion
on how to code an alternative.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread matthew at wil dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #7 from Matthew Wilcox  ---
OK, so how should we write this function/macro to accomplish what we want?

And the requirement is "If the argument is one of these eight special
constants, use this special instruction, otherwise call this function".  Even
if the argument happens to be one of the eight special values at runtime, we
should still call the function, because it's not worth doing runtime checks; we
only want a compile-time optimisation.  And it's worth the compile-time
optimisation because adding constant 1/-1 is really, really common.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-11 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #6 from rguenther at suse dot de  ---
On Thu, 11 Jan 2018, matthew at wil dot cx wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653
> 
> --- Comment #5 from Matthew Wilcox  ---
> Hi Aldy!
> 
> Thanks for looking into this.  Yes, I agree, there's no way that GCC can know
> this is a constant, but that *should* have been taken care of.  Please pardon
> me copying and pasting from the original source file rather than the
> preprocessed source, but I find it utterly impossible to work with the
> preprocessed source ...
> 
> #define atomic_sub_return(i,v)  \
> ({  \
> int __ia64_asr_i = (i); \
> (__builtin_constant_p(i)\
>  && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)  \
>  || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)  \
>  || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)  \
>  || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))\
> ? ia64_fetch_and_add(-__ia64_asr_i, &(v)->counter)  \
> : ia64_atomic_sub(__ia64_asr_i, v); \
> })
> 
> That __builtin_constant_p() *should* have led GCC to throw up its hands, not
> bother checking for the +/- 1, 4, 8, 16 cases and just call 
> ia64_atomic_sub(). 
> Looking at the disassembly, I see a BBB bundle, indicating quite strongly to 
> me
> that it is testing for all of these cases, and the __builtin_constant_p is
> being ... ignored?  misunderstood?

The usual events are that we perform some jump threading across
the __builtin_constant_p before it is resolved, exposing the constant
path somewhere 'i' isn't constant.

The builtin really doesn't guarantee what kernel folks want, because
at one point they want it resolved very late so optimization can
eventually figure out 'i' _is_ constant on the other hand some
optimizations might pull out stuff across it and break expectations.

Note I didn't try to figure what exactly goes wrong but I believe
we had duplicate issues (also in the kernel) that were worked
around in the source by just removing the stupid optimization
and relying on GCC to perform it for constant 'i'.

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-10 Thread matthew at wil dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

--- Comment #5 from Matthew Wilcox  ---
Hi Aldy!

Thanks for looking into this.  Yes, I agree, there's no way that GCC can know
this is a constant, but that *should* have been taken care of.  Please pardon
me copying and pasting from the original source file rather than the
preprocessed source, but I find it utterly impossible to work with the
preprocessed source ...

#define atomic_sub_return(i,v)  \
({  \
int __ia64_asr_i = (i); \
(__builtin_constant_p(i)\
 && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)  \
 || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)  \
 || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)  \
 || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))\
? ia64_fetch_and_add(-__ia64_asr_i, &(v)->counter)  \
: ia64_atomic_sub(__ia64_asr_i, v); \
})

That __builtin_constant_p() *should* have led GCC to throw up its hands, not
bother checking for the +/- 1, 4, 8, 16 cases and just call ia64_atomic_sub(). 
Looking at the disassembly, I see a BBB bundle, indicating quite strongly to me
that it is testing for all of these cases, and the __builtin_constant_p is
being ... ignored?  misunderstood?

Thanks!

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-10 Thread aldyh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

Aldy Hernandez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2018-01-11
 CC||aldyh at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Aldy Hernandez  ---
I can reproduce this on mainline, however the testcase is suspect.

I see that page_ref_sub() is defined as:

int __ia64_asr_i = ((nr))
...
if (__ia64_asr_i == xxx)
else if (__ia64_asr_i == yyy)
else if (__ia64_asr_i == yyy)
etc
else _tmp = __bad_increment_for_ia64_fetch_and_add();

The thing I see is that NR doesn't seem like an inlineable constant when passed
from the caller:

nr = 1UL << compound_order(page)
...
page_ref_sub(page, nr)

because:

unsigned int compound_order(struct page *page)
{
 if (!PageHead(page))
  return 0;
 return page[1].compound_order;
}

And sure enough...after early inlining, both compound_order and page_ref_sub
are inlined into shmem_add_to_page_cache and we can see:

  _117 = MEM[(struct page *)page_49(D) + 56B].D.16951.D.16950.compound_order;

There's no way the compiler can know that _117 is a known constant if it's
reading the value from memory.

OTOH, the other *_bad* thinggies do get inlined correctly because they depend
on sizeof(stuff), whose size can be determined at compile time.

Matthew, could you double check here?  Maybe I missed something, but perhaps a
reduced testcase would help analyze better (at least for me :)).

[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64

2018-01-08 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653

Richard Biener  changed:

   What|Removed |Added

Summary|[6 Regression] GCC fails to |[6/7/8 Regression] GCC
   |remove a can't-happen call  |fails to remove a
   |on ia64 |can't-happen call on ia64
  Known to fail||4.9.4, 5.5.0, 6.2.0, 7.2.0

--- Comment #3 from Richard Biener  ---
So I just assume trunk fails as well.