On Fri, 2020-06-26 at 16:04 -0600, Jeff Law wrote:
> On Fri, 2020-06-26 at 23:54 +0200, Ilya Leoshkevich wrote:
> > How should this work ideally?  Suppose we have:
> > 
> > static inline void foo (int i)
> > {
> >   if (__builtin_is_constant_p (i))
> >     asm volatile ("bar" :: "i" (i))
> >   else
> >     asm volatile ("baz" :: "d" (i));
> > }
> > 
> > First of all, this is a supported use case, right?
> Yes, this is a supported case.
> 
> > Then, the way I see it, is that at least for a while there must
> > exist
> > trees like the ones above, regardless of whether their asm
> > arguments
> > match constraints.  But ultimately dead code elimination should get
> > rid
> > of the wrong ones before they reach RTL.
> > E.g. in the example above, the non-inlined version should have
> > `!__builtin_is_constant_p (i)`, so `bar` should not survive until
> > RTL.  The same for inlined `foo (1)` version's `baz`.
> In theory yes, but there are cases where paths converge (like you've
> shown) where
> you may have evaluated to a constant on the paths, but it's not a
> constant at the
> convergence point.  You have to be very careful using b_c_p like this
> and it's
> been a regular source of kernel bugs.

Is there something specific that a compiler user should look out for?
For example, here is the kernel code, from which the test was derived:

static inline void atomic_add(int i, atomic_t *v)
{
#ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
        if (__builtin_constant_p(i) && (i > -129) && (i < 128)) {
                __atomic_add_const(i, &v->counter);
                return;
        }
#endif
        __atomic_add(i, &v->counter);
}
        
It looks very straightforward - can there still be something wrong
with its usage of b_c_p?

> I'd recommend looking at the .ssa dump and walk forward from there if
> the .ssa
> dump looks correct.
> 
> jeff

Well, 021t.ssa already has:

__attribute__((gnu_inline))
__atomic_add_const (intD.6 valD.2269, intD.6 * ptrD.2270)
{
  intD.6 val_3(D) = valD.2269;
  intD.6 * ptr_2(D) = ptrD.2270;
;;   basic block 2, loop depth 0, maybe hot
;;    prev block 0, next block 1, flags: (NEW)
;;    pred:       ENTRY (FALLTHRU)
  # .MEM_4 = VDEF <.MEM_1(D)>
  __asm__ __volatile__("asi %0,%1
" : "ptr" "=Q" *ptr_2(D) : "val" "i" val_3(D), "Q" *ptr_2(D) :
"memory", "cc");
  # VUSE <.MEM_4>
  return;
;;    succ:       EXIT

}

which is, strictly speaking, not correct, because val_3(D) and
valD.2269 are not constant.  But as far as I understand we are willing
to tolerate trees like this until a certain point.

What is this point supposed to be?  If I understood you right, 
106t.thread1 is already too late - why is it so?

Reply via email to