Re: [GIT PULL] x86/build changes for v4.17

Matthias Kaehlcke Wed, 04 Apr 2018 14:47:14 -0700

El Wed, Apr 04, 2018 at 11:11:36PM +0200 Arnd Bergmann ha dit:

> On Wed, Apr 4, 2018 at 10:58 PM, Matthias Kaehlcke <m...@chromium.org> wrote:
> > El Wed, Apr 04, 2018 at 10:33:19PM +0200 Arnd Bergmann ha dit:
> >>
> >> In most cases, this is used to implement a fast-path for a helper
> >> function, so not doing it the same way as gcc just results in
> >> slower execution, but I assume we also have code that behaves
> >> differently on clang compared to gcc because of this.
> >
> > I think I didn't come (knowingly) across that one yet. Could you point
> > me to an instance that could be used as an example in a bug report?
> 
> This code
> 
> #include <linux/math64.h>
> int f(u64 u)
> {
>         return div_u64(u, 100000);
> }
> 
> results in a call to __do_div64() on 32-bit arm using clang, but
> gets optimized into a set of multiply+shift on gcc.


I understand this is annoying, but it seems I'm missing something:

static inline u64 div_u64(u64 dividend, u32 divisor)
{
        u32 remainder;
        return div_u64_rem(dividend, divisor, &remainder);
}

static inline u64 div_u64_rem(u64 dividend, u32 divisor, u32 *remainder)
{
        *remainder = do_div(dividend, divisor);
        return dividend;
}

#define do_div(n, base) __div64_32(&(n), base)

static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
{
        register unsigned int __base      asm("r4") = base;
        register unsigned long long __n   asm("r0") = *n;
        register unsigned long long __res asm("r2");
        register unsigned int __rem       asm(__xh);
        asm(    __asmeq("%0", __xh)
                __asmeq("%1", "r2")
                __asmeq("%2", "r0")
                __asmeq("%3", "r4")
                "bl     __do_div64"
                : "=r" (__rem), "=r" (__res)
                : "r" (__n), "r" (__base)
                : "ip", "lr", "cc");
        *n = __res;
        return __rem;
}

There is no reference to __builtin_constant_p(), could you elaborate?

Also you mentioned there are plenty of cases, maybe there is a more
straightforward one?

In any case it seems this derails a bit from the original topic of the
thread. Shall we take this offline?

Re: [GIT PULL] x86/build changes for v4.17

Reply via email to