El Wed, Apr 04, 2018 at 11:11:36PM +0200 Arnd Bergmann ha dit: > On Wed, Apr 4, 2018 at 10:58 PM, Matthias Kaehlcke <m...@chromium.org> wrote: > > El Wed, Apr 04, 2018 at 10:33:19PM +0200 Arnd Bergmann ha dit: > >> > >> In most cases, this is used to implement a fast-path for a helper > >> function, so not doing it the same way as gcc just results in > >> slower execution, but I assume we also have code that behaves > >> differently on clang compared to gcc because of this. > > > > I think I didn't come (knowingly) across that one yet. Could you point > > me to an instance that could be used as an example in a bug report? > > This code > > #include <linux/math64.h> > int f(u64 u) > { > return div_u64(u, 100000); > } > > results in a call to __do_div64() on 32-bit arm using clang, but > gets optimized into a set of multiply+shift on gcc.
I understand this is annoying, but it seems I'm missing something: static inline u64 div_u64(u64 dividend, u32 divisor) { u32 remainder; return div_u64_rem(dividend, divisor, &remainder); } static inline u64 div_u64_rem(u64 dividend, u32 divisor, u32 *remainder) { *remainder = do_div(dividend, divisor); return dividend; } #define do_div(n, base) __div64_32(&(n), base) static inline uint32_t __div64_32(uint64_t *n, uint32_t base) { register unsigned int __base asm("r4") = base; register unsigned long long __n asm("r0") = *n; register unsigned long long __res asm("r2"); register unsigned int __rem asm(__xh); asm( __asmeq("%0", __xh) __asmeq("%1", "r2") __asmeq("%2", "r0") __asmeq("%3", "r4") "bl __do_div64" : "=r" (__rem), "=r" (__res) : "r" (__n), "r" (__base) : "ip", "lr", "cc"); *n = __res; return __rem; } There is no reference to __builtin_constant_p(), could you elaborate? Also you mentioned there are plenty of cases, maybe there is a more straightforward one? In any case it seems this derails a bit from the original topic of the thread. Shall we take this offline?