On 21 April 2012 00:57, Dinar Temirbulatov <dtemirbula...@gmail.com> wrote: > Hi, > Here is the patch that adds support for divide 64-bit by constant for > 32-bit target machines, this patch was tested on arm-7a with no new > regressions, also I am not sure on how to avoid for example i686 > targets since div operation there is fast compared to over targets and > it showed better performance with libc/sysdeps/wordsize-32/divdi3.c > __udivdi3 vs my implementation on the compiler side, it is not clear > for me by looking at the udiv_cost[speed][SImode] value.
Hi Dinar. I'm afraid it gives the wrong results for some dividends: * 82625484914982912 / 2023346444509052928: gives 4096, should be zero * 18317463604061229328 / 2023346444509052928: gives 4109, should be 9 * 12097415865295708879 / 4545815675034402816: gives 130, should be 2 * 18195490362097456014 / 6999635335417036800: gives 10, should be 2 The expanded version is very large. Perhaps it should only turn on at -O2 and always turn off at -Os? The speed increase is quite impressive - I'm seeing between 2.7 and 20x faster on a Cortex-A9 for things like x / 3. -- Michael