Re: Widening multiplication, but no narrowing division [i386/AMD64]
> On Jan 9, 2023, at 11:27 AM, Stefan Kanthak wrote: > > "Paul Koning" wrote: > >>> ... > >> Yes, I was thinking the same. But I spent a while on that pattern -- I >> wanted to support div/mod as a single operation because the machine has >> that primitive. And I'm pretty sure I saw it work before I committed >> that change. That's why I'm wondering if something changed. > > I can't tell from the past how GCC once worked, but today it can't > (or doesn't) use such patterns, at least not on i386/AMD64 processors. It turns out I was confused by the RTL generated by my pattern. That pattern is for divmodhi, so it works as desired given same-size inputs. I'm wondering if the case of longer dividend -- which is a common thing for several machines -- could be handled by a define_peephole2 that matches the sign-extend of the divisor followed by the (longer) divide. I made a stab at that but what I wrote wasn't valid. So, question to the list: suppose I want to write RTL that matches what Stefan is talking about, with a div or mod or divmod that has si results and a di dividend (or hi results and an si dividend), how would you do that? Can a define_peephole2 do it, and if so, what would it look like? paul
Re: Widening multiplication, but no narrowing division [i386/AMD64]
"Paul Koning" wrote: >> On Jan 9, 2023, at 10:20 AM, Stefan Kanthak wrote: >> >> "Paul Koning" wrote: >> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak wrote: Hi, GCC (and other C compilers too) support the widening multiplication of i386/AMD64 processors, but DON'T support their narrowing division: >>> >>> I wonder if this changed in the recent past. >>> I have a pattern for this type of thing in pdp11.md: >> [...] >>> and I'm pretty sure this worked at some point in the past. >> >> Unfortunately the C standard defines that the smaller operand (of lesser >> conversion rank), here divisor, has to undergo a conversion to the "real >> common type", i.e. the broader operand (of higher conversion rank), here >> dividend. Unless the information about promotion/conversion is handed over >> to the code generator it can't apply such patterns -- as demonstrated by >> the demo code. > Yes, I was thinking the same. But I spent a while on that pattern -- I > wanted to support div/mod as a single operation because the machine has > that primitive. And I'm pretty sure I saw it work before I committed > that change. That's why I'm wondering if something changed. I can't tell from the past how GCC once worked, but today it can't (or doesn't) use such patterns, at least not on i386/AMD64 processors. To give another example where the necessary information is most obviously NOT propagated from front end to back end: --- clmul.c --- // widening carry-less multiplication unsigned long long clmul(unsigned long p, unsigned long q) { unsigned long long r = 0; unsigned long s = 1UL << 31; do { r <<= 1; if (q & s) #ifdef _MSC_VER (unsigned long) r ^= p; #else r ^= p; // no need to promote/convert p here! #endif } while (s >>= 1); return r; } --- EOF --- # https://gcc.godbolt.org/z/E99v7fEP3 clmul(unsigned long, unsigned long): pushebp mov ecx, -2147483648 xor eax, eax xor edx, edx pushedi# OOPS: superfluous xor edi, edi # OOPS: superfluous pushesi pushebx# OUCH: WTF? mov ebp, DWORD PTR [esp+24] mov ebx, 32# OUCH: WTF? mov esi, DWORD PTR [esp+20] .L3: shldedx, eax, 1 add eax, eax testebp, ecx je .L2 xor eax, esi xor edx, edi # OOPS: superfluous .L2: shr ecx, 1 sub ebx, 1 # OUCH: WTF? jne .L3 pop ebx# OUCH: WTF? pop esi pop edi# OOPS: superfluous pop ebp ret 8 superfluous instructions out of the total 25 instructions! NOT AMUSED Stefan
Re: Widening multiplication, but no narrowing division [i386/AMD64]
> On Jan 9, 2023, at 10:20 AM, Stefan Kanthak wrote: > > "Paul Koning" wrote: > >>> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak wrote: >>> >>> Hi, >>> >>> GCC (and other C compilers too) support the widening multiplication >>> of i386/AMD64 processors, but DON'T support their narrowing division: >> >> I wonder if this changed in the recent past. >> I have a pattern for this type of thing in pdp11.md: > [...] >> and I'm pretty sure this worked at some point in the past. > > Unfortunately the C standard defines that the smaller operand (of lesser > conversion rank), here divisor, has to undergo a conversion to the "real > common type", i.e. the broader operand (of higher conversion rank), here > dividend. Unless the information about promotion/conversion is handed over > to the code generator it can't apply such patterns -- as demonstrated by > the demo code. > > regards > Stefan Yes, I was thinking the same. But I spent a while on that pattern -- I wanted to support div/mod as a single operation because the machine has that primitive. And I'm pretty sure I saw it work before I committed that change. That's why I'm wondering if something changed. paul
Re: Widening multiplication, but no narrowing division [i386/AMD64]
"Paul Koning" wrote: >> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak wrote: >> >> Hi, >> >> GCC (and other C compilers too) support the widening multiplication >> of i386/AMD64 processors, but DON'T support their narrowing division: > > I wonder if this changed in the recent past. > I have a pattern for this type of thing in pdp11.md: [...] > and I'm pretty sure this worked at some point in the past. Unfortunately the C standard defines that the smaller operand (of lesser conversion rank), here divisor, has to undergo a conversion to the "real common type", i.e. the broader operand (of higher conversion rank), here dividend. Unless the information about promotion/conversion is handed over to the code generator it can't apply such patterns -- as demonstrated by the demo code. regards Stefan
Re: Widening multiplication, but no narrowing division [i386/AMD64]
> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak wrote: > > Hi, > > GCC (and other C compilers too) support the widening multiplication > of i386/AMD64 processors, but DON'T support their narrowing division: I wonder if this changed in the recent past. I have a pattern for this type of thing in pdp11.md: (define_expand "divmodhi4" [(parallel [(set (subreg:HI (match_dup 1) 0) (div:HI (match_operand:SI 1 "register_operand" "0") (match_operand:HI 2 "general_operand" "g"))) (set (subreg:HI (match_dup 1) 2) (mod:HI (match_dup 1) (match_dup 2)))]) (set (match_operand:HI 0 "register_operand" "=r") (subreg:HI (match_dup 1) 0)) (set (match_operand:HI 3 "register_operand" "=r") (subreg:HI (match_dup 1) 2))] "TARGET_40_PLUS" "") and I'm pretty sure this worked at some point in the past. paul
Re: Widening multiplication, but no narrowing division [i386/AMD64]
LIU Hao wrote: >在 2023/1/9 20:20, Stefan Kanthak 写道: >> Hi, >> >> GCC (and other C compilers too) support the widening multiplication >> of i386/AMD64 processors, but DON'T support their narrowing division: >> >> > > QWORD-DWORD division would change the behavior of your program. [...] > If DIV was used, it would effect an exception: Guess why I use "schoolbook" division? Please read the end of my post until you understand the code. regards Stefan
Re: Widening multiplication, but no narrowing division [i386/AMD64]
在 2023/1/9 20:20, Stefan Kanthak 写道: Hi, GCC (and other C compilers too) support the widening multiplication of i386/AMD64 processors, but DON'T support their narrowing division: QWORD-DWORD division would change the behavior of your program. Given: ``` uint32_t xdiv(uint64_t x, uint32_t y) { return x / y; } ``` then `xdiv(0x20002, 2)` should first convert both operands to `uint64_t`, perform the division which yields `0x10001`, then truncate the quotient to 32-bit which gives `1`. The result is exact. If DIV was used, it would effect an exception: ``` mov edx, 2 mov eax, edx # edx:eax = 0x20002 mov ecx, edx div ecx# division overflows because the quotient # can't stored into EAX ``` -- Best regards, LIU Hao OpenPGP_signature Description: OpenPGP digital signature