Your message dated Fri, 5 Apr 2013 23:41:51 +0100 with message-id <CAPQ4b8=F1hF=xykzt9onkfpk7nkjtbakxvr6+cpwtvk+nbp...@mail.gmail.com> and subject line Re: [PR 18589] could optimize FP multiplies better has caused the Debian Bug report #268115, regarding [PR 18589] could optimize FP multiplies better to be marked as done.
This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact [email protected] immediately.) -- 268115: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=268115 Debian Bug Tracking System Contact [email protected] with problems
--- Begin Message ---Package: gcc-3.4 Version: 3.4.1-7.0.0.1.amd64 Severity: wishlist compiling this function: double baz(double foo, double bar) { return foo*foo*foo*foo*bar*bar*bar*bar; } on amd64 with -O6 -ffast-math, gcc emits this code: foo.o: file format elf64-x86-64 Disassembly of section .text: ... (some similar functions that I was messing around with) ... 0000000000000050 <ddbar>: 50: f2 0f 59 c0 mulsd %xmm0,%xmm0 54: f2 0f 59 c0 mulsd %xmm0,%xmm0 58: f2 0f 59 c1 mulsd %xmm1,%xmm0 5c: f2 0f 59 c1 mulsd %xmm1,%xmm0 60: f2 0f 59 c1 mulsd %xmm1,%xmm0 64: f2 0f 59 c1 mulsd %xmm1,%xmm0 68: c3 retq So, it notices that it can do foo*foo*foo*foo with two mulsd instructions, but it misses the same optimization for bar*bar*bar*bar. It would save one FP multiply overall to do: mulsd %xmm0, %xmm0 mulsd %xmm1, %xmm1 mulsd %xmm0, %xmm0 mulsd %xmm1, %xmm1 mulsd %xmm1, %xmm0 retq Also, the two non-dependent muls could run in parallel. Without -ffast-math, of course, gcc can't take advantage of the laws of arithmetic like that and has to do all the multiplies the straightforward way. Anyway, that's what I noticed while poking around waiting for pure64 to download from alioth to this fancy new dual Opteron I'm setting up for one of my users :) Correct me if I'm all wrong about this optimization being possible... -- System Information: Debian Release: 3.1 Architecture: amd64 (x86_64) Kernel: Linux 2.6.8-1-amd64-k8-smp Locale: LANG=C, LC_CTYPE=C Versions of packages gcc-3.4 depends on: ii binutils 2.15-1 The GNU assembler, linker and bina ii cpp-3.4 3.4.1-7.0.0.1.amd64 The GNU C preprocessor ii gcc-3.4-base 3.4.1-7.0.0.1.amd64 The GNU Compiler Collection (base ii libc6 2.3.2.ds1-16.0.0.1.amd64 GNU C Library: Shared libraries an ii libgcc1 1:3.4.1-7.0.0.1.amd64 GCC support library -- no debconf information
--- End Message ---
--- Begin Message ---fixed -1 1.120exp2 stop 4.8 was released in experimental and probably soon in unstable. gcc 1.120exp2 depends on that version on the major architectures. Cheers.
--- End Message ---

