Matthias, Thanks for your reply.
Last Saturday I identified the problem, which was my fault, not a gcc or ubuntu bug! Please close. The offending code contains a large array declared and initialised inside a function. It should have been declared "static" to avoid reinitialising it on every call to the function. For some reason failing to declare the array "static" didn't cause an obvious problem with Jaunty, but it shows up in 9.10, which is why I assumed I had found a bug. Jaunty-gcc does behave in similar way, but in my program the performance hit occurs less seriously in a second function that contained the same mistake. Making all big arrays "static" fixes the problem in both Jaunty and Karmic, and the performance of my program is quadrupled! I guess whether or not failing to declare "static" results in grief depends on many factors, but in the end it's the programmer's job to watch the details... Mea culpa. Sorry to have wasted your time : you guys do a fantastic job. Regards, Dave On Tue, 2009-12-29 at 13:12 +0000, Matthias Klose wrote: > please could you recheck with gcc-4.4 and gcc-snapshot from lucid? > > > ** Changed in: gcc-4.4 (Ubuntu) > Status: New => Incomplete > > -- > gcc - example severe performance regression after upgrading > https://bugs.launchpad.net/bugs/474825 > You received this bug notification because you are a direct subscriber > of the bug. > > Status in “gcc-4.4” package in Ubuntu: Incomplete > > Bug description: > I have a C++ program developed as an exercise in tuning to solve Sudoku > puzzles as fast as possible by reading them from a file and bit twiddling. > After upgrading to 9.10 from jaunty and recompiling, the program that > previously solved over 2000 puzzles per second dropped to about 180 puzzles > per second, and CPU increased from about 75% to 100%. > > callgrind reveals that the degradation occurs in a function that returns the > number of bits set in an integer. Before the upgrade, this was most > effectively done with a table lookup, e.g: > > unsint puzzle::noOfDigits( const cell_t n ) > { > // unsint is unsigned int ; cell_t is unsigned short. > > const unsigned int BitsSetTable[] = { > 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, > 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, > 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, > 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, > 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, > 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, > 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, > 5, 6, 6, 7, 6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9, > 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 9, 8, 9, 9, 10 > }; > return BitsSetTable[n]; > } > > On Jaunty the table lookup version of my program ran significantly faster > than the simple algorithm below (2200 vs 1500 puzzles per second). > > Recompiled n 9.10 the algorithm version still runs at 1500 puzzles per > second, but the lookup version works hard to achieve 180. > > unsint puzzle::noOfDigits( cell_t cell ) > { > /* > ** return the number of digits stored in a cell_t, ie the number of set > bits > ** eg. 101101101 ("134679") returns 6 > */ > unsint i, n = 0; > > for ( i=0; i<10; i++ ) > { > n += cell & 0x01; > cell >>= 1; > } > return n; > } > > 1) Description: Ubuntu 9.10 > Release: 9.10 > > 2) gcc: > Installed: 4:4.4.1-1ubuntu2 > Candidate: 4:4.4.1-1ubuntu2 > Version table: > *** 4:4.4.1-1ubuntu2 0 > 500 http://gb.archive.ubuntu.com karmic/main Packages > 100 /var/lib/dpkg/status > > 3) Similar performance > 4) Severe performance loss. > > Fujitsu Siemens Amilo Pi1505 Laptop, 1Gb RAM, Centrino duo > > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 14 > model name : Intel(R) Core(TM) Duo CPU T2250 @ 1.73GHz > stepping : 12 > cpu MHz : 800.000 > cache size : 2048 KB > physical id : 0 > siblings : 2 > core id : 0 > cpu cores : 2 > apicid : 0 > initial apicid : 0 > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 10 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon > bts pni monitor est tm2 xtpr pdcm > bogomips : 3466.39 > clflush size : 64 > power management: > > processor : 1 > vendor_id : GenuineIntel > cpu family : 6 > model : 14 > model name : Intel(R) Core(TM) Duo CPU T2250 @ 1.73GHz > stepping : 12 > cpu MHz : 800.000 > cache size : 2048 KB > physical id : 0 > siblings : 2 > core id : 1 > cpu cores : 2 > apicid : 1 > initial apicid : 1 > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 10 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon > bts pni monitor est tm2 xtpr pdcm > bogomips : 3466.83 > clflush size : 64 > power management: > > > > Experience with upgrading and using 9.10 otherwise very good with only a few > minor application bugs that are already recognised. Apologies if this > problem is gcc 4.4.1 rather than Ubuntu! > > Thanks, > > Dave > > ProblemType: Crash > Architecture: i386 > CrashCounter: 1 > Date: Sun Nov 1 16:32:56 2009 > DistroRelease: Ubuntu 9.10 > ExecutablePath: /usr/bin/rhythmbox > Package: rhythmbox 0.12.5-0ubuntu4 > ProcCmdline: rhythmbox > ProcEnviron: > PATH=(custom, user) > LANG=en_GB.UTF-8 > SHELL=/bin/bash > ProcVersionSignature: Ubuntu 2.6.31-14.48-generic > SegvAnalysis: > Segfault happened at: 0x5b2b397 <LIBMTP_Get_Filemetadata+23>: mov > 0x8(%eax),%edx > PC (0x05b2b397) ok > source "0x8(%eax)" (0x00000008) not located in a known VMA region (needed > readable region)! > destination "%edx" ok > SegvReason: reading NULL VMA > Signal: 11 > SourcePackage: rhythmbox > StacktraceTop: > LIBMTP_Get_Filemetadata () from /usr/lib/libmtp.so.8 > ?? () > gst_element_change_state () > ?? () from /usr/lib/libgstreamer-0.10.so.0 > gst_element_set_state () > Title: rhythmbox crashed with SIGSEGV in LIBMTP_Get_Filemetadata() > Uname: Linux 2.6.31-14-generic i686 > UserGroups: adm admin cdrom dialout lpadmin mandriva plugdev sambashare > > To unsubscribe from this bug, go to: > https://bugs.launchpad.net/ubuntu/+source/gcc-4.4/+bug/474825/+subscribe -- gcc - example severe performance regression after upgrading https://bugs.launchpad.net/bugs/474825 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
