http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56129
Bug #: 56129 Summary: Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ysrum...@gmail.com In a process of testing Atom with lto and pre-reload scheduler we got Seg Fault on 256: ./bzip2 input.graphic 58 spec_init Loading Input Data Duplicating 6656364 bytes Duplicating 13312728 bytes Duplicating 26625456 bytes Duplicating 7566496 bytes Input data 60817408 bytes in length Compressing Input Data, level 7 Compressed data 44249975 bytes in length Uncompressing Data Segmentation fault If we turn off LRA test is passed. We did the following investigation: 1. Both -lto and pre-reload scheduler (-fschedule-insns --param sched-pressure-algorithm=1 -fsched-pressure) are required to exhibit the failure. 2. We also used Atom specific options to compile: -msse2 -mfpmath=sse -ffast-math -march=atom and peak optset - -O3 -funroll loops 3. We found out that if we turn off pre-reload scheduler for 'main' the failure disappears. 4. We also determined that if we turn off pre-reload shceduler in main for regions > 575 test is passed but it failed if we increase upper threshold on 1, i.e. if we skip scheduling all regions in main starting from 577. 5. Comparing good and bad objdumps (I don't know how to get assembly file with -lto) we found out that possible reason of fail is non-correct offset for one of spill: in good case fill: 804c205: 8b 84 24 c0 00 00 00 mov 0xc0(%esp),%eax spill: 804d1ed: 89 84 24 c0 00 00 00 mov %eax,0xc0(%esp) in bad case with restricted scheduling fill: 804c205: 8b 84 24 c0 00 00 00 mov 0xc0(%esp),%eax spill: 804d1ed: 89 44 24 6c mov %eax,0x6c(%esp) in bzip2 built with standard pre-reload shceduler and lto fill: 804c272: 8b 44 24 7c mov 0x7c(%esp),%eax spill: 804d1a9: 89 44 24 78 mov %eax,0x78(%esp) i.e. we see the different offset for spill/fill of virtual register. To reproduce the failure is is sufficient to run spec on any corei7 machine with the pointed out options. Please, let me know if we are needed any additional info.