Thanks for your help Vlad. Another bad news about this PR fix, is that it has resurrected the thumb_movhi_clobber bug (PR 58785) but in a different manner as the original failing testcase still pass. I attached a testcase to be compiled with :
cc1 -mthumb -mcpu=cortex-m0 -O2 m.c And Thumb bootstrap seems to be broken with an ICE in check_rtl, I'm checking if it is the same issue. Yvan On 12 December 2013 20:18, Vladimir Makarov <vmaka...@redhat.com> wrote: > On 12/11/2013, 1:59 PM, Yvan Roux wrote: >> >> On 11 December 2013 19:25, Vladimir Makarov <vmaka...@redhat.com> wrote: >>> >>> On 12/11/2013, 5:35 AM, Yvan Roux wrote: >>>> >>>> >>>> Hi Vladimir, >>>> >>>> I've some regressions on ARM after this SP elimination patch, and they >>>> are execution failures. Here is the list: >>>> >>>> g++.dg/cilk-plus/AN/array_test_ND_tplt.cc -O3 -fcilkplus >>>> gcc.c-torture/execute/va-arg-22.c -O2 >>>> gcc.dg/atomic/c11-atomic-exec-5.c -O0 >>>> gfortran.dg/direct_io_12.f90 -O[23] >>>> gfortran.dg/elemental_dependency_1.f90 -O2 >>>> gfortran.dg/matmul_2.f90 -O2 >>>> gfortran.dg/matmul_6.f90 -O2 >>>> gfortran.dg/mvbits_7.f90 -O3 >>>> gfortran.dg/unlimited_polymorphic_1.f03 -O3 >>>> >>>> I reduced and looked at var-arg-22.c and the issue is that in >>>> lra_eliminate_regs_1 (called by get_equiv_with_elimination) we >>>> transformed sfp + 0x4c in sp + 0xfc because of a bad sp offset. What >>>> we try to do here is to change the pseudo 195 of the insn 118 below : >>>> >>>> (insn 118 114 112 8 (set (reg:DI 195) >>>> (unspec:DI [ >>>> (mem:DI (plus:SI (reg/f:SI 215) >>>> (const_int 8 [0x8])) [7 MEM[(struct A35 *)_12 >>>> + 64B]+8 S8 A8]) >>>> ] UNSPEC_UNALIGNED_LOAD)) v2.c:49 146 {unaligned_loaddi} >>>> (expr_list:REG_EQUIV (mem/c:DI (plus:SI (reg/f:SI 192) >>>> (const_int 8 [0x8])) [7 a35+8 S8 A32]) >>>> (nil))) >>>> >>>> with its equivalent (x arg of lra_eliminate_regs_1): >>>> >>>> (mem/c:DI (plus:SI (reg/f:SI 102 sfp) >>>> (const_int 76 [0x4c])) [7 a35+8 S8 A32]) >>>> >>>> lra_eliminate_regs_1 is called with full_p = true (it is not really >>>> clear for what it means), >>> >>> >>> >>> It means we use full offset between the regs, otherwise we use change in >>> the >>> full offset from the previous iteration (it can be changed as we reserve >>> stack memory for spilled pseudos and the reservation can be done several >>> times). As equiv value is stored as it was before any elimination, we >>> need >>> always to use full offset to make elimination. >> >> >> Ok thanks it's clearer. >> >>> but in the PLUS switch case, we have offset >>>> >>>> >>>> = 0xb (given by ep->offset) and as lra_get_insn_recog_data >>>> (insn)->sp_offset value is 0, we will indeed add 0xb to the original >>>> 0x4c offset. >>>> >>> >>> 0 value is suspicious because it is default. We might have not set up it >>> from neighbor insns. >>> >>> >>> >>>> So, here I don't get if it is the sp_offset value of the >>>> lra_insn_recog_data element which is not well updated or if lra_ >>>> eliminate_regs_1 has to be called with update_p and not full_p (which >>>> fixed the value in that particular case). Is it more obvious for you >>>> ? >>>> >>> >>> Yvan, could you send me the reduced preprocessed case and the options for >>> cc1 to reproduce it. >> >> >> >> Here is cc1 command line : >> >> cc1 -quiet -march=armv7-a -mtune=cortex-a15 -mfloat-abi=hard >> -mfpu=neon -mthumb v2.c -O2 >> >> I use a native build on a chromebook, but it's reproducible with a >> cross compiler. >> >> With the attached test case the issue is when processing insn 118. > > > The offset is updated two times and that is wrong. That is because memory > in init insn is shared by ira_reg_equiv and the test involves 2 equivalent > substitutions. As I wrote equiv should be stored in original form by the > current patch design. Simple copying will not work as the first > substitution is not done in this case. > > I need some time to think how to fix it better still I'll try to fix it > tomorrow. I expected that the patch might have some problems. The patch > code is quite big although it is just a long standing PR fix. Therefore that > was my first PR fixed on stage 3. It is good to have it tested earlier and > sorry to break some arm tests. > >
void free (void *) ; extern int *__errno (void); typedef unsigned int size_t; typedef unsigned int __uint32_t; typedef unsigned short __uint16_t; typedef struct _bufhead BUFHEAD; struct _bufhead { BUFHEAD *prev; BUFHEAD *next; BUFHEAD *ovfl; __uint32_t addr; char *page; char flags; }; typedef BUFHEAD **SEGMENT; typedef struct hashhdr { int magic; int version; __uint32_t lorder; int bsize; int bshift; int dsize; int ssize; int sshift; int ovfl_point; int last_freed; int max_bucket; int high_mask; int low_mask; int ffactor; int nkeys; int hdrpages; int h_charkey; int spares[32]; __uint16_t bitmaps[32]; } HASHHDR; typedef struct htab { HASHHDR hdr; int nsegs; int exsegs; __uint32_t (*hash)(const void *, size_t); int flags; int fp; char *tmp_buf; char *tmp_key; BUFHEAD *cpage; int cbucket; int cndx; int error; int new_file; int save_file; __uint32_t *mapp[32]; int nmaps; int nbufs; BUFHEAD bufhead; SEGMENT *dir; } HTAB; extern int collect_data(hashp, bufp, len, set) HTAB *hashp; BUFHEAD *bufp; int len, set; { __uint16_t *bp; char *p; BUFHEAD *xbp; __uint16_t save_addr; int mylen, totlen; p = bufp->page; bp = (__uint16_t *)p; mylen = hashp->hdr.bsize - bp[1]; save_addr = bufp->addr; if (bp[2] == 3) { totlen = len + mylen; if (hashp->tmp_buf) free(hashp->tmp_buf); if ((hashp->tmp_buf = (char *)malloc(totlen)) == ((void *)0)) return (-1); if (set) { hashp->cndx = 1; if (bp[0] == 2) { hashp->cpage = ((void *)0); hashp->cbucket++; } else { hashp->cpage = __get_buf(hashp, bp[bp[0] - 1], bufp, 0); if (!hashp->cpage) return (-1); else if (!((__uint16_t *)hashp->cpage->page)[0]) { hashp->cbucket++; hashp->cpage = ((void *)0); } } } } else { xbp = __get_buf(hashp, bp[bp[0] - 1], bufp, 0); if (!xbp || ((totlen = collect_data(hashp, xbp, len + mylen, set)) < 1)) return (-1); } if (bufp->addr != save_addr) { return (-1); } return (totlen); }