Re: memset() broken in gcc-3.1 on i386's
On Tue, Jun 04, 2002 at 03:02:21AM +, [EMAIL PROTECTED] wrote: > > Actually, it broke fsck_ffs. > > > > Workaround to avoid the known broken case: > > The brokenness in ix86_expand_clrstr is quite visible when you > compare the function with ix86_expand_movstr. Fixed in rev 1.368.2.10. "* i386.c (expand_movstr, expand_clrstr): Fix inline-all-stringops sequence." But their fix is a little different from yours: Index: i386.c === RCS file: /cvs/gcc/egcs/gcc/config/i386/i386.c,v retrieving revision 1.368.2.9 retrieving revision 1.368.2.10 diff -u -r1.368.2.9 -r1.368.2.10 --- i386.c 23 Apr 2002 08:11:22 - 1.368.2.9 +++ i386.c 22 May 2002 12:23:53 - 1.368.2.10 @@ -7959,6 +7959,10 @@ && GET_CODE (ix86_compare_op1) == CONST_INT && mode != HImode && (unsigned int) INTVAL (ix86_compare_op1) != 0x + /* The operand still must be representable as sign extended value. */ + && (!TARGET_64BIT + || GET_MODE (ix86_compare_op0) != DImode + || (unsigned int) INTVAL (ix86_compare_op1) != 0x7fff) && GET_CODE (operands[2]) == CONST_INT && GET_CODE (operands[3]) == CONST_INT) { @@ -9130,6 +9134,9 @@ { rtx countreg2; rtx label = NULL; + int desired_alignment = (TARGET_PENTIUMPRO + && (count == 0 || count >= (unsigned int) 260) + ? 8 : UNITS_PER_WORD); /* In case we don't know anything about the alignment, default to library version, since it is usually equally fast and result in @@ -9159,10 +9166,7 @@ This is quite costy. Maybe we can revisit this decision later or add some customizability to this code. */ - if (count == 0 - && align < (TARGET_PENTIUMPRO && (count == 0 - || count >= (unsigned int) 260) - ? 8 : UNITS_PER_WORD)) + if (count == 0 && align < desired_alignment) { label = gen_label_rtx (); emit_cmp_and_jump_insns (countreg, GEN_INT (UNITS_PER_WORD - 1), @@ -9184,10 +9188,7 @@ emit_label (label); LABEL_NUSES (label) = 1; } - if (align <= 4 - && ((TARGET_PENTIUMPRO && (count == 0 -|| count >= (unsigned int) 260)) - || TARGET_64BIT)) + if (align <= 4 && desired_alignment > 4) { rtx label = ix86_expand_aligntest (destreg, 4); emit_insn (gen_strmovsi (destreg, srcreg)); @@ -9196,6 +9197,12 @@ LABEL_NUSES (label) = 1; } + if (label && desired_alignment > 4 && !TARGET_64BIT) + { + emit_label (label); + LABEL_NUSES (label) = 1; + label = NULL_RTX; + } if (!TARGET_SINGLE_STRINGOP) emit_insn (gen_cld ()); if (TARGET_64BIT) @@ -9341,6 +9348,10 @@ { rtx countreg2; rtx label = NULL; + /* Compute desired alignment of the string operation. */ + int desired_alignment = (TARGET_PENTIUMPRO + && (count == 0 || count >= (unsigned int) 260) + ? 8 : UNITS_PER_WORD); /* In case we don't know anything about the alignment, default to library version, since it is usually equally fast and result in @@ -9355,13 +9366,10 @@ countreg = copy_to_mode_reg (counter_mode, count_exp); zeroreg = copy_to_mode_reg (Pmode, const0_rtx); - if (count == 0 - && align < (TARGET_PENTIUMPRO && (count == 0 - || count >= (unsigned int) 260) - ? 8 : UNITS_PER_WORD)) + if (count == 0 && align < desired_alignment) { label = gen_label_rtx (); - emit_cmp_and_jump_insns (countreg, GEN_INT (UNITS_PER_WORD - 1), + emit_cmp_and_jump_insns (countreg, GEN_INT (desired_alignment - 1), LEU, 0, counter_mode, 1, label); } if (align <= 1) @@ -9382,8 +9390,7 @@ emit_label (label); LABEL_NUSES (label) = 1; } - if (align <= 4 && TARGET_PENTIUMPRO && (count == 0 - || count >= (unsigned int) 260)) + if (align <= 4 && desired_alignment > 4) { rtx label = ix86_expand_aligntest (destreg, 4); emit_insn (gen_strsetsi (destreg, (TARGET_64BIT @@ -9394,6 +9401,13 @@ LABEL_NUSES (label) = 1; } + if (label && desired_alignment > 4 && !TARGET_64BIT) + { + emit_label (label); + LABEL_NUSES (label) = 1; + label = NULL_RTX; + } + if (!TARGET_SINGLE_STRINGOP) emit_insn (gen_cld ()); if (TARGET_64BIT) @@ -9409,18 +9423,18 @@ emit_insn (gen_rep_stossi (destreg, countreg2, zeroreg,
Re: memset() broken in gcc-3.1 on i386's
> Actually, it broke fsck_ffs. > > Workaround to avoid the known broken case: The brokenness in ix86_expand_clrstr is quite visible when you compare the function with ix86_expand_movstr. - Tor Egge Index: contrib/gcc/config/i386/i386.c === RCS file: /home/ncvs/src/contrib/gcc/config/i386/i386.c,v retrieving revision 1.9 diff -u -r1.9 i386.c --- contrib/gcc/config/i386/i386.c 9 May 2002 22:42:39 - 1.9 +++ contrib/gcc/config/i386/i386.c 4 Jun 2002 00:18:49 - @@ -9432,7 +9432,7 @@ gen_rtx_SUBREG (SImode, zeroreg, 0))); if (TARGET_64BIT && (align <= 4 || count == 0)) { - rtx label = ix86_expand_aligntest (destreg, 2); + rtx label = ix86_expand_aligntest (countreg, 4); emit_insn (gen_strsetsi (destreg, gen_rtx_SUBREG (SImode, zeroreg, 0))); emit_label (label); @@ -9443,7 +9443,7 @@ gen_rtx_SUBREG (HImode, zeroreg, 0))); if (align <= 2 || count == 0) { - rtx label = ix86_expand_aligntest (destreg, 2); + rtx label = ix86_expand_aligntest (countreg, 2); emit_insn (gen_strsethi (destreg, gen_rtx_SUBREG (HImode, zeroreg, 0))); emit_label (label); @@ -9454,7 +9454,7 @@ gen_rtx_SUBREG (QImode, zeroreg, 0))); if (align <= 1 || count == 0) { - rtx label = ix86_expand_aligntest (destreg, 1); + rtx label = ix86_expand_aligntest (countreg, 1); emit_insn (gen_strsetqi (destreg, gen_rtx_SUBREG (QImode, zeroreg, 0))); emit_label (label);
Re: memset() broken in gcc-3.1 on i386's
On Tue, 4 Jun 2002, I wrote: > gcc now generates inline code for memset in some cases. Broken code. Actually, it only generates inline code for memset in a few more cases, and the case of a non-constant length is broken (and some cases of constant lengths are pessimized (e.g., length 7)). > ... > This broke newfs (newfs left some garbage in a bitmap). Actually, it broke fsck_ffs. Workaround to avoid the known broken case: %%% Index: builtins.c === RCS file: /home/ncvs/src/contrib/gcc/builtins.c,v retrieving revision 1.1.1.3 diff -u -2 -r1.1.1.3 builtins.c --- builtins.c 13 May 2002 03:35:47 - 1.1.1.3 +++ builtins.c 4 Jun 2002 00:53:22 - @@ -2195,4 +2195,9 @@ len_rtx = expand_expr (len, NULL_RTX, VOIDmode, 0); + /* Give up for non-constant lengths. They are broken on at least +i386's. */ + if (GET_CODE (len_rtx) != CONST_INT) + return 0; + dest_mem = get_memory_rtx (dest); set_mem_align (dest_mem, dest_align); %%% Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
memset() broken in gcc-3.1 on i386's
gcc now generates inline code for memset in some cases. Broken code. E.g., compiling the following with -O: %%% #include int foo[100]; int x; main() { memset(&foo[0], 0, x); } %%% gives (at least if you have fixed function alignment): %%% .file "z.c" .text .p2align 2,,3 .globl main .type main,@function main: pushl %ebp movl%esp, %ebp pushl %edi pushl %eax movlx, %ecx xorl%eax, %eax shrl$2, %ecx movl$foo, %edi cld rep stosl andl$-16, %esp <-- the lower bits of `len' should be loaded near here testl $2, %edi<-- this seems to be meant to test the 2^1 bit in `len' (not alignment of the pointer like it actually does). %edi is the wrong register for holding the bits, since it is still needed for the pointer. je .L2 stosw .L2: testl $1, %edi<-- similarly for the 2^0 bit. je .L3 stosb .L3: movl-4(%ebp), %edi leave ret .Lfe1: .size main,.Lfe1-main .comm foo,400,32 .comm x,4,4 .ident "GCC: (GNU) 3.1 [FreeBSD] 20020509 (prerelease)" %%% This broke newfs (newfs left some garbage in a bitmap). This seems to only result in (len % 3) bytes not being cleared, since gcc doesn't seem to use the builtin memset unless it knows that the pointer is aligned. If %edi could be misaligned, then too many bytes would be set. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message