[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #18 from Kurt Garloff --- dd_rescue-1.99.13 has been released including the fix to XORN. (Fix uses uint* casts rather than uchar*.)
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #17 from Kurt Garloff --- jIt fixes it. Fix committed to git (on bad old sf.net). Will release new dd_rescue tomorrow ... Thanks, Martin, Jakub, Andrew for analyzing this!
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 kurt at garloff dot de changed: What|Removed |Added CC||kurt at garloff dot de --- Comment #16 from kurt at garloff dot de --- Using u32 instead of long to access the arrays in XORN should work as well, no? Should result in better performance than doing it byte by byte...
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #15 from Martin Liška --- (In reply to Jakub Jelinek from comment #14) > (In reply to Martin Liška from comment #10) > > > where the XOR16 is implemented as: > > > > > > #define XORN(in1,in2,out,len) \ > > > do { \ > > > uint _i;\ > > > for (_i = 0; _i < len/sizeof(ulong); ++_i) \ > > > *((ulong*)(out)+_i) = *((ulong*)(in1)+_i) ^ > > > *((ulong*)(in2)+_i);\ > > > } while(0) > > > > I can confirm that changing that to: > > > > #define XORN(in1, in2, out, len) \ > > do\ > > { \ > > uint _i; \ > > for (_i = 0; _i < len; ++_i) \ > > *(out + _i) = *(in1 + _i) ^ *(in2 + _i); \ > > } while (0) > > > > fixes the problem. It seems very close to what I saw here: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201#c13 > > It depends on if those arrays were stored as ulong or will be later read as > ulong or something else. Yes, that's what happens, they are stored with the aforementioned XORN function as ulong types. And later are read with #define GETU32(p) ntohl(*((u32*)(p))) as u32. That's the aliasing violation.
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #14 from Jakub Jelinek --- (In reply to Martin Liška from comment #10) > > where the XOR16 is implemented as: > > > > #define XORN(in1,in2,out,len) \ > > do {\ > > uint _i;\ > > for (_i = 0; _i < len/sizeof(ulong); ++_i) \ > > *((ulong*)(out)+_i) = *((ulong*)(in1)+_i) ^ > > *((ulong*)(in2)+_i);\ > > } while(0) > > I can confirm that changing that to: > > #define XORN(in1, in2, out, len) \ > do\ > { \ > uint _i; \ > for (_i = 0; _i < len; ++_i) \ > *(out + _i) = *(in1 + _i) ^ *(in2 + _i); \ > } while (0) > > fixes the problem. It seems very close to what I saw here: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201#c13 It depends on if those arrays were stored as ulong or will be later read as ulong or something else. One could also use typedef ulong ulong_alias __attribute__((may_alias)); and use ulong_alias* above, or memcpy to/out of ulong temporaries.
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #13 from Martin Liška --- Got it: https://sourceforge.net/p/ddrescue/tickets/6/
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #12 from Martin Liška --- (In reply to Sam James from comment #11) > Can you drop a link in here if/when reported upstream? Thanks I was unable to find a bugzilla insntance for the dd_rescue project. Is there any?
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #11 from Sam James --- Can you drop a link in here if/when reported upstream? Thanks
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 Martin Liška changed: What|Removed |Added Resolution|--- |INVALID Status|WAITING |RESOLVED --- Comment #10 from Martin Liška --- > where the XOR16 is implemented as: > > #define XORN(in1,in2,out,len) \ > do { \ > uint _i;\ > for (_i = 0; _i < len/sizeof(ulong); ++_i) \ > *((ulong*)(out)+_i) = *((ulong*)(in1)+_i) ^ > *((ulong*)(in2)+_i);\ > } while(0) I can confirm that changing that to: #define XORN(in1, in2, out, len) \ do\ { \ uint _i; \ for (_i = 0; _i < len; ++_i) \ *(out + _i) = *(in1 + _i) ^ *(in2 + _i); \ } while (0) fixes the problem. It seems very close to what I saw here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201#c13
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #9 from Martin Liška --- Actually, looking at the tree dumps before and after the revision, it's leading to a different place: First difference happens in: test_aes.ltrans0.ltrans.116t.dse2 [local count: 8687547526]: - _118 = MEM[(ulong *)iv_4(D)]; - _120 = MEM[(ulong *)input_19]; - _121 = _118 ^ _120; - MEM[(ulong *)iv_4(D)] = _121; - _129 = MEM[(ulong *)iv_4(D) + 8B]; - _131 = MEM[(ulong *)input_19 + 8B]; - _132 = _129 ^ _131; - MEM[(ulong *)iv_4(D) + 8B] = _132; (there's one more optimized out block like this. Which maps to: int AES_Gen_CBC_Enc(AES_Crypt_Blk_fn *cryptfn, const uchar* rkeys, uint rounds, uchar *iv, uint pad, const uchar *input, uchar *output, ssize_t len, ssize_t *olen) { *olen = len; while (len >= 16) { XOR16(iv, input, iv); cryptfn(rkeys, rounds, iv, iv); memcpy(output, iv, 16); len -= 16; input += 16; output += 16; } if (len || pad == PAD_ALWAYS) { uchar *in = crypto->blkbuf2; fill_blk(input, in, len, pad); XOR16(iv, in, iv); cryptfn(rkeys, rounds, iv, output); /* Store last IV */ memcpy(iv, output, 16); *olen += 16-(len&15); //memset(in, 0, 16); //LFENCE; } return (pad == PAD_ALWAYS || (len&15))? 16-(len&15): 0; } where the XOR16 is implemented as: #define XORN(in1,in2,out,len) \ do {\ uint _i;\ for (_i = 0; _i < len/sizeof(ulong); ++_i) \ *((ulong*)(out)+_i) = *((ulong*)(in1)+_i) ^ *((ulong*)(in2)+_i);\ } while(0)
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #8 from Martin Liška --- Ok, one suspicious casting of memory comes from: typedef union _roundkey { unsigned char data[16]; unsigned int data32[4]; } roundkey; ... typedef struct _sec_fields { ... roundkey ekeys[40]; ... uchar *rkeys = (uchar*)crypto->ekeys; //malloc(alg->ctx_size); It's then passed to rijndaelEncrypt(const u8 *rkeys /*u32 rk[4*(Nr + 1)]*/, uint Nr, const u8 pt[16], u8 ct[16]) as the first argument, where it's later used as: const u32 *rk = (u32*)rkeys; Might be the violation we face? Here are mixed 'roundkey *' and 'u32*'.
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- (In reply to Martin Liška from comment #6) > Yes, I'm also suspecting this code and I can verify that using optimize("O0") > for rijndaelEncrypt fixes the issue. > > The thing below is cast from 'const u8 *' and I thought it's valid to case > to 'u32 *' and then access it. Can you explain to me how exactly the > violation happens? A cast is fine, what matters are the accesses. If the object has u32 type, then casting its address to const u8 * and later on back to u32 * and accessing through that type is fine, but if it has an incompatible type, it is not. Similarly, if it is heap allocated, the type is given to it through the stores to it and later reads from it should be done with a compatible (aliasing wise) type. See section 6.5 in e.g. C17 for the details (definition of effective type and the aliasing requirements).
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #6 from Martin Liška --- (In reply to Andrew Pinski from comment #5) > I am 99% sure there is aliasing violations in this code too: > #if _MSC_VER > #define GETU32(p) SWAP(*((u32 *)(p))) > #define PUTU32(ct, st) \ > { \ > *((u32 *)(ct)) = SWAP((st));\ > } Yes, I'm also suspecting this code and I can verify that using optimize("O0") for rijndaelEncrypt fixes the issue. The thing below is cast from 'const u8 *' and I thought it's valid to case to 'u32 *' and then access it. Can you explain to me how exactly the violation happens? > #else /* _MSC_VER */ > # if __BYTE_ORDER == __BIG_ENDIAN > #define GETU32(p) *((u32*)(p)) > #define PUTU32(ct, st) *((u32*)(ct)) = (st) > #else /* BIG_ENDIAN */ > #if 0 //def HAVE_LINUX_SWAB_H > #define GETU32(p) __swab32(*((u32*)(p))) > #define PUTU32(ct, st) *((u32*)(ct)) = __swab32((st)) > #else /* __GNUC__ */ > #include > #define GETU32(p) ntohl(*((u32*)(p))) > #define PUTU32(ct, st) *((u32*)(ct)) = htonl((st)) > #endif /* __GNUC__ */ > #endif /* BIG_ENDIAN */ > #endif /*_MSC_VER */ > This part below is guarded with '#if 0'.. > #if 0 > #define GETU32(pt) (((u32)(pt)[0] << 24) ^ ((u32)(pt)[1] << 16) ^ > ((u32)(pt)[2] << 8) ^ ((u32)(pt)[3])) > #define PUTU32(ct, st) \ > { \ > (ct)[0] = (u8)((st) >> 24); \ > (ct)[1] = (u8)((st) >> 16); \ > (ct)[2] = (u8)((st) >> 8); \ > (ct)[3] = (u8)(st); \ > } > #endif > > A few other places too ...
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #5 from Andrew Pinski --- I am 99% sure there is aliasing violations in this code too: #if _MSC_VER #define GETU32(p) SWAP(*((u32 *)(p))) #define PUTU32(ct, st) \ { \ *((u32 *)(ct)) = SWAP((st));\ } #else /* _MSC_VER */ # if __BYTE_ORDER == __BIG_ENDIAN #define GETU32(p) *((u32*)(p)) #define PUTU32(ct, st) *((u32*)(ct)) = (st) #else /* BIG_ENDIAN */ #if 0 //def HAVE_LINUX_SWAB_H #define GETU32(p) __swab32(*((u32*)(p))) #define PUTU32(ct, st) *((u32*)(ct)) = __swab32((st)) #else /* __GNUC__ */ #include #define GETU32(p) ntohl(*((u32*)(p))) #define PUTU32(ct, st) *((u32*)(ct)) = htonl((st)) #endif /* __GNUC__ */ #endif /* BIG_ENDIAN */ #endif /*_MSC_VER */ #if 0 #define GETU32(pt) (((u32)(pt)[0] << 24) ^ ((u32)(pt)[1] << 16) ^ ((u32)(pt)[2] << 8) ^ ((u32)(pt)[3])) #define PUTU32(ct, st) \ { \ (ct)[0] = (u8)((st) >> 24); \ (ct)[1] = (u8)((st) >> 16); \ (ct)[2] = (u8)((st) >> 8); \ (ct)[3] = (u8)(st); \ } #endif A few other places too ...
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #4 from Martin Liška --- (In reply to Andrew Pinski from comment #2) > Also does adding -fno-strict-aliasing help? Yes, it helps.
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 --- Comment #3 from Martin Liška --- (In reply to Andrew Pinski from comment #1) > We need a better testcase than the direction on how to build a full package. Sure, I'll try to reduce it.
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 Andrew Pinski changed: What|Removed |Added Keywords||lto --- Comment #2 from Andrew Pinski --- Also does adding -fno-strict-aliasing help?
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 Andrew Pinski changed: What|Removed |Added Status|NEW |WAITING Keywords||needs-reduction --- Comment #1 from Andrew Pinski --- We need a better testcase than the direction on how to build a full package.
[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695 Martin Liška changed: What|Removed |Added Last reconfirmed||2023-02-07 Target Milestone|--- |13.0 Status|UNCONFIRMED |NEW Priority|P3 |P1 Ever confirmed|0 |1