[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 Carl Lovechanged: What|Removed |Added Status|RESOLVED|CLOSED -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 Carl Lovechanged: What|Removed |Added Resolution|--- |FIXED Status|CONFIRMED |RESOLVED -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #35 from Carl Love--- testsuite fix to give more unique values. In testing issues with the new ISA 3.0 instructions in BE mode, it was found that we needed some more unique values in the operands to catch various errors. The issue is a sigle 32-bit value was replicated four times for a V128 operand. The result is testing loads and stores where the word or half word order was swizzled couln't be detected because they were the same. By making the 32-bit chunks unique we were able to catch additional errors. The initialize_buffer() function in none/tests/ppc64/ppc64_helpers.h was changed to ensure we got more unique values. The result is the ISA 3.0 expected output files changed. The expect files were for LE. The expect files were updated for the BE results and the LE result file was added with the "-LE" on the end of the file name per convention. The patch is large so I did not attach it to the bug. Committed in Valgrind commit 16032 -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #34 from Carl Love--- ISA 3.0 BE fixes for various new instructions patch committed VEX commit 3260 -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #33 from Carl Love--- Created attachment 101482 --> https://bugs.kde.org/attachment.cgi?id=101482=edit ISA 3.0 BE fixes for various new instructions This is an additional commit to fix issues found with the new Power ISA 3.0 instructions for BE mode. The instructions fixed in this patch include: lxvl, lxvx, lxvwsx, lxvh8x, lxvh16x, stxvx, stxvh8x, stxvh16x, lxsibzx, lxsihzx, xscvqpdp, xscvqpdp0, xvcvsphp. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #32 from Carl Love--- Patch to fix missing fnptr_to_fnentry() wrapper funion calls for clean and dirty helpers was committed, VEX commit 3251. The BE fixes for the function FPU_rounding_mode_isOdd(), stxvl and stxvx instructions was committed, VEX commit 3252 Patch for the load and store instructions in BE mode is still pending. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #31 from Carl Love--- Created attachment 101364 --> https://bugs.kde.org/attachment.cgi?id=101364=edit Some BE fixes The BE simulator runs found a few more issues that cause Valgrind to fail. These are minor things that were easy to fix. The attached patch addresses these issues. The BE testing has also detected that some of the load and store instructions are not loading the data in the expected BE order. Working on a patch for these instructions. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #30 from Mark Wielaard--- (In reply to Carl Love from comment #29) > Please review the patch and let me know if you see any issues. If everyone > approves the patch and the ISA 3.0 Big endian test passes, I will commit the > fix. The fix will need to get staged for the upcoming Valgrind release. Looks good to me and it passes the testsuite on my setup (power8 both ppc64 and ppc64le). -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #29 from Carl Love--- Created attachment 101332 --> https://bugs.kde.org/attachment.cgi?id=101332=edit Patch to fix missing fnptr_to_fnentry() wrapper function calls for clean and dirty helpers The attached patch fixes the missing fnptr_to_fnentry() calls for the clean and dirty calls. This patch is based on patch 101296 by Mark Wielaard but with the fnptr_to_fentry() call added to all of the other clean and dirty handlers where it was missing. Note, the order of the arguments in the function calls was also changed but that is just a "cosmetic" change The fnptr_to_fnentry() fix was tested and verified on a Power 7 (Big endian) system by applying patch 101299. The patch basically allows us to test the control flow for the bcdadd and bcdsub instructions without actually generating the bcdadd or bcdsub instructions on a Power 7. Instead of the bcdadd and bcdsub instructions being generated, the vand instruction which is supported by Power 7 is generated. With patch 101299, the bug in the clean helper was verified on a big endian system. Then when the fnptr_to_fentry() fix (patch 101296) was applied the clean helper executes correctly. The full regtest suite has been run with the attached patch applied to the current upstream code base on a Power 8 Big endian and Power 8 Little endian systems. The regtest suite on the patched upstream code has been tested and verified on an ISA 3.0 Little endian simulator. The ISA 3.0 Big endian test is currently still running. Unfortunately, building Valgrind from scratch and running the regtest literally takes all day. Please review the patch and let me know if you see any issues. If everyone approves the patch and the ISA 3.0 Big endian test passes, I will commit the fix. The fix will need to get staged for the upcoming Valgrind release. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #28 from Carl Love--- Posted a patch to replace the bcdadd instruction generation in Valgrind with vand. This patch allows us to run a bcd add test on a power 7 (big endian) machine to test the ISA 3.0 code that calls is_BCDstring128() function to see if the function pointer stuff for clean helpers is broken on the older big endian machines. Also posted the compiled bcd_test.c test program. I ran the patch and binary bcd_test program on a Power 7. It does appear to segfault the same as we are seeing on Power 8 BE. Note, this was done WITH OUT Mark's patch applied. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #27 from Carl Love--- Created attachment 101300 --> https://bugs.kde.org/attachment.cgi?id=101300=edit binary for bcd_add.c test Binary for the bcd_test.c program. Runs under valgrind on Power 7 with the valgrind bcdadd patch previously posted. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #26 from Carl Love--- Created attachment 101299 --> https://bugs.kde.org/attachment.cgi?id=101299=edit patch, replace isa2_07 bcdadd/bcdsub with isa2_06 vand inst to we can run bcdadd test on Power 7 Replace bcdadd, bcdsub with vand so we can run bcdadd test on a power 7 to test the clean helper issue. Ignore compiler warnings about unused variables. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #25 from Ulrich Weigand--- (In reply to Julian Seward from comment #24) > (In reply to Ulrich Weigand from comment #23) > > However, adding calls to fnptr_to_fnentry at a high level likewise seems > > wrong, since once you've done that, you've forgotten where the function > > descriptor was and have no chance of then retrieving the correct TOC value > > to load into r2 before the call. And in fact, the crash in comment #7 seems > > a typical example of what happens when calling a function without setting up > > its TOC pointer correctly. > > Ulrich, you're right. Valgrind actually kludges this on ppc64be, by > completely > ignoring the TOC pointer question and not saving or restoring r2 across calls. > I suspect that the reason this works is that Valgrind itself is compiled into > a > statically linked executable, and so there is only ever one required r2 value, > which is set up at start time and stays the same for ever more. Does that > sound plausible? I see. For the "host" side (valgrind itself), I guess that's a valid approach, as long as you make sure that there is indeed never any shared library involved (not even via dlopen), and the main executable actually only uses a single TOC (the linker in fact supports generating executables using multiple TOCs). With code generated by a current compiler (in particular using the medium memory model), you'll probably always get single-TOC executables anyway, but if you actually rely on that property, I guess it cannot hurt to make sure by passing --no-multi-toc to the linker. Of course on the "target" side (the program running under valgrind), you'll have shared libraries and thus multiple TOCs, so if you inject code on that side, you may need to consider reloading the emulated r2 at some point. That probably isn't an issue with the "helpers" discussed in this bug though. Given that, simply doing the fnptr_to_fentry actually may just be the right fix. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #24 from Julian Seward--- (In reply to Ulrich Weigand from comment #23) > However, adding calls to fnptr_to_fnentry at a high level likewise seems > wrong, since once you've done that, you've forgotten where the function > descriptor was and have no chance of then retrieving the correct TOC value > to load into r2 before the call. And in fact, the crash in comment #7 seems > a typical example of what happens when calling a function without setting up > its TOC pointer correctly. Ulrich, you're right. Valgrind actually kludges this on ppc64be, by completely ignoring the TOC pointer question and not saving or restoring r2 across calls. I suspect that the reason this works is that Valgrind itself is compiled into a statically linked executable, and so there is only ever one required r2 value, which is set up at start time and stays the same for ever more. Does that sound plausible? -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 Ulrich Weigandchanged: What|Removed |Added CC||uweig...@de.ibm.com --- Comment #23 from Ulrich Weigand --- Will asked me to look at this bug and comment from the perspective of the Power ABI ... Now, with the ppc64(be) (or AIX) ABI, to call a function via function pointer you need to do two things: - read the target functions' TOC pointer from the function descriptor (word 1) and load it into r2, *and* - read the target functions' entry point address from the function descriptor (word 0) and transfer control to it (And of course you need to save the old value of r2 before the call and restore it afterwards. To be fully correct, you also need to load the static chain pointer from word 2 of the function descriptor and load it into r11 -- but that only matters when calling a nested function.) If you're jumping into a function descriptor (as in comment #19), then you're clearly treating a function descriptor address as if it were an entry point address, which is of course wrong. However, adding calls to fnptr_to_fnentry at a high level likewise seems wrong, since once you've done that, you've forgotten where the function descriptor was and have no chance of then retrieving the correct TOC value to load into r2 before the call. And in fact, the crash in comment #7 seems a typical example of what happens when calling a function without setting up its TOC pointer correctly. I'm not very familiar with valgrind internals, but what I'd have expected to happen is that the function pointer value (i.e. function descriptor address) is being kept around in the IR until that (platform-specific) point where actual function call instructions (bctrl etc.) are emitted; and only at *that* point look into the function descriptor and use it to load both TOC pointer and entry point address. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #22 from Julian Seward--- Looking for helper calls in the the whole of guest_ppc_toIR.c, by searching for the string "mkIRExprVec_", I found the following non-wrapped uses of function pointers. They should all be wrapped in fnptr_to_fnentry(). The reason for my claim of "impossible" in comment 21 is that I would have thought that the BCD test cases exercise all of these helper functions, so any missing wrapper would cause the case to segfault. So maybe we don't really yet understand what is going on. static IRExpr * is_BCDstring128 (UInt Signed, IRExpr *src) _BCDstring128_helper, static IRTemp increment_BCDstring (IRExpr *src, IRExpr *carry_in) _BCDstring32_helper, _BCDstring32_helper, _BCDstring32_helper, _BCDstring32_helper, static IRExpr * convert_to_zoned ( IRExpr *src, IRExpr *upper_byte ) _to_zoned_helper, _to_zoned_helper, static IRExpr * convert_to_national ( IRExpr *src ) { _to_national_helper, _to_national_helper, static IRExpr * convert_from_zoned ( IRExpr *src ) { _from_zoned_helper, static IRExpr * convert_from_national ( IRExpr *src ) { _from_national_helper, static Bool dis_av_load ( const VexAbiInfo* vbi, UInt theInstr ) _dirtyhelper_LVS, (although this one doesn't matter, because it is LE only. Still, inconsistent) same for lvsr a few lines later -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #21 from Mark Wielaard--- Created attachment 101296 --> https://bugs.kde.org/attachment.cgi?id=101296=edit Patch to wrap _BCDstring128_helper address in fnptr_to_fnentry After some debugging Julian suggested this fix. It makes the original testcase PASS. But Julian claims this is impossible :) There are other calls that arguably should use fnptr_to_fnentry to get/call the actual function entry (instead the function descriptor). So either or testcases are not stress testing things enough, or we still don't fully understand what is going on. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #20 from Mark Wielaard--- Created attachment 101295 --> https://bugs.kde.org/attachment.cgi?id=101295=edit Trace for smaller badness2 reproducer Trace output for a smaller reproducer: $ cat badness2.c #include static vector unsigned long long vec_out, vec_inA, vec_inB; int main ( void ) { memset(_inA, 0x12, sizeof(vec_inA)); memset(_inB, 0x34, sizeof(vec_inB)); memset(_out, 0x56, sizeof(vec_out)); __asm__ __volatile__ ("bcdadd. %0, %1, %2, 0" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB)); return 0; } $ gcc -Winline -Wall -g -O -mregnames -maltivec -m64 -o badness2 badness2.c -mvsx -Wa,-mvsx -mcpu=power8 -Wa,-mpower8 -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #19 from Will Schmidt--- Chased this one around under gdb for a while, looking for where the R2 or the stack frame are getting messed up.Something seems to be going horribly wrong here, but it's possible valgrind just always does this and I'm totally misunderstanding what I am seeing. With that said: It looks like something is going badly in m_dispatch/dispatch-ppc64be-linux.S, line 225, in or around the vspltisw instruction. Noting the changes between the backtrace before and the backtrace after the step over that instruction. (gdb) 223beq .LafterVMX2 => 0x38092e88 <.vgPlain_disp_run_translations+320>:41 82 00 0c beq 0x38092e94 <.vgPlain_disp_run_translations+332> (gdb) bt #0 .vgPlain_disp_run_translations () at m_dispatch/dispatch-ppc64be-linux.S:223 #1 0x380a3114 in run_thread_for_a_while (two_words=two_words@entry=0x802b3fe20, dispatchCtrP=dispatchCtrP@entry=0x802b3fe34, tid=tid@entry=1, alt_host_addr=alt_host_addr@entry=0, use_alt_host_addr=use_alt_host_addr@entry=0 '\000') at m_scheduler/scheduler.c:947 #2 0x380a5a18 in vgPlain_scheduler (tid=) at m_scheduler/scheduler.c:1336 #3 0x380bf3c8 in thread_wrapper (tidW=) at m_syswrap/syswrap-linux.c:103 #4 run_a_thread_NORETURN (tidW=) at m_syswrap/syswrap-linux.c:156 #5 0x in ?? () (gdb) stepi .vgPlain_disp_run_translations () at m_dispatch/dispatch-ppc64be-linux.S:225 225vspltisw 3,0x0 /* generate zero */ => 0x38092e8c <.vgPlain_disp_run_translations+324>:10 60 03 8c vspltisw v3,0 (gdb) bt #0 .vgPlain_disp_run_translations () at m_dispatch/dispatch-ppc64be-linux.S:225 #1 0x380a3114 in run_thread_for_a_while (two_words=0x0, dispatchCtrP=0x0, tid=, alt_host_addr=0, use_alt_host_addr=) at m_scheduler/scheduler.c:947 #2 0x in ?? () (gdb) stepi 226mtvscr 3 => 0x38092e90 <.vgPlain_disp_run_translations+328>:10 00 1e 44 mtvscr v3 (gdb) 230stdu1,-48(1) => 0x38092e94 <.vgPlain_disp_run_translations+332>:f8 21 ff d1 stdur1,-48(r1) (gdb) .vgPlain_disp_run_translations () at m_dispatch/dispatch-ppc64be-linux.S:233 233mr 31,4 /* r31 (generated code gsp) = r4 */ => 0x38092e98 <.vgPlain_disp_run_translations+336>:7c 9f 23 78 mr r31,r4 (gdb) 240mtctr 5 => 0x38092e9c <.vgPlain_disp_run_translations+340>:7c a9 03 a6 mtctr r5 (gdb) 241bctr => 0x38092ea0 <.vgPlain_disp_run_translations+344>:4e 80 04 20 bctr (gdb) 0x00080324 in ?? () => 0x00080324:83 df 00 08lwz r30,8(r31) (gdb) ... at which point I suspect we're off in the weeds with our IP/SP.Did a bit more single-stepping after that,.. Neither I nor gdb are able to map whatever we are executing to anything.. 0x00080324 in ?? () => 0x00080324:83 df 00 08lwz r30,8(r31) (gdb) 0x000803240004 in ?? () => 0x000803240004:37 de ff ffaddic. r30,r30,-1 (gdb) 0x000803240008 in ?? () => 0x000803240008:93 df 00 08stw r30,8(r31) (gdb) 0x00080324000c in ?? () => 0x00080324000c:40 80 00 10bge 0x80324001c (gdb) eventually we stumble across ppc32g_dirtyhelper_MFSPR_ (gdb) 0x0008032401ec in ?? () => 0x0008032401ec:4e 80 04 21bctrl (gdb) ppc32g_dirtyhelper_MFSPR_268_269 (r269=0) at priv/guest_ppc_helpers.c:98 98 if (r269) { => 0x3817ce80 :2f a3 00 00 cmpdi cr7,r3,0 0x3817ce84 :40 9e 00 1c bne cr7,0x3817cea0 a bctrl in that takes us back into dispatch-ppc64be-linux.S (gdb) 0x00080324026c in ?? () => 0x00080324026c:4e 80 04 21bctrl (gdb) .vgPlain_disp_cp_chain_me_to_fastEP () at m_dispatch/dispatch-ppc64be-linux.S:435 435li 6, VG_TRC_CHAIN_ME_TO_FAST_EP => 0x38093034 <.vgPlain_disp_cp_chain_me_to_fastEP+0>:38 c0 00 33 li r6,51 (gdb) 436mflr 7 eventually we branch into postamble/vgPlain_disp_run_translations... (gdb) 442b.postamble => 0x38093040 <.vgPlain_disp_cp_chain_me_to_fastEP+12>:4b ff fe 64 b 0x38092ea4 <.vgPlain_disp_run_translations+348> (gdb) .vgPlain_disp_run_translations () at m_dispatch/dispatch-ppc64be-linux.S:256 256addi1,1,48 -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #17 from Mark Wielaard--- Since some of the output looks as if we jump into (or just read from?) the ODP and that gives a permission error/SEGV, could you provide the section and segments of the tool to double check everything is at the right place? eu-readelf -Sl none-ppc64be-linux -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #18 from Mark Wielaard--- Never mind the objdump -h gives the right information. The 0x3824B2B0 is indeed in the middle of the ODP. The ODP is loaded and DATA (not CODE), which I assume means it is readable (so we can load/read the actual function address and jump to that) but not execute it. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #16 from Will Schmidt--- A bit of an info dump from the IRC discussions occurring on this bug: A gdb session revealed: Program received signal SIGSEGV, Segmentation fault. 0x3824b2b0 in is_BCDstring128_helper () (gdb) disas 0x3824b2b0 Dump of assembler code for function is_BCDstring128_helper: => 0x3824b2b0 <+0>: .long 0x0 0x3824b2b4 <+4>: addir0,r23,-11904 Disassembly of the function by name shows a different address. (gdb) disas is_BCDstring128_helper Dump of assembler code for function is_BCDstring128_helper: 0x3817d180 <+0>: cmpdi cr7,r3,1 0x3817d184 <+4>: li r3,1 per some disassembly, I believe we have confirmed we just tried to execute the OPD... 3824b2b0 g F .opd 0080 is_BCDstring128_helper > objdump -h none/none-ppc64be-linux none/none-ppc64be-linux: file format elf64-powerpc Sections: Idx Name Size VMA LMA File off Algn 0 .note.gnu.build-id 0024 38000158 38000158 0158 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .text 001c6ed0 38000180 38000180 0180 2**5 CONTENTS, ALLOC, LOAD, READONLY, CODE 2 .rodata 00059198 381c7050 381c7050 001c7050 2**3 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .eh_frame 0030 382201e8 382201e8 002201e8 2**3 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .data.rel.ro 0030 3823ffd0 3823ffd0 0022ffd0 2**3 CONTENTS, ALLOC, LOAD, DATA 5 .data 25f0 3824 3824 0023 2**3 CONTENTS, ALLOC, LOAD, DATA 6 .opd ad98 382425f0 382425f0 002325f0 2**3 CONTENTS, ALLOC, LOAD, DATA 7 .got 0030 3824d388 3824d388 0023d388 2**3 CONTENTS, ALLOC, LOAD, DATA 8 .bss 009da2a0 3824d3c0 3824d3c0 0023d3b8 2**4 ALLOC 9 .comment 002c 0023d3b8 2**0 CONTENTS, READONLY 10 .debug_aranges 1c20 0023d3f0 2**4 CONTENTS, READONLY, DEBUGGING 11 .debug_info 0030b9a3 0023f010 2**0 CONTENTS, READONLY, DEBUGGING 12 .debug_abbrev 000156fa 0054a9b3 2**0 CONTENTS, READONLY, DEBUGGING 13 .debug_line 00055a2f 005600ad 2**0 CONTENTS, READONLY, DEBUGGING 14 .debug_frame 00023b30 005b5ae0 2**3 CONTENTS, READONLY, DEBUGGING 15 .debug_str0002e1c5 005d9610 2**0 CONTENTS, READONLY, DEBUGGING 16 .debug_loc002b2e89 006077d5 2**0 CONTENTS, READONLY, DEBUGGING 17 .debug_ranges 00051bb0 008ba65e 2**0 CONTENTS, READONLY, DEBUGGING -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 Will Schmidtchanged: What|Removed |Added Attachment #101244|0 |1 is obsolete|| --- Comment #15 from Will Schmidt --- Created attachment 101246 --> https://bugs.kde.org/attachment.cgi?id=101246=edit comment 3 step 3 output respin comment 3 step 3 respin with --trace-notbelow=2092 -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #14 from Carl Love--- Created attachment 101245 --> https://bugs.kde.org/attachment.cgi?id=101245=edit simple test program The attachment is a stripped down test for the bcdadd instruction. It was pulled out of the jm_vec_isa_2_07 test. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #13 from Will Schmidt--- Created attachment 101244 --> https://bugs.kde.org/attachment.cgi?id=101244=edit comment 3 step 3 output results from comment3 step 3 -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #12 from Will Schmidt--- Created attachment 101243 --> https://bugs.kde.org/attachment.cgi?id=101243=edit output from monitor v.translate on segfaulting instruction -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #11 from Will Schmidt--- (In reply to Mark Wielaard from comment #9) > Assuming that is with remote vgdb attached then you should be able to get > the generated IR and assembly with: > monitor v.translate 0x100016dc 0b0011 > > (See the manual or valgrind --help-debug for the explanation of the > --trace-flags) Ok. So given this: Program received signal SIGSEGV, Segmentation fault. 0x100016d8 in test_bcdadd () at test_isa_2_07_part1.c:710 710 __asm__ __volatile__ ("bcdadd. %0, %1, %2, 0" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB)); I did the monitor command against the ... 16d8 addr. (gdb) monitor v.translate 0x100016d8 0b0011 The output is larger than I had expected...attaching it momentarily. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #10 from Will Schmidt--- Created attachment 101242 --> https://bugs.kde.org/attachment.cgi?id=101242=edit trace flags output last portion of output from a run in attachment. $ ./vg-in-place --tool=none --trace-flags=10001110 --trace-notbelow=2000 none/tests/ppc64/test_isa_2_07_part1 -a very last bit of output also inline: 377 std %r14,1296(%r31) 378 ld %r14,1304(%r31) 379 li_word %r15,0xFFFC 380 and %r16,%r14,%r15 381 (xIndir) if (always) { std %r16,1296(%r31); imm64 r30,$disp_cp_xindir; mtctr r30; bctr } VexExpansionRatio 36 1872 520 :10 ==4341== ==4341== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==4341== Bad permissions for mapped region at address 0x3824B2B0 ==4341==at 0x100016D8: test_bcdadd (test_isa_2_07_part1.c:710) ==4341==by 0x10001A2F: test_av_bcd (test_isa_2_07_part1.c:1419) ==4341==by 0x41E4D87: generic_start_main.isra.0 (libc-start.c:289) ==4341==by 0x41E4FB7: (below main) (libc-start.c:80) ==4341== -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #9 from Mark Wielaard--- Assuming that is with remote vgdb attached then you should be able to get the generated IR and assembly with: monitor v.translate 0x100016dc 0b0011 (See the manual or valgrind --help-debug for the explanation of the --trace-flags) -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #8 from Will Schmidt--- off-by-one on my disassembly "==>" pointer, the SIGSEGV is actually on the bcdadd instruction. (showed up differently while single-stepping). 0x100016d4 <+64>:lxvd2x vs32,r10,r9 0x100016d8 <+68>:lxvd2x vs33,0,r9 => 0x100016dc <+72>:bcdadd. v0,v1,v0,0 0x100016e0 <+76>:li r10,16 still poking.. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #7 from Will Schmidt--- (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x100016d8 in test_bcdadd () at test_isa_2_07_part1.c:710 710 __asm__ __volatile__ ("bcdadd. %0, %1, %2, 0" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB)); (gdb) disas Dump of assembler code for function test_bcdadd: 0x10001694 <+0>:nop 0x10001698 <+4>:lwz r9,-32228(r2) 0x1000169c <+8>:cmpwi cr7,r9,0 0x100016a0 <+12>:beq cr7,0x100016c8 0x100016a4 <+16>:nop 0x100016a8 <+20>:addir9,r2,-32328 0x100016ac <+24>:li r10,32 0x100016b0 <+28>:lxvd2x vs32,r10,r9 0x100016b4 <+32>:lxvd2x vs33,0,r9 0x100016b8 <+36>:bcdadd. v0,v1,v0,1 0x100016bc <+40>:li r10,16 0x100016c0 <+44>:stxvd2x vs32,r9,r10 0x100016c4 <+48>:b 0x100016e8 0x100016c8 <+52>:nop 0x100016cc <+56>:addir9,r2,-32328 0x100016d0 <+60>:li r10,32 0x100016d4 <+64>:lxvd2x vs32,r10,r9 => 0x100016d8 <+68>:lxvd2x vs33,0,r9 0x100016dc <+72>:bcdadd. v0,v1,v0,0 0x100016e0 <+76>:li r10,16 0x100016e4 <+80>:stxvd2x vs32,r9,r10 0x100016e8 <+84>:blr 0x100016ec <+88>:.long 0x0 0x100016f0 <+92>:.long 0x0 0x100016f4 <+96>:.long 0x0 End of assembler dump. (gdb) info reg r10 r100x10001958268441944 (gdb) info reg r9 r9 0x10020f40268570432 (gdb) info reg vs32 vs32 {uint128 = 0x6090180378642006002244669113354d, v2_double = {0x, 0x0}, v4_float = { 0x, 0x, 0x0, 0x0}, v4_int32 = {0x60901803, 0x78642006, 0x224466, 0x9113354d}, v8_int16 = { 0x6090, 0x1803, 0x7864, 0x2006, 0x22, 0x4466, 0x9113, 0x354d}, v16_int8 = {0x60, 0x90, 0x18, 0x3, 0x78, 0x64, 0x20, 0x6, 0x0, 0x22, 0x44, 0x66, 0x91, 0x13, 0x35, 0x4d}} (gdb) info reg vs33 vs33 {uint128 = 0x8045090189321003001122334556677d, v2_double = {0x0, 0x0}, v4_float = {0x0, 0x0, 0x0, 0xd66}, v4_int32 = {0x80450901, 0x89321003, 0x112233, 0x4556677d}, v8_int16 = {0x8045, 0x901, 0x8932, 0x1003, 0x11, 0x2233, 0x4556, 0x677d}, v16_int8 = {0x80, 0x45, 0x9, 0x1, 0x89, 0x32, 0x10, 0x3, 0x0, 0x11, 0x22, 0x33, 0x45, 0x56, 0x67, 0x7d}} (gdb) -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #6 from Mark Wielaard--- (In reply to Julian Seward from comment #3) > I tried to reproduce this on gcc112.fsffrance.org (a P8 system) but failed -- > it runs OK. I'd be happy to chase this if I could reproduce on a machine > that I can access. gcc112 is ppc64le which seems to work fine. Unfortunately the ppc64[be] machines in the GCC compile farm are all power7, not power8 (as gcc112 is). > (3) with --tool=none, run it with --trace-flags=10001110 and >--trace-notbelow= set to the appropriate value so you actually >get output for the failing block. From that we might be able >to guess what the problem is. It is sometimes convenient to do that from gdb. Run with ./vg-in-place --vgdb-error=0 none/tests/ppc64/test_isa_2_07_part1 in one terminal gdb none/tests/ppc64/test_isa_2_07_part1 in another. Attach in gdb with target remote | coregrind/vgdb. continue. It gets the SEGV. Find the address where it crashed. Then use monitor v.translate [] to dump the block. The traceflags are the same. See http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.valgrind-monitor-commands -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #5 from Julian Seward--- Still can't repro it, but with a test case for this insn, the two calls look like this: IR and virtual-registerised code: -- t127 = 1Sto32(32to1(64to32(And64(is_BCDstring128_helper{0x38174610}(0x1:I64,t117,t118):I64,is_BCDstring128_helper{0x38174610}(0x1:I64,t120,t121):I64 li_word %vR721,0x0001 mr %r3,%vR721 mr %r4,%vR117 mr %r5,%vR118 call: { li_word %r10,0x38174610 ; mtctr r10 ; bctrl [r3,r4,r5,RLPri_Int] } mr %vR722,%r3 li_word %vR723,0x0001 mr %r3,%vR723 mr %r4,%vR120 mr %r5,%vR121 call: { li_word %r10,0x38174610 ; mtctr r10 ; bctrl [r3,r4,r5,RLPri_Int] } mr %vR724,%r3 and %vR720,%vR722,%vR724 andi. %vR725,%vR720,1 cmplwi %cr7,%vR725,1 set (cr7.eq=1),%vR719: { mfcr r0 ; rlwinm %vR719,r0,31,31,31 } slwi %vR719,%vR719,31 srawi %vR719,%vR719,31 mr %vR127,%vR719 Register allocation around the two calls: 194 li_word %r14,0x0001 195 mr %r3,%r14 196 mr %r4,%r15 197 mr %r5,%r16 198 call: { li_word %r10,0x38174610 ; mtctr r10 ; bctrl [r3,r4,r5,RLPri_Int] } 199 mr %r14,%r3 200 li_word %r19,0x0001 201 mr %r3,%r19 202 mr %r4,%r17 203 mr %r5,%r18 204 call: { li_word %r10,0x38174610 ; mtctr r10 ; bctrl [r3,r4,r5,RLPri_Int] } 205 mr %r19,%r3 206 and %r20,%r14,%r19 207 andi. %r14,%r20,1 So the result of the first call is parked in r14, and then we make the second call. That seems OK because r14 (in fact, r14 to r28) are callee saved so the result of the first call won't get trashed. Also, the args for the second call enter this fragment in r19/r18/r17, so we can be fairly sure the first call doesn't trash them. So there's nothing *obviously* wrong here that I can see. We need the real crash block. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #4 from Julian Seward--- Comment 3 assumes that the block that segfaults is the same one where the (we assume) mis-translation occurred. It might be that some previous block was mis-translated and causes the simulated machine state to be corrupted, and only in some later block does it crash. But that's a less likely scenario. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #3 from Julian Seward--- This kind of thing could well be due to incorrect register allocation around the calls, perhaps corrupting the values passed to the calls or corrupting values in registers around the call site, that both caller and callee expect the other to preserve. Maybe something along those lines. I tried to reproduce this on gcc112.fsffrance.org (a P8 system) but failed -- it runs OK. I'd be happy to chase this if I could reproduce on a machine that I can access. Failing that .. I'd suggest: (1) make the smallest possible testcase that segfaults (2) make sure it still segfaults with --tool=none (3) with --tool=none, run it with --trace-flags=10001110 and --trace-notbelow= set to the appropriate value so you actually get output for the failing block. From that we might be able to guess what the problem is. (4) Find the failing instruction in GDB by setting env var VALGRIND_LAUNCHER=(to nothing) and then running gdb none/none-ppc64be-linux run --tool=none ./path/to/the/test/binary and disassemble say 10 insns each side of the failing insn, so we can locate it in the JIT generated code. Be aware that GDB may catch segfaults that are unrelated to this crash, prior to the crash point; allow it to pass those on to valgrind to be handled. (5) Post the results from (3) and (4) here. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 Carl Lovechanged: What|Removed |Added CC||c...@us.ibm.com, ||will_schm...@vnet.ibm.com Ever confirmed|0 |1 Status|UNCONFIRMED |CONFIRMED --- Comment #2 from Carl Love --- I have some what isolated where the issue when running on Power 7 BE (issue does not occur on LE) is but at this point could use some help getting to the root cause. in VEX/priv/guest_ppc_toIR.c in the function dis_av_bcd() there is a case statement that handles the bcdadd and bcdsub instructions. Once the result of the operation is completed, we have to compute some values to set the CR field 6. The calculation of the valid bit is as follows: valid = unop( Iop_64to32, binop( Iop_And64, is_BCDstring128( /* Signed */True, mkexpr( vA ) ), is_BCDstring128( /* Signed */True, mkexpr( vB ) ) ) ); Note, is_BCDstring128() calls a clean helper written in C. The issue is making the two calls above leads to a segmentation fault. I rewrote the computation as follows to isolate out the computations: assign( A_isBCDstr, is_BCDstring128( /* Signed */True, mkexpr( vA ) ) ); // assign( A_isBCDstr, mkU64( 0 ) ); assign( B_isBCDstr, is_BCDstring128( /* Signed */True, mkexpr( vB ) ) ); // assign( B_isBCDstr, mkU64( 0 ) ); valid = unop( Iop_64to32, binop( Iop_And64, mkexpr( A_isBCDstr ), mkexpr( B_isBCDstr ) ) ); Now if I make the call to is_BCDstring128() for vA and vB I get the segmentation fault. If I comment out either computation and put in the appropriate commented out line to just explicitly set the result to zero, I do not get a segmentation fault. The segmentation fault occurs later when the IR code is actually executed. I am guessing that somehow the generation of the IR code with two back to back clean helper calls results in bad code that causes the segmentation fault. Unfortunately, the issue is a bit deeper in Valgrind then I am familiar with. I am looking for help to debug this issue. Please let me know if there is a way we can dump out the generated code and so we can examine it. Thanks. -- You are receiving this mail because: You are watching all bug changes.
[valgrind] [Bug 369175] jm_vec_isa_2_07 test crashes on ppc64
https://bugs.kde.org/show_bug.cgi?id=369175 --- Comment #1 from Mark Wielaard--- Note that running the program itself (not under valgrind) seems fine. It is only when ran under valgrind that it produces the SIGSEGV: ==28394== Process terminating with default action of signal 11 (SIGSEGV) ==28394== Bad permissions for mapped region at address 0x3828C978 ==28394==at 0x100016D4: test_bcdadd (test_isa_2_07_part1.c:710) ==28394==by 0x10001A2B: test_av_bcd (test_isa_2_07_part1.c:1419) ==28394==by 0x420646B: generic_start_main.isra.0 (in /usr/lib64/libc-2.17.so) ==28394==by 0x4206693: (below main) (in /usr/lib64/libc-2.17.so) static int PS_bit; static void test_bcdadd (void) { if (PS_bit) __asm__ __volatile__ ("bcdadd. %0, %1, %2, 1" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB)); else => __asm__ __volatile__ ("bcdadd. %0, %1, %2, 0" : "=v" (vec_out): "v" (vec_inA),"v" (vec_inB)); } => is line 710. -- You are receiving this mail because: You are watching all bug changes.