https://bugs.exim.org/show_bug.cgi?id=2540
Bug ID: 2540 Summary: Valgrind errors in PCRE2 JIT code Product: PCRE Version: 10.34 (PCRE2) Hardware: x86-64 OS: Linux Status: NEW Severity: bug Priority: medium Component: Code Assignee: p...@hermes.cam.ac.uk Reporter: vesse...@awcreator.com CC: pcre-dev@exim.org Created attachment 1283 --> https://bugs.exim.org/attachment.cgi?id=1283&action=edit ZIP file with bugreport files While investigating a problem in TeXstudio search function one of the users tried running it under Valgrind+GDB and came across a few memory errors reported by Valgrind. I investigated the problems in more detail and it seems that the problems are caused by the PCRE2 JIT code which uses XMM registers/instructions to accesses memory after the end of the allocated string buffer. TeXstudio uses QString from the Qt framework, which in turn uses PCRE2 for the regular expression search. The exact versions are Qt 5.13.2 (as shipped with Fedora 31) pcre2-10.34 (as shipped with Fedora 31) The string in which the search is being done is "wo?wo??" (without the double quotes). The string that is being searched for (the pattern) is "wo" (that is we don't have any special regular expression characters inside and the pattern). The actual Valgrind error is: --------------------------------------------- ==2547== Invalid read of size 16 ==2547== at 0x253C92B5: ??? ==2547== by 0x187A604B: ??? ==2547== Address 0x187a604e is 30 bytes inside a block of size 40 alloc'd --------------------------------------------- (the full error message is attached as valgrind_error.txt) A full GDB disassembly of the offending PCRE2 JIT code is attached as jit_stage_2_disassembly.txt The actual offending instruction is => 0x00000000253c92b5: f3 0f 6f 4e fe movdqu xmm1,XMMWORD PTR [rsi-0x2] and in the full disassembly it is prefixed by "=>" (the GDB notation for current EIP). I also got the first stage of the PCRE2 JIT code where the separate JIT assembly instructions are kept in the compiler buffer but they are not merged into a single code block and the jump addresses and not adjusted. The dump of the first-stage JIT code is attached as jit_stage_1_dump.txt The offending instruction binary code is at offset 0x0155 (341) from the start of the buffer (address 0x2f86fa75) 0x2f86fa75: 0xf3 0x0f 0x6f 0x4e 0xfe (see jit_stage_1_dump.txt for the full hex dump) I did a GDB backtrace of the code that builds the offending instruction it is available as jit_stage_1_bt.txt >From the stage 1 backtrace it seems that the offending instruction is generated in pcre2_jit_simd_inc.h on line 568 I am also attaching the corresponding source file pcre2_jit_simd_inc.h Lines 567-568 are -------------------------------------------------------------------- load_from_mem_sse2(compiler, data1_ind, str_ptr_reg_ind, 0); load_from_mem_sse2(compiler, data2_ind, str_ptr_reg_ind, -(sljit_s8)diff); -------------------------------------------------------------------- and they generate the two adjacent instructions from stage 2 disassembly -------------------------------------------------------------------- 0x00000000253c92b1: 66 0f 6f 06 movdqa xmm0,XMMWORD PTR [rsi] => 0x00000000253c92b5: f3 0f 6f 4e fe movdqu xmm1,XMMWORD PTR [rsi-0x2] -------------------------------------------------------------------- where the second instruction is the actual offending one which reads past the end of the allocated buffer. Overall it seems that the JIT code does not check for the end of the string in which we search and therefore it just reads past the end of the allocated string buffer. So I would appreciate if any of the PCRE2 developers can tell me: 1. Is this Valgrind warning a serious issue? In this case it seems fairly harmless, but I would imagine that it could lead to an exception and crash if the JIT code tries to read from memory that is has no read access to. 2. Is there an easy workaround for this kind of read past the end of the buffer when using PCRE2? -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev