https://bugs.llvm.org/show_bug.cgi?id=48649

            Bug ID: 48649
           Summary: unsafe pointer arithmetic in llvm_regcomp()
           Product: libraries
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Support Libraries
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]

llvm/lib/Support/regcomp.c is borrowed from OpenBSD, to which the following
issue has been reported and fixed. (report and patch in
https://marc.info/?l=openbsd-tech&m=160923823113340&w=2 )

regcomp.c uses the "start + count < end" idiom to check that there are "count"
bytes available in an array of char "start" and "end" both point to.

This is fine, unless "start + count" goes beyond the last element of the array.
In this case, pedantic interpretation of the C standard makes the comparison of
such a pointer against "end" undefined, and optimizers from hell will happily
remove as much code as possible because of this.

An example of this occurs in regcomp.c's bothcases(), which defines bracket[3],
sets "next" to "bracket" and "end" to "bracket + 2". Then it invokes
p_bracket(), which starts with "if (p->next + 5 < p->end)"...

Because bothcases() and p_bracket() are static functions in regcomp.c, there is
a real risk of miscompilation if aggressive inlining happens. The following
diff rewrites the "start + count < end" constructs into "end - start > count".
Assuming "end" and "start" are always pointing in the array (such as
"bracket[3]" above), "end - start" is well-defined and can be compared without
trouble.

As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified a bit.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to