[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 --- Comment #15 from Nadav Har'El --- More than 5 years later, more and more projects are discovering this bug the hard way, and moving from std::regex to boost::regex which doesn't have this bug - boost::regex defaults to BOOST_REGEX_NON_RECURSIVE mode, which uses a stack on the heap instead of recursion (but I don't know if the specific examples shown the various duplicates all need this stack in practice, for example it's unfortunate if matching " *" needs to copy the entire input string in a stack). The latest example of this exodus is https://github.com/scylladb/scylladb/pull/13452. So I think it's about time this issue is solved. Maybe even the Boost implementation can studied for inspiration and implementation ideas?
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 --- Comment #14 from Jonathan Wakely --- Running out of stack space is not acceptable, that's why this is considered a bug. As already stated in comment 8, I started work on fixing it, but the rewritten code had bugs that I haven't had time to resolve yet.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Maarten L. Hekkelman changed: What|Removed |Added CC||maarten at hekkelman dot com --- Comment #13 from Maarten L. Hekkelman --- Too bad this bug has still not been dealt with. And it is even worse that simply running out of stack space seems to be acceptable. And no, I'm not using inputs in the form of 27kB, more like just a few hundred characters at most with quite complex expressions. Fortunately, it is now very easy to use the boost::regex as a standalone library as a replacement. But alas, that's still a dependency.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Jonathan Wakely changed: What|Removed |Added CC||semi1 at posteo dot de --- Comment #12 from Jonathan Wakely --- *** Bug 84738 has been marked as a duplicate of this bug. ***
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Jonathan Wakely changed: What|Removed |Added CC||nyh at math dot technion.ac.il --- Comment #11 from Jonathan Wakely --- *** Bug 93502 has been marked as a duplicate of this bug. ***
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Jonathan Wakely changed: What|Removed |Added CC||shaoqin2 at illinois dot edu --- Comment #10 from Jonathan Wakely --- *** Bug 84865 has been marked as a duplicate of this bug. ***
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Boris Kolpackov changed: What|Removed |Added CC||boris at kolpackov dot net --- Comment #9 from Boris Kolpackov --- Any progress on this? I get the segfault (due to stack overflow) with the following trivial regex: regex re ("#+",); regex_search (string (32 * 1024, '#'), re); In comparison, MSVC's implementation crashes on much larger input (in the above test it is still able to match 4MB string) while libc++ doesn't seem to have any stack-related limits (I was able to match 40MB). I see two issues here: 1. It would have been nice if implementation-related limits were reported with an exception rather than a crash. 2. The limits seem to be really low, both practically (matching 32KB doesn't feel unreasonable) and compared to other implementations.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 --- Comment #8 from Jonathan Wakely --- I started working on a patch to replace the recursion with iteration, but didn't get it working yet.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Giuliano Belinassi changed: What|Removed |Added CC||giuliano.belinassi at usp dot br --- Comment #7 from Giuliano Belinassi --- It seems that the issue is the backtracking required by the NFA, as it enters in a deep recursion when calling _M_dfs in _M_main_dispatch (regex_executor.tcc). Maybe moving the DFS stack from the recursion stack to the heap and use an iterative DFS could fix this, but converting the NFA to DFA may be a better choice, as it removes the backtracking requirement when iterating with the string.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-22 Ever confirmed|0 |1
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Nils Gladitz changed: What|Removed |Added CC||nilsgladitz at gmail dot com --- Comment #6 from Nils Gladitz --- I think I am hitting this issue somewhat earlier on an ARM system with a more limited stack size. Was able to reproduce it on Desktop x86_64 Linux with e.g.: #include int main() { std::regex_match( std::string(2000, 'a'), std::regex(".*") ); } $ ulimit -s 256 # 256kb stack; which is what have by default on the ARM system $ g++ test.cpp -o regex_test $ ./regex_test Segmentation fault (core dumped)
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 --- Comment #5 from Vadim Zeitlin --- I obviously meant that it makes it unusable in my use case when I can't guarantee that the input is bounded by this (smallish) size.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 --- Comment #4 from Jonathan Wakely --- (In reply to Vadim Zeitlin from comment #3) > This makes std::regex simply unusable. Yes, because there are no uses with inputs below 27KiB.
[Bug libstdc++/86164] std::regex crashes when matching long lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Vadim Zeitlin changed: What|Removed |Added CC||vz-gcc at zeitlins dot org --- Comment #3 from Vadim Zeitlin --- BTW, this is unrelated to using grouping in the regex, searching for something as simple as "A.*B" also crashes for input longer than ~27KiB on Linux amd64 with g++ 8.2.0. This makes std::regex simply unusable.