[Bug libstdc++/61582] C++11 regex memory corruption

2021-12-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed|2014-06-25 00:00:00 |2021-12-16
   Assignee|timshen at gcc dot gnu.org |redi at gcc dot gnu.org

--- Comment #24 from Jonathan Wakely  ---
(In reply to Maksymilian Arciemowicz from comment #12)
> Ups. Check this (.*{100}{300})

This one still results in a stack overflow on trunk, with an 8MB stack. That
is:

std::regex_match("a", std::regex("(.*{100}{300})"));

I have a proof-of-concept patch replacing the recursion in _Executor. The
example above runs successfully with a 16k stack limit.

[Bug libstdc++/61582] C++11 regex memory corruption

2021-12-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #23 from Jonathan Wakely  ---
(In reply to M Welinder from comment #22)
> FWIW, there is an excellent overview of regular expression engine pitfalls
> and methods here:
> 
> https://swtch.com/~rsc/regexp/regexp1.html
> https://swtch.com/~rsc/regexp/regexp2.html
> https://swtch.com/~rsc/regexp/regexp3.html

Yes, there have been links to the first one in libstdc++ headers since 2013.

[Bug libstdc++/61582] C++11 regex memory corruption

2021-05-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug libstdc++/61582] C++11 regex memory corruption

2019-02-21 Thread terra at gnome dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

M Welinder  changed:

   What|Removed |Added

 CC||terra at gnome dot org

--- Comment #22 from M Welinder  ---
FWIW, there is an excellent overview of regular expression engine pitfalls
and methods here:

https://swtch.com/~rsc/regexp/regexp1.html
https://swtch.com/~rsc/regexp/regexp2.html
https://swtch.com/~rsc/regexp/regexp3.html

Those are about 10 years old, but not outdated.

The TL;DR is "use NFAs and DFAs, not back-tracking".

[Bug libstdc++/61582] C++11 regex memory corruption

2017-02-10 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #21 from Tim Shen  ---
(In reply to Pádraig Brady from comment #20)
> Any status update on this. GCC7 is looming...
> Thanks.

Unfortunately I haven't get a chance to work on this. I plan to put up a
one-line tweak on the internal state limit to make the library throwing an
exception, instead of crash. That's probably a strict improvement.

[Bug libstdc++/61582] C++11 regex memory corruption

2017-02-10 Thread P at draigBrady dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Pádraig Brady  changed:

   What|Removed |Added

 CC||P at draigBrady dot com

--- Comment #20 from Pádraig Brady  ---
Any status update on this. GCC7 is looming...
Thanks.

[Bug libstdc++/61582] C++11 regex memory corruption

2016-03-30 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Tim Shen  changed:

   What|Removed |Added

 CC||chaoskeeper at mail dot ru

--- Comment #19 from Tim Shen  ---
*** Bug 70459 has been marked as a duplicate of this bug. ***

[Bug libstdc++/61582] C++11 regex memory corruption

2016-03-25 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Tim Shen  changed:

   What|Removed |Added

 CC||bisqwit at iki dot fi

--- Comment #18 from Tim Shen  ---
*** Bug 70411 has been marked as a duplicate of this bug. ***

[Bug libstdc++/61582] C++11 regex memory corruption

2015-12-03 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Tim Shen  changed:

   What|Removed |Added

 CC||kerukuro at gmail dot com

--- Comment #17 from Tim Shen  ---
*** Bug 68688 has been marked as a duplicate of this bug. ***

[Bug libstdc++/61582] C++11 regex memory corruption

2015-08-14 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Tim Shen timshen at gcc dot gnu.org changed:

   What|Removed |Added

 CC||morandidodo at gmail dot com

--- Comment #15 from Tim Shen timshen at gcc dot gnu.org ---
*** Bug 66456 has been marked as a duplicate of this bug. ***


[Bug libstdc++/61582] C++11 regex memory corruption

2015-08-14 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Tim Shen timshen at gcc dot gnu.org changed:

   What|Removed |Added

 CC||antialize at gmail dot com

--- Comment #16 from Tim Shen timshen at gcc dot gnu.org ---
*** Bug 67212 has been marked as a duplicate of this bug. ***


[Bug libstdc++/61582] C++11 regex memory corruption

2014-07-04 Thread max at cert dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #13 from Maksymilian Arciemowicz max at cert dot cx ---
@Tim: do you need help?


[Bug libstdc++/61582] C++11 regex memory corruption

2014-07-04 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #14 from Tim Shen timshen at gcc dot gnu.org ---
(In reply to Maksymilian Arciemowicz from comment #13)
 @Tim: do you need help?

This is what I'm going to do:
https://gcc.gnu.org/ml/libstdc++/2014-07/msg8.html

Please send to libstdc++ ml if you have any ideas.


[Bug libstdc++/61582] C++11 regex memory corruption

2014-07-01 Thread max at cert dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #12 from Maksymilian Arciemowicz max at cert dot cx ---
Ups. Check this (.*{100}{300})

gcc version 4.10.0 20140701 (experimental) (GCC)

Starting program: /home/cx/REtrunk/kozak5/t3 '(.*{100}{300})'

Program received signal SIGSEGV, Segmentation fault.
0x0040c22a in std::__detail::_Executorchar const*,
std::allocatorstd::sub_matchchar const* , std::regex_traitschar,
true::_M_dfs(std::__detail::_Executorchar const*,
std::allocatorstd::sub_matchchar const* , std::regex_traitschar,
true::_Match_mode, long) ()



[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-30 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #11 from Tim Shen timshen at gcc dot gnu.org ---
Author: timshen
Date: Tue Jul  1 03:05:45 2014
New Revision: 212185

URL: https://gcc.gnu.org/viewcvs?rev=212185root=gccview=rev
Log:
PR libstdc++/61061
PR libstdc++/61582
* include/bits/regex_automaton.h (_NFA::_M_insert_state): Add
a NFA state limit. If it's exceeded, regex_constants::error_space
will be throwed.
* include/bits/regex_automaton.tcc (_StateSeq::_M_clone): Use
map (which is sparse) instead of vector. This reduce n times clones'
cost from O(n^2) to O(n).
* include/std/regex: Add map dependency.
* testsuite/28_regex/algorithms/regex_match/ecma/char/61601.cc: New
testcase.


Added:
   
trunk/libstdc++-v3/testsuite/28_regex/algorithms/regex_match/ecma/char/61601.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/regex_automaton.h
trunk/libstdc++-v3/include/bits/regex_automaton.tcc
trunk/libstdc++-v3/include/std/regex


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-26 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #7 from Tim Shen timshen at gcc dot gnu.org ---
(.*{100}{100}{100}) seems to be a stack overflow. It's because regex executor
uses recursion. It could be fixed (not segfault but memory exhaustion) by using
a std::stack and simulate recursion; IMH, however, directly throwing
regex_error::error_space is the right thing here to do.


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-26 Thread max at cert dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #8 from Maksymilian Arciemowicz max at cert dot cx ---
(In reply to Tim Shen from comment #7)
 (.*{100}{100}{100}) seems to be a stack overflow. It's because regex
 executor uses recursion. It could be fixed (not segfault but memory
 exhaustion) by using a std::stack and simulate recursion; IMH, however,
 directly throwing regex_error::error_space is the right thing here to do.

Yeap it's stack overflow. Why regex_error::error_space? Not better
regex_error::error_stack?


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-26 Thread timshen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #9 from Tim Shen timshen at gcc dot gnu.org ---
(In reply to Maksymilian Arciemowicz from comment #8)
 (In reply to Tim Shen from comment #7)
  (.*{100}{100}{100}) seems to be a stack overflow. It's because regex
  executor uses recursion. It could be fixed (not segfault but memory
  exhaustion) by using a std::stack and simulate recursion; IMH, however,
  directly throwing regex_error::error_space is the right thing here to do.
 
 Yeap it's stack overflow. Why regex_error::error_space? Not better
 regex_error::error_stack?

Sorry for not clarify that: I prefer throwing error_space when constructing
(complaining about too many states) instead of throwing error_stack when
matching. To solve the latter problem, as I said, we can use a std::stack or
something to avoid a stack overflow.


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-26 Thread max at cert dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #10 from Maksymilian Arciemowicz max at cert dot cx ---
There is also one other alternative like this

http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/regex/regcomp.c.diff?r1=1.29r2=1.30f=h


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-25 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|RESOLVED|NEW
   Last reconfirmed||2014-06-25
 Resolution|INVALID |---
Summary|C11 regex memory corruption |C++11 regex memory
   ||corruption
 Ever confirmed|0   |1

--- Comment #3 from Jonathan Wakely redi at gcc dot gnu.org ---
(In reply to Maksymilian A from comment #2)
 cx@cx:~/REstd11/kozak5$ ./c11re '((x|'
 terminate called after throwing an instance of 'std::regex_error'
   what():  regex_error
 Przerwane (core dumped)

I think this is by design.

 cx@cx:~/REstd11/kozak5$ ./c11re '((.*)()?*{100})'
 Naruszenie ochrony pamięci (core dumped)

That's a bug.

(It would be helpful if you didn't put C11 in the subject, this has nothing to
do with C)

[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-25 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||timshen at gcc dot gnu.org

--- Comment #4 from Jonathan Wakely redi at gcc dot gnu.org ---
That segfault is already fixed on trunk, although possibly just latent


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-25 Thread max at cert dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #5 from Maksymilian Arciemowicz max at cert dot cx ---
Thanks for feedback. I'm going verify this on trunk


[Bug libstdc++/61582] C++11 regex memory corruption

2014-06-25 Thread max at cert dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582

--- Comment #6 from Maksymilian Arciemowicz max at cert dot cx ---
@Jonathan: true but check this case

cx@cx:~/REtrunk/kozak5$ ~/gccTRUNK/bin/g++ -v
Using built-in specs.
COLLECT_GCC=/home/cx/gccTRUNK/bin/g++
COLLECT_LTO_WRAPPER=/home/cx/gccTRUNK/libexec/gcc/x86_64-unknown-linux-gnu/4.10.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../trunk/configure --prefix=/home/cx/gccTRUNK/
--disable-multilib
Thread model: posix
gcc version 4.10.0 20140625 (experimental) (GCC) 
cx@cx:~/REtrunk/kozak5$ ~/gccTRUNK/bin/g++ c11re.c -o c11re -std=c++11
cx@cx:~/REtrunk/kozak5$ ./c11re '(.*{100}{100}{100})'
Naruszenie ochrony pamięci (core dumped)

Program received signal SIGSEGV, Segmentation fault.
0x0041014e in std::__detail::_Executorchar const*,
std::allocatorstd::sub_matchchar const* , std::regex_traitschar,
true::_State_infostd::integral_constantbool, true,
std::vectorstd::sub_matchchar const*, std::allocatorstd::sub_matchchar
const*   ::_M_visited(long) const ()

BR,
Maksymilian
http://cxsecurity.com/