https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85824
Jonathan Wakely <redi at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |timshen at gcc dot gnu.org --- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> --- (In reply to Wanying Luo from comment #0) > When _M_transform() calls strxfrm() and gets -1 when converting 0x80 under > the UTF-8 locale on Solaris SPARC, it simply assigns -1 to __res of type > size_t which creates a very large number. This causes __ret.append(__c, > __res) to crash. I think it would be nice if the code checks errno and > issues a better error message than the one above. N.B. it doesn't just crash, it throws an exception because it can't append 4294967295 bytes to a std::string. Any fix to check errno in collate<char>::do_transform is still going to involve throwing an exception, just a slightly different one. The real problem is that std::regex wants to build a cache of every value from CHAR_MIN to CHAR_MAX, to decide if it matches the bracket expression "[0-9]". If calling strxfrm on any 8-bit char value produces an error then we're going to get an exception. I think something in the regex compiler (maybe in transform_primary) needs to handle those exceptions (and either decide the characters that produce errors do not match, or maybe disable the cache?) Tim, I'll take care of checking errno in collate<>::_M_transform but could you advise what to do about the regex compiler? Maybe: --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -257,7 +257,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 const __ctype_type& __fctyp(use_facet<__ctype_type>(_M_locale)); std::vector<char_type> __s(__first, __last); __fctyp.tolower(__s.data(), __s.data() + __s.size()); - return this->transform(__s.data(), __s.data() + __s.size()); + __try { + return this->transform(__s.data(), __s.data() + __s.size()); + } catch(const std::runtime_error&) { + return string_type(); + } } /**