Re: [libstdc++/71500] make back reference work with icase
On 19/09/17 15:38 +0100, Jonathan Wakely wrote: On 18/09/17 16:54 -0700, Tim Shen wrote: On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely wrote: On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote: On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely wrote: We need to rewrite this to check the lengths are equal first, and then call the 3-argument version of std::equal. Alternatively, we could move the implementation of the C++14 std::equal overloads to __equal and make that available for C++11. I'll try that. Here's a proof of concept patch for that. It's a bit ugly. Instead of having iterator tags in the interface, we can probe the random-access-ness inside __equal4/__equal4_p, can't we? It's similar to the existing "if (_RAIters()) { ... }". I'd expect the patches to be renaming the current implementations and adding wrappers, instead of adding new implementations. Well I decided to split the existing functions up and use tag dispatching, which is conceptually cleaner anyway. But as the RandomAccessIterator version doesn't need any operations that aren't valid for other categories, it's not strictly necessary. The tag dispatching version should generate slightly smaller code for unoptimized builds, but that's not very important. Unoptimized builds don't inline small functions, therefore the first patch generate two weak symbols, instead of one by the second patch. Two small functions that only do the necessary work, rather than one large function that has a branch for RAIters even when it can never be taken. It's unclear to me how would number of symbols penalize the performance/binary size. People who care about performance or binary size should be optimizing, and in that case the RAIters branch will be known at compile-time and the dead code should get removed, and the wrapper functions inlined. Here's the patch doing it as you suggest. We can't call the new functions __equal because t hat name is already taken by a helper struct, hence __equal4. Do you prefer this version? Yes, I prefer this version for readability reasons: 1) subjectively, less scattered code; and 2) ideally I want `if constexpr (...)`), the if version is closer. Yes, we could add _GLIBCXX17_CONSTEXPR there, but I'm not sure it's worth doing. 3) The calls to __equal4 in _Backref_matcher are simpler. I agree that it's not a big difference. I just wanted to point out the small difference. I'm fine with either version. I'll commit the second version. Here's what I've committed, with a minimal test to catch this happening in future. I'll re-run the full set of test variations. commit 371c5de025c0fc95420d96bf96f3da84e3725c9d Author: Jonathan Wakely Date: Tue Sep 19 17:36:52 2017 +0100 PR libstdc++/71500 restore C++11 compatibility in PR libstdc++/71500 * include/bits/regex_executor.tcc (_Backref_matcher>::_M_apply): Use std::__equal4 instead of C++14 4-iterator overloads of std::equal. * include/bits/stl_algobase.h (__equal4): New functions implementing 4-iterator overloads of std::equal for use in C++11. (equal(It1, It1, It2, It2), equal(It1, It1, It2, It2, BinaryPred)): Move function bodies to new __equal4 functions. * testsuite/28_regex/simple_c++11.cc: New. diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index f6149fecf9d..2ceba35e7b8 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -366,17 +366,17 @@ namespace __detail _BiIter __actual_end) { if (!_M_icase) - return std::equal(__expected_begin, __expected_end, - __actual_begin, __actual_end); + return std::__equal4(__expected_begin, __expected_end, + __actual_begin, __actual_end); typedef std::ctype<_CharT> __ctype_type; const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc()); - return std::equal(__expected_begin, __expected_end, - __actual_begin, __actual_end, - [this, &__fctyp](_CharT __lhs, _CharT __rhs) - { - return __fctyp.tolower(__lhs) -== __fctyp.tolower(__rhs); - }); + return std::__equal4(__expected_begin, __expected_end, + __actual_begin, __actual_end, + [this, &__fctyp](_CharT __lhs, _CharT __rhs) + { + return __fctyp.tolower(__lhs) + == __fctyp.tolower(__rhs); + }); } bool _M_icase; diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index f68ecb22b82..a80934c4faa 100644 --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -1082,6 +1082,60 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO return true; } +#if __cplusplus >= 201103L + // 4-iterator version of std::equal for use in C++11. + template +inline bool +__equal4(_II1 __first1, _II1 __last1, _II2 __first2,
Re: [libstdc++/71500] make back reference work with icase
The failures that need to be fixed can be seen at https://gcc.gnu.org/ml/gcc-testresults/2017-09/msg01633.html
Re: [libstdc++/71500] make back reference work with icase
On 18/09/17 16:54 -0700, Tim Shen wrote: On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely wrote: On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote: On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely wrote: We need to rewrite this to check the lengths are equal first, and then call the 3-argument version of std::equal. Alternatively, we could move the implementation of the C++14 std::equal overloads to __equal and make that available for C++11. I'll try that. Here's a proof of concept patch for that. It's a bit ugly. Instead of having iterator tags in the interface, we can probe the random-access-ness inside __equal4/__equal4_p, can't we? It's similar to the existing "if (_RAIters()) { ... }". I'd expect the patches to be renaming the current implementations and adding wrappers, instead of adding new implementations. Well I decided to split the existing functions up and use tag dispatching, which is conceptually cleaner anyway. But as the RandomAccessIterator version doesn't need any operations that aren't valid for other categories, it's not strictly necessary. The tag dispatching version should generate slightly smaller code for unoptimized builds, but that's not very important. Unoptimized builds don't inline small functions, therefore the first patch generate two weak symbols, instead of one by the second patch. Two small functions that only do the necessary work, rather than one large function that has a branch for RAIters even when it can never be taken. It's unclear to me how would number of symbols penalize the performance/binary size. People who care about performance or binary size should be optimizing, and in that case the RAIters branch will be known at compile-time and the dead code should get removed, and the wrapper functions inlined. Here's the patch doing it as you suggest. We can't call the new functions __equal because t hat name is already taken by a helper struct, hence __equal4. Do you prefer this version? Yes, I prefer this version for readability reasons: 1) subjectively, less scattered code; and 2) ideally I want `if constexpr (...)`), the if version is closer. Yes, we could add _GLIBCXX17_CONSTEXPR there, but I'm not sure it's worth doing. 3) The calls to __equal4 in _Backref_matcher are simpler. I agree that it's not a big difference. I just wanted to point out the small difference. I'm fine with either version. I'll commit the second version. Thanks for the prototyping! -- Regards, Tim Shen
Re: [libstdc++/71500] make back reference work with icase
On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely wrote: > On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote: >> >> On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely >> wrote: We need to rewrite this to check the lengths are equal first, and then call the 3-argument version of std::equal. Alternatively, we could move the implementation of the C++14 std::equal overloads to __equal and make that available for C++11. I'll try that. >>> >>> >>> >>> Here's a proof of concept patch for that. It's a bit ugly. >> >> >> Instead of having iterator tags in the interface, we can probe the >> random-access-ness inside __equal4/__equal4_p, can't we? It's similar >> to the existing "if (_RAIters()) { ... }". >> >> I'd expect the patches to be renaming the current implementations and >> adding wrappers, instead of adding new implementations. > > > Well I decided to split the existing functions up and use tag > dispatching, which is conceptually cleaner anyway. But as the > RandomAccessIterator version doesn't need any operations that aren't > valid for other categories, it's not strictly necessary. The tag > dispatching version should generate slightly smaller code for > unoptimized builds, but that's not very important. Unoptimized builds don't inline small functions, therefore the first patch generate two weak symbols, instead of one by the second patch. It's unclear to me how would number of symbols penalize the performance/binary size. > > Here's the patch doing it as you suggest. We can't call the new > functions __equal because t hat name is already taken by a helper > struct, hence __equal4. > > Do you prefer this version? Yes, I prefer this version for readability reasons: 1) subjectively, less scattered code; and 2) ideally I want `if constexpr (...)`), the if version is closer. I agree that it's not a big difference. I just wanted to point out the small difference. I'm fine with either version. Thanks for the prototyping! -- Regards, Tim Shen
Re: [libstdc++/71500] make back reference work with icase
On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote: On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely wrote: We need to rewrite this to check the lengths are equal first, and then call the 3-argument version of std::equal. Alternatively, we could move the implementation of the C++14 std::equal overloads to __equal and make that available for C++11. I'll try that. Here's a proof of concept patch for that. It's a bit ugly. Instead of having iterator tags in the interface, we can probe the random-access-ness inside __equal4/__equal4_p, can't we? It's similar to the existing "if (_RAIters()) { ... }". I'd expect the patches to be renaming the current implementations and adding wrappers, instead of adding new implementations. Well I decided to split the existing functions up and use tag dispatching, which is conceptually cleaner anyway. But as the RandomAccessIterator version doesn't need any operations that aren't valid for other categories, it's not strictly necessary. The tag dispatching version should generate slightly smaller code for unoptimized builds, but that's not very important. Here's the patch doing it as you suggest. We can't call the new functions __equal because t hat name is already taken by a helper struct, hence __equal4. Do you prefer this version? diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index f6149fecf9d..2ceba35e7b8 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -366,17 +366,17 @@ namespace __detail _BiIter __actual_end) { if (!_M_icase) - return std::equal(__expected_begin, __expected_end, - __actual_begin, __actual_end); + return std::__equal4(__expected_begin, __expected_end, + __actual_begin, __actual_end); typedef std::ctype<_CharT> __ctype_type; const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc()); - return std::equal(__expected_begin, __expected_end, - __actual_begin, __actual_end, - [this, &__fctyp](_CharT __lhs, _CharT __rhs) - { - return __fctyp.tolower(__lhs) -== __fctyp.tolower(__rhs); - }); + return std::__equal4(__expected_begin, __expected_end, + __actual_begin, __actual_end, + [this, &__fctyp](_CharT __lhs, _CharT __rhs) + { + return __fctyp.tolower(__lhs) + == __fctyp.tolower(__rhs); + }); } bool _M_icase; diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index f68ecb22b82..ff5e94d9ae8 100644 --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -1082,6 +1082,58 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO return true; } +#if __cplusplus >= 201103L + template +inline bool +__equal4(_II1 __first1, _II1 __last1, _II2 __first2, _II2 __last2) +{ + using _RATag = random_access_iterator_tag; + using _Cat1 = typename iterator_traits<_II1>::iterator_category; + using _Cat2 = typename iterator_traits<_II2>::iterator_category; + using _RAIters = __and_, is_same<_Cat2, _RATag>>; + if (_RAIters()) + { + auto __d1 = std::distance(__first1, __last1); + auto __d2 = std::distance(__first2, __last2); + if (__d1 != __d2) + return false; + return _GLIBCXX_STD_A::equal(__first1, __last1, __first2); + } + + for (; __first1 != __last1 && __first2 != __last2; + ++__first1, (void)++__first2) + if (!(*__first1 == *__first2)) + return false; + return __first1 == __last1 && __first2 == __last2; +} + + template +inline bool +__equal4(_II1 __first1, _II1 __last1, _II2 __first2, _II2 __last2, + _BinaryPredicate __binary_pred) +{ + using _RATag = random_access_iterator_tag; + using _Cat1 = typename iterator_traits<_II1>::iterator_category; + using _Cat2 = typename iterator_traits<_II2>::iterator_category; + using _RAIters = __and_, is_same<_Cat2, _RATag>>; + if (_RAIters()) + { + auto __d1 = std::distance(__first1, __last1); + auto __d2 = std::distance(__first2, __last2); + if (__d1 != __d2) + return false; + return _GLIBCXX_STD_A::equal(__first1, __last1, __first2, + __binary_pred); + } + + for (; __first1 != __last1 && __first2 != __last2; + ++__first1, (void)++__first2) + if (!bool(__binary_pred(*__first1, *__first2))) + return false; + return __first1 == __last1 && __first2 == __last2; +} +#endif // C++11 + #if __cplusplus > 201103L #define __cpp_lib_robust_nonmodifying_seq_ops 201304 @@ -1112,24 +1164,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO __glibcxx_requires_valid_range(__first1, __last1); __glibcxx_requires_valid_range(__first2, __last2); - using _RATag = random_access_iterator_tag; - using _Cat1 = typename iterator_traits<_II1>::iterator_category; - using _Cat2 = typename iterator_traits<_II2>::iterator_category; - using _RAIters = __and_, i
Re: [libstdc++/71500] make back reference work with icase
On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely wrote: >> We need to rewrite this to check the lengths are equal first, and then >> call the 3-argument version of std::equal. >> >> Alternatively, we could move the implementation of the C++14 >> std::equal overloads to __equal and make that available for C++11. >> I'll try that. > > > Here's a proof of concept patch for that. It's a bit ugly. Instead of having iterator tags in the interface, we can probe the random-access-ness inside __equal4/__equal4_p, can't we? It's similar to the existing "if (_RAIters()) { ... }". I'd expect the patches to be renaming the current implementations and adding wrappers, instead of adding new implementations. -- Regards, Tim Shen
Re: [libstdc++/71500] make back reference work with icase
On 15/09/17 16:39 +0100, Jonathan Wakely wrote: On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote: This fixes the follow-up comments in 71500. Back-reference matching is different from other matching, as the content the back-reference refers to is at "run-time", aka during regex_match(), not regex() compilation. For compilation we do have an abstraction layer to catch all comparison customizations, namely _M_translator in regex_compiler.h. Until this patch, we don't have an abstraction for "run-time" matching. I believe that back-reference is the only place that needs run-time matching, so I just build a _Backref_matcher in regex_executot.tcc. Tested on x86_64-linux-gnu. Thanks! -- Regards, Tim Shen commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26 Author: Tim Shen Date: Mon Sep 4 03:19:35 2017 -0700 PR libstdc++/71500 * include/bits/regex_executor.tcc: Support icase in regex_tratis<...> for back reference matches. * testsuite/28_regex/regression.cc: Test case. diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index 226e05856e1..f6149fecf9d 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -335,6 +335,54 @@ namespace __detail _M_states._M_queue(__state._M_next, _M_cur_results); } + template +struct _Backref_matcher +{ + _Backref_matcher(bool __icase, const _TraitsT& __traits) + : _M_traits(__traits) { } + + bool + _M_apply(_BiIter __expected_begin, + _BiIter __expected_end, _BiIter __actual_begin, + _BiIter __actual_end) + { + return _M_traits.transform(__expected_begin, __expected_end) + == _M_traits.transform(__actual_begin, __actual_end); + } + + const _TraitsT& _M_traits; +}; + + template +struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>> +{ + using _TraitsT = std::regex_traits<_CharT>; + _Backref_matcher(bool __icase, const _TraitsT& __traits) + : _M_icase(__icase), _M_traits(__traits) { } + + bool + _M_apply(_BiIter __expected_begin, + _BiIter __expected_end, _BiIter __actual_begin, + _BiIter __actual_end) + { + if (!_M_icase) + return std::equal(__expected_begin, __expected_end, + __actual_begin, __actual_end); This is only valid in C++14 and higher, because the 4-argument version of std::equal isn't present in C++11. + typedef std::ctype<_CharT> __ctype_type; + const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc()); + return std::equal(__expected_begin, __expected_end, + __actual_begin, __actual_end, Same here. + [this, &__fctyp](_CharT __lhs, _CharT __rhs) + { + return __fctyp.tolower(__lhs) + == __fctyp.tolower(__rhs); + }); We need to rewrite this to check the lengths are equal first, and then call the 3-argument version of std::equal. Alternatively, we could move the implementation of the C++14 std::equal overloads to __equal and make that available for C++11. I'll try that. Here's a proof of concept patch for that. It's a bit ugly. diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index f6149fecf9d..4b185cc9d1e 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -366,17 +366,21 @@ namespace __detail _BiIter __actual_end) { if (!_M_icase) - return std::equal(__expected_begin, __expected_end, - __actual_begin, __actual_end); + return std::__equal4(__expected_begin, __expected_end, + __actual_begin, __actual_end, + std::__iterator_category(__expected_begin), + std::__iterator_category(__actual_begin)); typedef std::ctype<_CharT> __ctype_type; const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc()); - return std::equal(__expected_begin, __expected_end, - __actual_begin, __actual_end, - [this, &__fctyp](_CharT __lhs, _CharT __rhs) - { - return __fctyp.tolower(__lhs) -== __fctyp.tolower(__rhs); - }); + return std::__equal4_p(__expected_begin, __expected_end, + __actual_begin, __actual_end, + [this, &__fctyp](_CharT __lhs, _CharT __rhs) + { + return __fctyp.tolower(__lhs) + == __fctyp.tolower(__rhs); + }, + std::__iterator_category(__expected_begin), + std::__iterator_category(__actual_begin)); } bool _M_icase; diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index f68ecb22b82..b7848b3de99 100644 --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h
Re: [libstdc++/71500] make back reference work with icase
On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote: This fixes the follow-up comments in 71500. Back-reference matching is different from other matching, as the content the back-reference refers to is at "run-time", aka during regex_match(), not regex() compilation. For compilation we do have an abstraction layer to catch all comparison customizations, namely _M_translator in regex_compiler.h. Until this patch, we don't have an abstraction for "run-time" matching. I believe that back-reference is the only place that needs run-time matching, so I just build a _Backref_matcher in regex_executot.tcc. Tested on x86_64-linux-gnu. Thanks! -- Regards, Tim Shen commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26 Author: Tim Shen Date: Mon Sep 4 03:19:35 2017 -0700 PR libstdc++/71500 * include/bits/regex_executor.tcc: Support icase in regex_tratis<...> for back reference matches. * testsuite/28_regex/regression.cc: Test case. diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index 226e05856e1..f6149fecf9d 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -335,6 +335,54 @@ namespace __detail _M_states._M_queue(__state._M_next, _M_cur_results); } + template +struct _Backref_matcher +{ + _Backref_matcher(bool __icase, const _TraitsT& __traits) + : _M_traits(__traits) { } + + bool + _M_apply(_BiIter __expected_begin, + _BiIter __expected_end, _BiIter __actual_begin, + _BiIter __actual_end) + { + return _M_traits.transform(__expected_begin, __expected_end) + == _M_traits.transform(__actual_begin, __actual_end); + } + + const _TraitsT& _M_traits; +}; + + template +struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>> +{ + using _TraitsT = std::regex_traits<_CharT>; + _Backref_matcher(bool __icase, const _TraitsT& __traits) + : _M_icase(__icase), _M_traits(__traits) { } + + bool + _M_apply(_BiIter __expected_begin, + _BiIter __expected_end, _BiIter __actual_begin, + _BiIter __actual_end) + { + if (!_M_icase) + return std::equal(__expected_begin, __expected_end, + __actual_begin, __actual_end); This is only valid in C++14 and higher, because the 4-argument version of std::equal isn't present in C++11. + typedef std::ctype<_CharT> __ctype_type; + const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc()); + return std::equal(__expected_begin, __expected_end, + __actual_begin, __actual_end, Same here. + [this, &__fctyp](_CharT __lhs, _CharT __rhs) + { + return __fctyp.tolower(__lhs) + == __fctyp.tolower(__rhs); + }); We need to rewrite this to check the lengths are equal first, and then call the 3-argument version of std::equal. Alternatively, we could move the implementation of the C++14 std::equal overloads to __equal and make that available for C++11. I'll try that.
Re: [libstdc++/71500] make back reference work with icase
On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote: This fixes the follow-up comments in 71500. Back-reference matching is different from other matching, as the content the back-reference refers to is at "run-time", aka during regex_match(), not regex() compilation. For compilation we do have an abstraction layer to catch all comparison customizations, namely _M_translator in regex_compiler.h. Until this patch, we don't have an abstraction for "run-time" matching. I believe that back-reference is the only place that needs run-time matching, so I just build a _Backref_matcher in regex_executot.tcc. Looks like a nice solution. OK for trunk, thanks. I think this looks safe to backport too, but let's leave it on trunk for a while first.
[libstdc++/71500] make back reference work with icase
This fixes the follow-up comments in 71500. Back-reference matching is different from other matching, as the content the back-reference refers to is at "run-time", aka during regex_match(), not regex() compilation. For compilation we do have an abstraction layer to catch all comparison customizations, namely _M_translator in regex_compiler.h. Until this patch, we don't have an abstraction for "run-time" matching. I believe that back-reference is the only place that needs run-time matching, so I just build a _Backref_matcher in regex_executot.tcc. Tested on x86_64-linux-gnu. Thanks! -- Regards, Tim Shen commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26 Author: Tim Shen Date: Mon Sep 4 03:19:35 2017 -0700 PR libstdc++/71500 * include/bits/regex_executor.tcc: Support icase in regex_tratis<...> for back reference matches. * testsuite/28_regex/regression.cc: Test case. diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index 226e05856e1..f6149fecf9d 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -335,6 +335,54 @@ namespace __detail _M_states._M_queue(__state._M_next, _M_cur_results); } + template +struct _Backref_matcher +{ + _Backref_matcher(bool __icase, const _TraitsT& __traits) + : _M_traits(__traits) { } + + bool + _M_apply(_BiIter __expected_begin, + _BiIter __expected_end, _BiIter __actual_begin, + _BiIter __actual_end) + { + return _M_traits.transform(__expected_begin, __expected_end) + == _M_traits.transform(__actual_begin, __actual_end); + } + + const _TraitsT& _M_traits; +}; + + template +struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>> +{ + using _TraitsT = std::regex_traits<_CharT>; + _Backref_matcher(bool __icase, const _TraitsT& __traits) + : _M_icase(__icase), _M_traits(__traits) { } + + bool + _M_apply(_BiIter __expected_begin, + _BiIter __expected_end, _BiIter __actual_begin, + _BiIter __actual_end) + { + if (!_M_icase) + return std::equal(__expected_begin, __expected_end, + __actual_begin, __actual_end); + typedef std::ctype<_CharT> __ctype_type; + const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc()); + return std::equal(__expected_begin, __expected_end, + __actual_begin, __actual_end, + [this, &__fctyp](_CharT __lhs, _CharT __rhs) + { + return __fctyp.tolower(__lhs) +== __fctyp.tolower(__rhs); + }); + } + + bool _M_icase; + const _TraitsT& _M_traits; +}; + // First fetch the matched result from _M_cur_results as __submatch; // then compare it with // (_M_current, _M_current + (__submatch.second - __submatch.first)). @@ -355,9 +403,10 @@ namespace __detail __last != _M_end && __tmp != __submatch.second; ++__tmp) ++__last; - if (_M_re._M_automaton->_M_traits.transform(__submatch.first, - __submatch.second) - == _M_re._M_automaton->_M_traits.transform(_M_current, __last)) + if (_Backref_matcher<_BiIter, _TraitsT>( + _M_re.flags() & regex_constants::icase, + _M_re._M_automaton->_M_traits)._M_apply( + __submatch.first, __submatch.second, _M_current, __last)) { if (__last != _M_current) { diff --git a/libstdc++-v3/testsuite/28_regex/regression.cc b/libstdc++-v3/testsuite/28_regex/regression.cc index ee4d3e1e6f8..3fa9022eac4 100644 --- a/libstdc++-v3/testsuite/28_regex/regression.cc +++ b/libstdc++-v3/testsuite/28_regex/regression.cc @@ -93,6 +93,17 @@ test06() } } +// PR libstdc++/71500 +void +test07() +{ + bool test [[gnu::unused]] = true; + + VERIFY(regex_match_debug("abc abc", regex("([a-z]+) \\1", regex::icase))); + VERIFY(regex_match_debug("Abc abc", regex("([a-z]+) \\1", regex::icase))); + VERIFY(regex_match_debug("abc Abc", regex("([a-z]+) \\1", regex::icase))); +} + int main() { @@ -102,6 +113,7 @@ main() test04(); test05(); test06(); + test07(); return 0; }