Re: [libstdc++/71500] make back reference work with icase

2017-09-19 Thread Jonathan Wakely

On 19/09/17 15:38 +0100, Jonathan Wakely wrote:

On 18/09/17 16:54 -0700, Tim Shen wrote:

On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely  wrote:

On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote:


On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely 
wrote:


We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.




Here's a proof of concept patch for that. It's a bit ugly.



Instead of having iterator tags in the interface, we can probe the
random-access-ness inside __equal4/__equal4_p, can't we? It's similar
to the existing "if (_RAIters()) { ... }".

I'd expect the patches to be renaming the current implementations and
adding wrappers, instead of adding new implementations.



Well I decided to split the existing functions up and use tag
dispatching, which is conceptually cleaner anyway. But as the
RandomAccessIterator version doesn't need any operations that aren't
valid for other categories, it's not strictly necessary. The tag
dispatching version should generate slightly smaller code for
unoptimized builds, but that's not very important.


Unoptimized builds don't inline small functions, therefore the first
patch generate two weak symbols, instead of one by the second patch.


Two small functions that only do the necessary work, rather than one
large function that has a branch for RAIters even when it can never be
taken.


It's unclear to me how would number of symbols penalize the
performance/binary size.


People who care about performance or binary size should be optimizing,
and in that case the RAIters branch will be known at compile-time and
the dead code should get removed, and the wrapper functions inlined.


Here's the patch doing it as you suggest. We can't call the new
functions __equal because t hat name is already taken by a helper
struct, hence __equal4.

Do you prefer this version?


Yes, I prefer this version for readability reasons:
1) subjectively, less scattered code; and
2) ideally I want `if constexpr (...)`), the if version is closer.


Yes, we could add _GLIBCXX17_CONSTEXPR there, but I'm not sure it's
worth doing.

3) The calls to __equal4 in _Backref_matcher are simpler.


I agree that it's not a big difference. I just wanted to point out the
small difference. I'm fine with either version.


I'll commit the second version.


Here's what I've committed, with a minimal test to catch this
happening in future.

I'll re-run the full set of test variations.


commit 371c5de025c0fc95420d96bf96f3da84e3725c9d
Author: Jonathan Wakely 
Date:   Tue Sep 19 17:36:52 2017 +0100

PR libstdc++/71500 restore C++11 compatibility in 

PR libstdc++/71500
* include/bits/regex_executor.tcc
(_Backref_matcher>::_M_apply): Use
std::__equal4 instead of C++14 4-iterator overloads of std::equal.
* include/bits/stl_algobase.h (__equal4): New functions implementing
4-iterator overloads of std::equal for use in C++11.
(equal(It1, It1, It2, It2), equal(It1, It1, It2, It2, BinaryPred)):
Move function bodies to new __equal4 functions.
* testsuite/28_regex/simple_c++11.cc: New.

diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index f6149fecf9d..2ceba35e7b8 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -366,17 +366,17 @@ namespace __detail
 	   _BiIter __actual_end)
   {
 	if (!_M_icase)
-	  return std::equal(__expected_begin, __expected_end,
-			__actual_begin, __actual_end);
+	  return std::__equal4(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end);
 	typedef std::ctype<_CharT> __ctype_type;
 	const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
-	return std::equal(__expected_begin, __expected_end,
-			  __actual_begin, __actual_end,
-			  [this, &__fctyp](_CharT __lhs, _CharT __rhs)
-			  {
-			return __fctyp.tolower(__lhs)
-== __fctyp.tolower(__rhs);
-			  });
+	return std::__equal4(__expected_begin, __expected_end,
+			 __actual_begin, __actual_end,
+			 [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+			 {
+			   return __fctyp.tolower(__lhs)
+ == __fctyp.tolower(__rhs);
+			 });
   }
 
   bool _M_icase;
diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index f68ecb22b82..a80934c4faa 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -1082,6 +1082,60 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
   return true;
 }
 
+#if __cplusplus >= 201103L
+  // 4-iterator version of std::equal for use in C++11.
+  template
+inline bool
+__equal4(_II1 __first1, _II1 __last1, _II2 __first2,

Re: [libstdc++/71500] make back reference work with icase

2017-09-19 Thread Jonathan Wakely
The failures that need to be fixed can be seen at 
https://gcc.gnu.org/ml/gcc-testresults/2017-09/msg01633.html





Re: [libstdc++/71500] make back reference work with icase

2017-09-19 Thread Jonathan Wakely

On 18/09/17 16:54 -0700, Tim Shen wrote:

On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely  wrote:

On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote:


On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely 
wrote:


We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.




Here's a proof of concept patch for that. It's a bit ugly.



Instead of having iterator tags in the interface, we can probe the
random-access-ness inside __equal4/__equal4_p, can't we? It's similar
to the existing "if (_RAIters()) { ... }".

I'd expect the patches to be renaming the current implementations and
adding wrappers, instead of adding new implementations.



Well I decided to split the existing functions up and use tag
dispatching, which is conceptually cleaner anyway. But as the
RandomAccessIterator version doesn't need any operations that aren't
valid for other categories, it's not strictly necessary. The tag
dispatching version should generate slightly smaller code for
unoptimized builds, but that's not very important.


Unoptimized builds don't inline small functions, therefore the first
patch generate two weak symbols, instead of one by the second patch.


Two small functions that only do the necessary work, rather than one
large function that has a branch for RAIters even when it can never be
taken.


It's unclear to me how would number of symbols penalize the
performance/binary size.


People who care about performance or binary size should be optimizing,
and in that case the RAIters branch will be known at compile-time and
the dead code should get removed, and the wrapper functions inlined.


Here's the patch doing it as you suggest. We can't call the new
functions __equal because t hat name is already taken by a helper
struct, hence __equal4.

Do you prefer this version?


Yes, I prefer this version for readability reasons:
1) subjectively, less scattered code; and
2) ideally I want `if constexpr (...)`), the if version is closer.


Yes, we could add _GLIBCXX17_CONSTEXPR there, but I'm not sure it's
worth doing.

3) The calls to __equal4 in _Backref_matcher are simpler.


I agree that it's not a big difference. I just wanted to point out the
small difference. I'm fine with either version.


I'll commit the second version.


Thanks for the prototyping!


--
Regards,
Tim Shen


Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Tim Shen via gcc-patches
On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely  wrote:
> On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote:
>>
>> On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely 
>> wrote:

 We need to rewrite this to check the lengths are equal first, and then
 call the 3-argument version of std::equal.

 Alternatively, we could move the implementation of the C++14
 std::equal overloads to __equal and make that available for C++11.
 I'll try that.
>>>
>>>
>>>
>>> Here's a proof of concept patch for that. It's a bit ugly.
>>
>>
>> Instead of having iterator tags in the interface, we can probe the
>> random-access-ness inside __equal4/__equal4_p, can't we? It's similar
>> to the existing "if (_RAIters()) { ... }".
>>
>> I'd expect the patches to be renaming the current implementations and
>> adding wrappers, instead of adding new implementations.
>
>
> Well I decided to split the existing functions up and use tag
> dispatching, which is conceptually cleaner anyway. But as the
> RandomAccessIterator version doesn't need any operations that aren't
> valid for other categories, it's not strictly necessary. The tag
> dispatching version should generate slightly smaller code for
> unoptimized builds, but that's not very important.

Unoptimized builds don't inline small functions, therefore the first
patch generate two weak symbols, instead of one by the second patch.
It's unclear to me how would number of symbols penalize the
performance/binary size.

>
> Here's the patch doing it as you suggest. We can't call the new
> functions __equal because t hat name is already taken by a helper
> struct, hence __equal4.
>
> Do you prefer this version?

Yes, I prefer this version for readability reasons:
1) subjectively, less scattered code; and
2) ideally I want `if constexpr (...)`), the if version is closer.

I agree that it's not a big difference. I just wanted to point out the
small difference. I'm fine with either version.

Thanks for the prototyping!


-- 
Regards,
Tim Shen


Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Jonathan Wakely

On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote:

On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely  wrote:

We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.



Here's a proof of concept patch for that. It's a bit ugly.


Instead of having iterator tags in the interface, we can probe the
random-access-ness inside __equal4/__equal4_p, can't we? It's similar
to the existing "if (_RAIters()) { ... }".

I'd expect the patches to be renaming the current implementations and
adding wrappers, instead of adding new implementations.


Well I decided to split the existing functions up and use tag
dispatching, which is conceptually cleaner anyway. But as the
RandomAccessIterator version doesn't need any operations that aren't
valid for other categories, it's not strictly necessary. The tag
dispatching version should generate slightly smaller code for
unoptimized builds, but that's not very important.

Here's the patch doing it as you suggest. We can't call the new
functions __equal because t hat name is already taken by a helper
struct, hence __equal4.

Do you prefer this version?


diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index f6149fecf9d..2ceba35e7b8 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -366,17 +366,17 @@ namespace __detail
 	   _BiIter __actual_end)
   {
 	if (!_M_icase)
-	  return std::equal(__expected_begin, __expected_end,
-			__actual_begin, __actual_end);
+	  return std::__equal4(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end);
 	typedef std::ctype<_CharT> __ctype_type;
 	const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
-	return std::equal(__expected_begin, __expected_end,
-			  __actual_begin, __actual_end,
-			  [this, &__fctyp](_CharT __lhs, _CharT __rhs)
-			  {
-			return __fctyp.tolower(__lhs)
-== __fctyp.tolower(__rhs);
-			  });
+	return std::__equal4(__expected_begin, __expected_end,
+			 __actual_begin, __actual_end,
+			 [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+			 {
+			   return __fctyp.tolower(__lhs)
+ == __fctyp.tolower(__rhs);
+			 });
   }
 
   bool _M_icase;
diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index f68ecb22b82..ff5e94d9ae8 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -1082,6 +1082,58 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
   return true;
 }
 
+#if __cplusplus >= 201103L
+  template
+inline bool
+__equal4(_II1 __first1, _II1 __last1, _II2 __first2, _II2 __last2)
+{
+  using _RATag = random_access_iterator_tag;
+  using _Cat1 = typename iterator_traits<_II1>::iterator_category;
+  using _Cat2 = typename iterator_traits<_II2>::iterator_category;
+  using _RAIters = __and_, is_same<_Cat2, _RATag>>;
+  if (_RAIters())
+	{
+	  auto __d1 = std::distance(__first1, __last1);
+	  auto __d2 = std::distance(__first2, __last2);
+	  if (__d1 != __d2)
+	return false;
+	  return _GLIBCXX_STD_A::equal(__first1, __last1, __first2);
+	}
+
+  for (; __first1 != __last1 && __first2 != __last2;
+	  ++__first1, (void)++__first2)
+	if (!(*__first1 == *__first2))
+	  return false;
+  return __first1 == __last1 && __first2 == __last2;
+}
+
+  template
+inline bool
+__equal4(_II1 __first1, _II1 __last1, _II2 __first2, _II2 __last2,
+	 _BinaryPredicate __binary_pred)
+{
+  using _RATag = random_access_iterator_tag;
+  using _Cat1 = typename iterator_traits<_II1>::iterator_category;
+  using _Cat2 = typename iterator_traits<_II2>::iterator_category;
+  using _RAIters = __and_, is_same<_Cat2, _RATag>>;
+  if (_RAIters())
+	{
+	  auto __d1 = std::distance(__first1, __last1);
+	  auto __d2 = std::distance(__first2, __last2);
+	  if (__d1 != __d2)
+	return false;
+	  return _GLIBCXX_STD_A::equal(__first1, __last1, __first2,
+   __binary_pred);
+	}
+
+  for (; __first1 != __last1 && __first2 != __last2;
+	  ++__first1, (void)++__first2)
+	if (!bool(__binary_pred(*__first1, *__first2)))
+	  return false;
+  return __first1 == __last1 && __first2 == __last2;
+}
+#endif // C++11
+
 #if __cplusplus > 201103L
 
 #define __cpp_lib_robust_nonmodifying_seq_ops 201304
@@ -1112,24 +1164,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
   __glibcxx_requires_valid_range(__first1, __last1);
   __glibcxx_requires_valid_range(__first2, __last2);
 
-  using _RATag = random_access_iterator_tag;
-  using _Cat1 = typename iterator_traits<_II1>::iterator_category;
-  using _Cat2 = typename iterator_traits<_II2>::iterator_category;
-  using _RAIters = __and_, i

Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Tim Shen via gcc-patches
On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely  wrote:
>> We need to rewrite this to check the lengths are equal first, and then
>> call the 3-argument version of std::equal.
>>
>> Alternatively, we could move the implementation of the C++14
>> std::equal overloads to __equal and make that available for C++11.
>> I'll try that.
>
>
> Here's a proof of concept patch for that. It's a bit ugly.

Instead of having iterator tags in the interface, we can probe the
random-access-ness inside __equal4/__equal4_p, can't we? It's similar
to the existing "if (_RAIters()) { ... }".

I'd expect the patches to be renaming the current implementations and
adding wrappers, instead of adding new implementations.


-- 
Regards,
Tim Shen


Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Jonathan Wakely

On 15/09/17 16:39 +0100, Jonathan Wakely wrote:

On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote:

This fixes the follow-up comments in 71500.

Back-reference matching is different from other matching, as the
content the back-reference refers to is at "run-time", aka during
regex_match(), not regex() compilation.

For compilation we do have an abstraction layer to catch all
comparison customizations, namely _M_translator in regex_compiler.h.
Until this patch, we don't have an abstraction for "run-time"
matching. I believe that back-reference is the only place that needs
run-time matching, so I just build a _Backref_matcher in
regex_executot.tcc.

Tested on x86_64-linux-gnu.

Thanks!

--
Regards,
Tim Shen



commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26
Author: Tim Shen 
Date:   Mon Sep 4 03:19:35 2017 -0700

  PR libstdc++/71500
  * include/bits/regex_executor.tcc: Support icase in
  regex_tratis<...> for back reference matches.
  * testsuite/28_regex/regression.cc: Test case.

diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 226e05856e1..f6149fecf9d 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -335,6 +335,54 @@ namespace __detail
  _M_states._M_queue(__state._M_next, _M_cur_results);
   }

+  template
+struct _Backref_matcher
+{
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+  _BiIter __expected_end, _BiIter __actual_begin,
+  _BiIter __actual_end)
+  {
+   return _M_traits.transform(__expected_begin, __expected_end)
+   == _M_traits.transform(__actual_begin, __actual_end);
+  }
+
+  const _TraitsT& _M_traits;
+};
+
+  template
+struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>>
+{
+  using _TraitsT = std::regex_traits<_CharT>;
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_icase(__icase), _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+  _BiIter __expected_end, _BiIter __actual_begin,
+  _BiIter __actual_end)
+  {
+   if (!_M_icase)
+ return std::equal(__expected_begin, __expected_end,
+   __actual_begin, __actual_end);


This is only valid in C++14 and higher, because the 4-argument version
of std::equal isn't present in C++11.


+   typedef std::ctype<_CharT> __ctype_type;
+   const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
+   return std::equal(__expected_begin, __expected_end,
+ __actual_begin, __actual_end,


Same here.


+ [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+ {
+   return __fctyp.tolower(__lhs)
+   == __fctyp.tolower(__rhs);
+ });


We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.


Here's a proof of concept patch for that. It's a bit ugly.


diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index f6149fecf9d..4b185cc9d1e 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -366,17 +366,21 @@ namespace __detail
 	   _BiIter __actual_end)
   {
 	if (!_M_icase)
-	  return std::equal(__expected_begin, __expected_end,
-			__actual_begin, __actual_end);
+	  return std::__equal4(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end,
+			   std::__iterator_category(__expected_begin),
+			   std::__iterator_category(__actual_begin));
 	typedef std::ctype<_CharT> __ctype_type;
 	const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
-	return std::equal(__expected_begin, __expected_end,
-			  __actual_begin, __actual_end,
-			  [this, &__fctyp](_CharT __lhs, _CharT __rhs)
-			  {
-			return __fctyp.tolower(__lhs)
-== __fctyp.tolower(__rhs);
-			  });
+	return std::__equal4_p(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end,
+			   [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+			   {
+ return __fctyp.tolower(__lhs)
+   == __fctyp.tolower(__rhs);
+			   },
+			   std::__iterator_category(__expected_begin),
+			   std::__iterator_category(__actual_begin));
   }
 
   bool _M_icase;
diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index f68ecb22b82..b7848b3de99 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h

Re: [libstdc++/71500] make back reference work with icase

2017-09-15 Thread Jonathan Wakely

On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote:

This fixes the follow-up comments in 71500.

Back-reference matching is different from other matching, as the
content the back-reference refers to is at "run-time", aka during
regex_match(), not regex() compilation.

For compilation we do have an abstraction layer to catch all
comparison customizations, namely _M_translator in regex_compiler.h.
Until this patch, we don't have an abstraction for "run-time"
matching. I believe that back-reference is the only place that needs
run-time matching, so I just build a _Backref_matcher in
regex_executot.tcc.

Tested on x86_64-linux-gnu.

Thanks!

--
Regards,
Tim Shen



commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26
Author: Tim Shen 
Date:   Mon Sep 4 03:19:35 2017 -0700

   PR libstdc++/71500
   * include/bits/regex_executor.tcc: Support icase in
   regex_tratis<...> for back reference matches.
   * testsuite/28_regex/regression.cc: Test case.

diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 226e05856e1..f6149fecf9d 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -335,6 +335,54 @@ namespace __detail
  _M_states._M_queue(__state._M_next, _M_cur_results);
}

+  template
+struct _Backref_matcher
+{
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+  _BiIter __expected_end, _BiIter __actual_begin,
+  _BiIter __actual_end)
+  {
+   return _M_traits.transform(__expected_begin, __expected_end)
+   == _M_traits.transform(__actual_begin, __actual_end);
+  }
+
+  const _TraitsT& _M_traits;
+};
+
+  template
+struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>>
+{
+  using _TraitsT = std::regex_traits<_CharT>;
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_icase(__icase), _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+  _BiIter __expected_end, _BiIter __actual_begin,
+  _BiIter __actual_end)
+  {
+   if (!_M_icase)
+ return std::equal(__expected_begin, __expected_end,
+   __actual_begin, __actual_end);


This is only valid in C++14 and higher, because the 4-argument version
of std::equal isn't present in C++11.


+   typedef std::ctype<_CharT> __ctype_type;
+   const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
+   return std::equal(__expected_begin, __expected_end,
+ __actual_begin, __actual_end,


Same here.


+ [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+ {
+   return __fctyp.tolower(__lhs)
+   == __fctyp.tolower(__rhs);
+ });


We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.




Re: [libstdc++/71500] make back reference work with icase

2017-09-08 Thread Jonathan Wakely

On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote:

This fixes the follow-up comments in 71500.

Back-reference matching is different from other matching, as the
content the back-reference refers to is at "run-time", aka during
regex_match(), not regex() compilation.

For compilation we do have an abstraction layer to catch all
comparison customizations, namely _M_translator in regex_compiler.h.
Until this patch, we don't have an abstraction for "run-time"
matching. I believe that back-reference is the only place that needs
run-time matching, so I just build a _Backref_matcher in
regex_executot.tcc.


Looks like a nice solution. OK for trunk, thanks.

I think this looks safe to backport too, but let's leave it on trunk
for a while first.




[libstdc++/71500] make back reference work with icase

2017-09-04 Thread Tim Shen via gcc-patches
This fixes the follow-up comments in 71500.

Back-reference matching is different from other matching, as the
content the back-reference refers to is at "run-time", aka during
regex_match(), not regex() compilation.

For compilation we do have an abstraction layer to catch all
comparison customizations, namely _M_translator in regex_compiler.h.
Until this patch, we don't have an abstraction for "run-time"
matching. I believe that back-reference is the only place that needs
run-time matching, so I just build a _Backref_matcher in
regex_executot.tcc.

Tested on x86_64-linux-gnu.

Thanks!

-- 
Regards,
Tim Shen
commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26
Author: Tim Shen 
Date:   Mon Sep 4 03:19:35 2017 -0700

PR libstdc++/71500
* include/bits/regex_executor.tcc: Support icase in
regex_tratis<...> for back reference matches.
* testsuite/28_regex/regression.cc: Test case.

diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index 226e05856e1..f6149fecf9d 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -335,6 +335,54 @@ namespace __detail
 	  _M_states._M_queue(__state._M_next, _M_cur_results);
 }
 
+  template
+struct _Backref_matcher
+{
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+	   _BiIter __expected_end, _BiIter __actual_begin,
+	   _BiIter __actual_end)
+  {
+	return _M_traits.transform(__expected_begin, __expected_end)
+	== _M_traits.transform(__actual_begin, __actual_end);
+  }
+
+  const _TraitsT& _M_traits;
+};
+
+  template
+struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>>
+{
+  using _TraitsT = std::regex_traits<_CharT>;
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_icase(__icase), _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+	   _BiIter __expected_end, _BiIter __actual_begin,
+	   _BiIter __actual_end)
+  {
+	if (!_M_icase)
+	  return std::equal(__expected_begin, __expected_end,
+			__actual_begin, __actual_end);
+	typedef std::ctype<_CharT> __ctype_type;
+	const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
+	return std::equal(__expected_begin, __expected_end,
+			  __actual_begin, __actual_end,
+			  [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+			  {
+			return __fctyp.tolower(__lhs)
+== __fctyp.tolower(__rhs);
+			  });
+  }
+
+  bool _M_icase;
+  const _TraitsT& _M_traits;
+};
+
   // First fetch the matched result from _M_cur_results as __submatch;
   // then compare it with
   // (_M_current, _M_current + (__submatch.second - __submatch.first)).
@@ -355,9 +403,10 @@ namespace __detail
 	   __last != _M_end && __tmp != __submatch.second;
 	   ++__tmp)
 	++__last;
-  if (_M_re._M_automaton->_M_traits.transform(__submatch.first,
-		  __submatch.second)
-	  == _M_re._M_automaton->_M_traits.transform(_M_current, __last))
+  if (_Backref_matcher<_BiIter, _TraitsT>(
+	  _M_re.flags() & regex_constants::icase,
+	  _M_re._M_automaton->_M_traits)._M_apply(
+		  __submatch.first, __submatch.second, _M_current, __last))
 	{
 	  if (__last != _M_current)
 	{
diff --git a/libstdc++-v3/testsuite/28_regex/regression.cc b/libstdc++-v3/testsuite/28_regex/regression.cc
index ee4d3e1e6f8..3fa9022eac4 100644
--- a/libstdc++-v3/testsuite/28_regex/regression.cc
+++ b/libstdc++-v3/testsuite/28_regex/regression.cc
@@ -93,6 +93,17 @@ test06()
   }
 }
 
+// PR libstdc++/71500
+void
+test07()
+{
+  bool test [[gnu::unused]] = true;
+
+  VERIFY(regex_match_debug("abc abc", regex("([a-z]+) \\1", regex::icase)));
+  VERIFY(regex_match_debug("Abc abc", regex("([a-z]+) \\1", regex::icase)));
+  VERIFY(regex_match_debug("abc Abc", regex("([a-z]+) \\1", regex::icase)));
+}
+
 int
 main()
 {
@@ -102,6 +113,7 @@ main()
   test04();
   test05();
   test06();
+  test07();
   return 0;
 }