Re: [Patch] Patch set for regex instantiation

2014-06-29 Thread Jonathan Wakely

On 11/01/14 19:48 -0500, Tim Shen wrote:

Here're 4 patches that finally led the _Compiler's instantiation and
some other optimization for compiling time.

1) Create class _ScannerBase to make _Scanner pithier. Move const
static members to src/c++11/regex.cc.


I think it might be time to revisit this change and put the data
members of _ScannerBase into the .so, as we're creating and
initializing 136 bytes of duplicated constants every time we construct
a std::regex.

Alternatively, we could just make them local statics instead of member
data and still remove the duplciation (see attached).


commit 82f3e9fa7d16a677bb9ebc6bc3da12c1209c2df5
Author: Jonathan Wakely jwak...@redhat.com
Date:   Sun Jun 29 17:58:59 2014 +0100

	* include/bits/regex_scanner.h (_ScannerBase): Replace member data
	with static member functions.
	* include/bits/regex_scanner.tcc (_ScannerBase): Likewise.

diff --git a/libstdc++-v3/include/bits/regex_scanner.h b/libstdc++-v3/include/bits/regex_scanner.h
index 6627db9..fdc1210 100644
--- a/libstdc++-v3/include/bits/regex_scanner.h
+++ b/libstdc++-v3/include/bits/regex_scanner.h
@@ -1,4 +1,4 @@
-// class template regex -*- C++ -*-
+// Regex implementation -*- C++ -*-
 
 // Copyright (C) 2013-2014 Free Software Foundation, Inc.
 //
@@ -92,13 +92,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : _M_state(_S_state_normal),
 _M_flags(__flags),
 _M_escape_tbl(_M_is_ecma()
-		  ? _M_ecma_escape_tbl
-		  : _M_awk_escape_tbl),
+		  ? _S_ecma_escape_tbl()
+		  : _S_awk_escape_tbl()),
 _M_spec_char(_M_is_ecma()
-		 ? _M_ecma_spec_char
+		 ? _S_ecma_spec_char()
 		 : _M_is_basic()
-		 ? _M_basic_spec_char
-		 : _M_extended_spec_char),
+		 ? _S_basic_spec_char()
+		 : _S_extended_spec_char()),
 _M_at_bracket_start(false)
 { }
 
@@ -138,46 +138,60 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return _M_flags  regex_constants::awk; }
 
   protected:
-const std::pairchar, _TokenT _M_token_tbl[9] =
-  {
-	{'^', _S_token_line_begin},
-	{'$', _S_token_line_end},
-	{'.', _S_token_anychar},
-	{'*', _S_token_closure0},
-	{'+', _S_token_closure1},
-	{'?', _S_token_opt},
-	{'|', _S_token_or},
-	{'\n', _S_token_or}, // grep and egrep
-	{'\0', _S_token_or},
-  };
-const std::pairchar, char _M_ecma_escape_tbl[8] =
-  {
-	{'0', '\0'},
-	{'b', '\b'},
-	{'f', '\f'},
-	{'n', '\n'},
-	{'r', '\r'},
-	{'t', '\t'},
-	{'v', '\v'},
-	{'\0', '\0'},
-  };
-const std::pairchar, char _M_awk_escape_tbl[11] =
-  {
-	{'', ''},
-	{'/', '/'},
-	{'\\', '\\'},
-	{'a', '\a'},
-	{'b', '\b'},
-	{'f', '\f'},
-	{'n', '\n'},
-	{'r', '\r'},
-	{'t', '\t'},
-	{'v', '\v'},
-	{'\0', '\0'},
-  };
-const char* _M_ecma_spec_char = ^$\\.*+?()[]{}|;
-const char* _M_basic_spec_char = .[\\*^$;
-const char* _M_extended_spec_char = .[\\()*+?{|^$;
+static const std::pairchar, _TokenT* _S_token_tbl()
+{
+  static const std::pairchar, _TokenT __token_tbl[9] =
+	{
+	  {'^', _S_token_line_begin},
+	  {'$', _S_token_line_end},
+	  {'.', _S_token_anychar},
+	  {'*', _S_token_closure0},
+	  {'+', _S_token_closure1},
+	  {'?', _S_token_opt},
+	  {'|', _S_token_or},
+	  {'\n', _S_token_or}, // grep and egrep
+	  {'\0', _S_token_or},
+	};
+  return __token_tbl;
+}
+
+static const std::pairchar, char* _S_ecma_escape_tbl()
+{
+  static const std::pairchar, char __ecma_escape_tbl[8] =
+	{
+	  {'0', '\0'},
+	  {'b', '\b'},
+	  {'f', '\f'},
+	  {'n', '\n'},
+	  {'r', '\r'},
+	  {'t', '\t'},
+	  {'v', '\v'},
+	  {'\0', '\0'},
+	};
+  return __ecma_escape_tbl;
+}
+static const std::pairchar, char* _S_awk_escape_tbl()
+{
+  static const std::pairchar, char __awk_escape_tbl[11] =
+	{
+	  {'', ''},
+	  {'/', '/'},
+	  {'\\', '\\'},
+	  {'a', '\a'},
+	  {'b', '\b'},
+	  {'f', '\f'},
+	  {'n', '\n'},
+	  {'r', '\r'},
+	  {'t', '\t'},
+	  {'v', '\v'},
+	  {'\0', '\0'},
+	};
+  return __awk_escape_tbl;
+}
+
+static const char* _S_ecma_spec_char() { return ^$\\.*+?()[]{}|; }
+static const char* _S_basic_spec_char() { return .[\\*^$; }
+static const char* _S_extended_spec_char() { return .[\\()*+?{|^$; }
 
 _StateT   _M_state;
 _FlagT_M_flags;
diff --git a/libstdc++-v3/include/bits/regex_scanner.tcc b/libstdc++-v3/include/bits/regex_scanner.tcc
index 818e47b..ed88f71 100644
--- a/libstdc++-v3/include/bits/regex_scanner.tcc
+++ b/libstdc++-v3/include/bits/regex_scanner.tcc
@@ -170,7 +170,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		 __c != '}')
 	   || (_M_is_grep()  __c == '\n'))
 	{
-	  auto __it = _M_token_tbl;
+	  auto __it = _S_token_tbl();
 	  auto __narrowc = _M_ctype.narrow(__c, '\0');
 	  for (; __it-first != '\0'; ++__it)
 	if (__it-first == __narrowc)


Re: [Patch] Patch set for regex instantiation

2014-06-29 Thread Tim Shen
On Sun, Jun 29, 2014 at 10:18 AM, Jonathan Wakely jwak...@redhat.com wrote:
 Alternatively, we could just make them local statics instead of member
 data and still remove the duplciation (see attached).

This is a good idea. Can we use constexpr for the functions/static variables?


-- 
Regards,
Tim Shen


Re: [Patch] Patch set for regex instantiation

2014-06-29 Thread Daniel Krügler
2014-06-29 20:47 GMT+02:00 Tim Shen timshe...@gmail.com:
 On Sun, Jun 29, 2014 at 10:18 AM, Jonathan Wakely jwak...@redhat.com wrote:
 Alternatively, we could just make them local statics instead of member
 data and still remove the duplciation (see attached).

 This is a good idea. Can we use constexpr for the functions/static variables?

Not so for local statics within constexpr functions, but you could
have constexpr static data members, yes.

- Daniel


Re: [Patch] Patch set for regex instantiation

2014-01-12 Thread Paolo Carlini
Hi

 On 12/gen/2014, at 01:48, Tim Shen timshe...@gmail.com wrote:
 
 Here're 4 patches that finally led the _Compiler's instantiation and
 some other optimization for compiling time.
 
 1) Create class _ScannerBase to make _Scanner pithier. Move const
 static members to src/c++11/regex.cc.
 2) Make _Compiler and _Scanner `_FwdIter independent`. We store the
 input regex string in basic_regex as a basic_string; but when
 compiling it, const _CharT* is used.
 3) Avoid using std::map, std::set and std::queue to reduce compile time.
 4) Instantiate _Compilerregex_traitschar and
 _Compilerregex_traitswchar_t. Export vectorstring and
 vectorwstring's ctor and dtor as well for _Compiler's denpendency.

Thanks, but as we already tried to explain, instantiating, thus adding many 
exported symbols, is post 4.9 material, can't be committed until we branch. 
Please make sure to have in a separate patch or multiple patches the 
correctness fixes and maybe anything unrelated to instantiation which you 
consider stable and independently useful.

 Booted, and tested with -m64 and -m32; But check-debug failed some
 23_containers/* cases? I suppose it's not my problem?

Please make sure Francois knows about that...

Paolo