Le 16/11/2015 20:22, Georg Baum a écrit :
Richard Heck wrote:


Recipe:

1. Open the User Guide.
2. Click somewhere in the main text.
3. Insert > Citation
4. Enter the search field and type "cap"
5. Hit search button.

Error: Software exception Detected
----------------------------------------
LyX has caught an exception, it will now attempt to save all unsaved
documents and exit.

Exception: regex_error

This is on Fedora 22, gcc 5.1.1, with configuration as:
--enable-build-type=dev.

The offending code is this assignment

          static const lyx::regex reg("[].|*?+(){}^$\\[\\\\]");

in escape_special_characters in GuiCitation.cpp. I would guess that this
is a consequence of changes to the regex engine.

I can reproduce with gcc 5.1 in C++11-mode. This is most likely the same
problem as http://www.lyx.org/trac/ticket/9799, since the test fails as well
for the same compiler, so I guess it happens with gcc4 in C++11 mode as
well. Nice to see that the unit test worked! Now we only need to learn not
to ignore failing unit tests.

Some regex expert needs to find out whether the regex syntax we are using is
only valid for boost::regex or for C++11 std::regex as well, and depending
on the outcome we either need to change our regex, or file a gcc bug. Or
maybe it is some multithreading issue? The 'static' rings some alarm
bells...



If I may try to explain with my own words (because I really didn't get your reasoning at first, being far from being a C++ regex expert). Std::regex and boost::regex do not set the same regex style by default (perl vs ECMAScript) but both support other styles by passing the appropriate flag. The commit 4620034e already had fixed a crash a while ago due the use of (what is now) std::regex in MSVC, by passing automatically the flag to set the "grep" regex style instead of ECMAScript.

But, in 394e1bf9 you did without passing this flag. The reason, if I reconstruct correctly, is that std's ECMAScript style is only supposed to be boost's perl style minus a few perl-isms; we shouldn't have to set a style "grep" (that may even change the meaning of regexes!). The most sensible cross-platform solution is to conform to the ECMAScript subset, and if I infer correctly this is what you mean.

Once this is understood, we go to the ECMAScript standard and realize that "[]" is interpreted as the empty class in ECMAScript. "[]…]" is a perl-ism, and ] should be escaped.

Lesson for everybody apart from Georg: please write regexes according to the ECMAScript standard in the future, even if boost is happy with your perl-isms.

Here's a patch that locates a similar issue elsewhere. I did my fair share of the exercise, can I please let other people test it and commit the appropriate solution?


Guillaume
diff --git a/src/frontends/qt4/GuiCitation.cpp b/src/frontends/qt4/GuiCitation.cpp
index 052d700..eadff4f 100644
--- a/src/frontends/qt4/GuiCitation.cpp
+++ b/src/frontends/qt4/GuiCitation.cpp
@@ -665,12 +665,12 @@ void GuiCitation::filterByEntryType(BiblioInfo const & bi,
 static docstring escape_special_chars(docstring const & expr)
 {
 	// Search for all chars '.|*?+(){}[^$]\'
-	// Note that '[' and '\' must be escaped.
+	// Note that '[', ']', and '\' must be escaped.
 	// This is a limitation of lyx::regex, but all other chars in BREs
 	// are assumed literal.
-	static const lyx::regex reg("[].|*?+(){}^$\\[\\\\]");
+	static const lyx::regex reg("[.|*?+(){}^$\\[\\]\\\\]");
 
-	// $& is a perl-like expression that expands to all
+	// $& is an ECMAScript format expression that expands to all
 	// of the current match
 	// The '$' must be prefixed with the escape character '\' for
 	// boost to treat it as a literal.
diff --git a/src/frontends/tests/biblio.cpp b/src/frontends/tests/biblio.cpp
index 933c87a..ae5858d 100644
--- a/src/frontends/tests/biblio.cpp
+++ b/src/frontends/tests/biblio.cpp
@@ -15,12 +15,12 @@ using namespace std;
 string const escape_special_chars(string const & expr)
 {
 	// Search for all chars '.|*?+(){}[^$]\'
-	// Note that '[' and '\' must be escaped.
+	// Note that '[', ']', and '\' must be escaped.
 	// This is a limitation of lyx::regex, but all other chars in BREs
 	// are assumed literal.
-	lyx::regex reg("[].|*?+(){}^$\\[\\\\]");
+	lyx::regex reg("[.|*?+(){}^$\\[\\]\\\\]");
 
-	// $& is a perl-like expression that expands to all
+	// $& is a ECMAScript format expression that expands to all
 	// of the current match
 	// The '$' must be prefixed with the escape character '\' for
 	// boost to treat it as a literal.
diff --git a/src/insets/ExternalTransforms.cpp b/src/insets/ExternalTransforms.cpp
index f8ce18d..a3bf82d 100644
--- a/src/insets/ExternalTransforms.cpp
+++ b/src/insets/ExternalTransforms.cpp
@@ -283,7 +283,7 @@ string const sanitizeLatexOption(string const & input)
 	// "[,,,,foo..." -> "foo..." ("foo..." may be empty)
 	string output;
 	lyx::smatch what;
-	static lyx::regex const front("^( *[[],*)(.*)$");
+	static lyx::regex const front("^( *\\[,*)(.*)$");
 
 	regex_match(it, end, what, front);
 	if (!what[0].matched) {
@@ -309,7 +309,7 @@ string const sanitizeLatexOption(string const & input)
 
 	// Strip any trailing commas
 	// "...foo,,,]" -> "...foo" ("...foo,,," may be empty)
-	static lyx::regex const back("^(.*[^,])?,*[]] *$");
+	static lyx::regex const back("^(.*[^,])?,*\\] *$");
 	regex_match(output, what, back);
 	if (!what[0].matched) {
 		lyxerr << "Unable to sanitize LaTeX \"Option\": "

Reply via email to