Re: Bad Exception Crash in Master

Guillaume Munch Mon, 16 Nov 2015 14:54:20 -0800

Le 16/11/2015 20:22, Georg Baum a écrit :

Richard Heck wrote:


Recipe:

1. Open the User Guide.
2. Click somewhere in the main text.
3. Insert > Citation
4. Enter the search field and type "cap"
5. Hit search button.

Error: Software exception Detected
----------------------------------------
LyX has caught an exception, it will now attempt to save all unsaved
documents and exit.

Exception: regex_error

This is on Fedora 22, gcc 5.1.1, with configuration as:
--enable-build-type=dev.

The offending code is this assignment

          static const lyx::regex reg("[].|*?+(){}^$\\[\\\\]");

in escape_special_characters in GuiCitation.cpp. I would guess that this
is a consequence of changes to the regex engine.


I can reproduce with gcc 5.1 in C++11-mode. This is most likely the same
problem as http://www.lyx.org/trac/ticket/9799, since the test fails as well
for the same compiler, so I guess it happens with gcc4 in C++11 mode as
well. Nice to see that the unit test worked! Now we only need to learn not
to ignore failing unit tests.

Some regex expert needs to find out whether the regex syntax we are using is
only valid for boost::regex or for C++11 std::regex as well, and depending
on the outcome we either need to change our regex, or file a gcc bug. Or
maybe it is some multithreading issue? The 'static' rings some alarm
bells...

If I may try to explain with my own words (because I really didn't getyour reasoning at first, being far from being a C++ regex expert).Std::regex and boost::regex do not set the same regex style by default(perl vs ECMAScript) but both support other styles by passing theappropriate flag. The commit 4620034e already had fixed a crash a whileago due the use of (what is now) std::regex in MSVC, by passingautomatically the flag to set the "grep" regex style instead of ECMAScript.

But, in 394e1bf9 you did without passing this flag. The reason, if Ireconstruct correctly, is that std's ECMAScript style is only supposedto be boost's perl style minus a few perl-isms; we shouldn't have to seta style "grep" (that may even change the meaning of regexes!). The mostsensible cross-platform solution is to conform to the ECMAScript subset,and if I infer correctly this is what you mean.

Once this is understood, we go to the ECMAScript standard and realizethat "[]" is interpreted as the empty class in ECMAScript. "[]…]" is aperl-ism, and ] should be escaped.

Lesson for everybody apart from Georg: please write regexes according tothe ECMAScript standard in the future, even if boost is happy with yourperl-isms.

Here's a patch that locates a similar issue elsewhere. I did my fairshare of the exercise, can I please let other people test it and committhe appropriate solution?



Guillaume

diff --git a/src/frontends/qt4/GuiCitation.cpp b/src/frontends/qt4/GuiCitation.cpp
index 052d700..eadff4f 100644
--- a/src/frontends/qt4/GuiCitation.cpp
+++ b/src/frontends/qt4/GuiCitation.cpp
@@ -665,12 +665,12 @@ void GuiCitation::filterByEntryType(BiblioInfo const & bi,
 static docstring escape_special_chars(docstring const & expr)
 {
 	// Search for all chars '.|*?+(){}[^$]\'
-	// Note that '[' and '\' must be escaped.
+	// Note that '[', ']', and '\' must be escaped.
 	// This is a limitation of lyx::regex, but all other chars in BREs
 	// are assumed literal.
-	static const lyx::regex reg("[].|*?+(){}^$\\[\\\\]");
+	static const lyx::regex reg("[.|*?+(){}^$\\[\\]\\\\]");
 
-	// $& is a perl-like expression that expands to all
+	// $& is an ECMAScript format expression that expands to all
 	// of the current match
 	// The '$' must be prefixed with the escape character '\' for
 	// boost to treat it as a literal.
diff --git a/src/frontends/tests/biblio.cpp b/src/frontends/tests/biblio.cpp
index 933c87a..ae5858d 100644
--- a/src/frontends/tests/biblio.cpp
+++ b/src/frontends/tests/biblio.cpp
@@ -15,12 +15,12 @@ using namespace std;
 string const escape_special_chars(string const & expr)
 {
 	// Search for all chars '.|*?+(){}[^$]\'
-	// Note that '[' and '\' must be escaped.
+	// Note that '[', ']', and '\' must be escaped.
 	// This is a limitation of lyx::regex, but all other chars in BREs
 	// are assumed literal.
-	lyx::regex reg("[].|*?+(){}^$\\[\\\\]");
+	lyx::regex reg("[.|*?+(){}^$\\[\\]\\\\]");
 
-	// $& is a perl-like expression that expands to all
+	// $& is a ECMAScript format expression that expands to all
 	// of the current match
 	// The '$' must be prefixed with the escape character '\' for
 	// boost to treat it as a literal.
diff --git a/src/insets/ExternalTransforms.cpp b/src/insets/ExternalTransforms.cpp
index f8ce18d..a3bf82d 100644
--- a/src/insets/ExternalTransforms.cpp
+++ b/src/insets/ExternalTransforms.cpp
@@ -283,7 +283,7 @@ string const sanitizeLatexOption(string const & input)
 	// "[,,,,foo..." -> "foo..." ("foo..." may be empty)
 	string output;
 	lyx::smatch what;
-	static lyx::regex const front("^( *[[],*)(.*)$");
+	static lyx::regex const front("^( *\\[,*)(.*)$");
 
 	regex_match(it, end, what, front);
 	if (!what[0].matched) {
@@ -309,7 +309,7 @@ string const sanitizeLatexOption(string const & input)
 
 	// Strip any trailing commas
 	// "...foo,,,]" -> "...foo" ("...foo,,," may be empty)
-	static lyx::regex const back("^(.*[^,])?,*[]] *$");
+	static lyx::regex const back("^(.*[^,])?,*\\] *$");
 	regex_match(output, what, back);
 	if (!what[0].matched) {
 		lyxerr << "Unable to sanitize LaTeX \"Option\": "

Re: Bad Exception Crash in Master

Reply via email to