On 09/11/2017 05:04 PM, Julian Büning wrote: > observed behavior: > > $ echo | ptx -S $ & > [1] 1000 > $ jobs > [1]+ Running echo | ptx -S $ & > > expected behavior: > > $ echo | ptx -S $ & > [1] 1000 > [1]+ Done echo | ptx -S $ > > ptx does not terminate in case the specified sentence regex can be > matched but has a match of length zero on input that is non-empty. > > The following test cases show the same behavior: > $ echo | ptx -S ^ > $ echo | ptx -S "a*" > $ echo | ptx -S "\(\)" > $ echo test | ptx -S "\n*" > $ echo foo > non_empty; ptx non_empty -S $ > ... > > In ptx.c, find_occurs_in_text() calls re_search() and uses the length of > a match (which is falsely assumed to be greater than zero) to advance a > cursor through the input. For a match length of zero, the cursor is > never advanced. > > When switching on the results of re_search(), a case 0 could be added. > One possible fix would be to then abort with an error message. > > We found this behavior in version 8.27 and can reproduce it in version > 8.25 as well as version 8.28. > > This behavior was found using Symbolic Execution techniques developed in > the course of the SYMBIOSYS research project at COMSYS, RWTH Aachen > University. This research is supported by the European Research Council > (ERC) under the EU's Horizon 2020 Research and Innovation Programme > grant agreement n. 647295 (SYMBIOSYS).
Good catch! The attached patch fixes it; please check. Have a nice day, Berny
>From f082748dbdcab15de50e2e1dc26a7276aec2432c Mon Sep 17 00:00:00 2001 From: Bernhard Voelker <m...@bernhard-voelker.de> Date: Wed, 13 Sep 2017 23:37:20 +0200 Subject: [PATCH] ptx: avoid infloop due to zero-length matches with -S regex * src/ptx.c (find_occurs_in_text): Die with an appropriate error diagnostic when the given regular expression returns a match of length 0. * tests/misc/ptx.pl (S-infloop): Add a test. * NEWS (Bug fixes): Mention the fix. Fixes https://bugs.gnu.org/28417 which was detected using Symbolic Execution techniques developed in the course of the SYMBIOSYS research project at COMSYS, RWTH Aachen University. --- NEWS | 5 +++++ src/ptx.c | 5 +++++ tests/misc/ptx.pl | 6 ++++++ 3 files changed, 16 insertions(+) diff --git a/NEWS b/NEWS index 8be22f1..ca36063 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,11 @@ GNU coreutils NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Bug fixes + + ptx -S no longer infloops for a pattern which returns zero-length matches. + [the bug dates back to the initial implementation] + * Noteworthy changes in release 8.28 (2017-09-01) [stable] diff --git a/src/ptx.c b/src/ptx.c index 2aababf..b7aa107 100644 --- a/src/ptx.c +++ b/src/ptx.c @@ -818,6 +818,11 @@ find_occurs_in_text (int file_index) case -1: break; + case 0: + die (EXIT_FAILURE, 0, + _("error: regular expression has a match of length zero: %s"), + quote (context_regex.string)); + default: next_context_start = cursor + context_regs.end[0]; break; diff --git a/tests/misc/ptx.pl b/tests/misc/ptx.pl index d71d065..4d4e1c7 100755 --- a/tests/misc/ptx.pl +++ b/tests/misc/ptx.pl @@ -40,6 +40,12 @@ my @Tests = {OUT=>".xx \"\" \"\" \"foo\" \"\"\n"}], ["format-t", '--format=tex', {IN=>"foo\n"}, {OUT=>"\\xx {}{}{foo}{}{}\n"}], + +# with coreutils-8.28 and earlier, the -S option would infloop with +# matches of zero-length. +["S-infloop", '-S ^', {IN=>"a\n"}, {EXIT=>1}, + {ERR_SUBST=>'s/^.*reg.*ex.*length zero.*$/regexlzero/'}, + {ERR=>"regexlzero\n"}], ); @Tests = triple_test \@Tests; -- 2.1.4