There seems to be a bug in rxmatches.  I expect ('|$' rxmatches 'is') to
have three matches, but the final one is omitted.  Likewise for an empty
regex.  For comparison with perl:

    cat | perl
    $_= "is"; s/$/--/g; print "$_\n";
    $_= "is"; s/|$/--/g; print "$_\n";
    $_= "is"; s//--/g; print "$_\n";

    is--
    --i--s--
    --i--s--

Note that perl matches the end of the string for all 3 cases.

    jc
       s=: 'is'
       (<'--') ('$' rxmatches s) rxmerge s
    is--
       (<'--') ('|$' rxmatches s) rxmerge s
    --i--s
       (<'--') ('' rxmatches s) rxmerge s
    --i--s
       exit''

Note that I have used rxmerge to mimic the example given in perl.  However,
the unexpected result comes from rxmatches.

As these examples show, rxmatches is not compatible with perl for the
second 2 cases.  Clearly the second case should match the end of the
string, as one clause of the regex is "end of string."  The third case
seems to be the same bug.

If you visit https://www.pcre.org/original/doc/html/pcreapi.html and search
for "tricky" you will find:

Finding all the matches in a subject is tricky when the pattern can match
an empty string. It is possible to emulate Perl's /g behaviour by first
trying the match again at the same offset, with the PCRE_NOTEMPTY_ATSTART
and PCRE_ANCHORED options, and then if that fails, advancing the starting
offset and trying an ordinary match again. There is some code that
demonstrates how to do this in the pcredemo sample program. In the most
general case, you have to check to see if the newline convention recognizes
CRLF as a newline, and if so, and the current character is CR followed by
LF, advance the starting offset by two characters instead of one.

I have tried pcredemo and it provides results consistent with perl.

I have provided test scripts in both perl and ijs, along with a minimal
test file.

Thanks,
Rik
a
b
abacus
bbbcus
b
a
abacus
aaacus
$
--
is
is--

--
is
--i--s--
|$
--
is
--i--s--
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to