Hello regexp-dev,
Purpose: bug report
Version: jakarta-regexp-1.3
Sample test:
RE r = new RE( "\\*\\*\\*(.+?)\\*\\*" );
String fText = r.subst("aaa ***TEXT** ***AAA** bbb", "<h3>$1</h3>",
RE.REPLACE_ALL | RE.REPLACE_BACKREFERENCES);
System.out.println( fText );
Output:
aaa 3>TEXT</h3> 3>AAA</h3> bbb
While I expect to replace all my '***some_text**' with
'<h3>some_text</h3>', I get '3>some_text</h3>' as a replacement
So, I've run into source and found there the following code:
(RE.java, start from 1732 string)
--
[...]
// Process backreferences
int lCurrentPosition = 0;
int lLastPosition = 0;
int lLength = substitution.length();
while ((lCurrentPosition = substitution.indexOf("$",
lCurrentPosition)) >= 0)
{
if ((lCurrentPosition == 0 || substitution.charAt(lCurrentPosition
- 1) != '\\')
&& lCurrentPosition+1 < lLength)
{
char c = substitution.charAt(lCurrentPosition + 1);
if (c >= '0' && c <= '9')
{
// Append everything between the last and the current $
sign
ret.append(substitution.substring(lLastPosition + 2,
lCurrentPosition));
// Append the parenthesized expression
// Note: if a parenthesized expression of the requested
// index is not available "null" is added to the string
ret.append(getParen(c - '0'));
lLastPosition = lCurrentPosition;
}
}
// Move forward, skipping past match
lCurrentPosition++;
}
// Append everything after the last $ sign
ret.append(substitution.substring(lLastPosition + 2,lLength));
[...]
--
Especially
ret.append(substitution.substring(lLastPosition + 2, lCurrentPosition));
It's good for if we have more than one $-variables, good for all $-
variables exept the first one.
Initially lLastPosition has value of 0, so the first two symbols are
always lost.
May be it is not bad idea to verify was there any previous variable or
not, as follows:
--
[...]
// Process backreferences
int lCurrentPosition = 0;
int lLastPosition = 0;
int lLength = substitution.length();
// ! verify was variable or not
// initially - it is not
boolean wasSign = false;
while ((lCurrentPosition = substitution.indexOf("$",
lCurrentPosition)) >= 0)
{
if ((lCurrentPosition == 0 || substitution.charAt(lCurrentPosition
- 1) != '\\')
&& lCurrentPosition+1 < lLength)
{
char c = substitution.charAt(lCurrentPosition + 1);
if (c >= '0' && c <= '9')
{
// Append everything between the last and the current $
sign
ret.append(substitution.substring(wasSign ? lLastPosition
+ 2 : 0 , lCurrentPosition));
// now we are sure - it was
wasSign = true;
// Append the parenthesized expression
// Note: if a parenthesized expression of the requested
// index is not available "null" is added to the string
ret.append(getParen(c - '0'));
lLastPosition = lCurrentPosition;
}
}
// Move forward, skipping past match
lCurrentPosition++;
}
// Append everything after the last $ sign
ret.append(substitution.substring(wasSign ? lLastPosition + 2 :
0,lLength));
[...]
--
Thanks for reading. :)
--
Best regards,
Дмитрий mailto:[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]