Christoph Seibert wrote:
Hi there,

I think there is a problem with the following fix:

amyroh      2003/01/02 17:59:09

  Modified:    catalina/src/share/org/apache/catalina/core
                        StandardServer.java
  Log:
  Fix for bugzilla 15762.
[...]

diff -u -r1.32 -r1.33
--- StandardServer.java 11 Sep 2002 14:19:33 -0000 1.32
+++ StandardServer.java 3 Jan 2003 01:59:08 -0000 1.33
@@ -824,7 +824,15 @@
} else if (c == '"') {
filtered.append(""");
} else if (c == '&') {
- filtered.append("&");
+ char s1 = input.charAt(i+3);
+ char s2 = input.charAt(i+4);
+ char s3 = input.charAt(i+5);
+ if (((s1 == ';') || (s2 == ';')) || (s3 == ';')) {
+ // do not convert if it's already in converted form
+ filtered.append(c);
+ } else {
+ filtered.append("&");
+ }
} else {
filtered.append(c);
}

(Note: I haven't had a look at the surrounding code yet, so I have to
assume that 'i' is the position of 'c', that is the '&' character.)

This code assumes that character or entity references will not be
shorter than 4 characters (including the delimiters '&' and ';')
and no longer than 6. However, the XML specification does not in
any way define restrictions like that. For example, '&d;' is a
valid entity reference (assuming it was defined in the DTD). Worse,
character or entity references can have arbitrary length. For example,
'&#x0000000000020' is a valid character reference to the ' ' (space)
character.

I'm sorry I don't have a better fix right now, but I assume one
would have to iterate through the characters following the '&'
until either a ';' is found or a character occurs that is not a legal
part of an entity reference name (or in the case of a character
reference, not one of [0-9] for decimal or [0-9a-fA-F] for
hexadecimal).

(Actually, I believe this wheel must already have been invented,
but with only looking at this code snippet, I don't really know.)
I believe iterating through the characters following the '&' to look for ';' is found will fix the problem. A character such as '&#x0000000000020' without following ';' will result in parsing error where as ' ' will be written as a space(' ').

Thanks,
Amy

Ciao,
Christoph




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to