Here is code that we use. It depends on our own variant of String
Buffer, but you get the idea:

public  String htmlEncode(String cVal) {
        if (cVal==null || cVal.length()==0) {
                return "";
        }
        MHBuffer buf=new MHBuffer(cVal.length()<<2);
        final String[] aOld=    {"&",    "<",   ">",   "\""};
        final String[] aReplace={"&amp;","&lt;","&gt;","&quot;"};

        buf.append(cVal);

        for (int i=0; i < aOld.length; i++) {
                buf.replace(aOld[i],aReplace[i]);       
        }
        return buf.toString();
}



-----Original Message-----
From: Greg Ward [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 02, 2003 3:18 PM
To: [EMAIL PROTECTED]
Subject: HTML quoting


What's the standard way of quoting text for inclusion in a web page in
Java?  Ie. I need a method to convert the string

  Jeb said, "Hell & damnation! Is 5 > 4?"

to

  Jeb said, &quot;Hell &amp; damnation! Is 5 &gt; 4?&quot;

(I think: I've never been entirely sure what the right way to handle
quotes is.)  That is, I want the standard Java equivalent of Python's
cgi.escape(), or Perl's CGI::escapeHTML().

To my utter amazement, I cannot find any indication that such a method
even exists in the standard Java library!  (I tried Google'ing and
poking through the JDK 1.4 docs.)

So I went looking in the source for Tomcat 4.1.27 -- surely the HTML
version of the manager app must quote at least the webapp's display
name, since it comes from a user-supplied file and therefore might
contain funny characters.  Surprisingly, the manager just lets funny
characters through without touching them.  Eg. if you put

  <display-name>foo &amp; bar webapp</display-name>

then "&amp;" is translated back to "&" by some part of the XML-parsing
chain, and is emitted as "&" in the manager HTML page.  Most browsers
can deal with minor violations like this, but it's still technically
incorrect.  Just for fun I tried this:

  <display-name>my
&lt;script&gt;alert("foo");&lt;/script&gt;</display-name>

...and it works!  The manager emits this HTML:

 <td class="row-left"><small>my <script>alert("foo");</script>
webapp</small></td>

and my browser pops up a JavaScript window while rendering the manager
page.  Cool!  I doubt this is a security hole -- not many people can
edit web.xml! -- but surely it at least counts as a rendering bug.  ;-)

So: can someone tell me what the standard way of quoting text for
inclusion in a web page generated by a Java web application is?

Thanks!

        Greg

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to