If you refuse to or cannot use the org.apache.commons.lang.StringEscapeUtils (noted by Mike Curwen here), or the JSP functionality (noted by Tim Funk) it might help to know that a character entity doesn't have to be named. All characters have their respective character entity using the format &#nn..; or &#xhh..; where 'nn..' and 'hh..' are their respective character index in the current encoding, decimal and hexadecimal respectively.

It is obvious then that a space would be   or   since 32 is the ascii code for a space. Though i cannot quite figure out why you would want to escape a space...

The characters you want to escape are outside the bounds ascii
33 <= c <= 127
and c == '<', c == '>', c == '&', c == '\''

so this would be (imho) the better method:

xmp:

public static String escapeHtml ( String s ) {
   StringBuffer buffer = new StringBuffer ();

for ( int i = 0; i < s.length(); i ++ ) {
char c = s.charAt ( i );
if ( c < 32 || c > 127 || c == '<' || c == '>' || c == '&' || c == '\'' ) {
buffer.append ( "&#" + (int)c + ";" );
} else {
buffer.append ( c );
}
}
return buffer.toString ();
}


One might consider ordering the conditions in the if statement by occurrence probability to improve performance...


HTH, drm

Christopher Williams wrote:
Here's a simple method to quote the most important character entities:

    /**
     * Handles a couple of problematic characters in strings that are
printed to
     * an HTML stream, replacing them with their escaped equivalents
     * @param s an input string
     * @return the escaped string
     */
    public static String escapeSpaces(String s)
    {
        StringBuffer sb = new StringBuffer();
        int nChars = s.length();
        for (int i = 0; i < nChars; i++)
        {
            char c = s.charAt(i);
            if (' ' == c)
            {
                sb.append("&#032;");
            }
            else if ('>' == c)
            {
                sb.append("&gt;");
            }
            else if ('<' == c)
            {
                sb.append("&lt;");
            }
            else if ('\"' == c)
            {
                sb.append("&quot;");
            }
            else if ('&' == c)
            {
                sb.append("&amp;");
            }
            else
            {
                sb.append(c);
            }
        }
        return sb.toString();
    }

A more complete solution would be to look up the complete list of character
entities (e.g 'HTML and XHTML The Definitive Guide'), build a lookup table
and use each character as an index into that table.

Chris Williams.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to