On Tuesday 30 December 2008 08:24, [email protected] wrote:
> Author: j16sdiz
> Date: 2008-12-30 08:24:13 +0000 (Tue, 30 Dec 2008)
> New Revision: 24846
> 
> Modified:
>    trunk/freenet/src/freenet/support/HTMLEncoder.java
> Log:
> encodeXML() method
> 
> Modified: trunk/freenet/src/freenet/support/HTMLEncoder.java
> ===================================================================
> --- trunk/freenet/src/freenet/support/HTMLEncoder.java        2008-12-30 
> 07:26:32 
UTC (rev 24845)
> +++ trunk/freenet/src/freenet/support/HTMLEncoder.java        2008-12-30 
> 08:24:13 
UTC (rev 24846)
> @@ -14,7 +14,7 @@
>  public class HTMLEncoder {
>       public final static CharTable charTable = 
>               new CharTable(HTMLEntities.encodeMap);
> -
> +     
>       public static String encode(String s) {
>               int n = s.length();
>               StringBuilder sb = new StringBuilder(n);
> @@ -41,6 +41,28 @@
>               }
>               
>       }
> +
> +     /**
> +      * Encode String so it is safe to be used in XML attribute value and 
> text.
> +      * 
> +      * HTMLEncode.encode() use some HTML-specific entities (e.g. &) 
> hence 
not suitable for
> +      * generic XML.
> +      */
> +     public static String encodeXML(String s) {
> +             // Extensible Markup Language (XML) 1.0 (Fifth Edition)
> +             // [10]         AttValue           ::=          '"' ([^<&"] | 
> Reference)* '"'
> +             //                                                              
> |   "'" ([^<&'] | Reference)* "'"
> +             // [14]         CharData           ::=          [^<&]* - 
> ([^<&]* ']]>' [^<&]*)
> +             s = s.replace("&", "&#38;");
> +
> +             s = s.replace("\"", "&#34;");
> +             s = s.replace("'", "&#39;");
> +
> +             s = s.replace("<", "&#60;");
> +             s = s.replace(">", "&#62;"); // CharData can't contain ']]>'
> +
> +             return s;
> +     }

Why is this a blacklist rather than a whitelist? Why does it not encode double 
quotes, or newlines? In other words is it safe if fed arbitrary 
attacker-specified data? If not, please clearly label it.

>               
>       private final static class CharTable{
>               private char[] chars;

Attachment: pgpt3pfkfnTVY.pgp
Description: PGP signature

_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to