On Sat, Jan 10, 2009 at 9:16 AM, Matthew Toseland
<[email protected]> wrote:
> On Tuesday 30 December 2008 08:24, [email protected] wrote:
>> Author: j16sdiz
>> Date: 2008-12-30 08:24:13 +0000 (Tue, 30 Dec 2008)
>> New Revision: 24846
>>
>> Modified:
>>    trunk/freenet/src/freenet/support/HTMLEncoder.java
>> Log:
>> encodeXML() method
>>
>> Modified: trunk/freenet/src/freenet/support/HTMLEncoder.java
>> ===================================================================
>> --- trunk/freenet/src/freenet/support/HTMLEncoder.java        2008-12-30 
>> 07:26:32
> UTC (rev 24845)
>> +++ trunk/freenet/src/freenet/support/HTMLEncoder.java        2008-12-30 
>> 08:24:13
> UTC (rev 24846)
>> @@ -14,7 +14,7 @@
>>  public class HTMLEncoder {
>>       public final static CharTable charTable =
>>               new CharTable(HTMLEntities.encodeMap);
>> -
>> +
>>       public static String encode(String s) {
>>               int n = s.length();
>>               StringBuilder sb = new StringBuilder(n);
>> @@ -41,6 +41,28 @@
>>               }
>>
>>       }
>> +
>> +     /**
>> +      * Encode String so it is safe to be used in XML attribute value and 
>> text.
>> +      *
>> +      * HTMLEncode.encode() use some HTML-specific entities (e.g. &amp;) 
>> hence
> not suitable for
>> +      * generic XML.
>> +      */
>> +     public static String encodeXML(String s) {
>> +             // Extensible Markup Language (XML) 1.0 (Fifth Edition)
>> +             // [10]         AttValue           ::=          '"' ([^<&"] | 
>> Reference)* '"'
>> +             //                                                             
>>  |   "'" ([^<&'] | Reference)* "'"
>> +             // [14]         CharData           ::=          [^<&]* - 
>> ([^<&]* ']]>' [^<&]*)
>> +             s = s.replace("&", "&#38;");
>> +
>> +             s = s.replace("\"", "&#34;");
>> +             s = s.replace("'", "&#39;");
>> +
>> +             s = s.replace("<", "&#60;");
>> +             s = s.replace(">", "&#62;"); // CharData can't contain ']]>'
>> +
>> +             return s;
>> +     }
>
> Why is this a blacklist rather than a whitelist? Why does it not encode double
> quotes, or newlines? In other words is it safe if fed arbitrary
> attacker-specified data? If not, please clearly label it.

It have encode double quote. This badlist is exactly what XML 1.0
specification (5th ed.) specifies. Newlines is okay according to the
specification. (but I haven't test it in real xml decoder)

>
>>
>>       private final static class CharTable{
>>               private char[] chars;
>
> _______________________________________________
> Devl mailing list
> [email protected]
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>
_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to