[ 
https://issues.apache.org/jira/browse/SLING-7658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494700#comment-16494700
 ] 

indra kumar gurjar commented on SLING-7658:
-------------------------------------------

[~kwin] I think  XSSAPI.encodeForHTML() API is working as designed as it 
protects from introducing html entities. the unicode escaped characters are 
like html entities and not simple text.

We expect browser to not interpret(render) the encoded string as html entity 
but as plain text. so when we encode "✅" , we expect this to be rendered 
as "✅", not render differently and XSSAPI.encodeForHTML() API is doing 
this very well.

I have opened issue - [https://github.com/OWASP/owasp-java-encoder/issues/18]

at OWASP java encoder project and this is their view:

"The encoder is meant, on purpose, to encode all dangerous characters like you 
are describing. This is not the right tool for you. If you have HTML entities 
that you wish to preserve then your input is HTML. Consider using the OWASP 
HTML Sanitizer instead." [0]

Can XSSFilter [1] help here?

[0]: 
[https://github.com/OWASP/java-html-sanitizer/blob/master/src/main/java/org/owasp/html/HtmlSanitizer.java]

[1]:[https://github.com/apache/sling-org-apache-sling-xss/blob/master/src/main/java/org/apache/sling/xss/XSSFilter.java#L51]

 

> XSSApi.encodeForHtml doesn't leave unicode character escapes untouched
> ----------------------------------------------------------------------
>
>                 Key: SLING-7658
>                 URL: https://issues.apache.org/jira/browse/SLING-7658
>             Project: Sling
>          Issue Type: Bug
>          Components: XSS Protection API
>    Affects Versions: XSS Protection API 2.1.0
>            Reporter: Konrad Windszus
>            Priority: Major
>         Attachments: SLING-7658-v01.patch
>
>
> Whenever {{encodeForHtml}} is called with a string containing a unicode 
> character escapes 
> (https://www.w3.org/International/questions/qa-escapes#answer), the {{&}} 
> gets escaped again. I.e. {{✅}} becomes {{✅}}.
> Compare with the discussion at 
> https://www.mail-archive.com/[email protected]/msg76863.html.
> Attached is a patch with a failing test.
> Another use case for using unicode character escapes is the soft hyphen 
> (https://developer.mozilla.org/en-US/docs/Web/CSS/hyphens#Suggesting_line_break_opportunities).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to