Hi Joris,

I suspect it's just how the web has developed, where the mixing of
JavaScript and imperfect HTML is normal.

I quite like this video as a demo:

https://www.youtube.com/watch?v=lG7U3fuNw3A

Where I think your point is raised when comparing the different parsing of:

1) <div><script title="</div>">
2) <script><div title="</script">

My favourite exploit is very similar...

*<script>*
user_name = "Craig*</script>*Hello";
</script>

Personally I'd like to say to the browser, similar to the old/obsolete
<plaintext> element, you won't find any JavaScript code after this point
(maybe it can block all scripts in the <body>?)... but this is only because
I load my JS files in the <head>, and attach event listeners after
DOMContentLoaded, but I know so few developers do this, so it won't be
useful to add.

I think this is the main reason Content Security Policy came into
existence, where I can skip "unsafe-inline" to block any inline JavaScript,
and limit the JavaScript files that can be included.

You can kind of get an idea of what happens with the browser parsing by
using JavaScript to load the HTML into a <template> element... but that
does raise the question on how you get the unsafe variables to the
JavaScript in the first place.

As an aside, I use <meta name="js_name" content="..." /> tags... sometimes
with JSON encoded data in the content attribute, where I'd use something
like the following to get the content:

  var my_data = document.querySelector('meta[name="js_data"]');
  if (my_data) {
    try {
      my_data = JSON.parse(my_data.getAttribute('content'));
    } catch (e) {
      my_data = null;
    }
  }

But going forwards, the HTML5 spec does cover how the browser (and
third-party libraries) should be parsing imperfect HTML, so hopefully these
differences will reduce (but I don't imagine they will all be perfectly
aligned, in the same way different browsers aren't).

Craig





On Sun, 7 Apr 2019 at 09:00, joris <joris.gutj...@gmail.com> wrote:

>
> I agree, that would be a vulnerability.
> But I think this is not the core of my wonder.
> I wonder, why do Web developers have to
> guess what the Browser thinks is JS and executes
> it and what isn't?
> Why can't they just ask the Web Browser to do that
> for them?
> That would be more secure because
> all third-party libraries parse somewhat differently
> than all the Web Browser they are used with.
> On 4/6/19 12:51 PM, Craig Francis wrote:
>
> While I quite like the simplicity of this idea, where it kind of reminds
> me of the @inert attribute.
>
> My main concern is how to bypass it, take the code:
>
> <div noscripts="true"><?= $unsafe_user_name ?></div>
>
> Where the attacker can set their username to `X*</div>*
> <script>evil_code</script><div>`
>
> ---
>
> Unfortunately, I think this is why we need to work with more
> complicated/advanced solutions...
>
> We need to sanitise all strings that are included in the HTML on the
> server side - e.g. using templating systems; or passing the string though
> something like HTML Purifier:
>
> http://htmlpurifier.org/
>
> Or, and you have to be careful here... escaping all HTML output though
> functions like htmlentities() / htmlencode(), where this does not fix `<a
> href=<?= htmlentities($unsafe_url)>` due to the url being able to start
> with `javascript:`, or being able to take advantage of the missing
> quotation marks on the attribute via ` onclick=evil_code`.
>
> And when working with strings in JavaScript - you should use safe methods
> like `element.textContent`, or pass them though something to sanitise the
> HTML (both in removing the many ways JavaScript can be included, but also
> just making sure the HTML is well formed):
>
>
> https://github.com/google/closure-library/blob/master/closure/goog/html/sanitizer/htmlsanitizer.js
>
> https://github.com/punkave/sanitize-html
>
> Then you would ideally add a Content Security Policy to limit the scripts
> on the page, just incase you miss something.
>
> https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP
>
> And as an extra bonus, start playing with the (currently in development)
> Trusted Types, to make sure you aren't using unsafe things like
> element.innerHTML.
>
> https://developers.google.com/web/updates/2019/02/trusted-types
>
> Or for even more fun (pain), on your local development server, try setting
> the header:
>
>     Content-Type: application/xhtml+xml; charset=UTF-8
>
> Do not do this on live, as any bad formatting of your HTML will break the
> page - but this ensures all of your attributes are quoted, and all of your
> tags are perfectly nested (this includes `<br>` needing to be `<br />`, the
> attribute `selected` needing to be `selected="selected"`, etc).
>
> Craig
>
>
>
> On Fri, 5 Apr 2019 at 23:47, Yog Bii <joris.gutj...@gmail.com> wrote:
>
>> XSS prevention is a very important and costly part of a Websites Security.
>> Because XSS is currently prevented by matching for JS in user input
>> and is than either blocked or masked by the Web Developer, each on his
>> own site,
>> XSS attacks find differences between the matching of the Web Developer
>> and the Browser, such that the Web Developer's matching doesn't
>> recognize JS as JS, but the Browser executes it.
>>
>> This is a constant fight between the Web Developer and the XSS attacker,
>> that costs many resources needed somewhere else instead.
>> And this fight favors larger business over small Web developers.
>>
>> I think that this fight can be terminated by letting the
>> Web Developer not guess what the Browser may think to be JS
>> and instead tell him explicitly that somewhere shouldn't be any code.
>> The Browser then behaves in that region like
>> he would have JS disabled.
>>
>> I would do that with a new attribute, called noscripts.
>> Inside an HTML element with noscripts = "true",
>> the Browser handle anything inside that element like
>> JS would be disabled globally.
>>
>> An example HTML would look like this:
>> <!doctype html>
>> <html>
>> ...
>> <div noscripts="true">
>> <script>
>> // No danger by unescaped <script> tags
>> </script>
>> <button onclick="nor by Event listeners">Click me</button>
>> ...
>> </html>
>>
>> If you know a way to do this without any differences between what the
>> Browser executes and what ever that mechanic lets pass, let me know
>> and let me know why it isn't thought in every HTML/JS Tutorial and
>> every Documentation about Web Development.
>> _______________________________________________
>> dev-security mailing list
>> dev-security@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-security
>>
>
>
_______________________________________________
dev-security mailing list
dev-security@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security

Reply via email to