On Tue, Aug 17, 2021 at 6:59 AM Toon Verwaest <[email protected]> wrote:
> Thinking out loud: One idea could be to have a separate sandboxed compiler
> process in which we compile incoming JS code. That could reject the source
> if it doesn't compile; or compile it to a script that just throws with no
> additional info about the actual source.
>
> That process could implement streaming compilation; so we don't block
> streaming until later, we don't double parse, we still have a sandbox (not
> in the network process). There might even be benefits for caching as a
> compromised renderer cannot look at the compilation artefacts until it
> receives them.
>
> If we fully compile and create a code cache from the compilation result we
> don't need a new API on the V8 side, but do additional
> serialization/deserialization work. That should be faster than reparsing
> though. The upper limit of the cost would essentially be the cost of
> serializing / deserializing a code cache for each script.
>
This seems like an interesting idea. I wonder if compilation (no
evaluation / running of scripts) would be considered safe enough to handle
in a single (not origin/site-bound/locked) process.
One thing that I don't fully understand (For both full-JS-parsing and
partial/hackish-non-JS-detection approaches) is if the encoding (e.g. UTF8
vs UTF16-LE vs Win-1250) has to be known and communicated upfront to the
parser/sniffer? Or maybe the input to the decoder needs to be already in
UTF8? Or maybe something in //net or //network layers can already handle
this aspect of the problem (e.g. ensuring UTF8 in URLLoader::DidRead)?
Also - when trying to explore the partial/hackish-non-JS-detection idea, I
wondered if the very first character in a script may only come from a
relatively limited set of characters? Let's assume that the sniffer can
skip whitespace (space, tab, CR, LF, LS, PS) and html/xml comments (e.g.
<!-- ... -->) - AFAICT the very next character has to be either:
- The start of a reserved keyword like "if", "let", etc. (all lowercase
ASCII)
- The start of an identifier (any Unicode code point with the Unicode
property “ID_Start”)
- The start of a unary expression: + - ~ !
- The start of a string literal, string template, or a regexp literal
(or non-HTML comment): " ' ` /
- The start of a numeric literal: 0-9
- An opening paren, bracket or brace: ( [ {
- Not quite sure if a dot or an equal sign can appear as the very first
character: . =
This would reject PDFs (starts with %) and HTML/XML (starts with <), but
still would accept ZIP files (first character is a 0x50 - capital P) and
MSOffice files (first character is a 0xD0 which according to Unicode has
ID_Start property set to true). Rejecting ZIP and MSOffice files would
require going beyond the first character - maybe rejecting control
characters like 0x11 or 0x03 outside of comments (not sure if at this point
the sniffer's heuristics are starting to get too complex).
> On Fri, Aug 13, 2021 at 12:26 AM 'Łukasz Anforowicz' via v8-dev <
> [email protected]> wrote:
>
>> On Thu, Aug 12, 2021 at 3:18 PM Łukasz Anforowicz <[email protected]>
>> wrote:
>>
>>>
>>>
>>> On Thu, Aug 12, 2021 at 3:11 PM Jakob Kummerow <[email protected]>
>>> wrote:
>>>
>>>> ORB-with-html/json/xml-sniffing shows that some security benefits of
>>>>> ORB may be realized without full-fidelity JS sniffing/parsing.
>>>>>
>>>>>
>>>> You may call it a security benefit to block "obvious" parser breakers
>>>> like )]}', but in general, any "when in doubt, don't block it"
>>>> strategy won't be much of an obstacle to intentional attacks. For instance,
>>>> once Mr. Bad Guy has learned that the sniffer only looks at the first 1024
>>>> characters, they can send a response whose first 1024 characters lead to a
>>>> "well, it *might* be valid JS" judgement (such as a JS comment, or
>>>> long string, or whatever). OTOH any "when in doubt, block it" strategy runs
>>>> the risk of breaking existing websites in those doubtful cases.
>>>>
>>>
>>> In CORB threat model the attacker does *not* control the responses -
>>> CORB tries to prevent https://attacker.com (with either Spectre or a
>>> compromised renderer) from being able to read no-cors responses from
>>> https://victim.com.
>>>
>>>>
>>>>
>>>>> (Although the JSON object syntax is exactly Javascript's
>>>>> object-initializer syntax, a Javascript object-initializer expression is
>>>>> not valid as a standalone Javascript statement.)
>>>>
>>>>
>>>> There is (at least) one subtlety here: JS is more permissive than the
>>>> official JSON spec. The latter requires quotes around property names, the
>>>> former doesn't. I.e. {"foo": is indeed never valid JS, but {foo: is
>>>> (the brace opens a code block, and foo is a label). Also, the colon is
>>>> essential for rejecting the former snippet, because {"foo"; is valid
>>>> JS (code block plus ignored string á la "use strict";), so this is a
>>>> concrete example where the 1024-char prefix issue is relevant.
>>>>
>>>>
>>>>> When the sniffer sees:
>>>>> [ 123, 456, “long string taking X bytes”,
>>>>> then it should block the response when the Content-Type is a JSON MIME
>>>>> type
>>>>
>>>>
>>>> I don't follow. When the Content-Type is JSON, and the actual contents
>>>> are valid JSON, why should that be blocked?
>>>>
>>>
>>> Correct. There is no way to read cross-origin JSON via a "no-cors"
>>> fetch. The only way to read cross-origin JSON is via CORS-mediated fetch
>>> (where the victim has to opt-in by responding with
>>> "Access-Control-Allow-Origin: ...").
>>>
>>
>> Maybe another way to look at it is:
>>
>> - Only Javascript (and images/audio/video/stylesheets) can be sent in
>> no-cors mode (e.g. without CORS). Non-Javascript (and
>> non-image/video/etc), no-cors, cross-origin responses can be blocked.
>> - If the response sniffs as JSON (Content-Type=JSON and
>> First1024bytes=JSON) then it is *not* Javascript. Therefore we can block
>> the response (and prevent disclosing https://victim.com/secret.json
>> to a no-cors fetch from https://attacker.com).
>>
>>
>>
>>>
>>>> --
>>>> --
>>>> v8-dev mailing list
>>>> [email protected]
>>>> http://groups.google.com/group/v8-dev
>>>> ---
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "v8-dev" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/v8-dev/NGGCw9OjatI/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/v8-dev/CAKSzg3TNvd1jd3yH8xyD767ZhbCqhEZJMFmm7nQ%2BtcQcXfjt_g%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/v8-dev/CAKSzg3TNvd1jd3yH8xyD767ZhbCqhEZJMFmm7nQ%2BtcQcXfjt_g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>>
>>> Lukasz
>>>
>>
>>
>> --
>> Thanks,
>>
>> Lukasz
>>
>> --
>> --
>> v8-dev mailing list
>> [email protected]
>> http://groups.google.com/group/v8-dev
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "v8-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/v8-dev/CAA_NCUHWD5G2G9aHe%3DnM6k-hSZY2ufqx7GwEhmKYSfPN9b%3D9WA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/v8-dev/CAA_NCUHWD5G2G9aHe%3DnM6k-hSZY2ufqx7GwEhmKYSfPN9b%3D9WA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "v8-dev" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/v8-dev/NGGCw9OjatI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/v8-dev/CANS-YRqhC5Z_XeNuN0-4VNMgOV-bJ6LHd1e%3Daw%2Bn82pjxWJx1Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/v8-dev/CANS-YRqhC5Z_XeNuN0-4VNMgOV-bJ6LHd1e%3Daw%2Bn82pjxWJx1Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
--
Thanks,
Lukasz
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/v8-dev/CAA_NCUHjjiB9kMbyk%2Bn1ZMEda%2B8Oehr6ukU1VkK0vt9pcW%2B%3DuQ%40mail.gmail.com.