On Wed, Aug 18, 2021 at 2:29 AM 'Łukasz Anforowicz' via v8-dev < [email protected]> wrote:
> > > On Tue, Aug 17, 2021 at 6:59 AM Toon Verwaest <[email protected]> > wrote: > >> Thinking out loud: One idea could be to have a separate sandboxed >> compiler process in which we compile incoming JS code. That could reject >> the source if it doesn't compile; or compile it to a script that just >> throws with no additional info about the actual source. >> >> That process could implement streaming compilation; so we don't block >> streaming until later, we don't double parse, we still have a sandbox (not >> in the network process). There might even be benefits for caching as a >> compromised renderer cannot look at the compilation artefacts until it >> receives them. >> >> If we fully compile and create a code cache from the compilation result >> we don't need a new API on the V8 side, but do additional >> serialization/deserialization work. That should be faster than reparsing >> though. The upper limit of the cost would essentially be the cost of >> serializing / deserializing a code cache for each script. >> > > This seems like an interesting idea. I wonder if compilation (no > evaluation / running of scripts) would be considered safe enough to handle > in a single (not origin/site-bound/locked) process. > The parser/compiler aren't tiny, so it's not unlikely there's a bug. It's certainly much less easy to control such bugs than full-blown JS OOB access though. I could imagine a security bug replacing scripts in another site (assuming it's sandboxed so well that it can't do much else), which would be terrible; and it's unclear to me how easy that would be. > > One thing that I don't fully understand (For both full-JS-parsing and > partial/hackish-non-JS-detection approaches) is if the encoding (e.g. UTF8 > vs UTF16-LE vs Win-1250) has to be known and communicated upfront to the > parser/sniffer? Or maybe the input to the decoder needs to be already in > UTF8? Or maybe something in //net or //network layers can already handle > this aspect of the problem (e.g. ensuring UTF8 in URLLoader::DidRead)? > There's some encoding guessing happening before we streaming compile ( https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/bindings/core/v8/script_streamer.cc;l=584;drc=f0b502c3c977f47c58b49506629b2dd8353e4c59;bpv=1;bpt=1) and some afterwards; and if we initially compiled with the wrong encoding we discard and redo iirc. Presumably compilation failed anyway if the encoding was wrong; but this presumably also doesn't happen too often. > > Also - when trying to explore the partial/hackish-non-JS-detection idea, I > wondered if the very first character in a script may only come from a > relatively limited set of characters? Let's assume that the sniffer can > skip whitespace (space, tab, CR, LF, LS, PS) and html/xml comments (e.g. > <!-- ... -->) - AFAICT the very next character has to be either: > > - The start of a reserved keyword like "if", "let", etc. (all > lowercase ASCII) > - The start of an identifier (any Unicode code point with the Unicode > property “ID_Start”) > - The start of a unary expression: + - ~ ! > - The start of a string literal, string template, or a regexp literal > (or non-HTML comment): " ' ` / > - The start of a numeric literal: 0-9 > - An opening paren, bracket or brace: ( [ { > - Not quite sure if a dot or an equal sign can appear as the very > first character: . = > > This would reject PDFs (starts with %) and HTML/XML (starts with <), but > still would accept ZIP files (first character is a 0x50 - capital P) and > MSOffice files (first character is a 0xD0 which according to Unicode has > ID_Start property set to true). Rejecting ZIP and MSOffice files would > require going beyond the first character - maybe rejecting control > characters like 0x11 or 0x03 outside of comments (not sure if at this point > the sniffer's heuristics are starting to get too complex). > That was my initial thought too for e.g., PDF. You'd be blacklisting files you don't want to leak vs whitelisting JS though, which isn't entirely ideal security-wise. It might be better than the alternative though; if we either end up spending slowing down the web (repeat parsing, interfere with streaming) or potentially have new security issues through a shared compiler process. > > >> On Fri, Aug 13, 2021 at 12:26 AM 'Łukasz Anforowicz' via v8-dev < >> [email protected]> wrote: >> >>> On Thu, Aug 12, 2021 at 3:18 PM Łukasz Anforowicz <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Thu, Aug 12, 2021 at 3:11 PM Jakob Kummerow <[email protected]> >>>> wrote: >>>> >>>>> ORB-with-html/json/xml-sniffing shows that some security benefits of >>>>>> ORB may be realized without full-fidelity JS sniffing/parsing. >>>>>> >>>>>> >>>>> You may call it a security benefit to block "obvious" parser breakers >>>>> like )]}', but in general, any "when in doubt, don't block it" >>>>> strategy won't be much of an obstacle to intentional attacks. For >>>>> instance, >>>>> once Mr. Bad Guy has learned that the sniffer only looks at the first 1024 >>>>> characters, they can send a response whose first 1024 characters lead to a >>>>> "well, it *might* be valid JS" judgement (such as a JS comment, or >>>>> long string, or whatever). OTOH any "when in doubt, block it" strategy >>>>> runs >>>>> the risk of breaking existing websites in those doubtful cases. >>>>> >>>> >>>> In CORB threat model the attacker does *not* control the responses - >>>> CORB tries to prevent https://attacker.com (with either Spectre or a >>>> compromised renderer) from being able to read no-cors responses from >>>> https://victim.com. >>>> >>>>> >>>>> >>>>>> (Although the JSON object syntax is exactly Javascript's >>>>>> object-initializer syntax, a Javascript object-initializer expression is >>>>>> not valid as a standalone Javascript statement.) >>>>> >>>>> >>>>> There is (at least) one subtlety here: JS is more permissive than the >>>>> official JSON spec. The latter requires quotes around property names, the >>>>> former doesn't. I.e. {"foo": is indeed never valid JS, but {foo: is >>>>> (the brace opens a code block, and foo is a label). Also, the colon is >>>>> essential for rejecting the former snippet, because {"foo"; is valid >>>>> JS (code block plus ignored string á la "use strict";), so this is a >>>>> concrete example where the 1024-char prefix issue is relevant. >>>>> >>>>> >>>>>> When the sniffer sees: >>>>>> [ 123, 456, “long string taking X bytes”, >>>>>> then it should block the response when the Content-Type is a JSON >>>>>> MIME type >>>>> >>>>> >>>>> I don't follow. When the Content-Type is JSON, and the actual contents >>>>> are valid JSON, why should that be blocked? >>>>> >>>> >>>> Correct. There is no way to read cross-origin JSON via a "no-cors" >>>> fetch. The only way to read cross-origin JSON is via CORS-mediated fetch >>>> (where the victim has to opt-in by responding with >>>> "Access-Control-Allow-Origin: ..."). >>>> >>> >>> Maybe another way to look at it is: >>> >>> - Only Javascript (and images/audio/video/stylesheets) can be sent >>> in no-cors mode (e.g. without CORS). Non-Javascript (and >>> non-image/video/etc), no-cors, cross-origin responses can be blocked. >>> - If the response sniffs as JSON (Content-Type=JSON and >>> First1024bytes=JSON) then it is *not* Javascript. Therefore we can block >>> the response (and prevent disclosing https://victim.com/secret.json >>> to a no-cors fetch from https://attacker.com). >>> >>> >>> >>>> >>>>> -- >>>>> -- >>>>> v8-dev mailing list >>>>> [email protected] >>>>> http://groups.google.com/group/v8-dev >>>>> --- >>>>> You received this message because you are subscribed to a topic in the >>>>> Google Groups "v8-dev" group. >>>>> To unsubscribe from this topic, visit >>>>> https://groups.google.com/d/topic/v8-dev/NGGCw9OjatI/unsubscribe. >>>>> To unsubscribe from this group and all its topics, send an email to >>>>> [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/v8-dev/CAKSzg3TNvd1jd3yH8xyD767ZhbCqhEZJMFmm7nQ%2BtcQcXfjt_g%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/v8-dev/CAKSzg3TNvd1jd3yH8xyD767ZhbCqhEZJMFmm7nQ%2BtcQcXfjt_g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> >>>> >>>> -- >>>> Thanks, >>>> >>>> Lukasz >>>> >>> >>> >>> -- >>> Thanks, >>> >>> Lukasz >>> >>> -- >>> -- >>> v8-dev mailing list >>> [email protected] >>> http://groups.google.com/group/v8-dev >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "v8-dev" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/v8-dev/CAA_NCUHWD5G2G9aHe%3DnM6k-hSZY2ufqx7GwEhmKYSfPN9b%3D9WA%40mail.gmail.com >>> <https://groups.google.com/d/msgid/v8-dev/CAA_NCUHWD5G2G9aHe%3DnM6k-hSZY2ufqx7GwEhmKYSfPN9b%3D9WA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> -- >> v8-dev mailing list >> [email protected] >> http://groups.google.com/group/v8-dev >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "v8-dev" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/v8-dev/NGGCw9OjatI/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/v8-dev/CANS-YRqhC5Z_XeNuN0-4VNMgOV-bJ6LHd1e%3Daw%2Bn82pjxWJx1Q%40mail.gmail.com >> <https://groups.google.com/d/msgid/v8-dev/CANS-YRqhC5Z_XeNuN0-4VNMgOV-bJ6LHd1e%3Daw%2Bn82pjxWJx1Q%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > Thanks, > > Lukasz > > -- > -- > v8-dev mailing list > [email protected] > http://groups.google.com/group/v8-dev > --- > You received this message because you are subscribed to the Google Groups > "v8-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/v8-dev/CAA_NCUHjjiB9kMbyk%2Bn1ZMEda%2B8Oehr6ukU1VkK0vt9pcW%2B%3DuQ%40mail.gmail.com > <https://groups.google.com/d/msgid/v8-dev/CAA_NCUHjjiB9kMbyk%2Bn1ZMEda%2B8Oehr6ukU1VkK0vt9pcW%2B%3DuQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CANS-YRqxEZHNcHV%2ByHZLBfoNOCbzQRxjXkfaeo2VCQgvUG9zKg%40mail.gmail.com.
