Re: [v8-dev] Utility to check if a given stream can parse as Javascript (ORB)

'Daniel Vogelheim' via v8-dev Thu, 02 Jun 2022 08:36:03 -0700

On Thursday, June 2, 2022 at 9:46:15 AM UTC+2 [email protected] wrote:

> Can we not detect these via some magic number sniffing? I'm fundamentally 
> concerned about an allowlist approach for JS over a blocklist approach for 
> non-JS.
>

This is pretty much the heart of the issue: The entire thing of CORB to ORB 
transition is to go from "blocklist" to "allowlist", based on the 
observation that block lists ultimately never seem to work. In particular, 
we don't want to pass things by default, where anything we don't know 
automatically passes. That does lead us to an allowlist, in some form. 
Elsewhere, I summarized (my understanding of) the ORB security requirements 
as this: For "no-cors" requests, we want to have some positive evidence 
that the data we're receiving is in a format suitable for the request type.

Being able to drop unknown stuff by default is really the core benefit of 
ORB.

I do think we have quite a bit of leeway to decide what form of "positive 
evidence" we'll accept. The current draft specifies a full JS parse, which 
I think is way over the top. But I do think we need *something* that tells 
us with some probability whether a given byte sequence looks like JS or 
not. The only hard criteria is that actually valid JS should pass, because 
otherwise we'll break websites left and right. (To that end, "while (1);" 
was arguably a terrible example.) (Caveat: Those are my opinions. Other 
browsers might have stronger opinions.)

IMHO, checking for "parser breakers", the way CORB does, is a convenient 
temporary solution, because we already know it's web compatible.

IMHO, a full parse (in the network process, or triggered by the network 
process) is crazy, and I'd really like to have something more lightweight.

Which leads me to the proposal to only use the scanner to look for a few 
tokens. And ideally for TC39 to adopt some sort of SmellsLikeJavaScript 
abstract operation that other standards could point to.

>
> Note that CSV is sadly valid JS, so that won't be blocked at all.
>
> On Wed, Jun 1, 2022 at 6:45 PM 'Łukasz Anforowicz' via v8-dev <
> [email protected]> wrote:
>
>>
>>
>> On Wed, Jun 1, 2022 at 8:34 AM Leszek Swirski <[email protected]> 
>> wrote:
>>
>>> On Wed, Jun 1, 2022 at 5:17 PM 'Łukasz Anforowicz' via v8-dev <
>>> [email protected]> wrote:
>>>
>>>> Benefit of full JS parse over a list of known non-JS prefixes: Stricter 
>>>> is-it-JS checking = more non-JS things get blocked = improved security.  
>>>> Still, there is a balance here - some heuristics (like the ones proposed 
>>>> by 
>>>> Daniel) are almost as secure as full JS parse (while being easier to 
>>>> implement and having less of a performance impact).
>>>>
>>>
>>> Makes sense, I'm just asking to make sure that we strike the right 
>>> balance between security improvements and complexity/performance issues; 
>>> even a JS tokenizer without a full parser is quite a complexity investment 
>>> (it needs e.g. a full regexp parser), plus the language grammar is 
>>> sufficiently broad that I expect exhaustively enumerating all possible 
>>> combinations of even just 3-5 tokens to be prohibitively large (setting 
>>> aside maintainability in the face of ever-updating standards).
>>>
>>> Do we have a measure of how much non-JS coverage the current heuristics 
>>> give, on real-world examples of JSON files? Or perhaps, a measure of how 
>>> many different prefixes there are that we could blocklist? Do we know at 
>>> what point the improved security has diminishing returns?
>>>
>>
>> Examples of a response bodies that we would want to block, but that 
>> wouldn't get blocked without full JS parsing/verification (assume that the 
>> responses below are served as text/html or application/octet-stream):
>>
>>    - PDF
>>    - ProtoBuf
>>    - Microsoft Word
>>    - CSV files
>>    
>>
>>> - Leszek
>>>
>>> -- 
>>> -- 
>>> v8-dev mailing list
>>> [email protected]
>>> http://groups.google.com/group/v8-dev
>>> --- 
>>> You received this message because you are subscribed to a topic in the 
>>> Google Groups "v8-dev" group.
>>> To unsubscribe from this topic, visit 
>>> https://groups.google.com/d/topic/v8-dev/NGGCw9OjatI/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to 
>>> [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/v8-dev/CAGRskv9UUNJ9sjW0FvuHyCN90j%3DfbafSOgGVBG19qRe19_%2BO5w%40mail.gmail.com
>>>  
>>> <https://groups.google.com/d/msgid/v8-dev/CAGRskv9UUNJ9sjW0FvuHyCN90j%3DfbafSOgGVBG19qRe19_%2BO5w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> -- 
>> Thanks,
>>
>> Lukasz
>>
>> -- 
>> -- 
>> v8-dev mailing list
>> [email protected]
>> http://groups.google.com/group/v8-dev
>> --- 
>>
> You received this message because you are subscribed to the Google Groups 
>> "v8-dev" group.
>>
> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/v8-dev/CAA_NCUE%3DgtMdPPzFGy-gSuvV62VqesgRdkTkfvpOXNf9xHKpYQ%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/v8-dev/CAA_NCUE%3DgtMdPPzFGy-gSuvV62VqesgRdkTkfvpOXNf9xHKpYQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-dev/3ab87558-c9ea-484c-b42a-459380e8ad25n%40googlegroups.com.

Re: [v8-dev] Utility to check if a given stream can parse as Javascript (ORB)

Reply via email to