Re: [whatwg] Subresource Integrity-based caching

Roger Hågensen Fri, 03 Mar 2017 05:18:47 -0800

On 2017-03-03 01:02, James Roper wrote:


    How about you miss-understanding the fact that a hash can only
    ever guarantee that two resources are different. A hash can not
    guarantee that two resources are the same. A hash do infer a high
    probability they are the same but can never guarantee it, such is
    the nature of of a hash. A carefully tailored jquery.js that
    matches the hash of the "original jquery.js" could be crafted and
    contain a hidden payload. Now the browser suddenly injects this
    script into all websites that the user visits that use that
    particular version of jquery.js which I'd call a extremely serious
    security hole. you can't rely on length either as that could also
    be padded to match the length. Not to mention that this is also
    crossing the CORS threshold (the first instance is from a
    different domain than the current page is for example). Accidental
    (natural) collision probabilities for sha256/sha384/sha512 is very
    low, but intentional ones are higher than accidental ones.

This is completely wrong. No one has *ever* produced an intentionalcollision in sha256 or greater.

Huh? When did I ever state that? I have never said that sha256 or higherhaving been broken, do not put words/lies in my mouth please. I findthat highly offensive.I said "could", just ask any cryptographer. It is highly improbable, buttheoretically possible, but fully impractical to attempt (current stagesof quantum computing has not shown any magic bullet yet).

I'm equally concerned with a natural collision, while the probability isincredibly small the chance is 50/50 (if we imagine all files containingrandom data and file lengths, which they don't).

And as to my statement "a hash can only ever guarantee that tworesources are different. A hash can not guarantee that two resources arethe same" again that is true. You can even test this by using smallenough hashes (CRC-4 or something simple) and editing a file and you'llsee that what I say is true.

You know how a these types of hashes works right? They are NOT UNIQUE,if if you want something unique then those are called "Perfect Hash"which is not something you want to use for cryptography.If a hash like sha256 was unique it would be a compression miracle asyou could then just "uncompress" the hash.

Only if the data you hash is the same size as the hash can you perfectlyre-create the data that is hashed. Which is what I proposed with my UUIDsuggestion.Do note that I'm talking about Version 1 UUIDs and not the randomVersion 4 ones which re not unique.

In case you missed the headlines, last week Google announced itcreated a sha1 collision. That is the first, and only known sha1collision ever created. This means sha1 is broken, and must not be used.
Now it's unlikely (as in, it's not likely to happen in the history ofa billion universes), but it is possible that at some point in thehistory of sha256 that a collision was accidentally created. Thisprobability is non zero, which is greater than the impossibility ofintentionally creating a collision, hence it is more likely that wewill get an accidental collision than an intentional collision.

Sha1 still has it's uses. Now I haven't checked but sha1 just as md5 arestill ok to use with HMAC. Also it's odd that you say sha1 should not beused at all. Nothing wrong with using it as a file hash/checksum. Withthe number of files and the increase in-data CRC32 is nit that useful(unless you divide the file in chunks and provide a CRC32 array instead).

A hash is not the right way to do what you want, a UUID and a (ormultiple) trusted shared cache(s) is.The issue with using a hash is that at some point sha256 could becomedeprecated, do the browser start ignoring i then? Should it behave as ifthe javascript file had no hash or that it's potentially dangerous now?

Also take note that a UUID can also be made into a valid URI, but Isuggested adding a attribute as that would make older browsers/version"forward compatible" as the URI till works normally.

And to try and not entirely run you idea into the ground. It's notdetailed enough. By that I mean you would need a way for the webdesignerto inform the browser that they do not want the scripts hosted on theirsite replaced by these from another site. Now requiring a Opt Out is apain in the ass, and when security is concerned one such never have to"Opt Out to get more secure", one should by default be more secure.Which means that you would need to add another attribute or modify theintegrity one to allow cache sharing.

Now myself I would never do that, even if the hash matches I'd neverfeel comfortable running a script originating from some other site inthe page I'm delivering to my visitor.I would not actually want the browser to even cache my script andprovide that to other sites pages.

I might however feel comfortable adding a UUID and let the browser fetchthat script from it's local cache or from a trusted cloud cache.

If you are going to use the integrity attribute for authentication thenyou also need to add a method of revocation so that if for example thehashing used is deemed weak/compromised (due to say a so farundiscovered design flaw), then only the browsers that are up to datewill be able to consider those hashes unsafe. Older browsers will beclueless and all of a sudden some porn site includes manipulatedbanking.js and whenever a older browser with a stale cache encounterthat it replaces that and the next time the user goes to their bank thebrowser will happily use a trojan script instead. Te end result is thatbank's etc will not use the integrity attribute or they will server adifferent versioned script for each visit/page load which kinda nukescaching in general.Remember, you did not specify a optin/optout for the shared integritybased caching.

You might say that this is all theoretical, but you yourself proclaimedsha1 is no longer safe. Imagine if the most popular version of jquerybecame a trojan, we're talking tens of thousands of very high profilesites possible victims of cache poisoning.

Now I'm not saying the integrity attribute is useless, for CDNs it'spretty nice. It ensures that when your site uses say awesomescript12.jsthat is awesomescript12.js and not a misnamed awesomescript10.js orworse notsoawesomescript4.jsBut, at this point you already trust the CDN (why else would you usethem right?)Another thing the integrity hash is great for is to reduce the chance ofa damaged script being loaded (sha512 has way more bits than CRC32 forexample).And if I was to let a webpage fetch a script from a CDN I would probablyuse the integrity attribute, but that is because I trust that CDN.If a browser just caches the first of whatever it encounter and then usethat for all subsequent requests for that script then I want no part ofthat, it's a security boundary I'm not willing to cross, hash or nohash. So a opt-in would be essential on this.

Now many sites have there own CDN I assume these are your focus. Butmany use global ones (sometimes provided directly/indirectly with theblessing of the developers of a script). I don't see this a a majorcaching issue. The main issue is multiple versions of a script. Manyscripts are not always that backward compatible, I have seen cases wherethere are 3-4 versions of the same script on the same site. A sharedbrowser cache may help with that if those are the unedited officialscripts of jquery but usually they may not be. They may also be runthrough a minifer or similar or they have been minified but not with thesame settings as the official one.

This is why I stress that a UUID based idea is better in the whole. Asthe focus would be on the versions/APIs/interoperability instead. I.e.v1.1 and v1.2 have the exact same calls just some bug fixes? They canboth be given the same UUID and the CDN or trusted cache will providev1.2 all the time.

PS! Not trying to sound like an ass here but could you trim the emailnext time? While I do enjoy hearing my own voice/reading my own text asmuch as the next person there is no need to quote the whole thing. Alsowhy did you CC me a full quote of my email but did not write anythingyourself, did you hit reply by accident or is there a bug in the emailsystem somewhere?Which brings me to a nitpick of mine, if you reply to the list thenthere is no need to also CC me. If' I'm posting to the list then I'malso reading the list, I'd rather not have multiple email copies in myinbox. Hit the "Reply to list" button instead of "Reply to all" nexttime (these options depends on your email client).



--
Roger Hågensen,
Freelancer, Norway.

Re: [whatwg] Subresource Integrity-based caching

Reply via email to