Re: [whatwg] checksum attribute in a href tag
On Thu, 25 Oct 2012, Mikko Rantalainen wrote: Ian Hickson, 2012-10-24 19:28 (Europe/Helsinki): Anyway, if you have memory corruption there's nothing to say the corruption won't occur _after_ you've done the checksum verification. In particular, there's nothing to say it'll happen between receiving and decoding the packets over TLS and comparing the packets to the checksum, and not either before (in which case TLS will catch it as part of its own integrity checking) or after (in which case the checksum won't help). That's a pretty narrow window. My guess would be that people will screw up their hidden metadata (e.g. updating the .img file without updating the HTML file) more often, much more often, than the checksum will catch an error. That might be true, I really don't know. I'd guess that this attribute would be used pretty seldom but in those cases the correctness of transferred file would be important enough to take the possibility of false positive. If it would be used pretty seldom, it's probably not the highest priority in terms of things we should add. :-) In addition, if the correctness of the file is important to you and the downloaded file does not match the hidden metadata, you definitely should contact the server administrator in any case. In practice, users blame the browser, not the server. Especially when using another (older) browser (that doesn't check the checksum) results in it working fine. The server administrator can then either fix the checksum attribute or the actual file. Or tell you to use another browser, because they don't have any idea what this checksum thing is, since they had paid someone to write the site and only later replaced the file being downloaded and aren't HTML experts... As a result, I wouldn't expect the false positive error to be the permanent state for important files. And the extra work required for the attribute should prevent it's usage for non-important files. I'd trust that casual content authors are too lazy to bother with it. The problem isn't casual content authors, it's authors who copy other people's pages and don't test with supporting browsers, or authors who got help from their geek Web designer daughter at Christmas and then later changed the file and broke it because they didn't understand it, or the case I referenced above where a company hires a Web designer to do the initial work and then later go and update it. Furthermore, the checksum attribute could be valuable against both memory corruption and network transfer corruption over unsecured (non-TLS) links. Of course, it wouldn't provide any safety against malicious uses. It doesn't really help against memory corruption, as discussed above. If network corruption is a concern, then TLS really is the way to go (I would expect network corruption to be more of an issue with an active attacker than passive hardware failure, and in the case of an attacker, they can just update the checksum too). In conclusion, it seems this proposal only solves a small problem, and doesn't solve it in a particularly successful manner. It's worth considering again if we have no more important things to address, but for now I haven't added it. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] checksum attribute in a href tag
Ian Hickson, 2012-10-24 19:28 (Europe/Helsinki): Anyway, if you have memory corruption there's nothing to say the corruption won't occur _after_ you've done the checksum verification. In particular, there's nothing to say it'll happen between receiving and decoding the packets over TLS and comparing the packets to the checksum, and not either before (in which case TLS will catch it as part of its own integrity checking) or after (in which case the checksum won't help). That's a pretty narrow window. My guess would be that people will screw up their hidden metadata (e.g. updating the .img file without updating the HTML file) more often, much more often, than the checksum will catch an error. That might be true, I really don't know. I'd guess that this attribute would be used pretty seldom but in those cases the correctness of transferred file would be important enough to take the possibility of false positive. In addition, if the correctness of the file is important to you and the downloaded file does not match the hidden metadata, you definitely should contact the server administrator in any case. The server administrator can then either fix the checksum attribute or the actual file. As a result, I wouldn't expect the false positive error to be the permanent state for important files. And the extra work required for the attribute should prevent it's usage for non-important files. I'd trust that casual content authors are too lazy to bother with it. Furthermore, the checksum attribute could be valuable against both memory corruption and network transfer corruption over unsecured (non-TLS) links. Of course, it wouldn't provide any safety against malicious uses. If @checksum is added to the spec, it should include a notice that the feature is only intended to provide a safety net for non-malicious corruption and TLS should be used instead or in addition if malicious corruption is considered possible. Example cases of such (non-malicious) checksum failure: (1) remote server or intermediate network device hardware or software error (2) local machine hardware or software error (3) server administrator error (metadata error vs. actual file content error - it's not _always_ the metadata that's corrupted) -- Mikko
Re: [whatwg] checksum attribute in a href tag
On 2012-10-19 14:01, Nils Dagsson Moskopp wrote: A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012 13:50:04 +0200: I'm sick of coping the checksum of important files by hand or QR-code to the download manager or console. To solve the problem I suggest a checksum attribute in the a href tag. It seems that problem is solved at the HTTP level with RFC 1864: http://tools.ietf.org/html/rfc1864 The latest spec defining Content-MD5 was RFC 2616. It will not be included in the revision of HTTP/1.1 because of broken interop for Range requests, and because of the weakness of MD5 (see http://trac.tools.ietf.org/wg/httpbis/trac/ticket/178 for context). That being said a new response header field that is well-defined wrt to partial responses and more flexible wrt to digest algorthms would be interesting. ... Best regards, Julian
Re: [whatwg] checksum attribute in a href tag
Anne van Kesteren, 2012-10-19 14:57 (Europe/Helsinki): On Fri, Oct 19, 2012 at 1:50 PM, A. Rauschenbach rauschenb...@annuo.de wrote: I'm sick of coping the checksum of important files by hand or QR-code to the download manager or console. To solve the problem I suggest a checksum attribute in the a href tag. example: a href=http://example.com/important.file; checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772download/a Another advantage is that your visitors (browser) can verify that the document (e.g. a pdf) you linked to is still the same. If you serve important files over HTTP without TLS I don't think a checksum is going to help anyone much. Checksum can help even with encrypted connections. Example scenario: User connects to https://download.manufacturer.com/ and clicks link a href=phone-firmware-15.img checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772Firmware version 15/a The download then starts and file gets saved to the filesystem. However, the system has memory corruption and despite the fact that the file got to the user agent intact, the file will end up as corrupted to the filesystem. However, if user agent had computed and verified the checksum after re-reading the file back from the local filesystem, it would have noticed the error. You might think that memory corruption is rare but trust me, it happens often enough to be worried about. After it has bitten you once, you learn to be paranoid about that. I'm speaking from experience here - I once had a memory corruption that caused three bits (out of 8GB) to randomly fail and that caused filesystem data corruption. And I had already been running a memory tester (memtest86) for a day without errors after I had installed the memory so I assumed it would be fine. Search for git corrupt for more evidence from real world software developers and remember that software developers are usually using high quality hardware. You don't want to fail with an important opaque file such as a firmware image. Hopefully the firmware image will contain internal checksum but it wouldn't hurt if the problem were found before trying to flash the image. -- Mikko
Re: [whatwg] checksum attribute in a href tag
On 2012/10/24 15:11, Mikko Rantalainen wrote: Checksum can help even with encrypted connections. I agree. I have checksum and GPG signature verification failures often enough on files I have downloaded via https that I always check them. Automation would be welcome. Regards -Mark -- 注意:この電子メールには、株式会社エイチアイの機密情報が含まれている場合 が有ります。正式なメール受信者では無い場合はメール複製、 再配信または情 報の使用を固く禁じております。エラー、手違いでこのメールを受け取られまし たら削除を行い配信者にご連絡をお願いいたし ます. NOTE: This electronic mail message may contain confidential and privileged information from HI Corporation. If you are not the intended recipient, any disclosure, photocopying, distribution or use of the contents of the received information is prohibited. If you have received this e-mail in error, please notify the sender immediately and permanently delete this message and all related copies.
Re: [whatwg] checksum attribute in a href tag
On Wed, 24 Oct 2012, Mikko Rantalainen wrote: Checksum can help even with encrypted connections. Example scenario: User connects to https://download.manufacturer.com/ and clicks link a href=phone-firmware-15.img checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772Firmware version 15/a The download then starts and file gets saved to the filesystem. However, the system has memory corruption and despite the fact that the file got to the user agent intact, the file will end up as corrupted to the filesystem. However, if user agent had computed and verified the checksum after re-reading the file back from the local filesystem, it would have noticed the error. You might think that memory corruption is rare but trust me, it happens often enough to be worried about. Memory corruption is indeed more common than people realise. But that's not the important question. The important question is, does memory corruption occur more often than mistakes in the checksum= value will occur? More often enough that when people get the message that there was a download error, they'll trust the message rather than assuming it's just yet another false positive? Anyway, if you have memory corruption there's nothing to say the corruption won't occur _after_ you've done the checksum verification. In particular, there's nothing to say it'll happen between receiving and decoding the packets over TLS and comparing the packets to the checksum, and not either before (in which case TLS will catch it as part of its own integrity checking) or after (in which case the checksum won't help). That's a pretty narrow window. My guess would be that people will screw up their hidden metadata (e.g. updating the .img file without updating the HTML file) more often, much more often, than the checksum will catch an error. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] checksum attribute in a href tag
On Fri, Oct 19, 2012 at 1:50 PM, A. Rauschenbach rauschenb...@annuo.de wrote: I'm sick of coping the checksum of important files by hand or QR-code to the download manager or console. To solve the problem I suggest a checksum attribute in the a href tag. example: a href=http://example.com/important.file; checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772download/a Another advantage is that your visitors (browser) can verify that the document (e.g. a pdf) you linked to is still the same. If you serve important files over HTTP without TLS I don't think a checksum is going to help anyone much. We did have something similar to this, but it got dropped: http://html5.org/r/7434 -- http://annevankesteren.nl/
Re: [whatwg] checksum attribute in a href tag
A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012 13:50:04 +0200: I'm sick of coping the checksum of important files by hand or QR-code to the download manager or console. To solve the problem I suggest a checksum attribute in the a href tag. It seems that problem is solved at the HTTP level with RFC 1864: http://tools.ietf.org/html/rfc1864 Another advantage is that your visitors (browser) can verify that the document (e.g. a pdf) you linked to is still the same. Cool URIs should not change. -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net
Re: [whatwg] checksum attribute in a href tag
Am 2012-10-19 14:01, schrieb Nils Dagsson Moskopp: It seems that problem is solved at the HTTP level with RFC 1864: http://tools.ietf.org/html/rfc1864 If I get it right this works fine if you serve it from your server, but not if you link to foreign server. Another advantage is that your visitors (browser) can verify that the document (e.g. a pdf) you linked to is still the same. Cool URIs should not change. A changing URI isn't the problem. In that case you get a 404. The problem is when the URI stays the same but the content behind it changes. (especially when the content is not on your server)
Re: [whatwg] checksum attribute in a href tag
If you serve important files over HTTP without TLS I don't think a checksum is going to help anyone much. With important I meant the file as to work right here and right now not any security issues.
Re: [whatwg] checksum attribute in a href tag
On Fri, 19 Oct 2012, A. Rauschenbach wrote: I'm sick of coping the checksum of important files by hand or QR-code to the download manager or console. To solve the problem I suggest a checksum attribute in the a href tag. example: a href=http://example.com/important.file; checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772download/a Another advantage is that your visitors (browser) can verify that the document (e.g. a pdf) you linked to is still the same. What is the attack scenario you are trying to avoid? Without a discussion of what problem you're trying to solve, it's unclear how to evaluate the proposal. The idea of a hash= or checksum= attribute on a href has come up before -- about once a year, as far as I can tell! -- but it's always been found lacking in one way or another. e.g.: http://lists.w3.org/Archives/Public/public-whatwg-archive/2006Nov/thread.html#msg233 http://lists.w3.org/Archives/Public/public-whatwg-archive/2007Jul/0049.html http://lists.w3.org/Archives/Public/public-whatwg-archive/2008Dec/0376.html (in the third one, search for fingerprint.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] checksum attribute in a href tag
Am 2012-10-19 18:49, schrieb Ian Hickson: What is the attack scenario you are trying to avoid? Without a discussion of what problem you're trying to solve, it's unclear how to evaluate the proposal. The idea of a hash= or checksum= attribute on a href has come up before -- about once a year, as far as I can tell! -- but it's always been found lacking in one way or another. I don't want to avoid any attack scenario! I want trusted information. If I write an article and link to other documents I want a solution that the visitor can be sure that the document he opens is the document I originally linked to. (And if its not he gets informed. So he knows that the information maybe differ from the one the article talks about.) The second point is that verification if a file was downloaded correctly is a computer task not a human task. A standard how to give the verification information enables the browser/plugin vendors to do this task.
Re: [whatwg] checksum attribute in a href tag
On Fri, Oct 19, 2012 at 11:46 AM, A. Rauschenbach rauschenb...@annuo.de wrote: Am 2012-10-19 18:49, schrieb Ian Hickson: What is the attack scenario you are trying to avoid? Without a discussion of what problem you're trying to solve, it's unclear how to evaluate the proposal. The idea of a hash= or checksum= attribute on a href has come up before -- about once a year, as far as I can tell! -- but it's always been found lacking in one way or another. I don't want to avoid any attack scenario! I want trusted information. If I write an article and link to other documents I want a solution that the visitor can be sure that the document he opens is the document I originally linked to. (And if its not he gets informed. So he knows that the information maybe differ from the one the article talks about.) That's also an attach scenario. ^_^ I doubt it would be very useful to use this for confirming that arbitrary destination pages are the same. Those can change in minor, unimportant ways all the time; a lot of pages include some form of dynamic content that means they'll almost *never* be exactly the same from pageload to pageload. It seems highly likely that trying to use a checksum for this scenario would simply result in the browser over-warning people, thus making the warning useless. Using it specifically to defend against attack scenarios in *downloads*, on the other hand, is more likely to be useful. Downloads don't change nearly as much as pages do, so a change is more likely to be a result of something you don't want, rather than simply something incidental. However, check out the threads that Hixie referenced. The upsides and downsides of something like this have been discussed quite a bit already. ~TJ
Re: [whatwg] checksum attribute in a href tag
On Fri, 19 Oct 2012, A. Rauschenbach wrote: If I write an article and link to other documents I want a solution that the visitor can be sure that the document he opens is the document I originally linked to. (And if its not he gets informed. So he knows that the information maybe differ from the one the article talks about.) I don't think this is something that would be very practical. As Tab says, pages change a _lot_. You'd just always be getting a warning that the page had changed, even if the important content had not. The second point is that verification if a file was downloaded correctly is a computer task not a human task. A standard how to give the verification information enables the browser/plugin vendors to do this task. If the file is downloaded over TLS, then it's already verified. Pretty much any attack scenario in which the file can be corrupted (man-in-the-middle, server-side corruption, client-side corruption, etc) can attack the file just as easily as the hash, so there's not really any gain from checking a hash. (This applies equally well to manual checking.) Providing such a feature would, in most cases, just give users a false sense of security. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] checksum attribute in a href tag
A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012 20:46:24 +0200: […] If I write an article and link to other documents I want a solution that the visitor can be sure that the document he opens is the document I originally linked to. Mirror the information. -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net