Re: [whatwg] checksum attribute in a href tag

2012-12-28 Thread Ian Hickson
On Thu, 25 Oct 2012, Mikko Rantalainen wrote:
 Ian Hickson, 2012-10-24 19:28 (Europe/Helsinki):
  Anyway, if you have memory corruption there's nothing to say the 
  corruption won't occur _after_ you've done the checksum verification. 
  In particular, there's nothing to say it'll happen between receiving 
  and decoding the packets over TLS and comparing the packets to the 
  checksum, and not either before (in which case TLS will catch it as 
  part of its own integrity checking) or after (in which case the 
  checksum won't help). That's a pretty narrow window.
  
  My guess would be that people will screw up their hidden metadata 
  (e.g. updating the .img file without updating the HTML file) more 
  often, much more often, than the checksum will catch an error.
 
 That might be true, I really don't know. I'd guess that this attribute 
 would be used pretty seldom but in those cases the correctness of 
 transferred file would be important enough to take the possibility of 
 false positive.

If it would be used pretty seldom, it's probably not the highest priority 
in terms of things we should add. :-)


 In addition, if the correctness of the file is important to you and the 
 downloaded file does not match the hidden metadata, you definitely 
 should contact the server administrator in any case.

In practice, users blame the browser, not the server. Especially when 
using another (older) browser (that doesn't check the checksum) results in 
it working fine.


 The server administrator can then either fix the checksum attribute or 
 the actual file.

Or tell you to use another browser, because they don't have any idea what 
this checksum thing is, since they had paid someone to write the site 
and only later replaced the file being downloaded and aren't HTML experts...


 As a result, I wouldn't expect the false positive error to be the 
 permanent state for important files. And the extra work required for the 
 attribute should prevent it's usage for non-important files. I'd trust 
 that casual content authors are too lazy to bother with it.

The problem isn't casual content authors, it's authors who copy other 
people's pages and don't test with supporting browsers, or authors who got 
help from their geek Web designer daughter at Christmas and then later 
changed the file and broke it because they didn't understand it, or the 
case I referenced above where a company hires a Web designer to do the 
initial work and then later go and update it.


 Furthermore, the checksum attribute could be valuable against both 
 memory corruption and network transfer corruption over unsecured 
 (non-TLS) links. Of course, it wouldn't provide any safety against 
 malicious uses.

It doesn't really help against memory corruption, as discussed above. 
If network corruption is a concern, then TLS really is the way to go (I 
would expect network corruption to be more of an issue with an active 
attacker than passive hardware failure, and in the case of an attacker, 
they can just update the checksum too).


In conclusion, it seems this proposal only solves a small problem, and 
doesn't solve it in a particularly successful manner. It's worth 
considering again if we have no more important things to address, but for 
now I haven't added it.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] checksum attribute in a href tag

2012-10-25 Thread Mikko Rantalainen
Ian Hickson, 2012-10-24 19:28 (Europe/Helsinki):
 Anyway, if you have memory corruption there's nothing to say the 
 corruption won't occur _after_ you've done the checksum verification. In 
 particular, there's nothing to say it'll happen between receiving and 
 decoding the packets over TLS and comparing the packets to the checksum, 
 and not either before (in which case TLS will catch it as part of its own 
 integrity checking) or after (in which case the checksum won't help). 
 That's a pretty narrow window.
 
 My guess would be that people will screw up their hidden metadata (e.g. 
 updating the .img file without updating the HTML file) more often, much 
 more often, than the checksum will catch an error.

That might be true, I really don't know. I'd guess that this attribute
would be used pretty seldom but in those cases the correctness of
transferred file would be important enough to take the possibility of
false positive.

In addition, if the correctness of the file is important to you and the
downloaded file does not match the hidden metadata, you definitely
should contact the server administrator in any case. The server
administrator can then either fix the checksum attribute or the actual
file. As a result, I wouldn't expect the false positive error to be the
permanent state for important files. And the extra work required for the
attribute should prevent it's usage for non-important files. I'd trust
that casual content authors are too lazy to bother with it.

Furthermore, the checksum attribute could be valuable against both
memory corruption and network transfer corruption over unsecured
(non-TLS) links. Of course, it wouldn't provide any safety against
malicious uses.

If @checksum is added to the spec, it should include a notice that the
feature is only intended to provide a safety net for non-malicious
corruption and TLS should be used instead or in addition if malicious
corruption is considered possible.

Example cases of such (non-malicious) checksum failure:

(1) remote server or intermediate network device hardware or software error
(2) local machine hardware or software error
(3) server administrator error (metadata error vs. actual file content
error - it's not _always_ the metadata that's corrupted)

-- 
Mikko





Re: [whatwg] checksum attribute in a href tag

2012-10-25 Thread Julian Reschke

On 2012-10-19 14:01, Nils Dagsson Moskopp wrote:

A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012
13:50:04 +0200:


I'm sick of coping the checksum of important files by hand or QR-code
to the download manager or console.

To solve the problem I suggest a checksum attribute in the a href
tag.


It seems that problem is solved at the HTTP level with RFC 1864:
http://tools.ietf.org/html/rfc1864


The latest spec defining Content-MD5 was RFC 2616. It will not be 
included in the revision of HTTP/1.1 because of broken interop for Range 
requests, and because of the weakness of MD5 (see 
http://trac.tools.ietf.org/wg/httpbis/trac/ticket/178 for context).


That being said a new response header field that is well-defined wrt to 
partial responses and more flexible wrt to digest algorthms would be 
interesting.


 ...

Best regards, Julian


Re: [whatwg] checksum attribute in a href tag

2012-10-24 Thread Mikko Rantalainen
Anne van Kesteren, 2012-10-19 14:57 (Europe/Helsinki):
 On Fri, Oct 19, 2012 at 1:50 PM, A. Rauschenbach rauschenb...@annuo.de 
 wrote:
 I'm sick of coping the checksum of important files by hand or QR-code to the
 download manager or console.

 To solve the problem I suggest a checksum attribute in the a href tag.

 example: a href=http://example.com/important.file;
 checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772download/a

 Another advantage is that your visitors (browser) can verify that the
 document (e.g. a pdf) you linked to is still the same.
 
 If you serve important files over HTTP without TLS I don't think a
 checksum is going to help anyone much.

Checksum can help even with encrypted connections.

Example scenario:

User connects to https://download.manufacturer.com/ and clicks link

a href=phone-firmware-15.img
checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772Firmware
version 15/a

The download then starts and file gets saved to the filesystem. However,
the system has memory corruption and despite the fact that the file got
to the user agent intact, the file will end up as corrupted to the
filesystem.

However, if user agent had computed and verified the checksum after
re-reading the file back from the local filesystem, it would have
noticed the error.

You might think that memory corruption is rare but trust me, it happens
often enough to be worried about. After it has bitten you once, you
learn to be paranoid about that. I'm speaking from experience here - I
once had a memory corruption that caused three bits (out of 8GB) to
randomly fail and that caused filesystem data corruption. And I had
already been running a memory tester (memtest86) for a day without
errors after I had installed the memory so I assumed it would be fine.
Search for git corrupt for more evidence from real world software
developers and remember that software developers are usually using high
quality hardware.

You don't want to fail with an important opaque file such as a firmware
image. Hopefully the firmware image will contain internal checksum but
it wouldn't hurt if the problem were found before trying to flash the image.

-- 
Mikko





Re: [whatwg] checksum attribute in a href tag

2012-10-24 Thread Mark Callow
On 2012/10/24 15:11, Mikko Rantalainen wrote:
 Checksum can help even with encrypted connections.
I agree. I have checksum and GPG signature verification failures often
enough on files I have downloaded via https that I always check them.
Automation would be welcome.

Regards

-Mark


-- 
注意:この電子メールには、株式会社エイチアイの機密情報が含まれている場合
が有ります。正式なメール受信者では無い場合はメール複製、 再配信または情
報の使用を固く禁じております。エラー、手違いでこのメールを受け取られまし
たら削除を行い配信者にご連絡をお願いいたし ます.

NOTE: This electronic mail message may contain confidential and
privileged information from HI Corporation. If you are not the intended
recipient, any disclosure, photocopying, distribution or use of the
contents of the received information is prohibited. If you have received
this e-mail in error, please notify the sender immediately and
permanently delete this message and all related copies.



Re: [whatwg] checksum attribute in a href tag

2012-10-24 Thread Ian Hickson
On Wed, 24 Oct 2012, Mikko Rantalainen wrote:
 
 Checksum can help even with encrypted connections.
 
 Example scenario:
 
 User connects to https://download.manufacturer.com/ and clicks link
 
 a href=phone-firmware-15.img 
 checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772Firmware
  
 version 15/a
 
 The download then starts and file gets saved to the filesystem. However, 
 the system has memory corruption and despite the fact that the file got 
 to the user agent intact, the file will end up as corrupted to the 
 filesystem.
 
 However, if user agent had computed and verified the checksum after 
 re-reading the file back from the local filesystem, it would have 
 noticed the error.

 You might think that memory corruption is rare but trust me, it happens 
 often enough to be worried about.

Memory corruption is indeed more common than people realise. But that's 
not the important question. The important question is, does memory 
corruption occur more often than mistakes in the checksum= value will 
occur? More often enough that when people get the message that there was a 
download error, they'll trust the message rather than assuming it's just 
yet another false positive?

Anyway, if you have memory corruption there's nothing to say the 
corruption won't occur _after_ you've done the checksum verification. In 
particular, there's nothing to say it'll happen between receiving and 
decoding the packets over TLS and comparing the packets to the checksum, 
and not either before (in which case TLS will catch it as part of its own 
integrity checking) or after (in which case the checksum won't help). 
That's a pretty narrow window.

My guess would be that people will screw up their hidden metadata (e.g. 
updating the .img file without updating the HTML file) more often, much 
more often, than the checksum will catch an error.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread Anne van Kesteren
On Fri, Oct 19, 2012 at 1:50 PM, A. Rauschenbach rauschenb...@annuo.de wrote:
 I'm sick of coping the checksum of important files by hand or QR-code to the
 download manager or console.

 To solve the problem I suggest a checksum attribute in the a href tag.

 example: a href=http://example.com/important.file;
 checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772download/a

 Another advantage is that your visitors (browser) can verify that the
 document (e.g. a pdf) you linked to is still the same.

If you serve important files over HTTP without TLS I don't think a
checksum is going to help anyone much.

We did have something similar to this, but it got dropped:
http://html5.org/r/7434


-- 
http://annevankesteren.nl/


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread Nils Dagsson Moskopp
A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012
13:50:04 +0200:

 I'm sick of coping the checksum of important files by hand or QR-code 
 to the download manager or console.
 
 To solve the problem I suggest a checksum attribute in the a href 
 tag.

It seems that problem is solved at the HTTP level with RFC 1864:
http://tools.ietf.org/html/rfc1864

 Another advantage is that your visitors (browser) can verify that the 
 document (e.g. a pdf) you linked to is still the same.

Cool URIs should not change.

-- 
Nils Dagsson Moskopp // erlehmann
http://dieweltistgarnichtso.net


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread A. Rauschenbach

Am 2012-10-19 14:01, schrieb Nils Dagsson Moskopp:

It seems that problem is solved at the HTTP level with RFC 1864:
http://tools.ietf.org/html/rfc1864

If I get it right this works fine if you serve it from your server, but 
not if you link to foreign server.


Another advantage is that your visitors (browser) can verify that 
the

document (e.g. a pdf) you linked to is still the same.


Cool URIs should not change.


A changing URI isn't the problem. In that case you get a 404. The 
problem is when the URI stays the same but the content behind it 
changes. (especially when the content is not on your server)




Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread A. Rauschenbach

If you serve important files over HTTP without TLS I don't think a
checksum is going to help anyone much.



With important I meant the file as to work right here and right now not 
any security issues.


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread Ian Hickson
On Fri, 19 Oct 2012, A. Rauschenbach wrote:
 
 I'm sick of coping the checksum of important files by hand or QR-code to 
 the download manager or console.
 
 To solve the problem I suggest a checksum attribute in the a href tag.
 
 example: a href=http://example.com/important.file;
 checksum=MD5:32c3675211199b671fbca1304d819289;SHA1:6e1ddeede3979c953788a3499616af35ee5fd772download/a
 
 Another advantage is that your visitors (browser) can verify that the 
 document (e.g. a pdf) you linked to is still the same.

What is the attack scenario you are trying to avoid?

Without a discussion of what problem you're trying to solve, it's unclear 
how to evaluate the proposal.

The idea of a hash= or checksum= attribute on a href has come up 
before -- about once a year, as far as I can tell! -- but it's always been 
found lacking in one way or another.

e.g.: 
   
http://lists.w3.org/Archives/Public/public-whatwg-archive/2006Nov/thread.html#msg233
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2007Jul/0049.html
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2008Dec/0376.html

(in the third one, search for fingerprint.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread A. Rauschenbach

Am 2012-10-19 18:49, schrieb Ian Hickson:

What is the attack scenario you are trying to avoid?

Without a discussion of what problem you're trying to solve, it's 
unclear

how to evaluate the proposal.

The idea of a hash= or checksum= attribute on a href has come 
up
before -- about once a year, as far as I can tell! -- but it's always 
been

found lacking in one way or another.


I don't want to avoid any attack scenario!

I want trusted information.

If I write an article and link to other documents I want a solution 
that the visitor can be sure that the document he opens is the document 
I originally linked to. (And if its not he gets informed. So he knows 
that the information maybe differ from the one the article talks about.)



The second point is that verification if a file was downloaded 
correctly is a computer task not a human task. A standard how to give 
the verification information enables the browser/plugin vendors to do 
this task.


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread Tab Atkins Jr.
On Fri, Oct 19, 2012 at 11:46 AM, A. Rauschenbach rauschenb...@annuo.de wrote:
 Am 2012-10-19 18:49, schrieb Ian Hickson:
 What is the attack scenario you are trying to avoid?

 Without a discussion of what problem you're trying to solve, it's unclear
 how to evaluate the proposal.

 The idea of a hash= or checksum= attribute on a href has come up
 before -- about once a year, as far as I can tell! -- but it's always been
 found lacking in one way or another.

 I don't want to avoid any attack scenario!

 I want trusted information.

 If I write an article and link to other documents I want a solution that the
 visitor can be sure that the document he opens is the document I originally
 linked to. (And if its not he gets informed. So he knows that the
 information maybe differ from the one the article talks about.)

That's also an attach scenario. ^_^

I doubt it would be very useful to use this for confirming that
arbitrary destination pages are the same.  Those can change in minor,
unimportant ways all the time; a lot of pages include some form of
dynamic content that means they'll almost *never* be exactly the same
from pageload to pageload.  It seems highly likely that trying to use
a checksum for this scenario would simply result in the browser
over-warning people, thus making the warning useless.

Using it specifically to defend against attack scenarios in
*downloads*, on the other hand, is more likely to be useful.
Downloads don't change nearly as much as pages do, so a change is more
likely to be a result of something you don't want, rather than simply
something incidental.

However, check out the threads that Hixie referenced.  The upsides and
downsides of something like this have been discussed quite a bit
already.

~TJ


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread Ian Hickson
On Fri, 19 Oct 2012, A. Rauschenbach wrote:
 
 If I write an article and link to other documents I want a solution that 
 the visitor can be sure that the document he opens is the document I 
 originally linked to. (And if its not he gets informed. So he knows that 
 the information maybe differ from the one the article talks about.)

I don't think this is something that would be very practical. As Tab says, 
pages change a _lot_. You'd just always be getting a warning that the page 
had changed, even if the important content had not.


 The second point is that verification if a file was downloaded correctly 
 is a computer task not a human task. A standard how to give the 
 verification information enables the browser/plugin vendors to do this 
 task.

If the file is downloaded over TLS, then it's already verified. Pretty 
much any attack scenario in which the file can be corrupted 
(man-in-the-middle, server-side corruption, client-side corruption, etc) 
can attack the file just as easily as the hash, so there's not really any 
gain from checking a hash. (This applies equally well to manual checking.) 
Providing such a feature would, in most cases, just give users a false 
sense of security.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] checksum attribute in a href tag

2012-10-19 Thread Nils Dagsson Moskopp
A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012
20:46:24 +0200:

 […]

 If I write an article and link to other documents I want a solution 
 that the visitor can be sure that the document he opens is the
 document I originally linked to.

Mirror the information.

-- 
Nils Dagsson Moskopp // erlehmann
http://dieweltistgarnichtso.net