Re: [whatwg] behavior

Tue, 15 Sep 2009 05:53:52 -0700

Ian Hickson wrote:
Since the whole point of text/plain sniffing is a workaround around a known issue where content is reliably mis-marked as text/plain, and since in this case there is a source of MIME information that's more reliable than that, it's not clear to me why we want to continue sniffing.

Of course if there is no @type there is no problem; I'm specifically concerned about the @type="text/plain" case here.

What exactly are you proposing here?

- Always honour type="" if it's a UA-supported type, ignoring server- provided content-type? - Always honour type="" without sniffing if it matches the server- provided content-type, even if normally that type would be sniffed?
 - Just honour type="text/plain" regardless of the server type, but for
   other UA-supported type=""s, use the server type?

My suggestion is to only perform text/plain "is this text or binary" sniffing where it belongs: on the HTTP level; since it's a workaround for a particular HTTP server bug. It shouldn't affect other type metadata.

Perform the sniffing such that it detects as either text/plain or application/octet-stream.

Then if it's application/octet-stream we'll end up using the @type. Though see below on other sniffing issues.

This does fail to sniff text/plain as the various "non-scriptable" types, but I question how desirable that is anyway, honestly. If we want to preserve this property without clobbering @type="text/plain" then I need to think a bit more about how to specify the behavior here.

Maybe your option 2 is what would give that behavior... I can work through it if you'd like.

Your option 1 would be ok if that's what we want (but a change from HTML4 and what UAs at least _try_ to implement now; I'm not sure whether it's desirable on its own). Your option 3 is a bit too magic for text/plain in @type; unnecessarily so unless we want to go the full option 1 route. All in my opinion, of course.

My concern about text/plain data being sniffed as text/html by your current algorithm (even with the changes you've made) seems to remain unaddressed.

I thought I had. Can you walk me through how anything labeled text/plain could get sniffed as text/html with the new text?

Hmm. Assume the type attribute is not set and HTML data is sent as text/plain and contains a "binary byte" in the first 512 bytes (can just stick it in the <title> or something). Also assume no plug-in claims to support the URI's file extension.

At step 3, the resource type is set to text/plain.

At step 4, the resource type is sniffed as application/octet-stream, since text/html is marked as scriptable in [MIMESNIFFF].

At step 5, there is no @type, and the resource type is application/octet-stream, so the resource type is changed to unknown.

At step 6, nothing changes since there is no plug-in supporting the URI's file extension.

At step 7, the resource type is "unknown", so it is changed to the "sniffed type of the resource".

Maybe I simply misunderstood this last reference, by way of contrasting it with what step 4 says and you mean to apply the full sniffing algorithm, including the special-cases for text/plain, and not just section 5 of [MIMESNIFF]. In that case there wouldn't be a problem (the data would get sniffed as application/octet-stream). That wasn't quite clear, but I can see now that this is probably what you meant.

-Boris

Reply via email to