Sent from my iPhone

On Jan 6, 2012, at 7:11 PM, Glenn Maynard <[email protected]> wrote:

> On Fri, Jan 6, 2012 at 12:13 PM, Jarred Nicholls <[email protected]> wrote:
> WebKit is used in many walled garden environments, so we consider these 
> scenarios, but as a secondary goal to our primary goal of being a standards 
> compliant browser engine.  The point being, there will always be content 
> that's created solely for WebKit, so that's not a good argument to make.  So 
> generally speaking, if someone is aiming to create content that's x-browser 
> compatible, they'll do just that and use the least common denominators.
> 
> If you support UTF-16 here, then people will use it.  That's always the 
> pattern on the web--one browser implements something extra, and everyone else 
> ends up having to implement it--whether or not it was a good idea--because 
> people accidentally started depending on it.  I don't know why we have to 
> keep repeating this mistake.
> 
> We're not adding anything here, it's a matter of complicating and "taking 
> away" from our decoder for one particular case.  You're acting like we're 
> adding UTF-32 support for the first time.
> 
> Of course you are; you're adding UTF-16 and UTF-32 support to the 
> responseType == "json" API.
> 
> Also, since JSON uses zero-byte detection, which isn't used by HTML at all, 
> you'd still need code in your decoder to support that--which means you're 
> forcing everyone else to complicate *their* decoders with this special case.
> 
> XHR's behavior, if the change I suggested is accepted, shouldn't require 
> special cases in a decoding layer.  I'd have the decoder expose the final 
> encoding in use (which I'd expect to be available already), and when 
> .response is queried, return null if the final encoding used by the decoder 
> wasn't UTF-8.  This means the decoding would still take place for other 
> encodings, but the end result would be discarded by XHR.  This puts the 
> handling for this restriction within the XHR layer, rather than at the 
> decoder layer.

That's why I'd like to see the spec changed to clarify the discarding if the 
encoding was supplied and isn't UTF-8.

> 
> I said:
> Also, I'm a bit confused.  You talk about the rudimentary encoding
> detection in the JSON spec (rfc4627 sec3), but you also mention HTTP
> mechanisms (HTTP headers and overrideMimeType).  These are separate
> and unrelated.  If you're using HTTP mechanisms, then the JSON spec
> doesn't enter into it.  If you're using both HTTP headers (HTTP) and
> UTF-32 BOM detection (rfc4627), then you're using a strange mix of the
> two.  I can't tell what mechanism you're actually using.
> 
> Correction: rfc4627 doesn't describe BOM detection, it describes zero-byte 
> detection.  My question remains, though: what exactly are you doing?  Do you 
> do zero-byte detection?  Do you do BOM detection?  What's the order of 
> precedence between zero-byte and/or BOM detection, HTTP Content-Type headers, 
> and overrideMimeType if they disagree?  All of this would need to be 
> specified; currently none of it is.

None of that matters if a specific codec is the one all be all.  If that's the 
consensus then that's it, period.

WebKit shares a single text decoder globally for HTML, XML, plain text, etc. 
the XHR payload runs through it before it would pass to JSON.parse.  Read the 
code if you're interested.  I would need to change the text decoder to skip BOM 
detection for this one case unless the spec added that wording of discarding 
when encoding != UTF-8, then that can be enforced all in XHR with no decoder 
changes.  I don't want to get hung on explaining WebKit's specific impl. 
details.

> 
>  
> "without breaking existing content" and yet killing UTF-16 and UTF-32 support 
> just for responseType "json" would break existing UTF-16 and UTF-32 JSON.  
> Well, which is it?
> 
> This is a new feature; there isn't yet existing content using a responseType 
> of "json" to be broken.
> 
> Don't get me wrong, I agree with pushing UTF-8 as the sole text encoding for 
> the web platform.  But it's also plausible to push these restrictions not 
> just in one spot in XHR, but across the web platform
> 
> I've yet to see a workable proposal to do this across the web platform, due 
> to backwards-compatibility.  That's why it's being done more narrowly, where 
> it can be done without breaking existing pages.  If you have any novel ideas 
> to do this across the platform, I guarantee everyone on the list would like 
> to hear them.  Failing that, we should do what we can where we can.
> 
> and also where the web platform defers to external specs (e.g. JSON).  In 
> this particular case, an author will be more likely to just use responseText 
> + JSON.parse for content he/she cannot control - the content won't end up 
> changing and our initiative is circumvented.
> 
> Of course not.  It tells the developer that something's wrong, and he has the 
> choice of working around it or fixing his service.  If just 25% of those 
> people make the right choice, this is a win.  It also helps discourage new 
> services from being written using legacy encodings.  We can't stop people 
> from doing the wrong thing, but that doesn't mean we shouldn't point people 
> in the right direction.
> 
> This is an editor's draft of a spec, it's not a recommendation, so it's 
> hardly a violation of anything.
> 
> This is the worst thing I've seen anyone say in here in a long time.

Wtaf, why is everyone taking this point and driving it so out of context? I was 
trying to make a point that things change overnight...I've already explained 
and I won't do it again.  Relax already, it's Friday!

> 
> On Fri, Jan 6, 2012 at 12:25 PM, Julian Reschke <[email protected]> wrote:
> One could argue that it isn't a race "to the bottom" when the component  
> accepts what is defined as valid (by the media type); and that the real 
> problem is that another spec tries to profile that.
> 
> First off, it's common and perfectly normal for an API exposing features from 
> another spec to explicitly limit the allowed profile of that spec.  Saying 
> "JSON through this API must be UTF-8" is perfectly OK.
> 
> Second, this isn't an issue of the JSON spec at all.  As described so far 
> (somewhat vaguely), his charset detection *isn't* what's described by 
> rfc4627, which only describes UTF-16 and UTF-32 zero-byte detection (and that 
> vaguely--it isn't even normative).  Rather, it's also mixing in bits from 
> HTTP (the Content-Type header, which I assume is what was meant by "dictated 
> by the server" in the original message) and XHR (the overrideMimeType 
> method).  None of that is defined by rfc4627, which makes WebKit's behavior 
> ad hoc, and none of this will be fixed by changes to rfc4627 (which obviously 
> shouldn't talk about HTTP headers).
> 
> -- 
> Glenn Maynard
> 

Reply via email to