Re: [whatwg] Parsing RFC3339 constructs
On Thu, 20 Aug 2009, Christoph P�per wrote: Ian Hickson: On Tue, 11 Aug 2009, Nils Dagsson Moskopp wrote: Am Dienstag, den 11.08.2009, 07:27 + schrieb Ian Hickson: On Tue, 11 Aug 2009, Julian Reschke wrote: Ian Hickson wrote: - the literal letters T and Z must be uppercase It simplifies processing a tiny amount. So for a tiny win, you change the format? By a tiny amount, yes. It will be interesting to see if parsers choose to also get lowercase letters. I'd half-expect that to work, not at least because there may already be RFC-compliant libraries in the wild. The spec explicitly points out that implementors shouldn't naively use ISO8601 libraries. That is not naivity! It is a standard's duty to correctly integrate other standards. If HTML 5 uses a subset of ISO 8601 then content processors must be able to use generic ISO-conformant parsers. No, sorry, ISO-8601 is too vague to make that possible. It doesn't define error handling, for instance. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
Ian Hickson: On Tue, 11 Aug 2009, Nils Dagsson Moskopp wrote: Am Dienstag, den 11.08.2009, 07:27 + schrieb Ian Hickson: On Tue, 11 Aug 2009, Julian Reschke wrote: Ian Hickson wrote: - the literal letters T and Z must be uppercase It simplifies processing a tiny amount. So for a tiny win, you change the format? By a tiny amount, yes. It will be interesting to see if parsers choose to also get lowercase letters. I'd half-expect that to work, not at least because there may already be RFC-compliant libraries in the wild. The spec explicitly points out that implementors shouldn't naively use ISO8601 libraries. That is not naivity! It is a standard's duty to correctly integrate other standards. If HTML 5 uses a subset of ISO 8601 then content processors must be able to use generic ISO-conformant parsers. Content providers OTOH may be required to restrict their generators. As soon as they discover what else is accepted by browsers etc., though, they will use it. Ergo HTML 5 should, from the beginning, support whatever parts of ISO 8601 common libraries cover already. Unless, of course, we presume that HTML implementors will use homebrewed code only.
Re: [whatwg] Parsing RFC3339 constructs
On Tue, 11 Aug 2009, Nils Dagsson Moskopp wrote: Am Dienstag, den 11.08.2009, 07:27 + schrieb Ian Hickson: On Tue, 11 Aug 2009, Julian Reschke wrote: Ian Hickson wrote: On Mon, 27 Apr 2009, Asbjørn Ulsberg wrote: On Mon, 27 Apr 2009 12:59:11 +0200, Julian Reschke julian.resc...@gmx.de wrote: - the literal letters T and Z must be uppercase Any technical reason why they have to? Any reason why they don't? It simplifies processing a tiny amount. So for a tiny win, you change the format? By a tiny amount, yes. It will be interesting to see if parsers choose to also get lowercase letters. I'd half-expect that to work, not at least because there may already be RFC-compliant libraries in the wild. The spec explicitly points out that implementors shouldn't naively use ISO8601 libraries. So if they do by the time HTML n is the standard, will the uppercase restriction be removed in HTML n+1 ? HTML5 itself will have to change if the implementations don't implement what it says. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
On Tue, 11 Aug 2009, Julian Reschke wrote: Ian Hickson wrote: On Mon, 27 Apr 2009, Asbjørn Ulsberg wrote: On Mon, 27 Apr 2009 12:59:11 +0200, Julian Reschke julian.resc...@gmx.de wrote: - the literal letters T and Z must be uppercase Any technical reason why they have to? Any reason why they don't? It simplifies processing a tiny amount. So for a tiny win, you change the format? By a tiny amount, yes. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
Am Dienstag, den 11.08.2009, 07:27 + schrieb Ian Hickson: On Tue, 11 Aug 2009, Julian Reschke wrote: Ian Hickson wrote: On Mon, 27 Apr 2009, Asbjørn Ulsberg wrote: On Mon, 27 Apr 2009 12:59:11 +0200, Julian Reschke julian.resc...@gmx.de wrote: - the literal letters T and Z must be uppercase Any technical reason why they have to? Any reason why they don't? It simplifies processing a tiny amount. So for a tiny win, you change the format? By a tiny amount, yes. It will be interesting to see if parsers choose to also get lowercase letters. I'd half-expect that to work, not at least because there may already be RFC-compliant libraries in the wild. So if they do by the time HTML n is the standard, will the uppercase restriction be removed in HTML n+1 ? Cheers -- Nils Dagsson Moskopp http://dieweltistgarnichtso.net
Re: [whatwg] Parsing RFC3339 constructs
On Mon, 27 Apr 2009, Asbjørn Ulsberg wrote: On Mon, 27 Apr 2009 12:59:11 +0200, Julian Reschke julian.resc...@gmx.de wrote: - the literal letters T and Z must be uppercase Any technical reason why they have to? Any reason why they don't? It simplifies processing a tiny amount. It would help people understand what the difference to RFC 3339 is. Indeed, and this is exactly what we did in RFC 4287, as I've pointed out previously. And I can't say that date parsing has proven to be an issue there at all, even with the little work we did on narrowing down and tightening the syntax. Section 3.3. of RFC 4287 says: A Date construct is an element whose content MUST conform to the date-time production in [RFC3339]. In addition, an uppercase T character MUST be used to separate date and time, and an uppercase Z character MUST be present in the absence of a numeric time zone offset. Perhaps HTML5 needs more detailing than this for parsing, but not referencing RFC 3339 just for the sake of not referencing RFC 3339 doesn't make much sense imho. For authoring (and parsing, infact), RFC 3339 plus a couple of additional guidelines have proven to be enough for implementors of RFC 4287, so assume HTML5 could be better off doing the same, no? HTML5 now references ISO8601 directly in a non-normative note explaining why ISO8601 isn't referenced normatively. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
On Fri, 5 Jun 2009, Julian Reschke wrote: Ian Hickson wrote: On Fri, 5 Jun 2009, Julian Reschke wrote: Ian Hickson wrote: Michael(tm) Smith wrote: It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase Any technical reason why they have to? Not really. We just need a separator. So why make it different from RFC 3339? Limiting the syntax to the simplest possible syntax was an intentional design choice intended to ease the burden on implementors and authors. In practice, pretty much every time we've made syntax case-insensitive, we've ended up having trouble because of it. If this was a totally new syntax, I would agree. But as something based on ISO8601 (and thereby also RFC 3339) it appears to be a bad idea to make it less compatible just for that reason. We've seriously simplified the ISO-8601 syntax in many more ways than just this. This was a conscious design decision. The HTML5 spec defines exactly how to parse dates. Implementors are required to implement what the spec describes, so reusing libraries is implicitly not likely to be useful here. RFC3339 isn't even a particularly important one in the grand scheme of things (ISO8601 comes to mind as a much higher-profile example). I think it's unfortunate that HTML5 doesn't allow using an off-the-shelf parser. But if it doesn't, and the temptation *will* be there to use them, I'd recommend stating it very clearly. Done. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
Ian Hickson wrote: If this was a totally new syntax, I would agree. But as something based on ISO8601 (and thereby also RFC 3339) it appears to be a bad idea to make it less compatible just for that reason. We've seriously simplified the ISO-8601 syntax in many more ways than just this. This was a conscious design decision. Yes, the same decision was made for RFC 3339 (and the similar W3C Note). I was recommending to stay closer to those, not to ISO8601. ... BR, Julian
Re: [whatwg] Parsing RFC3339 constructs
On Fri, 5 Jun 2009, Julian Reschke wrote: Ian Hickson wrote: Michael(tm) Smith wrote: It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase Any technical reason why they have to? Not really. We just need a separator. So why make it different from RFC 3339? Limiting the syntax to the simplest possible syntax was an intentional design choice intended to ease the burden on implementors and authors. In practice, pretty much every time we've made syntax case-insensitive, we've ended up having trouble because of it. - a year must be four or more digits, and must be greater that zero a year must be four or more digits -- sounds like an alternative format that an additional RFC, updating RFC 3339 could specify. must be greater that zero -- that's not syntax :-) So yes, I think referring to RFC 3339, even if it's just a narrative mention, would be good. Why? Because it explains to readers how this is different. That is important because it's natural to look for existing libraries to parse date formats. The HTML5 spec defines exactly how to parse dates. Implementors are required to implement what the spec describes, so reusing libraries is implicitly not likely to be useful here. RFC3339 isn't even a particularly important one in the grand scheme of things (ISO8601 comes to mind as a much higher-profile example). I'm certainly not proposing to go through every date format spec and explain how the rules in HTML5 differ from those rules. That is the kind of material that belongs in support documents. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
Ian Hickson wrote: On Fri, 5 Jun 2009, Julian Reschke wrote: Ian Hickson wrote: Michael(tm) Smith wrote: It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase Any technical reason why they have to? Not really. We just need a separator. So why make it different from RFC 3339? Limiting the syntax to the simplest possible syntax was an intentional design choice intended to ease the burden on implementors and authors. In practice, pretty much every time we've made syntax case-insensitive, we've ended up having trouble because of it. If this was a totally new syntax, I would agree. But as something based on ISO8601 (and thereby also RFC 3339) it appears to be a bad idea to make it less compatible just for that reason. - a year must be four or more digits, and must be greater that zero a year must be four or more digits -- sounds like an alternative format that an additional RFC, updating RFC 3339 could specify. must be greater that zero -- that's not syntax :-) So yes, I think referring to RFC 3339, even if it's just a narrative mention, would be good. Why? Because it explains to readers how this is different. That is important because it's natural to look for existing libraries to parse date formats. The HTML5 spec defines exactly how to parse dates. Implementors are required to implement what the spec describes, so reusing libraries is implicitly not likely to be useful here. RFC3339 isn't even a particularly important one in the grand scheme of things (ISO8601 comes to mind as a much higher-profile example). I think it's unfortunate that HTML5 doesn't allow using an off-the-shelf parser. But if it doesn't, and the temptation *will* be there to use them, I'd recommend stating it very clearly. I'm certainly not proposing to go through every date format spec and explain how the rules in HTML5 differ from those rules. That is the kind of material that belongs in support documents. BR, Julian
Re: [whatwg] Parsing RFC3339 constructs
On Mon, 27 Apr 2009, Julian Reschke wrote: Michael(tm) Smith wrote: Ian Hickson i...@hixie.ch, 2009-04-25 05:35 +: On Fri, 2 Jan 2009, Asbjørn Ulsberg wrote: Reading the spec, I have to wonder: Does HTML5 need to specify as much as it does inline? Can't more of it be referenced to ISO 8601 or even better; RFC 3339? I really fancy how Atom (RFC 4287) has defined date constructs: http://www.atompub.org/rfc4287.html#date.constructs Does not RFC 3339 defined date and time in a satisfactory manner to use directly in HTML5? The problem isn't so much the syntax definitions as the parsing definitions. We need very specific parsing rules; it's not clear that there is anything to refer to that does the job we need here. It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase Any technical reason why they have to? Not really. We just need a separator. - a year must be four or more digits, and must be greater that zero a year must be four or more digits -- sounds like an alternative format that an additional RFC, updating RFC 3339 could specify. must be greater that zero -- that's not syntax :-) So yes, I think referring to RFC 3339, even if it's just a narrative mention, would be good. Why? Ian replied: I don't understand what that would gain us. It would help people understand what the difference to RFC 3339 is. Why is that important or desirable? It seems that comparisons to other specs would be better placed in other documents. HTML5 doesn't even describe how it differs from its previous version (HTML4), why would it include descriptions of differences from otherwise unrelated RFCs? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing RFC3339 constructs
Michael(tm) Smith wrote: Ian Hickson i...@hixie.ch, 2009-04-25 05:35 +: On Fri, 2 Jan 2009, Asbjørn Ulsberg wrote: Reading the spec, I have to wonder: Does HTML5 need to specify as much as it does inline? Can't more of it be referenced to ISO 8601 or even better; RFC 3339? I really fancy how Atom (RFC 4287) has defined date constructs: http://www.atompub.org/rfc4287.html#date.constructs Does not RFC 3339 defined date and time in a satisfactory manner to use directly in HTML5? The problem isn't so much the syntax definitions as the parsing definitions. We need very specific parsing rules; it's not clear that there is anything to refer to that does the job we need here. It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase Any technical reason why they have to? - a year must be four or more digits, and must be greater that zero a year must be four or more digits -- sounds like an alternative format that an additional RFC, updating RFC 3339 could specify. must be greater that zero -- that's not syntax :-) So yes, I think referring to RFC 3339, even if it's just a narrative mention, would be good. Ian replied: I don't understand what that would gain us. It would help people understand what the difference to RFC 3339 is. BR, Julian
Re: [whatwg] Parsing RFC3339 constructs
On Mon, 27 Apr 2009 12:59:11 +0200, Julian Reschke julian.resc...@gmx.de wrote: - the literal letters T and Z must be uppercase Any technical reason why they have to? Any reason why they don't? It would help people understand what the difference to RFC 3339 is. Indeed, and this is exactly what we did in RFC 4287, as I've pointed out previously. And I can't say that date parsing has proven to be an issue there at all, even with the little work we did on narrowing down and tightening the syntax. Section 3.3. of RFC 4287 says: A Date construct is an element whose content MUST conform to the date-time production in [RFC3339]. In addition, an uppercase T character MUST be used to separate date and time, and an uppercase Z character MUST be present in the absence of a numeric time zone offset. Perhaps HTML5 needs more detailing than this for parsing, but not referencing RFC 3339 just for the sake of not referencing RFC 3339 doesn't make much sense imho. For authoring (and parsing, infact), RFC 3339 plus a couple of additional guidelines have proven to be enough for implementors of RFC 4287, so assume HTML5 could be better off doing the same, no? -- Asbjørn Ulsberg -=|=- asbj...@ulsberg.no «He's a loathsome offensive brute, yet I can't look away»
Re: [whatwg] Parsing RFC3339 constructs
Ian Hickson i...@hixie.ch, 2009-04-25 05:35 +: On Fri, 2 Jan 2009, Asbjørn Ulsberg wrote: Reading the spec, I have to wonder: Does HTML5 need to specify as much as it does inline? Can't more of it be referenced to ISO 8601 or even better; RFC 3339? I really fancy how Atom (RFC 4287) has defined date constructs: http://www.atompub.org/rfc4287.html#date.constructs Does not RFC 3339 defined date and time in a satisfactory manner to use directly in HTML5? The problem isn't so much the syntax definitions as the parsing definitions. We need very specific parsing rules; it's not clear that there is anything to refer to that does the job we need here. It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase - a year must be four or more digits, and must be greater that zero -- Michael(tm) Smith http://people.w3.org/mike/
Re: [whatwg] Parsing RFC3339 constructs
On Sat, 25 Apr 2009, Michael(tm) Smith wrote: It seems pretty clear that there isn't anything else to refer to for the date/time parsing rules -- but to me at least, specifying those rules seems orthogonal to specifying the date/time syntax, and I would think the syntax could just be defined by making reference to the productions[1] in RFC 3339 (instead of completely redefining them), while stating any exceptions. [1] http://tools.ietf.org/html/rfc3339#section-5.6 I think the exceptions might just amount to: - the literal letters T and Z must be uppercase - a year must be four or more digits, and must be greater that zero I don't understand what that would gain us. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] Parsing RFC3339 constructs
On Fri, 2 Jan 2009, Asbjørn Ulsberg wrote: Reading the spec, I have to wonder: Does HTML5 need to specify as much as it does inline? Can't more of it be referenced to ISO 8601 or even better; RFC 3339? I really fancy how Atom (RFC 4287) has defined date constructs: http://www.atompub.org/rfc4287.html#date.constructs Does not RFC 3339 defined date and time in a satisfactory manner to use directly in HTML5? The problem isn't so much the syntax definitions as the parsing definitions. We need very specific parsing rules; it's not clear that there is anything to refer to that does the job we need here. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'