Re: [whatwg] Fwd: Remarks on HTML5 (ASCII / Unicode)

2009-06-03 Thread Ian Hickson
On Sat, 4 Apr 2009, Innovimax SARL wrote:
 
 In 2.3 Case-sensitivity and string comparison
 
 Please replace
 
 Converting a string to uppercase
 and
 Converting a string to lowercase
 
 by respectively
 
 Converting a string to uppercase ASCII
 and
 Converting a string to lowercase ASCII

Done.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] cross-domain scrollIntoView on frames and iframes

2009-06-03 Thread Ian Hickson
On Fri, 3 Apr 2009, Ojan Vafai wrote:

 I'm suggesting an addition to cross-domain (i)frames that allows 
 scrolling specific content into view. The use case is sites that 
 aggregate data from many sites (e.g. search engines) and want to display 
 that data in an iframe. They can load the page in an iframe, but they 
 have no way to make the content visible as they don't have access to the 
 iframe's contents.
 
 A few possible APIs come to mind. I personally prefer the javascripty 
 option below, but I'll include another one for good measure.
 
 1) Add a scrollPathIntoView (with a better name) on iframes that takes 
 either an xpath or a css selector and scrolls the specified item into 
 view. If no such item exists, it does nothing. If one or more such items 
 exist, it calls scrollIntoView on the first matching item.

 2) Add a css or xpath expression to fragment identifiers. Tthe iframe 
 src can be set to http://foo.com#css(.foo http://foo.com/#css(.foo 
 #bar). Same as above applies. If there's no match, it's a noop. If there 
 is a match, it scrolls the first one into view.
 
 In both cases, no explicit success or failure is returned to the caller 
 as that would leak the iframes DOM across domains.
 
 This API can obviously be supported on same-domain iframes as well, but 
 it's not really necessary since you can just dig into the DOM of the 
 iframe.

On Mon, 6 Apr 2009, Jonas Sicking wrote:
 
 From my point of view I'm not sure how interesting this whole feature 
 is. We had support in firefox for XPointer for many years and saw little 
 to no uptake. I'm not sure if anyone complained when we removed the 
 support even (which would be pretty remarkable).

It seems that with such an API and with some careful timing measurements, 
you could determine the contents of a foreign iframe. I'm not sure that's 
a good idea.

I tried to come up with some alternative solutions, but I really haven't 
been very successful.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec

2009-06-03 Thread Silvia Pfeiffer
On Wed, Jun 3, 2009 at 3:28 PM, Daniel Berlin dan...@google.com wrote:
 On Tue, Jun 2, 2009 at 11:51 PM, Gregory Maxwell gmaxw...@gmail.com wrote:
 On Tue, Jun 2, 2009 at 10:18 PM, Daniel Berlin dan...@google.com wrote:
 On Tue, Jun 2, 2009 at 9:50 PM, Gregory Maxwell gmaxw...@gmail.com wrote:
 On Tue, Jun 2, 2009 at 9:29 PM, Daniel Berlin dan...@google.com wrote:
 [snip]
  I would, however, get in trouble for not having paid patent
 fees for doing so.
 No more or less trouble than you would have gotten in had you gotten
 it from ffmpeg instead of us, which combined with the fact that we do
 For the avoidance of doubt,
 Are you stating that when an end user obtains Chrome from Google they
 do not receive any license to utilize the Google distributed FFMPEG
 code to practice the patented activities essential to H.264 and/or AAC
 decoding, which Google licenses for itself?

 I'm not saying that at all. I'm simply saying that any patent license
 we may have does [not] cause our distribution of ffmpeg to violate the terms
 of the LGPL 2.1

 I now understand that your statement was only that Google's
 distribution of FFMPEG is not in violation of the LGPL due to patent
 licenses. Thank you for clarifying what you have stated. I will ask no
 further questions on that point.


 But I do have one further question:

 Can you please tell me if, when I receive Chrome from you, I also
 receive the patent licensing sufficient to use the Chrome package to
 practice the patents listed in MPEG-LA's 'essential' patent list for
 the decoding of H.264?  I wouldn't want to break any laws.
 Yes, you do.

Ah, that's interesting.


 I believe I know the answer, based on your statement No more or less
 … than … ffmpeg as ffmpeg explicitly does not provide any patent
 licensing,
 :)
 Again, that was specifically about ffmpeg as a component of Google
 Chrome, not about Google Chrome as a whole.   Licensing of projects
 that use a lot of open source components with a lot of different
 licenses is a complicated matter, since each component can have a
 license that is separate than the license for the work as a whole.
 I'm trying to make sure I am being as explicit as I can about what
 each subject i am talking about is while still providing answers, so
 that the answers are the actual answer, but as you can imagine, it's
 tricky.  It can be hard to differentiate between the questions people
 want to know an answer that is more general than what they asked, and
 those where they want to know just about that specific thing.  When it
 comes to matters like these, it's usually best for me to just answer
 the question people actually asked explicitly, and let them ask
 followups, than it is to try to anticipate what they really wanted to
 know.  It can come off as dodging at times, but i'm doing the best i
 can ;)

Glad you did, since I think this discussion has clarified a lot of
things - at least for me. Thanks a lot!

Regards,
Silvia.


Re: [whatwg] Fwd: Remarks on HTML5 (ASCII / Unicode)

2009-06-03 Thread Innovimax SARL
Thanks, Ian !

On Wed, Jun 3, 2009 at 8:45 AM, Ian Hickson i...@hixie.ch wrote:

 On Sat, 4 Apr 2009, Innovimax SARL wrote:
 
  In 2.3 Case-sensitivity and string comparison
 
  Please replace
 
  Converting a string to uppercase
  and
  Converting a string to lowercase
 
  by respectively
 
  Converting a string to uppercase ASCII
  and
  Converting a string to lowercase ASCII

 Done.

 --
 Ian Hickson   U+1047E)\._.,--,'``.fL
 http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'




-- 
Innovimax SARL
Consulting, Training  XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 €


Re: [whatwg] [html5] r3151 - [] (0) Try to make the magic margin collapsing rule more accurate.

2009-06-03 Thread Kristof Zelechovski
The HTML element cannot have a FIELDSET element as a child.  It can,
however, have a FRAMESET element as a child.
Chris



Re: [whatwg] [html5] r3151 - [] (0) Try to make the magic margin collapsing rule more accurate.

2009-06-03 Thread Ian Hickson
On Wed, 3 Jun 2009, Kristof Zelechovski wrote:

 The HTML element cannot have a FIELDSET element as a child.  It can, 
 however, have a FRAMESET element as a child.

Yes, this is indeed the case. Is there a particular reason you bring this up?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Fwd: Remarks on HTML5 (ASCII / Unicode)

2009-06-03 Thread Ian Hickson
On Sat, 4 Apr 2009, Kristof Zelechovski wrote:

 It seems that getting the element name is not covered at all, it is a core
 interface, so definitions in the HTML specification do not apply.

I don't know what this is in reference to. Could you elaborate on what 
change to the spec you would like?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Fwd: Remarks on HTML5 (ASCII / Unicode)

2009-06-03 Thread Henri Sivonen

On Jun 3, 2009, at 09:45, Ian Hickson wrote:


On Sat, 4 Apr 2009, Innovimax SARL wrote:


In 2.3 Case-sensitivity and string comparison

Please replace

Converting a string to uppercase
and
Converting a string to lowercase

by respectively

Converting a string to uppercase ASCII
and
Converting a string to lowercase ASCII


Done.



Please fix the terminology under Writing HTML documents to match.

--
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/




Re: [whatwg] How long should sessionStorage data persist?

2009-06-03 Thread Ian Hickson
On Sat, 4 Apr 2009, João Eiras wrote:

 On , Jeremy Orlow jor...@google.com wrote:
 
  I think this also applies: NOTE: The lifetime of a browsing context 
  can be unrelated to the lifetime of the actual user agent process 
  itself, as the user agent may support resuming sessions after a 
  restart.
 
 Should that restore sessionStorage data ? Aren't you making 
 sessionStorage much more complicated while the same use cases are 
 covered by localStorage ? sessionStorage could be optimized to be just a 
 volatile amount of data in memory, but these requirements require 
 sessionStorage to implement the same disk IO heuristics, and a complex 
 heuristic to decide when to erase sessionStorage completly.
 
 I vote for the data to be present just while a page is open or is 
 restored from history or by going back.

User agents aren't required to do the above; but they are allowed to if 
they so desire. So the complexity is entirely optional (and I don't expect 
most UAs to avail themselves of this option).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec

2009-06-03 Thread Chris DiBona
Yeah, this is really pretty difficult stuff. The lgpl is probably the
least understood and most complicated free software licenses.

Chris

On Wed, Jun 3, 2009 at 2:49 PM, Silvia Pfeiffer
silviapfeiff...@gmail.com wrote:
 On Wed, Jun 3, 2009 at 3:28 PM, Daniel Berlin dan...@google.com wrote:
 On Tue, Jun 2, 2009 at 11:51 PM, Gregory Maxwell gmaxw...@gmail.com wrote:
 On Tue, Jun 2, 2009 at 10:18 PM, Daniel Berlin dan...@google.com wrote:
 On Tue, Jun 2, 2009 at 9:50 PM, Gregory Maxwell gmaxw...@gmail.com wrote:
 On Tue, Jun 2, 2009 at 9:29 PM, Daniel Berlin dan...@google.com wrote:
 [snip]
  I would, however, get in trouble for not having paid patent
 fees for doing so.
 No more or less trouble than you would have gotten in had you gotten
 it from ffmpeg instead of us, which combined with the fact that we do
 For the avoidance of doubt,
 Are you stating that when an end user obtains Chrome from Google they
 do not receive any license to utilize the Google distributed FFMPEG
 code to practice the patented activities essential to H.264 and/or AAC
 decoding, which Google licenses for itself?

 I'm not saying that at all. I'm simply saying that any patent license
 we may have does [not] cause our distribution of ffmpeg to violate the 
 terms
 of the LGPL 2.1

 I now understand that your statement was only that Google's
 distribution of FFMPEG is not in violation of the LGPL due to patent
 licenses. Thank you for clarifying what you have stated. I will ask no
 further questions on that point.


 But I do have one further question:

 Can you please tell me if, when I receive Chrome from you, I also
 receive the patent licensing sufficient to use the Chrome package to
 practice the patents listed in MPEG-LA's 'essential' patent list for
 the decoding of H.264?  I wouldn't want to break any laws.
 Yes, you do.

 Ah, that's interesting.


 I believe I know the answer, based on your statement No more or less
 … than … ffmpeg as ffmpeg explicitly does not provide any patent
 licensing,
 :)
 Again, that was specifically about ffmpeg as a component of Google
 Chrome, not about Google Chrome as a whole.   Licensing of projects
 that use a lot of open source components with a lot of different
 licenses is a complicated matter, since each component can have a
 license that is separate than the license for the work as a whole.
 I'm trying to make sure I am being as explicit as I can about what
 each subject i am talking about is while still providing answers, so
 that the answers are the actual answer, but as you can imagine, it's
 tricky.  It can be hard to differentiate between the questions people
 want to know an answer that is more general than what they asked, and
 those where they want to know just about that specific thing.  When it
 comes to matters like these, it's usually best for me to just answer
 the question people actually asked explicitly, and let them ask
 followups, than it is to try to anticipate what they really wanted to
 know.  It can come off as dodging at times, but i'm doing the best i
 can ;)

 Glad you did, since I think this discussion has clarified a lot of
 things - at least for me. Thanks a lot!

 Regards,
 Silvia.




-- 
Open Source Programs Manager, Google Inc.
Google's Open Source program can be found at http://code.google.com
Personal Weblog: http://dibona.com


Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec

2009-06-03 Thread Anne van Kesteren
On Wed, 03 Jun 2009 09:34:08 +0200, Chris DiBona cdib...@gmail.com wrote:
 Yeah, this is really pretty difficult stuff. The lgpl is probably the
 least understood and most complicated free software licenses.

Thanks for taking the time to explain it!


-- 
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec

2009-06-03 Thread Chris DiBona
I mostly wanted to explain our position on the use of the library and
the LGPLs. Danny keeps it all straight for us.

Happy hacking, everyone!

Chris

On Wed, Jun 3, 2009 at 3:40 PM, Anne van Kesteren ann...@opera.com wrote:
 On Wed, 03 Jun 2009 09:34:08 +0200, Chris DiBona cdib...@gmail.com wrote:
 Yeah, this is really pretty difficult stuff. The lgpl is probably the
 least understood and most complicated free software licenses.

 Thanks for taking the time to explain it!


 --
 Anne van Kesteren
 http://annevankesteren.nl/




-- 
Open Source Programs Manager, Google Inc.
Google's Open Source program can be found at http://code.google.com
Personal Weblog: http://dibona.com


Re: [whatwg] First or last Content-Type header?

2009-06-03 Thread Adam Barth
On Wed, Jun 3, 2009 at 12:36 AM, Philip Taylor excors+wha...@gmail.com wrote:
 http://blogs.msdn.com/ie/archive/2008/09/02/ie8-security-part-vi-beta-2-update.aspx
 - it's X-Content-Type-Options: nosniff now (and is used a bit in
 practice - it's on about 0.1% of pages from
 http://www.dotnetdotcom.org/, though about half of them are owned by
 Google or Microsoft).

The ironic twist to this story is that HTTP responses that include the
nosniff directive are 50% more likely to have a missing or incorrect
Content-Type header.

Adam


Re: [whatwg] Fwd: Remarks on HTML5 (ASCII / Unicode)

2009-06-03 Thread Kristof Zelechovski
The definition of uppercasing in HTML does not apply to element names
because getting them is covered by the DOM specification and not by the HTML
specification.  This is all right with me; I only think that saying to
uppercase ASCII explicitly is not necessary.
Chris



Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Ian Hickson
On Sun, 5 Apr 2009, Giovanni Campagna wrote:

 A few comments, as requested by Ian Hickson.
 
 - End of 2.2.1, a typo: JavsScript instead of Javascript

Fixed.


 - From section 2.4.2 I don't understand if boolean attributes with
 invalid values represent true or false. In addition, I don't
 understand if an empty value is false (as in XHTML1.0) or true (as in
 HTML4, because of the minimized syntax).
 From my experience, I expect that the empty string (which is
 equivalent to not specify the attribute at all) is false, and any
 other value is true.

The spec says The presence of a boolean attribute on an element 
represents the true value, and the absence of the attribute represents the 
false value; is that not clear?


 - In 2.4.3 I don't see the point of all the digression about 
 contentEditable, since it is noted that it doesn't work like that. I 
 would leave the note to just Note: The empty string can be one of the 
 keywords or Note: The empty string can a valid keyword

Done.


 - In 2.4.4.3 (and maybe in other places) I would prefer [A|E]BNF
 instead of the prose description of a floating point number.

It's not obvious to me that this would be any clearer.


 I'm also not sure that the normative algorithm is needed.

You mean for parsing? How else would you know how to parse it? In some of 
the cases the algorithms don't accept any errorneous content at all, but 
in many cases we have to define how you handle bogus data, and I don't see 
how to do that any other way.


 I've also searched IEEE, IETF, ECMA, ISO and ANSI for another normative 
 version of the syntax and processing, but I've found none. If you think 
 that it is important to have it specified completely, you may submit an 
 ID, so future technologies won't need to rewrite it again.

I'm not sure to what you refer. I certainly wouldn't want anyone reusing 
most of these definitions; many are the result of years of bugs causing 
legacy content to depend on weird quirks.


 - The second paragraph in 2.4.5.6 is hard to understand because the
 verb is at the end. I would rewrite as A week-year with a number *yr*
 has 53 weeks if corresponds to a year *yr* in the proleptic Gregorian
 calendar that has a Thursday as its first day (January 1st), or if
 *yr* where *yr* is a number divisible by 400, or a number divisible by
 4 but not by 100. In all other cases it has 52 weeks

Done.


 Also, don't rely on styles alone, use different words for identifiers
 and prose. This includes also the Note following, where no styles are
 applied and it is difficult to understand that year year is not a
 typo but rather is the year numbered year.

I made the note use y, but in general I find using anything but year 
here gets really ugly.


 - Can't be simply referenced CSS3 Color in 2.4.6?
 This way, implementors could have body[bgcolor] { background-color:
 attr(bgcolor,color,white); } in the default CSS instead of using HTML5
 specific rules.

The rules for parsing a legacy color value are very constrained and don't 
match CSS, no.


 - In 2.4.9 a valid hash reference must be equal to an ID, name is 
 supported only for backward compatibility.

No, map uses name=.


 - Section 2.6 is superfluous: handling of application cache is specified 
 in the appropriate section, handling of HTTP requests and caches is 
 defined in RFC2616, handling of cookie is defined in the appropriate RFC 
 (I don't remember the number), handling of about:blank is in the 
 proposed about-uri-scheme ID. In addition, serialized queue-based 
 handling of resources should not be mandated by the HTML5 specification 
 (can't UAs be multi-threaded?)

Section 2.6 (fetching) is needed to define how the fetching algorithm 
(HTTP, etc) fit into the event loop mechanism and the storage mutex.


 - Rewriting 2.6.1 without the HTTP word is definitely better. Browsers 
 are not required to support HTTP, AFAIK. You can write a GET method 
 (because GET is anyway an English word), a response code (most 
 protocols have response codes) and metadata (instead of headers, that 
 SMTP, POP, FTP don't support)

I think that would be far less clear.


 - 2.6.2 should be implied by the HTTP-over-TLS RFC

Apparently implying it isn't good enough, given current implementations.


 - In section 2.7.1, in sentence Extensions must not be used for 
 determining resource types for resources fetched over HTTP., do you 
 mean File extensions, like .txt or .png, or User agent extensions 
 (additions to the algorithm)?

This is fixed in Adam's draft now.


 - Still in section 2.7.1, why the algorithm is a violation of RFC2616? 
 Because it is case insensitive? Because it allows spaces? Because it 
 does not imply ISO-8859-1 if no charset is explicit? Because it does not 
 imply ASCII for text/* mime types?

Because it means not blindly honouring Content-Type.


 - Why don't you add ?xml to the sniffing table?

I'll leave this up to Adam.


 - In section 2.8, x-x-big5 is not a different encoding than big5,
 it 

Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Ian Hickson
On Sun, 5 Apr 2009, Kristof Zelechovski wrote:
 
 Now that classid is gone, what will be the workaround for ActiveX 
 objects where they are needed?

ActiveX controls are a vendor-specific technology, and thus not 
appropriate for explicit support in HTML5 (just like we dropped applet).

The real workaround is use HTML, CSS, JS, DOM, or, if you really need a 
plugin, use a type that triggers the right plugin.


   2. Use a custom DTD with classid for validation?

That doesn't make the document valid.


   3. Use a custom type application/vnd.acme-fancy-control+oleobject
 for every control?

Yes.


 Of course, such things are inherently nonportable but they are widely 
 used. It would be nice to have a way to validate them.

You can. If you use it the validator will tell you you're doing something 
non-portable. This is intentional!

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Ian Hickson
On Sun, 5 Apr 2009, João Eiras wrote:

 The spec does not forbid to use non supported attributes and elements. 

Actually, it does. I've just made the spec even clearer about this, 
though.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Ian Hickson
On Sun, 5 Apr 2009, Christoph Päper wrote:
 Giovanni Campagna:
  - The second paragraph in 2.4.5.6 is hard to understand because the
  verb is at the end. I would rewrite as
  A week-year with a number *yr* has 53 weeks if corresponds to a year *yr*
  in the proleptic Gregorian calendar that has a Thursday as its first day
  (January 1st), or if *yr* where *yr* is a number divisible by 400, or a
  number divisible by 4 but not by 100. In all other cases it has 52 weeks
 
 | A week-year with a number $year that corresponds to a year $year in the
 | proleptic Gregorian calendar that has a Thursday as its first day
 | (January 1st), and a week-year $year where $year is a number divisible
 | by 400, or a number divisible by 4 but not by 100, has 53 weeks. All
 | other week-years have 52 weeks.
 
 The description is wrong anyhow: Not every leap year has 53 weeks! (For 
 instance, 2008 and 2012 have 52 weeks only.) The difference to common 
 years is that leap years with 53 weeks can have Jan01 on either Thu or 
 Wed, because Dec31 then is Fri or Thu respectively. (Compare your 2020 
 to your 2004 calendar.)

Fixed, thanks.


 Or just reference and rely on ISO 8601. That is what references 
 (especially to standards) are for after all.

I would, except that ISO8601 costs 130 CHF and it seems far easier for 
everyone concerned to just add the paragraph right there instead of 
deferring to some paragraph elsewhere.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Kristof Zelechovski
The validator generates an error for the classid attribute (in line with
what the specification says, I think).  An error, unlike a warning, breaks
any complex process that depends on successful validation of the components.

I think the specification text should be rephrased so that the validator can
issue a warning instead.
For the time being, the only practical workaround for this incompatibility
is to use Internet Explorer magic comments.
IMHO,
Chris



Re: [whatwg] on bibtex-in-html5

2009-06-03 Thread Bruce D'Arcus
On Tue, Jun 2, 2009 at 12:05 PM, James Graham jgra...@opera.com wrote:
 Bruce D'Arcus wrote:

 So exactly what is the process by which this gets resolved? Is there one?

 Hixie will respond to substantive emails sent to this list at some point.
 However there are some hundreds of outstanding emails (see [1]) so the
 responses can take a while. If you have a pressing deadline that would
 benefit from your issue being addressed sooner, I suggest you talk to Hixie
 about it.

No problem; I just wanted to know how things worked here. Thanks.

 FWIW I have a few general thoughts about the bibtex section which may or may
 not be interesting:

 1) It seems like this and similar sections (bibtex, vCard, iCalendar) could
 be productively split out of the main spec into separate normative
 documents, since they are rather self-contained and have rather obvious
 interest for communities who are unlikely to find them at present or to be
 interested in the rest of the spec.

+1 to splitting them off.

I think there's still an open question, however, about whether any of
these—and particularly the bibliographic one (at least as it's
currently specified)—should be normative. I don't believe they should
be.

But, moving on ...

 Although the drag and drop stuff being
 dependent on them does mean that you'd need some circular references.

 2) For the bibliographic data the most important issues that I see are ease
 of use and ease of export. Although I am not attached to the bibtex format
 per-se I would be extremely disappointed if a different, harder to author,
 format were used. Formats that are flexible but rarely used are less useful
 overall than more limited formats with ubiquitous deployment. In addition
 formats that are hard to use make it more likely that people will make
 accidental mistakes, so decreasing the reliability of the data and devaluing
 tools that consume the data.

 Although I don't think we have to use bibtex as the basis for the format, I
 do think a canonical mapping to bibtex is a requirement. Obviously this
 reflects my background in the physical sciences but, at least in that field
 LaTeX and, by association, bibtex are overwhelmingly popular. I am well
 aware that the situation in other fields is different but without clean,
 high fidelity, bibtex export (at least to the extend required to support
 common citation patterns within the physical sciences) the format will lose
 out on a large audience with a higher than average number of potential early
 adopters.

Fair enough; all I'm saying is the same deference should be paid to
other research fields. The sciences for too long have dominated these
discussions, to the detriment of other fields. So I would hope we
could avoid that here.

Let's move on to a use case of two to illustrate the issues here.

Zotero is likely to be an early adopter of microdata as well,
certainly as a consumer of these data, and perhaps also as a producer.

http://www.zotero.org/

Zotero is a Firefox extension that can import and export BibTeX, among
a variety of other formats (RIS, MODS, and the new BIBO/DC RDF work,
which is its primary format). It includes a number of components that
allow citation and document metadata to be extracted, and later
republished.

So, for example, a user is browsing the web, and they are reading this
article from the NY Times.

http://www.nytimes.com/2009/06/03/world/asia/03military.html

Zotero has a translator (basically, a dedicated screen-scraper) for
the NY Times, and so the user can simply click an icon in their
toolbar to extract the metadata into their database.  They can then
later cite it in their own documents, and Zotero will be responsible
for correctly formatting those citations and bibliographic entries.

So I have questions on this use case:

1) how do these data about the article get encoded in microdata in
such a way that Zotero (or any other similar tool) doesn't have to
continue to write and maintain dedicated translators for every site?
E.g. how should the newspaper article metadata be encoded?

It seems the assumption that bibtex is only for bibliographies leaves
that out. Instead, the current draft of the spec tells us the title of
the document corresponds to dc:title, and not much else.

My argument is to beef up the ability to describe documents in general
.* In strawman pseudo-code:

title = doc.title
type = doc.type
source = doc.isPartOf.title # or if not dc:isPartOf, something similarly generic
issued = doc.issued
creators = doc.creator
print creators[0].name **

E.g. don't pretend that document metadata is different than
bibliographic metadata. The latter is simply a reference to the former
(usually; there are some exceptions where people cite events).

2) If Zotero consumes these data and then the user cites it in their
document, and elects to export to HTML5, how should that same
newspaper article data be encoded in the bibliography?

BibTeX isn't terribly helpful; example:


[whatwg] clarification on ApplicationCaches cache failure steps

2009-06-03 Thread Andrei Popescu
Hi,

I have a question about the Application Caches update process:

http://dev.w3.org/html5/spec/Overview.html#application-cache-update-process

In the event of a failure during the update process (e.g. some error
reported when attempting to save the downloaded resources to stable
storage), the spec says the cache failure steps should be run:

http://dev.w3.org/html5/spec/Overview.html#cache-failure-steps

However, I am not sure it is perfectly clear what should happen after
that. If all the resources have been successfully downloaded but could
not be saved to stable storage, the newest application cache is
actually functional (as all of its resources are available in RAM).
Step 4 of the cache failure steps says:

If cache group has an application cache whose completeness flag is
incomplete, then discard that application cache.

IMHO, the fact that the application cache is functional (can be used
to serve content) before its resources were persisted to stable
storage is actually an implementation detail. As far as the spec is
concerned, such a cache is still incomplete, so it should be
discarded, right? But what should happen in the case that, prior to
the update process, there was another newest cache? Should the UA
continue to use it? Or should it fallback to the network? I can see
reasons for doing either but I was wondering if this should be agreed
on and written in the spec. Or maybe it is in the spec and I missed
it?

This question appeared while trying to fix the following bug in WebKit:

https://bugs.webkit.org/show_bug.cgi?id=25562

Many thanks,
Andrei


Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Kristof Zelechovski
Regarding
http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.
html#weeks:
A week begins on Sunday, not on Monday.
However, under the present assumption:
Better:
A week-year has 53 weeks if the first day of the year (January 1st)
in the proleptic Gregorian calendar is Thursday or the year is leap and the
first day of the year (January 1st) in the proleptic Gregorian calendar is
Wednesday. All other week-years have 52 weeks.
Better still:
A week-year has 53 weeks if February the 28th in the proleptic
Gregorian calendar is Saturday or February the 29th in the proleptic
Gregorian calendar is Saturday. All other week-years have 52 weeks.
Also note, x-x-big5 cannot be registered with IANA because it is already
registered for private use.
HTH,
Chris




Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread James Graham

Kristof Zelechovski wrote:

Regarding
http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.
html#weeks:
A week begins on Sunday, not on Monday.


Not according to ISO [1]

[1] http://en.wikipedia.org/wiki/ISO_week_date



Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Garrett Smith
On Sun, Apr 5, 2009 at 9:04 AM, Giovanni Campagna
scampa.giova...@gmail.com wrote:
 A few comments, as requested by Ian Hickson.

[...]

 - In section 3.3.3.7, instead of defining the syntax of style
 attributes, reference http://www.w3.org/TR/css-style-attr


The style property has been part of the HTMLElement interrface for
many years. Reading ahead, I see this has been addressed.

It would be useful to have a few others:

  computedStyle
  cascadedStyle
  getStyleValueAs(prop, unit);

The computedStyle is already possible, just clunky. A cascadedStyle
doesn't exist. getStyleValueAs exists as a very clunky API that
doesn't work in browsers.

Garrett


Re: [whatwg] [html5] Pre-Last Call Comments

2009-06-03 Thread Křištof Želechovski
It is possible to create no-script fallback without a NOSCRIPT element.  You
can put it into d...@class=noscript] and remove the DIV at run time.
It is worth noting that XHTML 1.0, along with deprecating MAP/@name, still
has the unrealistic assumption about usemap containing an arbitrary URI.  I
would not put much weight to that.
Cheers,
Chris




Re: [whatwg] several messages

2009-06-03 Thread Ian Hickson
On Tue, 7 Apr 2009, Jeff Creamer wrote:

 Hi.  Since March of '06, Opera 9 has supported a custom extension to the 
 canvas context called opera-2dgame. Importantly, their extension adds 
 these methods:
 
 getPixel(x, y)
 Returns the pixel value (colour, opacity) at (x, y). Returned in the 
 form #rrggbb if fully opaque and rgba(r, g, b, a) if it has some alpha 
 transparency.
 
 setPixel (x, y, color)
 Allows you to set the colour of the pixel at (x, y). The third argument 
 should be a CSS color - you could provide a string such as `red', a HTML 
 colour code or even a rgba() value.
 
 I don't see any recent discussion of this.  And I am also aware that the 
 canvas drawing model is not pixel-oriented.  Nonetheless, mightn't these 
 functions be extremely useful?  As the Opera folk point out, they bless 
 game developers, and it occurs to me that they could be used for other 
 neat, useful things, such as giving JavaScript a rudimentary RAM drive.  
 Description including several demo programs and a discussion of security 
 issues is available here.
 
 Why not make getPixel () and setPixel() a standard?

On Tue, 7 Apr 2009, Oliver Hunt wrote:

 The ImageData APIs already provide the ability to do this and are 
 already supported by Firefox, Opera and Safari.

Given the ImageData APIs, and given that they are generally more efficient 
at the typical use cases for getPixel/setPixel, I haven't added getPixel/ 
setPixel to the spec.

Cheers,
-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Canvas - toTempURL - A dangerous proposal - Summary

2009-06-03 Thread Ian Hickson
On Wed, 8 Apr 2009, Charles Pritchard wrote:
 
 Legacy clients may have terrible support for extensibility. With some 
 HTML consumers, base 64 encoded images are not usable in the global 
 scope. To get around this, we proposed using toTempURL(), to save an 
 image to the local temporary files directory, and return a reference 
 which the legacy client could support.
 [...]
 
 I suggested to Boris, that perhaps I could tie into a custom protocol 
 handler, to hide the location of the file on the user hard drive. 
 Obviously, this was not a well thought-out response. Boris replied:
 
 I guess I'm not clear on one thing: you can add support for 
 customHandler:// to this platform but not support for data: ? 
 
 At this point, I conceded that perhaps trying to support data: was a 
 better goal than trying to advocate toTempURL. We're trying now to 
 implement data:image/png support for Internet Explorer 6.0+.

That would be cool.


 A new problem: Short data URLs.
 
 There is likely a cost, though it could be addressed in implementation, 
 in passing around toDataURL strings. Compressing a bitmap to a png, 
 base64 encoding it, copying the string, twice, and decoding it, can be 
 expensive, and for some implementations and use-cases, completely 
 unnecessary.
 
 If we could reference a short string, of a hundred or so bytes, instead 
 of a very large base64 string, it may help with memory management (and 
 related efficiencies).
 
 Ian Hickson writes:
 
 On the long term I expect once we have a File/Blob API, we'll use that 
 to expose the canvas data as a file.
 
 My response, while we await such an API, is to perhaps introduce a new 
 mime output for toDataURL, one which for now will be implementation 
 dependent, but may hopefully grow to see more use.

It seems that inventing one API instead of another better one is a weird 
way of going around things. If we have the time to invent this API, why 
not just invent the File API?

In fact, Arun is doing the File API this week.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] optgroup element used for selection

2009-06-03 Thread Ian Hickson
On Thu, 9 Apr 2009, Randy Drielinger wrote:

 Currently (HTML4.01) and in the near-future HTML5 spec optgroup elements 
 are used to group options (hence the name :-) and amongst others allow 
 us to disable a group of option elements at once.
 
 Personally, I think it would make sense to add the ability to make an 
 optgroup selectable. The result *could* be to select all the childs of 
 that optgroup element (or simply post the child values as an array or 
 whatever).

This has been suggested previously also. I have noted it as a possible 
extension for a future version. I'd like to avoid adding new features at 
this point so that we can get HTML5 out of the door. :-)

Cheers,
-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] localStorage behavior when cookies mode is session-only

2009-06-03 Thread Ian Hickson
On Thu, 9 Apr 2009, Honza Bambas wrote:

 In the W3C spec for localStorage 
 http://dev.w3.org/html5/webstorage/#the-localstorage-attribute is said 
 to present it (the persistent storage) the same way as cookies.
 
 There were suggestion to throw DOM_QUOTA_ERROR exception when storing to 
 localStorage in case when cookies are in a session-only mode for a 
 host/page. It makes sense from several reasons. This way web apps may 
 decide or inform user about situation that the page cannot store to 
 localStorage while there is no way for the page to figure out that 
 cookies mode is session-only and web app still may freely read from the 
 storage.
 
 My suggestion is then:
 - allow a page to obtain valid localStorage object
 - allow read from it
 - throw DOM_QUOTA_ERROR when storing to it
 
 But I don't know what to do in case of call to clear() and removeItem() 
 methods. It would exposes the cookie behavior again when it 
 fails/throws.

Wouldn't it be best for the UA to store the data as normal, and report to 
the user in some informative messages somewhere in the interface that data 
has been stored but will be discarded? (The UA could even offer to save 
it, in fact.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] localStorage behavior when cookies mode is session-only

2009-06-03 Thread Jeremy Orlow
On Wed, Jun 3, 2009 at 1:15 PM, Ian Hickson i...@hixie.ch wrote:

 On Thu, 9 Apr 2009, Honza Bambas wrote:
 
  In the W3C spec for localStorage
  http://dev.w3.org/html5/webstorage/#the-localstorage-attribute is said
  to present it (the persistent storage) the same way as cookies.
 
  There were suggestion to throw DOM_QUOTA_ERROR exception when storing to
  localStorage in case when cookies are in a session-only mode for a
  host/page. It makes sense from several reasons. This way web apps may
  decide or inform user about situation that the page cannot store to
  localStorage while there is no way for the page to figure out that
  cookies mode is session-only and web app still may freely read from the
  storage.
 
  My suggestion is then:
  - allow a page to obtain valid localStorage object
  - allow read from it
  - throw DOM_QUOTA_ERROR when storing to it
 
  But I don't know what to do in case of call to clear() and removeItem()
  methods. It would exposes the cookie behavior again when it
  fails/throws.

 Wouldn't it be best for the UA to store the data as normal, and report to
 the user in some informative messages somewhere in the interface that data
 has been stored but will be discarded? (The UA could even offer to save
 it, in fact.)


This seems sane, but it definitely doesn't fit into Apple's private
browsing model, as far as I understand it.


[whatwg] Something better than DOM_QUOTA_ERROR when LocalStorage is immutable?

2009-06-03 Thread Jeremy Orlow
*Please, keep this on topic.  There's no point to rehashing
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019238.htmlor
any of the other similar debates on private browsing and
localStorage's
persistence guarantees.*

When in private browsing mode, WebKit should not write any data to the
hard drive.  In addition, WebKit
does not allow changes to localStorage that aren't going to be written to
disk.  Currently, it returns a DOM_QUOTA_ERROR on setItem when private
browsing is enabled, and silently fails for removeItem and clear.  The
silent failures are obviously bad, but even the (ab)use of DOM_QUOTA_ERROR
kind of bothers me.

Is there an exciting exception that'd work better to tell the script the
change failed because localStorage is currently immutable?  If not, is
there any chance it could be added to the spec?

Obviously only browser vendors that share WebKit's philosophy on
localStorage's guarantee of persistence would actually use this, but I
think it'd be far better than the current behavior.

Thanks,
Jeremy


Re: [whatwg] document.contentType

2009-06-03 Thread João Eiras



In HTML5, HTML elements in text/html are put in the XHTML namespace and 
text/html might contain SVG or MathML elements, so you probably want to 
conditionally call getElementsByTagNameNS based on e.g. the root element's 
namespaceURI rather than the document's HTMLness.



I think the major advantage of document.contentType is to know the value of the 
Content-Type header (without charset) sent by the server.
This would be good for 3rd party libraries or client side scripts.


Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]

2009-06-03 Thread Ian Hickson

I haven't made any changes to the spec based on the feedback below. Let me 
know if there's anything I missed. I'm not aware of any specific problems 
at this time.

On Sat, 11 Apr 2009, Øistein E. Andersen wrote:

 On 22 May 2008, at 12:40, Ian Hickson wrote:
 
  Do you have input on the EUC-JP issue?
 
 I am now about to finish my analysis of CJK encodings (e-mail forthcoming),
 including EUC-JP.  This encoding does not seem to be particularly problematic,
 however.  Are you referring to a specific problem?
 
  On Thu, 13 Mar 2008, Øistein E. Andersen wrote:
   Note: Similarly, IE apparently handles CS-ISO-2022-JP as distinct from
ISO-2022-JP. This is something to keep in mind when looking at
multi-byte encodings.
  
  What should we say about this?
 
 The issue seems to be that IE's implementation of ISO-2022-JP is a large
 superset of what is actually specified.  (This is the case for several other
 CJK encodings as well.)  See forthcoming e-mail for an actual description of
 the extensions.
 
   (TC)VN5712-2  (TC)VN5712-1
   
   Opera[?] and Firefox seem to have implemented the superset only.
  
  Should we require this mapping?
 
 For reference:
 (TC)VN5712-3(TC)VN5712-2 = VSCII-2 = ISO IR 180(TC)VN5712-1
 
 Only the complete set seems to be implemented (and only in Firefox), and MIME
 charset strings referring to one of the subsets do not seem to work at all, so
 no mappings are necessary.
 
 

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]

2009-06-03 Thread Ian Hickson
On Sun, 12 Apr 2009, Øistein E. Andersen wrote:
 On 2 Sep 2008, at 06:06, Ian Hickson wrote:
 
  On Wed, 30 Jul 2008, Øistein E. Andersen wrote:
   
   1. Opera, Firefox and Safari all handle US-ASCII as Windows-1252.
  IE7, on the other hand, simply ignores the high bit (as it does for
  a few other 7-bit encodings, by the way).  Perhaps this
  alias could be dropped from the other browsers.
  
  Ignoring the high bit seems like a dangerous security bug; dropping any
  character with a high bit as U+FFFD seems unnecessarily drastic.
 
 According to a test I did using browsershots.org, IE8 actually seems to do
 this (8-bit characters are rendered as squares), which looks like an argument
 in favour of the more `drastic' option.
 
  I've made the spec go with the O/F/S behaviour here.
 
 This has the advantage of not adding ASCII as a separate encoding, and
 Windows-1252 is presumably one of the encodings most often mislabelled as
 ASCII.  However, IE has ignored the high bit at least since 5.01 (IE4 via
 browsershots.org treats it as CP1252, but this could well be
 locale-dependent), so there may not be that many mislabelled pages.  Has
 anyone got a list of pages which are labelled as ASCII and contain 8-bit
 characters?
 
 This is probably not very important.  U+FFFD is `purer', Windows-1252 has the
 potential of rescuing a few pages.  It is however essential that 8-bit
 characters be considered not conforming since they do not in fact work (as
 Windows-1252 bytes) in IE5-IE8.  This is currently the case, but I think Henri
 Sivonen has argued that `misinterpretation for compatibility' should not be
 considered a conformance error (which would probably be fairly harmless for
 other mappings).

I (and the spec) agree with you here, that these should be reported as 
errors.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]

2009-06-03 Thread Ian Hickson
On Tue, 14 Apr 2009, Øistein E. Andersen wrote:

 This e-mail is an attempt to give a relatively concise yet reasonably complete
 overview of non-Unicode character sets and encodings for `Chinese characters',
 excluding those which are not supported by at least one of the four browsers
 IE, Safari, Firefox and Opera (henceforth `all browsers'), and tentatively
 avoiding technical details which are out of scope for HTML5 unless they are
 important to gain a general understanding of the relevant issues.
 
 To avoid unnecessary confusion, the following three concepts are kept
 distinct:
 
 1) Character set: A collection of characters, typically defined as a matrix
 with 94 rows and 94 columns.  (A character set with more than one matrix is
 said to have multiple planes.)  The ones officially registered `for use with
 escape sequences' (typically in ISO-2022 encodings, see below) can be found at
 http://www.itscj.ipsj.or.jp/ISO-IR/overview.htm.
 
 2) Encoding: Defines how a given character (typically defined by its row and
 column numbers) from a given character set can be encoded as a sequence of
 bytes.  All the encodings discussed below allow multiple character sets to be
 encoded.  [ISO-2022 encodings use only 7-bit bytes and employ escape sequences
 to switch between different character sets. EUC encodings use bytes  128 for
 ASCII (or something similar) and bytes = 128 to encode other character sets.]
 
 3) MIME charset string: This is the string used, e.g., in a HTTP Content-Type
 header to indicate the *encoding*.  Many of these can be found at
 http://www.iana.org/assignments/character-sets.
 
 Some information about browser support for specific character sets, encodings
 and MIME charset strings can be found at
 http://coq.no/character-tables/mime/iso-2022/en,
 http://coq.no/character-tables/mime/euc/en and
 http://coq.no/character-tables/mime/locale-specific/en.
 
 The notation a  b means that a is a proper subset of b; a and b can be either
 character sets or encodings.
 
 
 **
 * What should HTML 5 say about all this? *
 **
 
 This section gives a summary of superset encodings which are either
 universally supported or potentially needed for compatibility.
 
 (Anyone who is going to read the entire e-mail will probably prefer to read
 the sections *Chinese*, *Japanese* and *Korean* at this point and return to
 this section afterwards.)
 
 
 Superset encodings (stricto sensu)
 --
 
 HTML5 currently contains a table of encodings aliases, of which the following
 involve Chinese characters:
 
 1) EUC-KR  -  Windows-949
 2) GB2312  -  GBK
 3) GB_2312-80  -  GBK
 4) KS_C_5601-1987  -  Windows-949
 5) x-x-big5-  Big5
 
 EUC-KR  Windows-949, and all browsers do 1), so this is reasonable and
 probably needed.
 
 GB2312 and GB_2312-80 technically refer to the *character set* GB 2312-80,
 which can be expressed not only in EUC-CN encoding, but also in ISO-2022-CN
 encoding and HZ encoding.  GBK, on the other hand, is an encoding.  EUC-CN 
 GBK.  It would be more correct to remove 2) and 3) and instead add:
EUC-CN  -  GBK
 
 Admittedly, EUC-CN is sometimes called `8-bit GB encoding', and registered
 MIME charset strings include GB_2312-80 and GB_2312-80 as distinct entries
 (but not EUC-CN), so a note to this effect might be appropriate.
 
 (Additionally, GBK is slightly ambiguous, so make sure not to reference an
 incomplete or outdated version without pointing out necessary
 amendments/additions.)
 
 Similarly, EUC-KR is sometimes referred to as `eight-bit KS' or
 `KS_C_5601-1987', which Ken Lunde characterises as `incorrect and dangerous'
 in his book /CJKV Information Processing/.  It would be more correct to remove
 4).
 
 Unlike EUC-CN, EUC-KR is a registered MIME charset string, but KS_C_5601-1987
 has a distinct entry, so a note might again be appropriate.
 
 As for 5), the MIME charset string x-x-big5 does indeed correspond to Big5
 encoding (or rather an extension thereof) in all browsers but Opera.  There is
 a large number of unregistered charset strings, however, and the other
 mappings in this table are between encodings.  Unless x-x-big5 is actually
 supposed to refer to an encoding distinct from Big5, 5) should be removed.
 
 Instead (depending on the reference ultimately given for Big5), it may be
 necessary to note that at least certain ETen extensions should be regarded as
 part of Big5.

I believe you misunderstand the purpose of this table. The idea is to give 
a mapping of _labels_ to encodings, not encodings to encodings. I've 
clarified the text to this effect.



 In addition, Shift_JIS  Windows-31J, and all browsers implement this mapping,
 so the following should be added:
Shift_JIS   -  Windows-31J

Added.


I haven't added the mappings described below, since they are not all 
implemented uniformly. If specific mappings are 

Re: [whatwg] The keygen element

2009-06-03 Thread Ian Hickson
On Sun, 12 Apr 2009, Nelson B Bolyard wrote:
 Yngve Nysaeter Pettersen wrote:
 
  The default format, introduced by Netscape, is the SPKAC format, see 
  the above link, and includes the public key and the Keygen challenge 
  attribute, and is signed by the private key.
 
  The actual standardized format is PKCS #10, in form a more advanced 
  and flexible version of SPKAC (it is the format used to request 
  certificates for webservers), and I am not sure if this is now used 
  by default in some clients. In Opera this format can be selected by 
  using a type=pkcs10 attribute in the keygen tag.
 
 That's an interesting idea.  But PKCS#10 is like a self-signed 
 certificate. It has a full-blown X.500 Directory Name in it, just like a 
 certificate, and the KEYGEN tag doesn't provide the input for that.  I 
 guess the browser could prompt the user, perhaps using a form something 
 like:
 
 http://mxr.mozilla.org/security/source/security/nss/cmd/certcgi/main.html
 
 But heaven help the user to fill that in! :-/
 
 Also like a real certificate, a PKCS#10 certificate request may have 
 extensions.  This is the way that a cert requester requests that 
 particular extensions be put into his cert.  Again, the keygen tag has 
 no way of specifying these.  But the browser could use a form like:
 
 http://mxr.mozilla.org/security/source/security/nss/cmd/certcgi/stnd_ext_form.html
 
 There's one other problem with PKCS#10 (and SPKAC too) that I mentioned 
 before: it only works with public keys that can be used for signing. If 
 you have an encryption only key, you can't request a cert for it with 
 PKCS#10 because doing so requires generating a signature with it.
 
 To solve these and other problems, an alternative protocol named CRMF 
 (the Certificate Request Message Format) was created.  Mozilla 
 supports that with the crypto.generateCRMFRequest method.  If we're 
 really going to standardize something like a keygen tag, we should 
 design it to be able to do the things that can be done with 
 crypto.generateCRMFRequest, too. That should not be difficult.  See
 
 https://developer.mozilla.org/En/JavaScript_crypto/GenerateCRMFRequest

I agree that standardising this would be a good idea; I recommend 
approaching the public-webapps WG at the W3C to do this.


  I haven't added this, because right now the only browser I could find 
  which supports more than one algorithm is Firefox, and it just has two 
  (RSA and ECs, as far as I could tell).
 
 And DSA.

The DSA code doesn't appear to be hooked up.


  I haven't added ECs to HTML5 since I couldn't find any documentation 
  on it (the above isn't updated yet). Also, I omitted DSA support which 
  is claimed to be supported on the above page, because as far as I can 
  tell nobody actually supports it.
 
 It's not popular in the commercial world, but I think a certain 
 government still likes it. :)

I'm definitely not adding features to HTML5 for a single vendor, even if 
that vendor has an army.


 Which is more likely to be adopted as a cross browser standard? A new 
 html tag? or a new JavaScript object/method?

It would presumably depend on how it is to be used. If it's for form 
submission, then an element would make more sense. If it's for 
applications, then an API would be better.


On Mon, 13 Apr 2009, Anders Rundgren wrote:
 
 On-line provisioning of PKI is rather little used because the big users 
 of PKI (banks and governments), prefer using physical token distribution 
 like for PIV/CAC/eID.
 
 What those large users have not bothered much with to date is how they 
 are going to use PKI in the most popular IT-device there is, the mobile 
 phone. IMHO the availability of trusted HW at a very small premium 
 motivates a completely new key-generation scheme, presumably based on 
 TPM 1.2 or enhanced TPM-schemes.
 
 Regarding the keygen tag itself, I personally don't see that such 
 mechanisms need any explicit links to an HTML page, at least none of the 
 alternatives including generateCRMFRequest and CertEnroll do, they are 
 just APIs.
 
 To give you an indication of that key-generation standards is not an 
 easy task, IETF's KEYPROV has been running for almost three years!
 
 My own contribution to this field, KeyGen2, requires not less than six 
 message rounds compared to keygen's three.  Take a peek at the [beta] 
 XML Schema at: http://keycenter.webpki.org/resources in case you are 
 interested

Thanks for the info!


On Fri, 17 Apr 2009, Anders Rundgren wrote:

 I understand what you are saying, but without a buy-in from Microsoft 
 there is little point in elevating keygen to some kind of standard 
 since it will fail in the majority of cases.

Even if IE's market share stops dropping (which it shows no signs of 
doing), I believe that getting interop amongst three browser vendors is an 
important enough goal that it is still worth standardising even if 
Microsoft never implement keygen.

Cheers,
-- 
Ian Hickson   U+1047E  

Re: [whatwg] Nested optgroups

2009-06-03 Thread Ian Hickson
On Mon, 13 Apr 2009, Markus Ernst wrote:

 I found a message in the list archives from July 2004, where Ian 
 announced to put nested optgroups back into the spec: 
 http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2004-July/001200.html
 
 Anyway in the current spec, the optgroup element is not allowed inside 
 another optgroup element: 
 http://www.whatwg.org/specs/web-apps/current-work/#the-optgroup-element
 
 Has this been removed again since 2004? I did not find more on this in 
 the list archives.

Yeah, this was removed because we couldn't find a good way to get browsers 
to support it without breaking backwards compatibility with legacy content 
(which relies on the non-nesting parser behaviour).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] several messages

2009-06-03 Thread João Eiras



The ImageData APIs already provide the ability to do this and are
already supported by Firefox, Opera and Safari.


Given the ImageData APIs, and given that they are generally more efficient
at the typical use cases for getPixel/setPixel, I haven't added getPixel/
setPixel to the spec.

Cheers,


The opera-2dgame context also has APIs for collision detection and control 
painting
http://my.opera.com/WebApplications/blog/show.dml/200788
I'm not familiar with the current canvas spec, but are these features supported 
somehow? Or would they have the potential to be included ?


Re: [whatwg] native ordered dictionary data type in web storage draft

2009-06-03 Thread Ian Hickson
On Tue, 14 Apr 2009, Patrick Mueller wrote:

 The last paragraph in section 4.6 of the Web Storage draft (10 April 
 2009), mentions a native ordered dictionary data type.  The URL to the 
 section in the draft is here:
 
 http://dev.w3.org/html5/webstorage/#database-query-results
 
 This is the first time I've seen the requirement for such a beast.  You 
 can understand the desire for it, given the context, but still.  Does 
 anything else in JavaScript make use of such a data structure?

 It's not clear to me how you would even use it, without something like a 
 list comprehension, or some other functional construct.  It's hard to 
 imagine how someone might make use of the ordered-ness in a plain old 
 for/in loop, for instance.
 
 It would also be impossible, in the JavaScript in use today, AFAIK, to 
 emulate this with user-land JavaScript.

On Tue, 14 Apr 2009, Aryeh Gregor wrote:
 
 It says that JavaScript should just use Object.  Isn't that, 
 essentially, an ordered dictionary?

On Tue, 14 Apr 2009, James Graham wrote:
 
 Yes. Indeed there are compatibility requirements for the ordering of 
 ordinary user-created Object Objects in web browser implementations; the 
 order of enumeration must be the same as the order of insertion of the 
 properties.

On Tue, 14 Apr 2009, Patrick Mueller wrote:
 
 Interesting.  I guess this is a JavaScript in web browser 
 implementation difference from the JavaScript spec.  Following the 
 links in jresig's blog post
 
http://ejohn.org/blog/javascript-in-chrome/
 
 in the for loop order section.
 
 Still doesn't seem like it makes sense to go ahead and build 
 dependencies on this (unfortunate, IMO) behavior.

On Tue, 14 Apr 2009, Aryeh Gregor wrote:
 
 Isn't HTML5 all about mandating and building dependencies on unfortunate 
 but entrenched behavior?

On Tue, 14 Apr 2009, Patrick Mueller wrote:
 
 This seems slightly different because it's making a dependency on 
 (unspec'd) JavaScript behavior.  Though I'd guess there are other 
 examples.
 
 This one may also be significant in that, as we start to see JS usage in 
 other environments, like servers, folks may want to reuse something like 
 the sql access defined in here in those environments.  Who wants two 
 different ways to talk to sql?  It would be nice for this bit to be as 
 clean as it can be.

On Tue, 14 Apr 2009, Maciej Stachowiak wrote:
 
 FWIW I believe the next version of the ECMAScript spec will specify the 
 order of for..in enumeration.

On Tue, 14 Apr 2009, Patrick Mueller wrote:
 
 Checking some EcmaScript spec pages, like this one:
 
 http://wiki.ecmascript.org/doku.php?id=es3.1:es3.1_proposal_working_draft
 
 it appears that current versions of the spec have basically removed the 
 description that the properties are unordered, without specifying that 
 they're ordered, or how their ordered.  But a step in the 'right' 
 direction, I suppose.

On Tue, 14 Apr 2009, Jonas Sicking wrote:
 
 As I understand it, the web already depends on this behavior. IIRC 
 EcmaScript 3.1 is going to mandate this behavior, so it'll be specced 
 behavior soon.

Based on the comments above, I have not changed anything in the spec.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Canvas extensions from opera-2dgame

2009-06-03 Thread Ian Hickson
On Thu, 4 Jun 2009, João Eiras wrote:
 
 The opera-2dgame context also has APIs for collision detection and control
 painting
 http://my.opera.com/WebApplications/blog/show.dml/200788
 I'm not familiar with the current canvas spec, but are these features
 supported somehow? Or would they have the potential to be included ?

The collision detection is basically the same as isPointInPath(), already 
in HTML5, and the locking is implicit in most canvas implementations (it 
only repaints in between script executions, and this is actually called 
out in HTML5 as the right thing to do), so I don't think there's anything 
to add from 2dgame at this point.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] A few comments on the keygen tag

2009-06-03 Thread Ian Hickson
On Wed, 15 Apr 2009, Anders Rundgren wrote:

 Now to the really problematic stuff:  keygen is not really an HTML 
 tag, it is actually 2 phases of a 3-phase key provisioning protocol.
 I don't see why a protocol should be plugged into a page GUI.  The 
 alternatives all use APIs or specific plugins that indeed may be spawned 
 from an HTML page but that's something completely different.

I agree, keygen seems like a poor design. That's one of the reasons I 
didn't extend it in HTML5; we're just defining what it does in browsers so 
that new browsers can implement it if they want to be compatible with the 
existing browsers.


On Wed, 15 Apr 2009, Maciej Stachowiak wrote:
 
 HTML5 is meant to specify every HTML feature that you need to implement 
 a browser than can handle the real-world Web. At this point, anyone 
 implementing a new browser engine would have to support keygen. 
 However, none of this rules out the possibility of putting more advanced 
 crypto functionality into browsers, either via HTML or a separate spec. 

Indeed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Vulgar fractions

2009-06-03 Thread Ian Hickson
On Thu, 16 Apr 2009, �istein E. Andersen wrote:

 Currently, only a limited set of vulgar fractions can be expressed in 
 HTML, viz, those that exist as pre-composed characters in Unicode.  
 (For example, 16ths and 32nds, which are often used with imperial units, 
 are not included.) This can be solved in several ways:
 
 1) According to Unicode (http://unicode.org/book/ch06.pdf, p. 154), 
 `any sequence of one or more decimal digits, followed by the fraction 
 slash [U+2044], followed by any sequence of one or more decimal digits 
 ... should be displayed as a unit, such as ¾'.  Furthermore, Unicode 
 Technical Report No. 20, `Unicode in XML and other Markup Languages', 
 says that the fraction slash is suitable for use with mark-up.  
 Unfortunately, no browser seems to transform a sequence like 3⁄16 into 
 a proper vulgar fraction.  If, however, browsers are willing to 
 implement this, no changes to HTML are needed.
 
 2) Unicode Technical Report No. 20 also suggests that specific mark-up, MathML
 in particular, may be used instead.  MathML mark-up is a bit more verbose:
   math xmlns=http://www.w3.org/1998/Math/MathML;
   mfrac
   mn3/mn
   mn16/mn
   /mfrac
   /math
 Unfortunately, this corresponds to a mathematical fraction rather than a
 vulgar fraction, which means that it has a horizontal fraction bar and that it
 takes up too much vertical space.  (Admittedly, vulgar fractions may have
 horizontal fraction bars as well, but this is not suitable for on-screen
 viewing since the numbers get tiny, whereas a diagonal fraction bar only
 requires the numbers to be scaled to 60% vertically and 65% horizontally, as
 suggested in the PostScript Language Cookbook.)
 
 2') There is a MathML attribute called `bevelled' which is indicates that at
 diagonal line should be used to separate numerator from denominator:
   math xmlns=http://www.w3.org/1998/Math/MathML;
   mfrac bevelled=true
   mn3/mn
   mn16/mn
   /mfrac
   /math
 However, Firefox and Opera both display this as small3 / 16/small, and the
 example given in the MathML specification
 http://www.w3.org/TR/MathML3/image/f3008.gif suggests that this is not
 really meant for vulgar fractions at all.  This could still be a solution if
 mfrac bevelled=true/ is defined to correspond to a vulgar fraction if both
 numerator and denominator are mn/ elements consisting of digits 0--9.
 
 2'') Alternatively, a new attribute (e.g., `vulgar') could be added to cover
 this case.
 
 3) If neither special handling of the fraction slash nor a MathML solution for
 vulgar/non-mathematical fractions is possible, the only remaining solution
 would be to add specific mark-up to HTML directly.
 
 (I am aware that fractions have been proposed earlier in the context of
 mathematical formulae, but I have not been able to find any previous
 discussion regarding vulgar fractions.)

Since HTML5 supports MathML natively now, it seems that MathML is the 
solution to use here.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] scope of link type license

2009-06-03 Thread Ian Hickson
On Fri, 17 Apr 2009, Nils Dagsson Moskopp wrote:
 
 I am unsure on what scope the current document should have in 
 subsection 5.11.3.9 (Link type license) of the spec and how it 
 functions (or should play) together with sectioning.
 
 Let us consider this example, a creative-commons-licensed video:
 
 figure
 video src=foo.ogg/
 legend
 a href=http://example.org/person/misterx; rel=author
   Mister X.
 /a
 a href=http://creativecommons.org/licenses/by-sa/3.0/;
 rel=license
   CC BY-SA 3.0
 /a
 /legend
 /figure
 
 As I understand it, in this example, due to the provision of subsection 
 5.11.3.3 (Link type author) For a [...] elements, the author keyword 
 indicates that the referenced document provides further information 
 about the author of the section that the element defining the hyperlink 
 applies to., the linked author could be considered the author of the 
 video.
 
 However, subsection 5.11.3.9 (Link type license) has no such 
 provision. I therefore propose to amend For a and area elements, the 
 license keyword indicates that the referenced document provides further 
 information about the author of the section that the element defining 
 the hyperlink applies to. to subsection 5.11.3.9 of the HTML 5 spec.

Your interpretation of the spec is correct. This is based on actual 
implementations of rel=license, which apply it to the whole document. I 
expect that if we need per-sub-resource licensing information then a more 
specific solution will be developed, possibly using the microdata idea.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] document.contentType

2009-06-03 Thread Boris Zbarsky

João Eiras wrote:
I think the major advantage of document.contentType is to know the value 
of the Content-Type header (without charset) sent by the server.


That's not what document.contentType returns in Gecko, though... 
(especially if the server-sent value was empty, or didn't parse as a 
valid content-type header).  What's the use case for knowing what the 
server sent, as opposed to what the UA is treating the page as?


-Boris


Re: [whatwg] Nested optgroups

2009-06-03 Thread Brett Zamir

Ian Hickson wrote:

On Mon, 13 Apr 2009, Markus Ernst wrote:
   

I found a message in the list archives from July 2004, where Ian
announced to put nested optgroups back into the spec:
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2004-July/001200.html

Anyway in the current spec, the optgroup element is not allowed inside
another optgroup element:
http://www.whatwg.org/specs/web-apps/current-work/#the-optgroup-element

Has this been removed again since 2004? I did not find more on this in
the list archives.
 


Yeah, this was removed because we couldn't find a good way to get browsers
to support it without breaking backwards compatibility with legacy content
(which relies on the non-nesting parser behaviour).
   
Would there be a way to allow a new element to trigger this behavior 
(maybe deprecating optgroup as well if an attribute on the new element 
could indicate compactness)? Along the lines of expanding HTML more 
toward regular applications, I would think this could help quite nicely 
for building menu bars or the frequently used navigation bars 
recommended by accessibility guidelines without JavaScript (CSS-only 
ones are rare)...


Brett


Re: [whatwg] Vulgar fractions

2009-06-03 Thread Jonas Sicking
On Wed, Jun 3, 2009 at 3:53 PM, Ian Hickson i...@hixie.ch wrote:
 Since HTML5 supports MathML natively now, it seems that MathML is the
 solution to use here.

Actually, it seems to me that the best solution is for browsers to
support the unicode fraction slash [U+2044], unless that has too bad
performance problems.

Øistein: Have you talked to any browser vendor to see if there's a
reason they don't support this yet? At least firefox supports
ligatures these days which is somewhat similar.

But ultimately I agree, since we support MathML, it seems like a
solution should live there if a markup solution is needed.

/ Jonas


Re: [whatwg] Pre-Last Call Comments

2009-06-03 Thread Andrew W. Hagen
In current-work, section 4.6.6, there is this explanation of the small 
element:


Small print is typically legalese describing disclaimers, caveats, 
legal restrictions, or copyrights. Small print is also sometimes used 
for attribution.


This paragraph should be removed. Please do not advocate, encourage, or 
indicate acceptance that any legal text should appear in small print. In 
the law, in some circumstances, the size of the print can support an 
argument that a contract, disclaimer, restriction, caveat, legalese, or 
other legal text should be ruled invalid by a court. Furthermore, it is 
generally a bad idea to encourage people to put legal text in small 
print. This includes copyright notices, as well as any other notice. 
Legal text should appear in regular-sized print to keep it as readable 
as possible. When legal text is put into small print, that is 
regrettable. The paragraph should be removed. The best policy would be 
to not mention legal text in the context of the small element.


Secondly, a subsequent example paragraph is not quite right. The 
statement that it contains a copyright is off. It's just a notice of a 
copyright. Furthermore the general principle that should be recognized 
here is that no one should be led by example to place legal text in 
small print. The example in question might be changed to:


In this example the footer contains contact information and an aside.

footer
address
  For more details, contact
a href=mailto:j...@example.com;John Smith/a.
/address
psmallE-mail checked regularly./small/p
/footer


Thank you.

Andrew Hagen
contact2...@awhlink.com



[whatwg] the cite element

2009-06-03 Thread Andrew W. Hagen
The cite element should be slightly changed. Under this proposal, the 
cite element should be used only for titles of works, but may be used 
for other things that web authors may wish to cite. This conforms with 
how the cite element is used in practice.


In the current HTML 5 specification, the cite element can only represent 
a title of a work. This has several negative implications. First, it 
goes against what the word cite means. The common English usage of the 
word cite includes making reference to non-titular authorities. For 
example, a writer may cite Aristotle. See 
http://www.merriam-webster.com/dictionary/cite


Furthermore, the current restriction makes the cite element useless for 
works which do not have a title. See a list of such works at 
http://en.wikipedia.org/wiki/Untitled


Trying to enforce a titles-only rule for the cite element is 
impossible. The best that can be done is for small bands of advocates to 
ringingly criticize any web author who breaks the rule. That is herding 
cats. It is not as if browsers will refuse to render cite 
style=font-style: normalLincoln/cite or that validators can 
distinguish that from Gore Vidal's citeLincoln/cite (a historical 
novel). The restrictive rule cannot be enforced.


Finally, HTML 5 has a broad definition for some elements, such as kbd. 
The kbd element can represent any form of user input, even if it is not 
made with a keyboard. In current-work, one example is given of 
kbdkbdShift/kbd+kbdF3/kbd/kbd for Shift+F3, even though in 
that keyboard chord, the user would not actually input the + character 
on the keyboard. It is so broadly defined that 
kbdShift/kbd+kbdF3/kbd would also be valid. Some elements, like 
kbd, are very broad.


Logical consistency cannot be perfectly maintained when specifying the 
next version of HTML, but it should be a goal, and we ought to regret a 
logical inconsistency between the cite element and elements like kbd. 
One is narrow. The other is broad. Broadening the definition of cite 
will not cause harm. It would only allow web authors to fully embrace 
the cite element.


This solution is workable. The cite element's default style is italics 
in display mode, and this proposal would not change that. If a web 
author writes: citeAristotle/cite, the web author can live with it 
or re-style the cite element as desired.


To conclude, slightly broadening the cite element would improve HTML.

Andrew Hagen
contact2...@awhlink.com