[whatwg] Microdata feedback
On Thu, 12 Nov 2009, Philip Jägenstedt wrote: I've been playing with the microdata DOM APIs again, continuing the JavaScript experimental implementation http://gitorious.org/microdatajs. It's not small or elegant, but at least some spec issues have come up in the process. What is the http://www.w3.org/1999/xhtml/microdata# URI? It provides a way to map microdata property names to URLs in an unambiguous way. http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#associating-names-with-items Otherwise, if one of the other elements in pending is an ancestor element of candidate, and that element is scope, then remove candidate from pending. Otherwise, if one of the other elements in pending is an ancestor element of candidate, and that element also has scope as its nearest ancestor element with an itemscope attribute specified, then remove candidate from pending. The intention of these requirements seems to be to eliminate redundant elements in pending, but a comment on the intention of each in the spec would be helpful as it's quite cryptic right now. Added some brief explanations. http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#microdata-dom-api itemtype and itemid are both URL attributes and therefore when getting itemType and itemId relative URLs should be resolved (even if only absolute URLs are valid). Correct? That was a correct interpretation of the spec, but was only intended to be the case for itemid. I've corrected the spec to say that itemType is just a regular DOMString with no resolution. itemprop and itemref are both unordered set of unique space-separated tokens, but in HTMLElement only itemProp is a DOMSettableTokenList while itemRef is a DOMString. This doesn't really make sense, so make itemRef a DOMSettableTokenList too? Fixed. That was an oversight. From reading the spec it's not obvious (without following cross- references) that itemProp isn't just a plain string. An example using .itemProp.contains(name) or similar would make this more difficult to miss. Done. http://www.whatwg.org/specs/vocabs/current-work/#vcard Having clickable cross-references in this spec would help a lot when reviewing! I've put them back in the HTML5 spec, which makes this a moot point. Grammar: Let value *be* the result of collecting the first vCard subproperty named value in subitem. Fixed. Let n1 be the value of the first property named family-name in subitem, or the empty string if there is no such property or the property's value is itself an item. Why not use collecting the first vCard subproperty here? Not doing so had me trying to find how the two were different, but I couldn't find any differences given that the values are later escaped. Oops. Fixed. There's also the issue of how newlines from textContent values are escaped. Applying the vCard extraction algorithm to the spec example gives: BEGIN:VCARD PROFILE:VCARD VERSION:3.0 SOURCE:http://foolip.org/microdatajs/demo/vcard.html NAME:vCard demo FN:Jack Bauer PHOTO;VALUE=URI:http://foolip.org/microdatajs/demo/jack-bauer.jpg ORG:Counter-Terrorist Unit;Los Angeles Division ADR:;;10201 W. Pico Blvd.;Los Angeles;CA;90064;United States GEO:34.052339;-118.410623 TEL;TYPE=work:+1 (310)\n 597 3781 URL;VALUE=URI:http://en.wikipedia.org/wiki/Jack_Bauer URL;VALUE=URI:http://www.jackbauerfacts.com/ EMAIL:j.ba...@la.ctu.gov.invalid TEL;TYPE=cell:+1 (310) 555\n 3781 NOTE:If I'm out in the field\, you may be better off\n contacting Chloe O'B rian if it's about\n work\, or ask Tony Almeida if\n you're interested in the CTU five-a-side football team we're trying\n to get going. AGENT;VALUE=VCARD:BEGIN:VCARD\nPROFILE:VCARD\nVERSION:3.0\nSOURCE:http://fo olip.org/microdatajs/demo/vcard.html\nNAME:vCard demo\nEMAIL\;VALUE=URI:ma ilto:c.obr...@la.ctu.gov.invalid\nfn:Chloe O'Brian\nN:O'Brian\;Chloe\;\;\; \nEND:VCARD\n AGENT:Tony Almeida REV:2008-07-20T21:00:00+0100 TEL;TYPE=home:01632 960 123 N:Bauer;Jack;;; END:VCARD TEL and NOTE has line breaks that are just because of how the HTML source is formatted. Importing this into Gmail preserves these linebreaks which looks quite broken. Unless we expect text fields to contain meaningful formatting, perhaps simply collapsing all whitespace into a single space is OK? In the best of worlds br would be converted to \n, but I'm not sure if it's worth the trouble. We're screwed either way. If we convert newlines to , then we lose formatting from pre. If we don't convert newlines, we gain spurious linebreaks (and spaces). The latter is less destructive, which is why I picked it, but it's not ideal, I agree. I'd like at some point to introduce some sort of semantic textContent that handles br, pre, bdo, dir=, img alt, del, space- collapsing, and newline elimination, but there hasn't been much enthusiasm around the idea, and it's not clear what else
Re: [whatwg] Microdata feedback
Hixie wrote: Finally on vCard, the final part of the extraction algorithm goes to great trouble to guess what is the family name and what is the given name. This guess will be broken for transliterated east Asian names (CJKV that I know of, maybe others too). Just saying. Also, why is it important to explicitly add N: for organizations? This is intended to be compatible with Microformats vCard, which has these weird rules. If you think we should remove them, please at least first speak to Tantek and see why he thinks. The fn optimisation pattern isn't intended to catch 100% of cases, just the situation Firstname Lastname or Firstname Middlename Lastname. So if you just use fn (formatted name) and don't use n (name), the name will be extracted/guessed using the optimisation pattern. In cases where the pattern doesn't work (e.g. Anne van Kesteren, or east Asian names) you can still explicitly specify the family name and given name, over-riding the fn optimisation pattern. If you do this, you need to explicitly state this is the name (n) as well as the formatted name (fn). Similarly, for organisations, you don't have to explicitly set n (name) if you apply both fn (formatted name) and org (organisation name) to a string. This time, the optimisation pattern assumes that the fn is the name of the organisation. Technically, the n property is *always* required but if you use either of those two optimisation patterns, the n is inferred from fn. HTH, Jeremy -- Jeremy Keith a d a c t i o http://adactio.com/
[whatwg] bidi embedding for block-level elements
On 01/14/2010 12:49 AM, Simon Montagu wrote: On 01/11/2010 11:35 PM, fantasai wrote: On 11/26/2009 10:54 PM, Simon Montagu wrote: I assume your Gecko example is using a very recent version of Gecko, such as a nightly build or a beta of Firefox 3.6? I fixed this issue only a few months ago. The HTML standard does specify what to do in this case, see http://www.w3.org/TR/REC-html40/struct/dirlang.html#style-bidi: When a block element that does not have a dir attribute is transformed to the style of an inline element by a style sheet, the resulting presentation should be equivalent, in terms of bidirectional formatting, to the formatting obtained by explicitly adding a dir attribute (assigned the inherited value) to the transformed element. In practice, however, since browsers are not consistent, authors will have to use CSS properties to achieve the expected results. Does this mean applying unicode-bidi: embed to all block-level elements? Because that seems like it fulfill those requirements. I was thinking in terms of applying unicode-bidi: embed ad hoc whenever applying display: inline to a specific element, but applying it wholesale to all block-level elements will also work, of course. In that case, I suggest the we add it to the sample default style sheet for HTML 4 in the CSS2.1 appendix, and recommend the HTMLWG add some wording about block-level elements defining bidi embedding boundaries to the HTML5 spec (and perhaps using CSS's unicode-bidi: embed rule as an example). ~fantasai
Re: [whatwg] about:blank synchronicity
On 1/15/10 5:05 AM, Henri Sivonen wrote: I've located a Mozilla test case that seems to depend on the event loop task mapping of data: URL loads (http://mxr.mozilla.org/mozilla-central/source/layout/base/tests/chrome/test_bug533845.xul). Er... it does? Where? Does anyone happen to have data on whether the Web already depends on data: URLs that don't block the parser loading as a single event loop task? I don't think the web depends on data: URLs at all, really, so I would guess no. -Boris
Re: [whatwg] about:blank synchronicity
On 1/13/10 4:56 PM, Ian Hickson wrote: The spec currently distinguishes between the initial about:blank load (creation of a new browsing context), which actually doesn't involve navigation, and navigating to about:blank. It seems like simply making the first one synchronous, but making the latter asynchronous, would satisfy your use case. Would other vendors be ok with this? In case it wasn't clear from the relevant Gecko thread, I would personally be fine with this. That said, would initial about:blank load only include iframe/ (no src at all), or also iframe src=/ or also iframe src=about:blank/? I suspect it doesn't matter that much, actually, but would welcome confirmation. Would it have other problems? Are there cases other than navigation where about:blank being synchronous is detectable? (I couldn't find any.) I'm not sure what you're asking here... -Boris
Re: [whatwg] Microdata feedback
On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson i...@hixie.ch wrote: I've made it redirect to the spec. Could you say that the URL *should* provide human-readable information about the vocabulary? We all know the problems with having centrally-stored machine-readable data about your specs, but encouraging the URL to provide human-readable info seems helpful. (If they aren't supposed to be dereferenced, why use HTTP?) Graphs are intended to be supported in v2, using a mechanism You seem to have left this sentence unfinished.
Re: [whatwg] Microdata feedback
Aryeh Gregor wrote: On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson i...@hixie.ch wrote: I've made it redirect to the spec. Could you say that the URL *should* provide human-readable information about the vocabulary? We all know the problems with having centrally-stored machine-readable data about your specs, but encouraging the URL to provide human-readable info seems helpful. (If they aren't supposed to be dereferenced, why use HTTP?) ... SHOULD return human-readable information is good, if you also add SHOULD NOT automatically dereference. BR, Julian
Re: [whatwg] about:blank synchronicity
On Mon, 18 Jan 2010, Boris Zbarsky wrote: On 1/13/10 4:56 PM, Ian Hickson wrote: The spec currently distinguishes between the initial about:blank load (creation of a new browsing context), which actually doesn't involve navigation, and navigating to about:blank. It seems like simply making the first one synchronous, but making the latter asynchronous, would satisfy your use case. Would other vendors be ok with this? In case it wasn't clear from the relevant Gecko thread, I would personally be fine with this. That said, would initial about:blank load only include iframe/ (no src at all), or also iframe src=/ or also iframe src=about:blank/? I suspect it doesn't matter that much, actually, but would welcome confirmation. It would include any browsing context creation, including, e.g. window.open(), object pointing to an HTML file before the HTML file is loaded, etc. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] about:blank synchronicity
On 1/18/10 6:02 PM, Ian Hickson wrote: In case it wasn't clear from the relevant Gecko thread, I would personally be fine with this. That said, would initial about:blank load only include iframe/ (no src at all), or alsoiframe src=/ or alsoiframe src=about:blank/? I suspect it doesn't matter that much, actually, but would welcome confirmation. It would include any browsing context creation, including, e.g. window.open(),object pointing to an HTML file before the HTML file is loaded, etc. That wasn't quite my question. If I have an iframe src=about:blank/ in my source, would there be a sync about:blank document creation followed by an about:blank load? Or would the @src value just get ignored if it's about:blank? -Boris
Re: [whatwg] img copyright attribute
On Sat, 9 Jan 2010, will surgent wrote: It would be nice if there was a copyright attribute for the HTML 5 img tag. This would make it easy for users and search engines to filter out images that can not be used for certain purposes. On Sun, 10 Jan 2010, Jonny Barnes wrote: Or maybe a license attribute instead, that would include copyrighted work and stuff licensed under some CC or alternative. On Sat, 9 Jan 2010, Aryeh Gregor wrote: This is one of the things microdata/RDFa are meant to do. On Sun, 10 Jan 2010, Philip Jägenstedt wrote: http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#examples-4 On Sun, 10 Jan 2010, will surgent wrote: Hmm I didn't know about that. Thanks! On Sun, 10 Jan 2010, Dawid Czyzewski wrote: And why img only? this would also be good for audio and video. On Mon, 11 Jan 2010, will surgent wrote: That sounds like good idea (about the audio and video elements being included as-well). I just thought of it because Google does not allow one to specify the copyright or license in an image search as far as I know. having a license attribute would make it intuitive for developers to add the license the same way title and alt attributes are specified. On Tue, 12 Jan 2010, timeless wrote: external metadata on copyright is a disaster. it gets lost immediately. GIF and friends have supported embedding (c) into images for decades. As google is fully capable of caching images (and obviously does so), I question how adding a tag to html will solve a problem which is already solved by the native image formats themselves. For lack of a more useful reference about comment fields, i'll just point to one application which is aware of them (although at the time of the posting it only supported them for certain image types): http://www.group42.com/ts-wi04.htm Based on the above comments, I haven't changed anything -- the work vocabulary pretty much already addresses this use case in HTML, and addressing it in other formats is a problem for another working group. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'