Re: [whatwg] Interaction of explicit and implicit sections (was: Re: Question on (new) header and hgroup)
On Fri, 08 May 2009 00:58:21 +0200, Simon Pieters sim...@opera.com wrote: Actually I believe it would be: +--HTML 5 +--A new era of loveliness +--Navigation This surprised me when I used implicit sections and just wrapped articles around news items (which were h3s). I expected the outline to be like it was without the article: +--Site heading +--Page heading +--News item ...but instead it became (according to your and gsnedders' outliners): +--Site heading +--Page heading +--News item Maybe the spec should change here to match people's expectations better? OTOH, if we do this, then the default style sheet (the x x x h1 stuff) will be wrong. I guess the proper solution to that is to introduce :heading-level(n) that uses the outline algorithm. (But how to select and style subtitles in hgroup though?) -- Simon Pieters Opera Software
Re: [whatwg] Question on (new) header and hgroup
jgra...@opera.com writes: Quoting Smylers smyl...@stripey.com : James Graham writes: hgroup affects the document structure, header does not. That explains _how_ they are different (as does the spec), but not _why_ it is like that. More specifically: * Are there significant cases where header needs _not_ to imply hgroup ? Consider wrapping an hgroup inside every header ; how many places has that broken the semantics? I could believe that most of the cases where a pager header appropriately contains multiple headings they are subtitles rather than subsections. The semantic that authors seem to want from an element named header is All the top matter of my page before the main content. That could include headers, subheaders, navigation, asides and almost anything else. It could. But most of the above have no effect on the outline algorithm. In practice, how often do current div class=header sections contain headers of multiple sections, without those nested sections being separately wrapped in their own div-s (or similar, which could become section or whatever's appropriate in HTML 5)? Since the header can contain multiple distinct logical sections of the document, each with their own headers, it makes no sense to implicitly wrap its contents in hgroup. You're right. What I was really thinking of is something closer to: inside header if any hx elements are encountered before any nested sectioning elements then treat all the hx elements as being a single heading. So header could still contain section-s, with their own headings. And a header with no hx elements wouldn't create an empty entry in the outline. * Given the newness and nuance of header and hgroup and the distinction between them, it's likely that some authors will confuse them. Given that hgroup doesn't appear to do anything on the page (it's similar to invisible meta-data), it's likely that some authors will omit it[*1] when it's needed to convey the semantics they intend. Yes, that is possible. The thinking behind the change (or, at least, part of my reason for proposing it) was that it is less harmful if authors omit something where it would be useful than that they use it incorrectly in such a way that tools which follow the spec would be broken from the point of view of end users. That's a good point. In particular the old formulation of header would have caused the h2 element to be omitted from the outline in cases like header h1 My Blog/h1 nav h2 Navigation/h2 /nav /header , which would be confusing for users. Indeed. What I intended to raise for consideration (and hopefully now have done) is that header would not merge the above, because nav starts a new section inside header. Consider a similar example: header h1My Blog/h1 h2Ramblings of an internet nobody/h2 navh2Navigation/h2 ... /nav /header The spec currently has both the h2-s as subsections. The alternative I was thinking of would treat the h1 and first h2 as being a single heading (of the entire document), but keep the second h2 (as the heading of the navigation). On he other hand in the current formulation of the spec, the most likely error (omitting hgroup ) only has the effect that that outline heirachy is slightly wrong, with the subheader appearing as an actual header; it does not lead to data loss. This seems like a much better failure mode. That's true. But if the number of failures can be minimized, it matters less what the failure mode is. My concern is that with hgroup being so esoteric, combined with its effect being largely invisible, it will hardly be used and therefore possibly not worth adding to HTML 5. Authors don't have a good track record on accurately adding invisible metadata. If we can algorithmically get it right in most cases, while leaving a way for careful authors to explicitly override it if necessary, that may be better overall. * Are there significant cases where hgroup will be useful outside of header ? hgroup exists to allow for subtitles and the like. It's fairly common for documents to have these -- where it's likely there's use for a header element anyway. It's much less common for a mere section of a document to warrant a multi-part title; is that a case which is worth solving? If it is, would it be problematic to force authors to use header there? It seems highly odd to have header perform a dual role where sometimes it means section header and sometimes it means group of heading/subheading elements. Much more confusing than one element per role. I think the two concepts are sufficiently overlapping that it isn't really a dual role. header could mean 'section (or document) header' -- it would be used when a section's header consists of more than just a single hx element. Whether those elements are because of multi-part titles or search boxes or whatever is a distinction that authors would
Re: [whatwg] MessagePorts in Web Workers: implementation feedback
On May 7, 2009, at 5:40 PM, Drew Wilson wrote: Agreed that removing this requirement: User agents must act as if MessagePort objects have a strong reference to their entangled MessagePort object. would make MessagePort implementation much easier, as it would remove the need to track reachability across multiple threads. This requirement can get tricky especially as both sides can be cloned, in-flight to a new owner, etc. My only concern is that removing this requirement introduces non- deterministic behavior - if I have an entangled MessagePort and I register an onmessage() handler with it, then drop my reference to it, after which someone calls postMessage() on the entangled port, there's no way to tell if my onmessage() handler will be invoked ; it entirely depends on whether a GC happens first or not. That seems bad. That's a fair concern. I would mention a few counterpoints: 1) Nondeterministic behavior is inevitable with an API designed for concurrency. There are surely already possible cases of nondeterminism in the MessagePort API. Consider sending a message to two different workers and waiting for the reply. The replies may arrive in either order; indeed, the workers may receive the messages in either order, so if they are in communication with each other you cannot rely on one getting the message and performing its action first first. 2) The nondeterministic behavior in this case is easily avoided by what are in any case good coding practices: (a) don't drop all references to a MessagePort you are still using, and (b) call close() on the MessagePort when you are done with it and don't want more messages. 3) The alternatives on the table to removing this requirement are either removing the ability to use MessagePorts to communicate with Workers, or leaving the spec as-is with its attendant high implementation cost. Given all these factors, I think avoid nondeterminism in the one particular case you describe, when authors can already avoid it in a reasonable way, is not worth the order of magnitude increase in implementation complexity imposed by the entanglement keepalive requirement. I also think accepting this small amount of potential nondeterminism is preferable to excluding Workers from using MessagePorts. Thus, on the whole, I think the best option is to remove the keepalive requirement. Regards, Maciej -atw On Thu, May 7, 2009 at 3:28 PM, Maciej Stachowiak m...@apple.com wrote: I agree with Drew's assessment that MessagePorts in combination with Workers are extremely complicated to implement correctly, as currently specified. In fact, the design seems to push towards having lockable shared state, even though one potential advantage of the message passing design is to avoid locking and shared state. Besides removing MessagePorts as a way to communicate with workers, another possibility is simplifying the life cycle requirements. For example, getting rid of the keepalive rule, whereby both MessagePorts remain live so long as either is otherwise live, would remove the majority of the complexity. I don't think the slight convenience of that rule is worth the extra implementation cost. On May 7, 2009, at 1:39 PM, Drew Wilson wrote: Hi all, I've been hashing through a bunch of the design issues around using MessagePorts within Workers with IanH and the Chrome/WebKit teams and I wanted to follow up with the list with my progress. The problems we've encountered are all solveable, but I've been surprised at the amount of work involved in implementing worker MessagePorts (and the resulting implications that MessagePorts have on worker lifecycles/reachability). My concern is that the amount of work to implement MessagePorts within Worker context may be so high that it will prevent vendors from implementing the SharedWorker API. Have other implementers started working on this part of the spec yet? Let me quickly run down some of the implementation issues I've run into - some of these may be WebKit/Chrome specific, but other browsers may run into some of them as well: 1) MessagePort reachability is challenging in the context of separate Worker heaps In WebKit, each worker has its own heap (in Chrome, they will have their own process as well). The spec reads: User agents must act as if MessagePort objects have a strong reference to their entangled MessagePort object. Thus, a message port can be received, given an event listener, and then forgotten, and so long as that event listener could receive a message, the channel will be maintained. Of course, if this was to occur on both sides of the channel, then both ports would be garbage collected, since they would not be reachable from live code, despite having a strong reference to each other. Furthermore, a MessagePort object must not be garbage collected while there exists a message in a
Re: [whatwg] video/audio feedback
On Fri, May 8, 2009 at 9:43 AM, David Singer sin...@apple.com wrote: At 8:45 +1000 8/05/09, Silvia Pfeiffer wrote: On Fri, May 8, 2009 at 5:04 AM, David Singer sin...@apple.com wrote: At 8:39 +0200 5/05/09, KÞitof Îelechovski wrote: If the author wants to show only a sample of a resource and not the full resource, I think she does it on purpose. It is not clear why it is vital for the viewer to have an _obvious_ way to view the whole resource instead; if it were the case, the author would provide for this. IMHO, Chris It depends critically on what you think the semantics of the fragment are. In HTML (the best analogy I can think of), the web page is not trimmed or edited in any way -- you are merely directed to one section of it. There are critical differences between HTML and video, such that this analogy has never worked well. could you elaborate? At the risk of repeating myself ... HTML is text and therefore whether you download a snippet only or the full page and then do an offset does not make much of a difference. Even for a long page. In contrast, downloading a snippet of video compared to the full video will make a huge difference, in particular for long-form video. So, the difference is that in HTML the user agent will always have the context available within its download buffer, while for video this may not be the case. This admittedly technical difference also has an influence on the user interface. If you have all the context available in the user agent, it is easy to just grab a scroll-bar and jump around in the full content manually to look for things. This is not possible in the video case without many further download actions, which will each incur a network delay. This difference opens the door to enable user agents with a choice in display to either provide the full context, or just the fragment focus. Thus, while comparing media fragments to HTML fragments is a simple way to introduce the concept - and I use it, too, to explain to my less technical peers - it doesn't really help for detailed specifications. Regards, Silvia.
Re: [whatwg] Interaction of explicit and implicit sections (was: Re: Question on (new) header and hgroup)
On Fri, May 8, 2009 at 1:46 AM, Simon Pieters sim...@opera.com wrote: I guess the proper solution to that is to introduce :heading-level(n) that uses the outline algorithm. (But how to select and style subtitles in hgroup though?) If you used a :heading-level() pseudoclass, you'd just do :heading-level(x) h2 to style the h2 within an element of heading level x. (This would only target h2s withing hgroups right now, not h2s by themselves.) ~TJ
Re: [whatwg] Suitable video codec
yea.. the take home point is that Theora now has an encoder that puts it in the same ballpark as contemporary proprietary codecs. I would not say Theora is outdoing h.264. The results of a given PSNR test are impressive and important to publicize but I think my wording in posting about that test might have promoted overstating the quality factor. The only quality that really mattered in terms of standardization has stayed constant: which is Ogg Theora is /royalty free/ and implementable in both proprietary and free software browsers. --michael David Gerard wrote: H.264 was advocated here for the video element as higher quality than competing codecs such as Theora could ever manage. The Thusnelda coder is outdoing H.,264 in current tests: http://web.mit.edu/xiphmont/Public/theora/demo7.html This is of course developmental work. I'm sure the advocates of H. 264 can also tune its encoders to keep up, and not make Theora the only reasonable candidate for the video element. - d.
Re: [whatwg] video/audio feedback
At 23:46 +1000 8/05/09, Silvia Pfeiffer wrote: On Fri, May 8, 2009 at 9:43 AM, David Singer sin...@apple.com wrote: At 8:45 +1000 8/05/09, Silvia Pfeiffer wrote: On Fri, May 8, 2009 at 5:04 AM, David Singer sin...@apple.com wrote: At 8:39 +0200 5/05/09, KÞitof Îelechovski wrote: If the author wants to show only a sample of a resource and not the full resource, I think she does it on purpose. It is not clear why it is vital for the viewer to have an _obvious_ way to view the whole resource instead; if it were the case, the author would provide for this. IMHO, Chris It depends critically on what you think the semantics of the fragment are. In HTML (the best analogy I can think of), the web page is not trimmed or edited in any way -- you are merely directed to one section of it. There are critical differences between HTML and video, such that this analogy has never worked well. could you elaborate? At the risk of repeating myself ... HTML is text and therefore whether you download a snippet only or the full page and then do an offset does not make much of a difference. Even for a long page. you might try loading, say, the one-page version of the HTML5 spec. from the WhatWG site...it takes quite a while. Happily Ian also provides a multi-page, but this is not always the case. In contrast, downloading a snippet of video compared to the full video will make a huge difference, in particular for long-form video. there are short and long pages and videos. But we're talking about a point of principal here, which should be informed by practical, for sure, but not dominated by it. The reason I want clarity is that this has ramifications. For example, if a UA is asked to play a video with a fragment indication #time=10s-20s, and then a script seeks to 5s, does the user see the video at the 5s point of the total resource, or 15s? I think it has to be 5s. So, the difference is that in HTML the user agent will always have the context available within its download buffer, while for video this may not be the case. I'm sorry, I am lost. We could quite easily extend HTTP to allow for anchor-based retrieval of HTML (i.e. convert a 'please start at anchor X' into a pair of byte-range responses, for the global material, and then the document from that anchor onwards). This admittedly technical difference also has an influence on the user interface. If you have all the context available in the user agent, it is easy to just grab a scroll-bar and jump around in the full content manually to look for things. This is not possible in the video case without many further download actions, which will each incur a network delay. This difference opens the door to enable user agents with a choice in display to either provide the full context, or just the fragment focus. But we can optimize for the fragment without disallowing the seeking. -- David Singer Multimedia Standards, Apple Inc.
Re: [whatwg] Micro-data/Microformats/RDFa Interoperability Requirement
On Thu, 7 May 2009, Manu Sporny wrote: That's certainly not what the WHATWG blog stated just 20 days ago for rel=license [...] The WHATWG blog is an open platform on which anyone can post, and content is not vetted for correctness. Mark can sometimes make mistakes. Feel free to post a correction. :-) and the spec doesn't seem to clearly outline the difference in definition either (at least, that's not my reading of the spec): http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#link-type-license http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#link-type-tag Actually I just looked at the rel-tag faq and found that it disagrees with what Tantek had told me, so (assuming the faq is normative or that the rel-tag spec does mention this somewhere that I didn't find) the specs do match here. For rel-license, the HTML5 spec defines the value to apply to the content and not the page as a whole. This is a recent change to match actual practice and I will be posting about this shortly. The RDFa specification is very confusing to me (e.g. I don't understand how the normative processing model is separate from the section RDFa Processing in detail), so I may be misinterpreting things, but as far as I can tell: html xmlns=http://www.w3.org/1999/xhtml; head base href=http://example.com// link about=http://example.net/; rel=dc.author href=http://a.example.org// ... ...will result in the following triple: http://example.net/ http://example.com/dc.author http://a.example.org/ . Two corrections: The first is that an RDFa processor would not generate this triple. My apologies, I misinterpreted 5.4.4. Use of CURIEs in Specific Attributes to mean that rel= was a relative-uri-or-curie attribute. (5.4.4. Use of CURIEs in Specific Attributes says it's link-type-or-curie, but 5.4.3. General Use of CURIEs in Attributes doesn't list that as a possibility and at the end says that rel= is an exception only insofar as it supports specific link types as well, which I interpreted differently.) For example, it would be somewhat presumptious of RDFa to prevent any future version of HTML from being able to use the word resource as an attribute name. What if we want to extend the forms features to have an XForms datatype compatibility layer; why should we not be able to use the datatype and typeof attributes? As long as their legacy nature was preserved, and those uses didn't create ambiguity in RDFa processors and semantic equivalence was ensured, I don't see why they shouldn't be re-used. Ah, ok. If such attributes are re-used, I suppose that it should be possible to make sure that it is possible to re-use them in a way that doesn't conflict with RDFa (e.g. by triggering the non-curie-non-uri behaviour for property= or by having authors who want RDFa compatibility use xmlns:http=http: declarations or some such). Noted. Surely this is what namespaces were intended for. Uhh, what sort of namespaces are we talking about here? xmlns-style, namespaces? The idea of XML Namespaces was to allow people to extend vocabularies with a new features without clashing with older features by putting the new names in new namespaces. It seems odd that RDFa, a W3C technology for an XML vocabulary, didn't use namespaces to do it. For example, the way that n:next and next can end up being equivalent in RDFa processors despite being different per HTML rules (assuming an n namespace is appropriately declared). If they end up being equivalent in RDFa, the RDFa author did so explicitly when declaring the 'n' prefix to the default prefix mapping and we should not second-guess the authors intentions. My only point is that it is not compatible with HTML4 and HTML5, because they end up with different results in the same situation (one can treat two different values as the same, while the other can treat two different values as different). It is only not compatible with HTML5 if this community chooses for it to not be compatible with HTML5. Do you agree or disagree that we shouldn't second guess the authors intentions if they go out of their way to declare a mapping for 'n'? I don't think that's a relevant question. My point is that it is possible in RDFa to put two strings that have different semantics in HTML4 and yet have them have the same semantics in RDFa. This means RDFa is not compatible with HTML4. Another example would be: html xmlns=http://www.w3.org/1999/xhtml; head about= link rel=stylesheet alternate next href=... ... ...which in RDFa would cause the following triples to be created: http://www.w3.org/1999/xhtml/vocab#stylesheet ... . http://www.w3.org/1999/xhtml/vocab#alternate ... . http://www.w3.org/1999/xhtml/vocab#next ... . ...but according to
[whatwg] Helping people seaching for content filtered by license
One of the use cases I collected from the e-mails sent in over the past few months was the following: USE CASE: Help people searching for content to find content covered by licenses that suit their needs. SCENARIOS: * If a user is looking for recipes of pies to reproduce on his blog, he might want to exclude from his results any recipes that are not available under a license allowing non-commercial reproduction. * Lucy wants to publish her papers online. She includes an abstract of each one in a page, but because they are under different copyright rules, she needs to clarify what the rules are. A harvester such as the Open Access project can actually collect and index some of them with no problem, but may not be allowed to index others. Meanwhile, a human finds it more useful to see the abstracts on a page than have to guess from a bunch of titles whether to look at each abstract. * There are mapping organisations and data producers and people who take photos, and each may place different policies. Being able to keep that policy information helps people with further mashups avoiding violating a policy. For example, if GreatMaps.com has a public domain policy on their maps, CoolFotos.org has a policy that you can use data other than images for non-commercial purposes, and Johan Ichikawa has a photo there of my brother's cafe, which he has licensed as must pay money, then it would be reasonable for me to copy the map and put it in a brochure for the cafe, but not to copy the data and photo from CoolFotos. On the other hand, if I am producing a non-commercial guide to cafes in Melbourne, I can add the map and the location of the cafe photo, but not the photo itself. * Tara runs a video sharing web site for people who want licensing information to be included with their videos. When Paul wants to blog about a video, he can paste a fragment of HTML provided by Tara directly into his blog. The video is then available inline in his blog, along with any licensing information about the video. * Fred's browser can tell him what license a particular video on a site he is reading has been released under, and advise him on what the associated permissions and restrictions are (can he redistribute this work for commercial purposes, can he distribute a modified version of this work, how should he assign credit to the original author, what jurisdiction the license assumes, whether the license allows the work to be embedded into a work that uses content under various other licenses, etc). * Flickr has images that are CC-licensed, but the pages themselves are not. * Blogs may wish to reuse CC-licensed images without licensing the whole blog as CC, but while still including attribution and license information (which may be required by the licenses in question). REQUIREMENTS: * Content on a page might be covered by a different license than other content on the same page. * When licensing a subpart of the page, existing implementations must not just assume that the license applies to the whole page rather than just part of it. * License proliferation should be discouraged. * License information should be able to survive from one site to another as the data is transfered. * Expressing copyright licensing terms should be easy for content creators, publishers, and redistributors to provide. * It should be more convenient for the users (and tools) to find and evaluate copyright statements and licenses than it is today. * Shouldn't require the consumer to write XSLT or server-side code to process the license information. * Machine-readable licensing information shouldn't be on a separate page than human-readable licensing information. * There should not be ambiguous legal implications. * Parsing rules should be unambiguous. * Should not require changes to HTML5 parsing rules. The scenarios described above fall into three categories: searching for content, publishing content, and obtaining legal advice. First, I will examine the search scenario: * If a user is looking for recipes of pies to reproduce on his blog, he might want to exclude from his results any recipes that are not available under a license allowing non-commercial reproduction. This is technically possible today. The rel=license link type allows authors to specify the license that applies to the main content on a page, in this case recipes, search engines can be programmed with the most common licenses, and the user can tell the search engine what characteristics he wants (compatible with GPLv2, no advertising clause, doesn't have patent implications,
[whatwg] microdata use cases and Getting data out of poorly written Web pages
It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Shelley
Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages
On Fri, 8 May 2009, Shelley Powers wrote: It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Please don't cross-post to the WHATWG list and other lists -- you may pick either one, I read all of them. (Cross-posting results in a lot of confusion because some of the lists only allow members to posts, which others allow anyone to post, so we end up with fragmented threads.) Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). I did not keep track of where the use cases came from (I generally ignore the source of requests so as to avoid any possible bias). However, I can probably figure out some of the sources of a particular scenario if you have a specific one in mind. Could you clarify which scenario or requirement you are particularly interested in? In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? Naturally, all input is always welcome. It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Yes. If my proposals don't actually address the use cases, then please do point how that is the case. Similarly, if there are missing use cases, please bring them up. All input is always welcome (whether on the lists, or direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec is frozen, it's merely a proposal. If there are use cases that should be addressed that are not addressed then we should address them. (Regarding microdata note that I've so far only sent proposals for three of the 20 use cases that I collected. I've still got a lot to go through.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages
Ian Hickson wrote: On Fri, 8 May 2009, Shelley Powers wrote: It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Please don't cross-post to the WHATWG list and other lists -- you may pick either one, I read all of them. (Cross-posting results in a lot of confusion because some of the lists only allow members to posts, which others allow anyone to post, so we end up with fragmented threads.) But different people respond to the mailings in different ways, depending on the list. This isn't just you, Ian. How can I ensure that the W3C people have access to the same concerns? Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). I did not keep track of where the use cases came from (I generally ignore the source of requests so as to avoid any possible bias). Documenting the originator of a use case is introducing bias? In what universe? If anything, documenting where the use cases come from, and providing access to the original, raw data helps to ensure that bias has not been introduced. More importantly, it gives your teammates a chance to verify your interpretation of the use cases, and provide correction, if needed. However, I can probably figure out some of the sources of a particular scenario if you have a specific one in mind. Could you clarify which scenario or requirement you are particularly interested in? Ian, I think its important that you provide a place documenting the original raw data. This provides a historical perspective on the decisions going into HTML5 if nothing else. If you need help, I'm willing to help you. You'll need to forward me the emails you received, and send me links to the other locations. I'll then put all these into a document and we can work to map to your condensed document. That way there's accountability at all steps in the decision process, as well as transparency. Once I put the document together, we can put with other documents that also provide history of the decision processes. In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? Naturally, all input is always welcome. No, I didn't ask if input was welcome. I asked if this was still open for discussion, or if you have made up your mind, and and further discussion will just be wasting everyone's time. It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Yes. That's not appropriate for a team environment. If my proposals don't actually address the use cases, then please do point how that is the case. Similarly, if there are missing use cases, please bring them up. All input is always welcome (whether on the lists, or direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec is frozen, it's merely a proposal. If there are use cases that should be addressed that are not addressed then we should address them. Again, how can I? I don't have the original data. (Regarding microdata note that I've so far only sent proposals for three of the 20 use cases that I collected. I've still got a lot to go through.) After digging, I found another one, at http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019620.html Again, though, the writing style indicates the item is closed, and discussion is not welcome. I have to assume that this is how you mentally perceive the item, and therefore though we may respond, the response will make no difference. And I can't find the third one. Perhaps you can provide a direct link. I'm concerned, too, about the fact that the discussion for these is happening on the WhatWG group, but not in the HTML WG email list. I've never understood two different email lists, and have felt having both is confusing, and potentially misleading. Regardless, shouldn't this discussion be taking place in the HTML WG, too? Isn't the specification the W3C HTML5 specification, also? I'm just concerned because from what I can see of both groups, interests and concerns differ between the groups. That means only addressing issues in one group, would leave out potentially important discussions in the other group. Shelley
[whatwg] Allowing authors to annotate their documents to explain things for readers
One of the use cases I collected from the e-mails sent in over the past few months was the following: USE CASE: Allow authors to annotate their documents to highlight the key parts, e.g. as when a student highlights parts of a printed page, but in a hypertext-aware fashion. SCENARIOS: * Fred writes a page about Napoleon. He can highlight the word Napoleon in a way that indicates to the reader that that is a person. Fred can also annotate the page to indicate that Napoleon and France are related concepts. This use case isn't altogether clear, but if the target audience of the annotations is human readers (as opposed to machines and readers using automated processing tools), then it seems like this is already possible in a number of ways in HTML5. The easiest way of addressing this is just to include text bringing the user's attention to relationships: pThis page is about Napoleon. He was my uncle and lived in France./p Individual keywords can be highlighted with b: pThis page is about bNapoleon/b. He was my uncle and lived in bFrance/b./p Prose annotations can be added to individual words or phrases using the title= attribute: pThis page is about span title=A personNapoleon/span. He was my uncle and lived in span title=A hamlet near Drummond, in Idaho, USAFrance/span./p These typically show as tooltips. To highlight material on the page that might be relevant to the user, e.g. if the user searched for the word Uncle and the site wanted to highlight the word Uncle, the mark element can be used: pThis page is about Napoleon. He was my markuncle/mark and lived in France./p The same element can be used by a reader editing an existing document to highlight the parts that warrant further study, possibly using the title= attribute to include notes: pThis page is about Napoleon. He was my uncle and mark title=really?lived in France/mark./p Links can be used to link parts of a document together to indicate relationships: p id=napoleonMy uncle was called Napoleon. See also: a href=#franceFrance/a, a href=#uncleUncle/a./p ... p id=franceFrance is a hamlet near Drummond, ID. My uncle lived there. See also: a href=#napoleonNapoleon/a./p In conclusion, this use case doesn't seem to need any new changes to the language. A number of further use cases remain to be examined, including some more specifically looking at machine-readable annotations rather than annotations aimed directly at human readers. I will send further e-mail next week as I address them. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] video/audio feedback
On Sat, May 9, 2009 at 2:25 AM, David Singer sin...@apple.com wrote: At 23:46 +1000 8/05/09, Silvia Pfeiffer wrote: On Fri, May 8, 2009 at 9:43 AM, David Singer sin...@apple.com wrote: At 8:45 +1000 8/05/09, Silvia Pfeiffer wrote: On Fri, May 8, 2009 at 5:04 AM, David Singer sin...@apple.com wrote: At 8:39 +0200 5/05/09, KÞitof Îelechovski wrote: If the author wants to show only a sample of a resource and not the full resource, I think she does it on purpose. It is not clear why it is vital for the viewer to have an _obvious_ way to view the whole resource instead; if it were the case, the author would provide for this. IMHO, Chris It depends critically on what you think the semantics of the fragment are. In HTML (the best analogy I can think of), the web page is not trimmed or edited in any way -- you are merely directed to one section of it. There are critical differences between HTML and video, such that this analogy has never worked well. could you elaborate? At the risk of repeating myself ... HTML is text and therefore whether you download a snippet only or the full page and then do an offset does not make much of a difference. Even for a long page. you might try loading, say, the one-page version of the HTML5 spec. from the WhatWG site...it takes quite a while. Happily Ian also provides a multi-page, but this is not always the case. That just confirms the problem and it's obviously worse with video. :-) The reason I want clarity is that this has ramifications. For example, if a UA is asked to play a video with a fragment indication #time=10s-20s, and then a script seeks to 5s, does the user see the video at the 5s point of the total resource, or 15s? I think it has to be 5s. I agree, it has to be 5s. The discussion was about what timeline is displayed and what can the user easily access through seeking through the displayed timeline. A script can access any time of course. But a user is restricted by what the user interface offers. So, the difference is that in HTML the user agent will always have the context available within its download buffer, while for video this may not be the case. I'm sorry, I am lost. We could quite easily extend HTTP to allow for anchor-based retrieval of HTML (i.e. convert a 'please start at anchor X' into a pair of byte-range responses, for the global material, and then the document from that anchor onwards). Yes, but that's not the way it currently works and it is not a proposal currently under discussion. This admittedly technical difference also has an influence on the user interface. If you have all the context available in the user agent, it is easy to just grab a scroll-bar and jump around in the full content manually to look for things. This is not possible in the video case without many further download actions, which will each incur a network delay. This difference opens the door to enable user agents with a choice in display to either provide the full context, or just the fragment focus. But we can optimize for the fragment without disallowing the seeking. What do you mean by optimize for the fragment? Of course none of the discussion will inherently disallow seeking - scripts will always be able to do the seeking. But the user may not find it easy to do seeking to a section that is not accessible through the displayed timeline, which can be both a good and a bad thing. Cheers, Silvia.