Re: [whatwg] Implementation difficulties for MediaController
On Mar 27, 2011, at 8:01 PM, Ian Hickson wrote: It's been brought to my attention that there are aspects of the MediaController design that are hard to implement; in particular around the ability to synchronise or desynchronise media while it is playing back. To help with this, I propose to put in some blocks on the API on the short term so that things that are hard to implement today will simply throw exceptions or otherwise fail in detectable and predictable ways. However, to do that I need a better idea of what exactly is hard to implement. It would be helpful if you could describe exactly what is easy and what is hard (that is, glitchy or simply unsupported by common media frameworks) in terms of media synchronisation, in particular along the following axes: Hi Ian, Contained is Eric and my feedback as to the difficulty of implementing this proposal in Apple's port of WebKit: * multiple in-band tracks vs multiple independent files Playing in-band tracks from a single element will always be more efficient than playing multiple independent files or tracks, because the media engine can optimize its I/O and decoding pipelines at the lowest level. * playing tracks synchronised at different offsets However, if the in-band tracks will be played at a different time offsets, or at different rates, playback becomes just as inefficient as playing independent files. To implement this we will have to open two instances of a movie, enable different tracks on each, and then play the two instances in sync. * playing tracks at different rates In addition to the limitation listed above, efficient playback of tracks at different rates will require all tracks to be played in the same direction. * changing any of the above while media is playing vs when it is stopped Modifying the media groups while the media is playing is probably impossible to do without stalling. The media engine may have thrown out unneeded data from disabled tracks and may have to rebuffer that data, even in the case of in-band tracks. * adding or removing tracks while media is playing vs when it is stopped As above. * changing overall playback rate while a synced set of media is playing This is possible to do efficiently. Based on this I can then limit the API accordingly. (Any other feedback you may have on this proposed API is of course also very welcome.) From a user's point of view, your proposal seems more complicated than the basic use cases merit. For example, attempting to fix the synchronization of improperly authored media with micro-adjustments of the playback rate isn't likely to be very successful or accurate. The metronome case, while an interesting experiment, would be better served through something like the proposed Audio API. Slaving multiple media elements' playback rate and current time to a single master media element, Silvia and Eric's proposal, seems to achieve the needs of the broadest use cases. If adding independent playback rates becomes necessary later, adding this support in a future revision will be possible. -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] Implementation difficulties for MediaController
On Mar 29, 2011, at 9:05 PM, Ian Hickson wrote: On Tue, 29 Mar 2011, Jer Noble wrote: Contained is Eric and my feedback as to the difficulty of implementing this proposal in Apple's port of WebKit: Thank you very much for your feedback. I'll look into it more tomorrow when I update the spec, but in the meantime I had some additional questions: * playing tracks synchronised at different offsets However, if the in-band tracks will be played at a different time offsets, or at different rates, playback becomes just as inefficient as playing independent files. To implement this we will have to open two instances of a movie, enable different tracks on each, and then play the two instances in sync. Is that acceptable? That is, are you ok with implementing multiple file (or two instances of the same file at different offsets) synchronization? Yes, this would be acceptable. * playing tracks at different rates In addition to the limitation listed above, efficient playback of tracks at different rates will require all tracks to be played in the same direction. Ah, interesting. Is it acceptable to implement multiple playback at different rates if they're all in the same direction, or would you (at least for now) be significantly helped by forcing the playback rates to be the same for all slaved media tracks? It would be significantly easier to implement an across-the-board playback rate for all media elements in a media group. This seems like a reasonable restriction for the first version of the API. * changing any of the above while media is playing vs when it is stopped Modifying the media groups while the media is playing is probably impossible to do without stalling. The media engine may have thrown out unneeded data from disabled tracks and may have to rebuffer that data, even in the case of in-band tracks. That makes sense. There's several ways to handle this; the simplest is probably to say that when the list of synchronised tracks is changed, or when the individual offsets of each track or the individual playback rates of each track are changed, the playback of the entire group should be automatically stopped. Is that sufficient? I would say that, instead, it would be better to treat this as similar to seeking into an unbuffered region of a media file. Some implementers will handle this case better than others, so this seems to be a Quality of Service issue. (In the future, if media frameworks optimise these cases, or if hardware advances sufficiently that even inefficient implementations of this are adequate, we could add a separate flag that controls whether or not this automatic pausing happens.) It seems that this could be determined on the authors' side by pausing before operations that may cause significant buffering delays, without the need for a new flag. From a user's point of view, your proposal seems more complicated than the basic use cases merit. For example, attempting to fix the synchronization of improperly authored media with micro-adjustments of the playback rate isn't likely to be very successful or accurate. The metronome case, while an interesting experiment, would be better served through something like the proposed Audio API. Indeed. The use cases you mention here aren't the driving factor in this design, they're just enabled mostly as a side-effect. The driving factor is to avoid the symmetry problem described below: Slaving multiple media elements' playback rate and current time to a single master media element, Silvia and Eric's proposal, seems to achieve the needs of the broadest use cases. The problem with this design is that it is highly asymetric. The implementation of a media element needs to have basically two modes: slave and master, where the logic for both can be quite different. (Actually, three modes if you count the lone media element case as a separate mode.) This then also spills into the API, where the master is exposing both the network state of its own media, as well as the overall state of playback. We end up having to handle all kinds of special cases, such as what happens when the master track is shorter than a slaved track, or what happens when the master track is paused vs when a slaved track is paused. It's not impossible to do, but it is significantly more messy than simply having a distinct master object and having all the media elements only deal with one mode (two if a lone media element counts as separate), namely the slave mode. Any asymetry is reflected as differences between the controller and the media element. Each media element only has to deal with its own networking state, etc. For an example of why this matters, consider the use case of a movie site with the option of playing movies with a director's commentary track. Some director's commentaries are shorted than the movie (most
Re: [whatwg] How to handle multitrack media resources in HTML
On Apr 7, 2011, at 11:54 PM, Ian Hickson wrote: The distinction between a master media element and a master media controller is, in my mind, mostly a distinction without a difference. However, a welcome addition to the media controller would be convenience APIs for the above properties (as well as playbackState, networkState, seekable, and buffered). I'm not sure what networkState in this context. playbackState, assuming you mean 'paused', is already exposed. Sorry, by playbackState, I meant readyState. And I was suggesting that, much in the same way that you've provided .buffered and .seekable properties which expose the intersection of the slaved media elements' corresponding ranges, that a readyState property could similarly reflect the readyState values of all the slaved media elements. In this case, the MediaController's hypothetical readyState wouldn't flip to HAVE_ENOUGH_DATA until all the constituent media element's ready states reached at least the same value. Of course, this would imply that the load events fired by a media element (e.g. loadedmetadata, canplaythrough) were also fired by the MediaController, and I would support this change as well. Again, this would be just a convenience for authors, as this information is already available in other forms and could be relatively easily calculated on-the-fly in scripts. But UAs are likely going to have do these calculations anyway to support things like autoplay, so adding explicit support for them in API form would not (imho) be unduly burdensome. -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] How to handle multitrack media resources in HTML
On Apr 11, 2011, at 5:26 PM, Ian Hickson wrote: On Fri, 8 Apr 2011, Jer Noble wrote: Sorry, by playbackState, I meant readyState. And I was suggesting that, much in the same way that you've provided .buffered and .seekable properties which expose the intersection of the slaved media elements' corresponding ranges, that a readyState property could similarly reflect the readyState values of all the slaved media elements. In this case, the MediaController's hypothetical readyState wouldn't flip to HAVE_ENOUGH_DATA until all the constituent media element's ready states reached at least the same value. So basically it would return the lowest possible value amongst the slaved elements? I guess we could expose such a convenience accessor, but what's the use case? It seems easy enough to implement manually in JS, so unless there's a compelling case, I'd be reluctant to add it. Yes, this would be just a convenience, as I tried to make clear below. So I don't want to seem like I'm pushing this too hard. But since you asked... Of course, this would imply that the load events fired by a media element (e.g. loadedmetadata, canplaythrough) were also fired by the MediaController, and I would support this change as well. I don't see why it would imply that, but certainly we could add events like that to the controller. Again though, what's the use case? The use case for the events is the same one as for the convenience property: without a convenience event, authors would have to add event listeners to every slave media element. So by imply, I simply meant that if the use case for the first was compelling enough to warrant new API, the second would be warranted as well. Lets say, for example, an author wants to change the color of a play button when the media in a media group all reaches the HAVE_ENOUGH_DATA readyState. Current API: function init() { var mediaGroupElements = document.querySelectorAll(*[mediaGroup=group1]); for (var i = 0; i mediaGroupElements.length; ++i) mediaGroupElements.item(i).addEventListener('canplaythrough', readyStateChangeListener, false); } function readyStateChangeListener(e) { var mediaGroupElements = document.querySelectorAll(*[mediaGroup=group1]); var ready = mediaGroupElements.length 0; for (var i = 0; i mediaGroupElements.length; ++i) if (mediaGroupElements.item(i).readyState HAVE_ENOUGH_DATA) ready = false; if (ready) changePlayButtonColor(); } Convenience API: function init() { var controller = document.querySelector(*[mediaGroup=group1]).controller; controller.addEventListener('canplaythrough'), changePlayButtonColor, true; } I think the convenience benefits are pretty obvious. Maybe not compelling enough, however. :) Again, this would be just a convenience for authors, as this information is already available in other forms and could be relatively easily calculated on-the-fly in scripts. But UAs are likely going to have do these calculations anyway to support things like autoplay, so adding explicit support for them in API form would not (imho) be unduly burdensome. Autoplay is handled without having to do these calculations, as far as I can tell. I don't see any reason why the UA would need to do these calculations actually. If there are compelling use cases, though, I'm happy to add such accessors. Well, how exactly is autoplay handled in a media group? Does the entire media group start playing when the first media element in a group with it's autoplay attribute set reaches HAVE_ENOUGH_DATA? -Jer Jer Noble jer.no...@apple.com
[whatwg] Full Screen API Feedback
WebKit is in the process of implementing Mozilla's proposed Full Screen API https://wiki.mozilla.org/Gecko:FullScreenAPI. Basic full screen support is available in WebKit Nightlies http://nightly.webkit.org/ on Mac and Windows (other ports are adding support as well), and can be enabled through user defaults (WebKitFullScreenEnabled=1). To test the feasibility of this API, we have mapped the full screen button in the default controls in video elements to this new API. The webkit-only webkitenterfullscreen() method on HTMLMediaElement has also been mapped to this new API. In so doing, we have been able to collect test case results from live websites. In this process, I believe we have uncovered a number of issues with the API proposal as it currently stands that I'd like to see addressed. 1. Z-index as the primary means of elevating full screen elements to the foreground. The spec suggests that a full screen element is given a z-index of BIGNUM in order to cause the full screen element to be visible on top of the rest of page content. The spec also notes that it is possible for a document to position content over an element with the :full-screen pseudo-class, for example if the :full-screen element is in a container with z-index not 'auto'. In our testing, we have found that this caveat causes extreme rendering issues on many major video-serving websites, including Vimeo and Apple.com. In order to fix rendering under the new full-screen API to be on par with WebKit's existing full-screen support for video elements, we chose to add a new pseudo-class and associated style rule to forcibly reset z-index styles and other stacking-context styles. This is of course not ideal, and we have only added this fix for full screen video elements. This rendering quirk makes it much more difficult for authors to elevate a single element to full-screen mode without modifying styles on the rest of their page. Proposal: the current API proposal simply recommends a set of CSS styles. The proposal should instead require that no other elements render above the current full-screen element and its children, and leave it up to implementers to achieve that requirement. (E.g., WebKit may implement this by walking up the ancestors of the full-screen element disabling any styles which create stacking contexts.) 2. Animating into and out of full screen. WebKit's current video full-screen support will animate an element between its full-screen and non-full-screen states. This has both security and user experience benefits. However, with the current z-index-based rendering technique recommended by the proposed Full Screen API, animating the full-screen transition is extremely difficult. Proposal: The full-screen element should create a new view, separate from its parent document's view. This would allow the UA to resize and animate the view separate from the parent document's view. This would also solve issue 1 above. 3. fullscreenchange events and their targets. The current proposal states that a fullscreenchange event must be dispatched when a document enters or leaves full-screen. Additionally, when the event is dispatched, if the document's current full-screen element is an element in the document, then the event target is that element, otherwise the event target is the document. This has the side effect that, if an author adds an event listener for this event to an element, he will get notified when an element enters full screen, but never when that element exits full-screen (if the current full screen element is cleared, as it should be, before the event is dispatched.) In addition, if the current full-screen element is changed while in full screen mode (e.g. by calling requestFullScreen() on a different element) then an event will be dispatched to only one of the two possible targets. Proposal: split the fullscreenchange events into two: fullscreenentered and fullscreenexited (or some variation thereof) and fire each at the appropriate element. 4. A lack of rejection. The current proposal provides no notification to authors that a request to enter full screen has been denied. From an UA implementor's perspective, it makes writing test cases much more difficult. From an author's perspective it makes failing over to another full screen technique (such as a full-window substitute mode) impossible. Proposal: add a fullscreenrequestdenied event and require it to be dispatched when and if the UA denies a full-screen request. Thanks, -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] Full Screen API Feedback
On May 11, 2011, at 3:03 PM, Jonas Sicking wrote: On Wed, May 11, 2011 at 11:27 AM, Jer Noble jer.no...@apple.com wrote: 3. fullscreenchange events and their targets. The current proposal states that a fullscreenchange event must be dispatched when a document enters or leaves full-screen. Additionally, when the event is dispatched, if the document's current full-screen element is an element in the document, then the event target is that element, otherwise the event target is the document. This has the side effect that, if an author adds an event listener for this event to an element, he will get notified when an element enters full screen, but never when that element exits full-screen (if the current full screen element is cleared, as it should be, before the event is dispatched.) In addition, if the current full-screen element is changed while in full screen mode (e.g. by calling requestFullScreen() on a different element) then an event will be dispatched to only one of the two possible targets. Proposal: split the fullscreenchange events into two: fullscreenentered and fullscreenexited (or some variation thereof) and fire each at the appropriate element. Couldn't you simply define that fullscreenchange is fired after the fullscreen is cleared, but still fire it on the element which used to be the fullscreened element. It's nicer for authors to not have to deal with two events. That takes care of one case. But for the case where the full-screen element changes due to requestFullScreen() being called while already in full-screen mode, which element should you fire the fullscreenchange event at? The first or the second? Or both? This would be made much clearer if the element which lost current full-screen element status received one message, and the one gaining that status received another. If requiring authors to deal with two event names is too cumbersome, (which I'm not sure I agree with) perhaps a new Event type is in order. Something like FullScreenEvent.targetEntered and FullScreenEvent.targetExited (or similar) would also solve the problem. 4. A lack of rejection. The current proposal provides no notification to authors that a request to enter full screen has been denied. From an UA implementor's perspective, it makes writing test cases much more difficult. From an author's perspective it makes failing over to another full screen technique (such as a full-window substitute mode) impossible. Proposal: add a fullscreenrequestdenied event and require it to be dispatched when and if the UA denies a full-screen request. Wasn't the idea that if the user denies the fullscreen request, the browser can still full-window the element inside the normal browser window, thus taking care substitute for the website? Was it? It doesn't seem to be in the proposed API document. And absent any explicit requirement that the browser also implement a pseudo-full-screen mode, I think that the above event is still necessary. -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] Full Screen API Feedback
On May 11, 2011, at 7:41 PM, Robert O'Callahan wrote: On Thu, May 12, 2011 at 6:27 AM, Jer Noble jer.no...@apple.com wrote: 1. Z-index as the primary means of elevating full screen elements to the foreground. The spec suggests that a full screen element is given a z-index of BIGNUM in order to cause the full screen element to be visible on top of the rest of page content. The spec also notes that it is possible for a document to position content over an element with the :full-screen pseudo-class, for example if the :full-screen element is in a container with z-index not 'auto'. In our testing, we have found that this caveat causes extreme rendering issues on many major video-serving websites, including Vimeo and Apple.com. Can you describe these issues in more detail? Sure. Here's what Vimeo looked like in full-screen mode: http://i.imgur.com/Rl4Gp.png. And Apple.com: http://i.imgur.com/71Glg.png. Each page has already placed the video element in a stacking context one way or another. And so, even though the full screen element has a large z-index, many other elements of the page pop in front of it. In order to fix rendering under the new full-screen API to be on par with WebKit's existing full-screen support for video elements, we chose to add a new pseudo-class and associated style rule to forcibly reset z-index styles and other stacking-context styles. This is of course not ideal, and we have only added this fix for full screen video elements. This rendering quirk makes it much more difficult for authors to elevate a single element to full-screen mode without modifying styles on the rest of their page. Proposal: the current API proposal simply recommends a set of CSS styles. The proposal should instead require that no other elements render above the current full-screen element and its children, and leave it up to implementers to achieve that requirement. (E.g., WebKit may implement this by walking up the ancestors of the full-screen element disabling any styles which create stacking contexts.) This could have side effects observable to the page. I'd prefer to standardize exactly what happens here. I agree that an explicit requirement is desirable. 2. Animating into and out of full screen. WebKit's current video full-screen support will animate an element between its full-screen and non-full-screen states. This has both security and user experience benefits. However, with the current z-index-based rendering technique recommended by the proposed Full Screen API, animating the full-screen transition is extremely difficult. Proposal: The full-screen element should create a new view, separate from its parent document's view. This would allow the UA to resize and animate the view separate from the parent document's view. This would also solve issue 1 above. I'm not sure what you mean exactly by a new view. Depending on what you mean, that could create all kinds of implementation and spec issues. For example, if an element can have different style or layout in the two views, DOM APIs that return those things become ambiguous. I would strongly object to that. I'm not suggesting that the element exists in two views simultaneously, but rather that it becomes the root of a new viewport. It seems to me you could animate the transition without having multiple concurent views. For example, freeze the rendering of the document in its browser window, put the document into the fullscreen state, and display it in a popup window that starts off matching the geometry of the fullscreen element and zooms out to cover the screen. That is much more difficult than it sounds. :) Freezing the non-full-screen content is already undesirable. The animation can take an arbitrary amount of time to complete, an any animations or dynamic content will appear to hang until the animation completes or the dynamic content is obscured. But you're right that it would be required in order for this technique to work at all. I've managed to implement a full screen animation which allows the non-full-screen content to continue live during the full screen animation, but it relies on hardware acceleration and required a large number of intrusive changes to the rendering engine. Creating a new viewport for the full-screen content would serve the same purpose, and it would solve the z-index issue as well. Resizing the full-screen viewport wouldn't affect the layout of the non-full-screen content, allowing for efficient animation of just the full-screen element and its children. 3. fullscreenchange events and their targets. The current proposal states that a fullscreenchange event must be dispatched when a document enters or leaves full-screen. Additionally, when the event is dispatched, if the document's current full-screen element is an element in the document, then the event target is that element
Re: [whatwg] Full Screen API Feedback
On May 11, 2011, at 11:25 PM, Robert O'Callahan wrote: On Thu, May 12, 2011 at 4:45 PM, Jer Noble jer.no...@apple.com wrote: 2. Animating into and out of full screen. WebKit's current video full-screen support will animate an element between its full-screen and non-full-screen states. This has both security and user experience benefits. However, with the current z-index-based rendering technique recommended by the proposed Full Screen API, animating the full-screen transition is extremely difficult. Proposal: The full-screen element should create a new view, separate from its parent document's view. This would allow the UA to resize and animate the view separate from the parent document's view. This would also solve issue 1 above. I'm not sure what you mean exactly by a new view. Depending on what you mean, that could create all kinds of implementation and spec issues. For example, if an element can have different style or layout in the two views, DOM APIs that return those things become ambiguous. I would strongly object to that. I'm not suggesting that the element exists in two views simultaneously, but rather that it becomes the root of a new viewport. What does that mean in CSS terms? Does the element cease to exist in the old viewport? If so, what would that mean in CSS terms? I would imagine that, yes, the element ceases to exist in the old viewport. I'm not sure what that would mean in terms of CSS. Having elements in the same document be in different viewports still creates all kinds of spec and implementation issues :-(. It very well might. The current proposal has issues of it's own though. :) It seems to me you could animate the transition without having multiple concurent views. For example, freeze the rendering of the document in its browser window, put the document into the fullscreen state, and display it in a popup window that starts off matching the geometry of the fullscreen element and zooms out to cover the screen. That is much more difficult than it sounds. :) Freezing the non-full-screen content is already undesirable. The animation can take an arbitrary amount of time to complete, Really? Why? It shouldn't take more than a second to complete, surely? This is hypothetical, but imagine a touch-based UI where the user can pinch to enter and exit full-screen. In this UI, the full-screen animation is under direct control of the user, and so can take as long as the user wants it to take. 4. A lack of rejection. The current proposal provides no notification to authors that a request to enter full screen has been denied. From an UA implementor's perspective, it makes writing test cases much more difficult. From an author's perspective it makes failing over to another full screen technique (such as a full-window substitute mode) impossible. Proposal: add a fullscreenrequestdenied event and require it to be dispatched when and if the UA denies a full-screen request. My main concern is that with some UI scenarios there might not be a good time to fire the denied event. For example, in Firefox 4 when an application requests geolocation a popup appears, and if the user clicks anywhere outside the popup the popup disappears but there is still UI allowing the user to grant the request later. If we used the same approach for fullscreen, I think we wouldn't want to fire the denied event unless the user actually selects no in the popup. (It would seem confusing to deny the request and then grant it later.) I'm wary of authors writing code that assumes a denied event will fire and breaks when it doesn't, or when it fires later than they expect. The current API already requires that authors listen for events that may occur in the far future. I don't see how this event would be any different. You mean fullscreenchanged? I'm confident authors will understand that fullscreenchanged might fire late or never and will encounter that during testing. I'm less confident it will be obvious to authors that both fullscreenchanged and fullscreendenied might never fire and will encounter that during testing. I'm not sure I get the distinction. In fact, it seems to me to be the opposite. A) If an author calls requestFullScreen(), at some point in the future they will receive either a fullscreenchanged event or a fullscreendenied event. B) If an author calls requestFullScreen(), at some point in the future they may receive a fullscreenchanged event, or not. I'd argue that A) is easier to grasp. And your geolocation example actually argues the other way: the existing geolocation API includes an asynchronous error handler that is explicitly called when a request is denied. This would be a similar if not identical use case. I don't necessarily agree with that part of the geolocation API :-). Fair enough
Re: [whatwg] Full Screen API Feedback
On May 12, 2011, at 12:31 AM, Boris Zbarsky wrote: On 5/12/11 3:24 AM, Jer Noble wrote: A) If an author calls requestFullScreen(), at some point in the future they will receive either a fullscreenchanged event or a fullscreendenied event. B) If an author calls requestFullScreen(), at some point in the future they may receive a fullscreenchanged event, or not. I'd argue that A) is easier to grasp. (A) is easier to grasp incorrectly to. In practice, at some point in the future means maybe you'll get it, or maybe you won't, because for any finite time period the future may not have arrived yet. (B) just makes that explicit so authors don't get confused. No, that still doesn't make sense. At the time when the user decides to allow or deny full screen access, either a fullscreenchanged or a fullscreendenied event is fired. Saying that fullscreendenied will confuse users is akin to saying that fullscreenchanged will confuse them as well. I don't necessarily agree with that part of the geolocation API :-). Fair enough. But it is an API in relatively wide use now. Have authors complained that the timing of the error handler is too confusing? Yes. http://stackoverflow.com/questions/5947637/function-fail-never-called-not-if-user-declines-to-share-geolocation-in-firefox for example (where the author misunderstood the difference between denied and hasn't decided yet). That doesn't seem like a confusion about the API, but with Firefox's UI. Note that they are not confused by Chrome's behavior. I don't believe that Firefox's UI decisions should justify removing what would otherwise be a useful piece of API. So far, neither you nor Roc have been able to articulate why this event should be omitted beyond vague handwaving about developer confusion. On the contrary, there are real use cases for the denial event: - Failing over to a browser specific full screen mechanism (such as webkit's video element full screen mode) - Removing or disabling the full screen button from a web-app. - If a web app requested keyboard access, re-requesting with a no-keyboard full screen mode. - General user feedback True, without the fullscreendenied event, authors will be forced to pre-fallback to a full-window mode. But with the fullscreendenied event, they can decide whether to do that, or a more traditional post-denial full-window mode. And what do they do for the arbitrarily long time before getting any event at all? Display an indeterminate progress meter? Disable the full screen button? To be quite honest, the way Firefox implements this feature seems like a usability nightmare. Surely there's a way to achieve the security benefits you're hoping for without requiring intentionally obtuse API? -Jer
Re: [whatwg] Full Screen API Feedback
On May 12, 2011, at 12:54 AM, Jer Noble wrote: Surely there's a way to achieve the security benefits you're hoping for without requiring intentionally obtuse API? Okay, here's another proposal that should work with Firefox's passive permission system: Proposal: - Add a new boolean Element property canRequestFullScreen. This would map to Firefox's Never permission choice. - Add the fullscreendenied event. This would map to Firefox's Not now permission choice. Use case/Scenario: A video player would first query canRequestFullScreen to decide whether to display the full screen button in its UI. If the user hadn't previously decided to never allow this site to enter full screen, this property would return true (or perhaps maybe). Upon clicking the full screen button, the user would be presented with a notification. If the user chooses Never, a fullscreendenied event is dispatched at the requesting element, and subsequent calls to canRequestFullScreen would return false (or perhaps an empty string). In this situation, the video player would hide their full-screen button. If the user chooses Not now, a fullscreendenied event is dispatched at the requesting element, but subsequent calls to canRequestFullScreen would still return true/maybe. Alternative: The canRequestFullScreen property could be replaced with a function which takes the same flags as requestFullScreen. -Jer
Re: [whatwg] Full Screen API Feedback
On May 12, 2011, at 1:42 AM, timeless wrote: Your proposal makes it fairly easy for sites to stick up annoying full content blocking you must change your settings to proceed elements. That ability exists in the current API as well. A malicious website would just require the fullscreenchanged event before allowing the user to continue. -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] Full Screen API Feedback
On May 12, 2011, at 5:44 AM, Boris Zbarsky wrote: On 5/12/11 3:54 AM, Jer Noble wrote: No, that still doesn't make sense. At the time when the user decides to allow or deny full screen access The point is this may be never. They might just wake forever to make a decision. Saying that fullscreendenied will confuse users is akin to saying that fullscreenchanged will confuse them as well. I'm saying that if authors expect to get one or the other but then never do, that will confuse authors. Again, I fail to see how this is a problem for the denial event but not for the change event. That doesn't seem like a confusion about the API, but with Firefox's UI. Firefox's UI simply allows a user to defer the decision. There's no problem there. Right, I'm saying the developer is confused about FIrefox's UI. He (apparently) expects Not now to generate an error. All that happened is that the _developer_ (not a user!) got confused about the meaning of Not Now. It really does mean I haven't decided yet, not I'm not sharing. Exactly. I'm saying it's a UI confusion, and not one that justifies removing the error notification. I don't believe that Firefox's UI decisions should justify removing what would otherwise be a useful piece of API. The piece of API is broken, as Chrome's behavior described above shows. All it's doing is creating incorrect author expectations. I strongly disagree. Firefox's UI behavior is causing confusion, not the API. This problem is not endemic to the geolocation feature, but rather to one (or two) implementations of that feature. So far, neither you nor Roc have been able to articulate why this event should be omitted beyond vague handwaving about developer confusion. I'm not sure how you can't depend on this event ever firing, so you have to code on the assumption that it won't fire, but the spec makes you think that it will fire can be any clearer. I can: by adding an explicit error event. On the contrary, there are real use cases for the denial event: - Failing over to a browser specific full screen mechanism (such as webkit's video element full screen mode) - Removing or disabling the full screen button from a web-app. - If a web app requested keyboard access, re-requesting with a no-keyboard full screen mode. - General user feedback None of these work if the event can't be expected to fire on any set schedule! Sure they can! Every single one of these can. And what do they do for the arbitrarily long time before getting any event at all? Display an indeterminate progress meter? Disable the full screen button? That doesn't seem reasonable, honestly. Once a user clicks that [x] in Chrome, what happens? They get stuck? Stuck? They're already in full screen purgatory. :) What would happen if they clicked on the full screen button again? Would Firefox pop up another notification? To be quite honest, the way Firefox implements this feature seems like a usability nightmare. It's just fine for the users. The only problem in the geolocation case is that that the way the API is described creates unrealistic expectations on the part of _developers_. I don't consider the following to be a usable UI: - User clicks a full screen button - Content resizes to occupy full window - Browser pops up a permissions dialog - User has to click the Allow button* - Window then becomes full screen * This line is especially egregious. I can understand asking for permission if the original full screen request did not originate with a mouse click. Heck, I'm fine with /requiring/ full screen to initiate with a mouse click. But asking the user to confirm did you really mean to do this? for an easily reversable action is poor UI. If the browser inadvertantly exposes the user's geolocation to a website, that's an action that can never be undone. The same is not true for the full screen case. Now, this UI might be necessary in order to fend off unwanted, full-screen, pop-under advertising or phishing attacks. That doesn't make it good UI, but possibly a minimilaly bad UI. All I'm saying is that there may be an even less bad UI which would provide the same benefits. Surely there's a way to achieve the security benefits you're hoping for without requiring intentionally obtuse API? Not if we want to allow users to actually take however long they want to make the decision. Which we do. Thtat's fine. But I still don't agree that this requires there to be no error event when the user eventually does make that decision. -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] Full Screen API Feedback
On May 12, 2011, at 5:47 AM, Boris Zbarsky wrote: On 5/12/11 4:12 AM, Jer Noble wrote: - Add a new boolean Element property canRequestFullScreen. This would map to Firefox's Never permission choice. - Add the fullscreendenied event. This would map to Firefox's Not now permission choice. So if the user just dismisses the notification without picking any of the choices then fullscreendenied would fire in this proposal? I'm not trying to tell Firefox how to write their UI. And I would never suggest requiring this behavior in a spec. But, for the purposes of exploring this proposal, yes. What happens if the user then reopens the notification and selects Allow? Assuming the targetted element still exists, and that the page hasn't issued a cancelFullScreen() request (or perhaps either of those conditions would cause the notification to disappear?) then the page enters full-screen mode and generates a fullscreenchange event. Yeah, it's somewhat weird to get a fullscreenchange event after a fullscreendenial. But the spec already specifies that The user agent may transition a Document into or out of the full-screen state at any time, whether or not script has requested it. So the devoloper must already expect un-requested fullscreenchange events. -Jer Jer Noble jer.no...@apple.com
Re: [whatwg] Full Screen API Feedback
First things first: On May 12, 2011, at 11:24 AM, Boris Zbarsky wrote: I believe you have _completely_ misunderstood what I said. I'm describing a problem in the geolocation API as it stands. You're talking about something else. Unfortunately, I'm not quite sure how you got there from here. I think we really need to get this complete failure of communication sorted out for the rest of this discussion to be productive. :( and: Hold on. We're talking about geolocation here and whether it's a good model to follow, no? I'm not presuming to design UI for the full-screen case, and I have no indication that this would be the UI used for full-screen. Ah, okay. Sorry, I was under the opposite impression. I had thought the permissions model and UI that Firefox was suggesting was identical to the Geolocation case. If not, then you're right, I'm conflating the two issues. I'll try to limit my responses to the topic at hand. :) As I'm not really looking to rehash or debate the geolocation API and browser's implementation of it, I'm going to leave off responding to some of the points raised below. I'm not trying to be evasive in doing so, but just trying to focus in on the full-screen API. Continuing... On May 12, 2011, at 11:24 AM, Boris Zbarsky wrote: On 5/12/11 12:48 PM, Jer Noble wrote: I'm saying that if authors expect to get one or the other but then never do, that will confuse authors. Again, I fail to see how this is a problem for the denial event but not for the change event. The problem is not for any particular event. The problem is in creating an author expectation that one of the two events will definitely be called. This expectation is incorrect. If there is only one event, then there can be no such expectation, for obvious reasons: the behavior when full screen is denied is that there is no event, so authors have to handle the case of no event in a sane way. I understand what you're saying. By making the error case deliberately ambiguous, you're trying to force the author to behave in a certain way. However, I disagree that this is a) likely to work and b) likely to be less confusing than the alternative. Of course, one solution to both confusion and incorrect expectations is documentation. :-) If it were made both clear and explicit that either of those events may never be dispatched after requestFullScreen() is called, shouldn't that suffice? At the same time, such situations are clearly considered beneficial by multiple UAs, and I think you will have a hard time finding a UI designer who thinks that actually forcing the user to decide in this case (i.e. forcing a modal dialog on them) is a good idea. (skipping ahead) Keep in mind that the user denies case is very likely to be a _rare_ case. The common cases would be user accepts and user defers. I agree with the first statement. However, I don't expect that user defers will actually be that common. If you look at the Suggested UA Policy portion of the spec, most cases are either implicitly accepted or denied without user prompting. I expect that, for the overwhelming majority of cases, full-screen requests will be either be implicitly accepted (for user-action driven, non-keyboard requests), or implicitly denied (non-user-action driven requests). For the remainder (user-driven, keyboard-access requests), the requests will be overwhelmingly non-malicious, resulting in user accepts, and those that are spam/popups will result in user denies. So I'd argue that the case where a page author would have to wait any appreciable amount of time before receiving a fullscreendenied event is actually quite rare. -Jer
Re: [whatwg] Full Screen API Feedback
On May 12, 2011, at 4:23 PM, Robert O'Callahan wrote: That only works if the browser can detect a deferral. If the user simply ignores the browser's UI, you wouldn't know when to fire the event. And there's also the issue of a fullscreendenied being followed by a fullscreenchange, which is weird, but I guess we could live with it if it was the only issue. Although, if the user simply ignores the browser's UI, maybe the browser could fire the fullscreendenied event when there's next keyboard input or a mouse click into the tab that requested fullscreen. But I just made that up so I'll need to think about whether it's reasonable :-). Of course. I was really only talking about explicit user deferral actions; there might not be a good way to solve the problem of ignoring the notification. If it's anything like the current Firefox Geolocation notification, wouldn't a click in the non-popup area dismiss the notification? So I'd argue that the case where a page author would have to wait any appreciable amount of time before receiving a fullscreendenied event is actually quite rare. For what my sample size of one is worth, when Firefox pops up its passive This Web page tried to open a popup window UI, I usually ignore it rather than dismiss it. Interesting. Does Firefox display that message for non-user-action driven pop-ups? Or are those blocked silently? It displays that message for non-user-action driven pop-ups. Popups in mouse click events are automatically allowed (and open a new tab). Okay. Assuming Firefox adopts the Suggested UA Policy portion of the API, equivalent full-screen window requests would be implicitly denied, so you would hopefully not be spammed by full-screen request notifications quite as frequently. -Jer
Re: [whatwg] Proposal: Remove canplaythrough event from audio/video tag
On Nov 1, 2011, at 3:10 PM, Victoria Kirst wrote: - *What are some real examples of how canplaythrough is useful for a web developer?* Even if it were 100% accurate, what is the benefit of the event? Given that it's* not* 100% accurate and that the accuracy is largely up to the discretion of the web browser, what is the benefit? The purpose of the canplaythrough event (and of the HAVE_ENOUGH_DATA ready state) are to signal to the page author that playback is likely to keep up without stalling. This seems to me to have a fairly obvious benefits. Here's a hypothetical scenario: Assume a 100% accurate implementation of canplaythrough (in that the UA can predict with 100% accuracy at what time in the future will the final byte of media data will be downloaded.) Assume magic or time travel if necessary. In this scenario, a media element with with a canplaythrough listener will always begin playing at the earliest possible time and will never stall during normal playback. This is a clear benefit. The question I keep running into is *how inaccurate can the browser be until the event is no longer useful?* This seems to be a Quality of Service issue. Different UAs will have different implementations of canplaythough at varying degrees of accuracy. Some UAs will favor a lower possibility of stalling over an earlier start time. Others may cut closer to the margin-of-error and favor earlier start times. There are many ways to approximate download speed, but it quickly becomes complicated to maintain accuracy: - *Throttled downloads:* So as not to unnecessarily download too much data, browsers may postpone its download of the media file after it has reached a comfortable buffer. In this case, how should the download speed be approximated? Should the browser only measure during active downloading, or should the deferring somehow be factored in? I think this is a very bad example for your case. If the browser has decided to postpone further downloading once it has reached a comfortable buffer, shouldn't it have already fired a canplaythrough event and set its readyState to HAVE_ENOUGH_DATA? Isn't that the very definition of reaching a comfortable buffer? - *Amount of data for accurate bandwidth estimation:* When can the browser feel comfortable with its estimation of download speed? It seems like the browser's calculation would be pretty noisy unless it has at least a few (~5) seconds of data. But the browser won't always have that luxury: for example, if the browser is throttling download under high-speed internet connection, the browser could mostly be downloading in very short bursts, and each burst on its own may not be long enough to get a comfortable estimate. Again, this is a clear example of a situation where the browser could easily and safely emit a canplaythrough event. - *Inherent inaccuracy*: As I previously stated, the only way to be 100% accurate in firing this event is to wait until the last byte has been downloaded before firing the event. Firing the event at any time before then will *always* be a guess, not a guarantee. Someone can unplug their connection, or move to a stronger/weaker wifi connection, which will immediately invalidate any estimation from data collected before. The spec makes it clear that HAVE_ENOUGH_DATA is an estimate. And this situation makes a more compelling argument to /add/ a 'cannolongerplaythrough' event for use when the UA detects a suddenly bandwidth-limited connection. For example, with such an event, a page author may decide to switch to a lower-bitrate version of the media file. Given the uncertainty of the event's usefulness, and the trickiness involved in implementing it accurately, I propose removing canplaythrough from the spec if there is not an abundance of compelling cases in support of the event. A web developer can implement his or her own version of canplaythrough by monitoring progress events and buffered(). No, a page author cannot. For example: if progress events stall, a page author cannot tell that the UA has decided to postpone loading after reaching a comfortable buffer. The point of the event and the readyState is that the UA has knowledge of media loading and decoding that the page author does not. The UA is simply in a much better position to estimate whether a media can play though than is the page author. -Jer
[whatwg] MediaController feedback
Hi, I'm currently working on implementing MediaController in WebKit https://bugs.webkit.org/show_bug.cgi?id=71341, and have a couple pieces of feedback from an implementor's POV: * MediaController Playback State and Ready State The spec defines both a most recently reported readiness state[1] and a most recently reported playback state[2] which, when changed, trigger a variety of event. Because these previous values of these states must be compared each time they are recomputed[3], we must store these values in our MediaController implementation, which is no huge burdon. However, when I was writing testcases for my implementation, I noticed that there was no way to query the current value of either the playback state or the ready state, as neither was present in the IDL for MediaController. This makes writing test cases much more difficult, as they now much rely on waiting for edge-triggered events. In addition, there is a use case for having playbackState and readyState in the MediaController IDL. When adding a MediaController to an HTMLMediaElement, the spec does not require the media controller to report the controller state. (It does require that the MediaController bring the media element up to speed with the new controller.) In this case, the media controller should also be requried to report the controller state, as adding a blocking media element to a controller should probably cause the playbackState to revert to WAITING. But if the current playbackState is already WAITING, no waiting event will be emitted, and the client waiting on such an event will wait forever. So I would like to propose two changes to the spec: + MediaController should expose the following attributes in IDL: readonly attribute unsigned short readyState; readonly attribute unsigned short playbackState; Exposing these attributes would have approximately zero implementation cost (at least in my implementation) as these values are stored and easily queryable anyway. + Modify the media.controller()[4] section to require that the setting the controller report the controller state. * MediaController.play() The MediaController play() function does not actually cause its slaved media elements to play. If all the slaved media elements are paused, the MediaController is a blocked media controller, and none will play until at least one element has play() called on it directly. And even in that case, only the playing elements will begin playing. In addition, the user interface section of the spec says the following: When a media element has a current media controller, and all the slaved media elements of that MediaController are paused, the user agent should unpause all the slaved media elements when the user invokes a user agent interface control for beginning playback. So now, an individual media control must be able to access all other HTMLMediaElements associated with a given MediaController, because there is no facility in MediaController to actually unpause all the slaved media elements. In a previous paragraph in that same section: When a media element has a current media controller, the user agent's user interface for pausing and unpausing playback, for seeking, for changing the rate of playback, for fast-forwarding or rewinding, and for muting or changing the volume of audio of the entire group must be implemented in terms of the MediaController API exposed on that current media controller. Except, in the case of unpausing, this extra requirement of unpausing the slaved media elements is somewhat in conflict with this paragraph. I would like to propose three changes to the spec: + Modify the section bring the media element up to speed with the new controller[5] to require that a media element added to a playing media controller must begin playing, and one added to a paused media controller must pause. + Modiy the section controller . play()[6] to require that the user agent unpause all the slaved media elements. + Modify the section controller . pause()[7] to require that the user egent pause all the slaved media elements. + Remove the section from user interface[8] which requires the user agent unpause all the slaved media elements, quoted above. Thanks, -Jer [1] http://www.w3.org/TR/html5/video.html#most-recently-reported-playback-state [2] http://www.w3.org/TR/html5/video.html#most-recently-reported-playback-state [3] http://www.w3.org/TR/html5/video.html#report-the-controller-state [4] http://www.w3.org/TR/html5/video.html#dom-media-controller [5] http://www.w3.org/TR/html5/video.html#bring-the-media-element-up-to-speed-with-its-new-media-controller [6] http://www.w3.org/TR/html5/video.html#dom-mediacontroller-play [7] http://www.w3.org/TR/html5/video.html#dom-mediacontroller-pause [8] http://www.w3.org/TR/html5/video.html#user-interface
Re: [whatwg] Firing canplaythrough when caches/buffers are full
On May 27, 2012, at 5:51 PM, Robert O'Callahan rob...@ocallahan.org wrote: I propose fixing this by having the UA enter the HAVE_ENOUGH_DATA readyState when the UA decides to suspend a download indefinitely and the preload state is Automatic (or overriden by autoplay being set). We have checked in a patch to Gecko to do this. (Note that for a long time, Gecko has triggered playback of autoplay elements when suspending due to media buffers being full. The new change makes us enter HAVE_ENOUGH_DATA as well.) For what it's worth, the Mac port of WebKit has this exact behavior: http://trac.webkit.org/changeset/97944. It would be good to formalize this, however. -Jer
Re: [whatwg] Fullscreen events dispatched to elements
On Jun 1, 2012, at 6:45 PM, Chris Pearce cpea...@mozilla.com wrote: Because we exit fullscreen when the fullscreen element is removed from the document, so if you dispatch events to the context element, the fullscreenchange event never bubbles up to the containing document in the exit-on-remove case. Actually, in WebKit, we explicitly also message the document from which the element was removed in that case. I don't see why this behavior couldn't be standardized. -Jer
Re: [whatwg] Fullscreen events dispatched to elements
On Jun 4, 2012, at 10:43 PM, Robert O'Callahan rob...@ocallahan.org wrote: On Tue, Jun 5, 2012 at 9:13 AM, Jer Noble jer.no...@apple.com wrote: On Jun 1, 2012, at 6:45 PM, Chris Pearce cpea...@mozilla.com wrote: Because we exit fullscreen when the fullscreen element is removed from the document, so if you dispatch events to the context element, the fullscreenchange event never bubbles up to the containing document in the exit-on-remove case. Actually, in WebKit, we explicitly also message the document from which the element was removed in that case. I don't see why this behavior couldn't be standardized. Did you inform the spec editor(s) when you decided to make this change? What did they say? As it is a holdover from when we implemented the Mozilla Full Screen API proposal[1], which required that behavior, no. -Jer [1] https://wiki.mozilla.org/Gecko:FullScreenAPI#fullscreenchange_event
Re: [whatwg] MediaController feedback
On Jun 4, 2012, at 5:12 PM, Ian Hickson i...@hixie.ch wrote: On Wed, 2 Nov 2011, Jer Noble wrote: I'm currently working on implementing MediaController in WebKit https://bugs.webkit.org/show_bug.cgi?id=71341, and have a couple pieces of feedback from an implementor's POV: * MediaController Playback State and Ready State The spec defines both a most recently reported readiness state[1] and a most recently reported playback state[2] which, when changed, trigger a variety of event. Because these previous values of these states must be compared each time they are recomputed[3], we must store these values in our MediaController implementation, which is no huge burdon. However, when I was writing testcases for my implementation, I noticed that there was no way to query the current value of either the playback state or the ready state, as neither was present in the IDL for MediaController. This makes writing test cases much more difficult, as they now much rely on waiting for edge-triggered events. In addition, there is a use case for having playbackState and readyState in the MediaController IDL. When adding a MediaController to an HTMLMediaElement, the spec does not require the media controller to report the controller state. (It does require that the MediaController bring the media element up to speed with the new controller.) In this case, the media controller should also be requried to report the controller state, as adding a blocking media element to a controller should probably cause the playbackState to revert to WAITING. But if the current playbackState is already WAITING, no waiting event will be emitted, and the client waiting on such an event will wait forever. I've updated to report the controller state. Looks good, thanks. Actually exposing the controller state is not as trivial as it may first appear, in particular if we want to maintain the synchronous illusion (i.e. only change the state as the events fire, not before). But I've done that too. This too looks good. We already store the results when we report the controller state, so at a first glance, exposing this property will be trivial. * MediaController.play() The MediaController play() function does not actually cause its slaved media elements to play. If all the slaved media elements are paused, the MediaController is a blocked media controller, and none will play until at least one element has play() called on it directly. And even in that case, only the playing elements will begin playing. In addition, the user interface section of the spec says the following: When a media element has a current media controller, and all the slaved media elements of that MediaController are paused, the user agent should unpause all the slaved media elements when the user invokes a user agent interface control for beginning playback. So now, an individual media control must be able to access all other HTMLMediaElements associated with a given MediaController, because there is no facility in MediaController to actually unpause all the slaved media elements. In a previous paragraph in that same section: When a media element has a current media controller, the user agent's user interface for pausing and unpausing playback, for seeking, for changing the rate of playback, for fast-forwarding or rewinding, and for muting or changing the volume of audio of the entire group must be implemented in terms of the MediaController API exposed on that current media controller. Except, in the case of unpausing, this extra requirement of unpausing the slaved media elements is somewhat in conflict with this paragraph. I tried to fix this. Looks good. I would like to propose three changes to the spec: + Modify the section bring the media element up to speed with the new controller[5] to require that a media element added to a playing media controller must begin playing, and one added to a paused media controller must pause. + Modiy the section controller . play()[6] to require that the user agent unpause all the slaved media elements. + Modify the section controller . pause()[7] to require that the user egent pause all the slaved media elements. + Remove the section from user interface[8] which requires the user agent unpause all the slaved media elements, quoted above. I don't really understand this proposal. Could you elaborate on this? Sure. The overall purpose of the modifications is to achieve the following: when controller.play() is called, all slaved media elements unconditionally will begin playing. Whatever use case is served by allowing a paused media element to remain paused in a playing media controller, that use case could also be achieved by removing the element from the media controller, then pausing it. However, I now realize that the first change I proposed would turn all
Re: [whatwg] Fullscreen events dispatched to elements
On Jun 4, 2012, at 11:23 PM, Robert O'Callahan rob...@ocallahan.org wrote: If you implemented that proposal as-is then authors would usually need a listener on the document as well as the element, and as Chris pointed out, it's simpler to just always listen on the document. Is that true for the Webkit implementation or did you implement something slightly different? Sorry, you're right; we did implement something slightly different. We always dispatch a message to the element, and additionally one the document if the element has been removed from the document. So authors only have to add event listeners to one or the other. -Jer
Re: [whatwg] Fullscreen events dispatched to elements
On Jun 5, 2012, at 1:06 AM, Anne van Kesteren ann...@annevk.nl wrote: Why should we standardize this if we always notify the document? Is there a benefit to notifying both the element and the document? I think Vincent put forward a reasonable argument. The document is a finite, shared resource. Requiring authors to share that resource will inevitably lead to conflicts. Those (hypothetical) conflicts may be manageable, but including the fullscreen element in the event dispatch gives developers a means to avoid them entirely. For better or worse, existing implementations are still prefixed as far as I know and incompatible with each other. So that in itself is not really an argument for changing the standard. Of course. I was just pointing out an alternate solution. -Jer
Re: [whatwg] MediaController feedback
On Jun 5, 2012, at 3:02 PM, Ian Hickson i...@hixie.ch wrote: On Mon, 4 Jun 2012, Jer Noble wrote: This too looks good. We already store the results when we report the controller state, so at a first glance, exposing this property will be trivial. Make sure you're setting the attribute at the right time. There's some careful jumping through hoops in the spec to make sure the attribute doesn't update before the events are just about to fire. Will do. I would like to propose three changes to the spec: + Modify the section bring the media element up to speed with the new controller[5] to require that a media element added to a playing media controller must begin playing, and one added to a paused media controller must pause. + Modiy the section controller . play()[6] to require that the user agent unpause all the slaved media elements. + Modify the section controller . pause()[7] to require that the user egent pause all the slaved media elements. + Remove the section from user interface[8] which requires the user agent unpause all the slaved media elements, quoted above. I don't really understand this proposal. Could you elaborate on this? Sure. The overall purpose of the modifications is to achieve the following: when controller.play() is called, all slaved media elements unconditionally will begin playing. I don't think this is a good idea. If the user has paused one of the slaves, and then pauses and resumes the whole thing, the paused media element shouldn't resume. It should remain paused. Why? For one, I don't know how a user will end up pausing just one slaved media element. It appears that won't be possible with the UA provided play/pause button, as those are required to be implemented in terms of the MediaController. There is a non-normative line in the spec reading: When a media element has a current media controller, user agents may additionally provide the user with controls that directly manipulate an individual media element without affecting the MediaController, but such features are considered relatively advanced and unlikely to be useful to most users. …But even in this (optional and unlikely to be useful) case, the mandatory UA controls will just unpause the slaved elements the next time the user hits the UA provided play button. With JavaScript, it's certainly possible for a page author to play() or pause() a slaved media element directly, but that author could just as easily remove the media element from the media group / media controller. So, I don't really know what use case not resuming solves, but the general use case is made confusing by this requirement. E.g., a page author is setting up some custom UI to control two slaved media elements: mediaController = new MediaController() mediaController.pause() video1.controller = mediaController; video2.controller = mediaController; button.addEventListener('click', function(){ mediaController.play(); }, false); // If developers forget this step, their play button will never work: video1.play(); video2.play(); And then, once they discover the reason their custom play button doesn't work, a significant fraction of page authors will do something like: button.addEventListener('click', function() { video1.play(); video2.play(); mediaController.play(); }, false); Which will, hypothetically speaking, break the intent of the default behavior. [As an aside, this exact scenario played out as a developer was asking me why their MediaController demo wasn't working. They were quite incredulous that a call to MediaController.play() didn't actually cause their videos to play. I think that this confusion will be quite common.] Whatever use case is served by allowing a paused media element to remain paused in a playing media controller, that use case could also be achieved by removing the element from the media controller, then pausing it. That only works if there's JavaScript doing the removing. The idea here is that this should all work even without any JS, just with UA UI. With just the UA UI, the behavior would be exactly the same, as the spec currently requires the UA provided play button unpause the slaved media elements.[1] This would just add that requirement to the MediaController.play() method as well. -Jer
Re: [whatwg] MediaController feedback
On Aug 27, 2012, at 5:02 PM, Ian Hickson i...@hixie.ch wrote: With JavaScript, it's certainly possible for a page author to play() or pause() a slaved media element directly, but that author could just as easily remove the media element from the media group / media controller. [...] That only works if there's JavaScript doing the removing. The idea here is that this should all work even without any JS, just with UA UI. With just the UA UI, the behavior would be exactly the same [...] If you remove the element from the media controller, the media controller's timeline changes. So? In the general case (alternative audio, sign-language) the timelines will be exactly the same. If there's an edge case where a change in the timeline is a problem, a page author could hide the slaved media element (e.g. display:none or element.muted = true) instead. It'll be quite common for there to be videos that are not currently playing, e.g. sign-language tracks. I think you're making an incorrect distinction. The author may not want the sign-language track to *display*. Pausing the video is one mechanism which achieves that (sort of). Hiding it is another. Removing the video from the MediaController and pausing it is a third. The side effects of this particular mechanism are causing a lot of confusion. We gave a session at WWDC about the MediaController and collected a lot of developer feedback at the labs, and the general theme was that the API didn't make sense. Here's a good example of the kinds of bug reports we're seeing: MediaController play() doesn't work https://bugs.webkit.org/show_bug.cgi?id=94786. If we change anything here, I think it would be the currently required UI behaviour which requires all the videos to start playing when the user overrides the JS-provided controls and just uses the UA controls. This change would break the UI controls in the basic case of video controls mediagroup=foo. -Jer
Re: [whatwg] video feedback
On Sep 17, 2012, at 12:43 PM, Ian Hickson i...@hixie.ch wrote: On Mon, 9 Jul 2012, adam k wrote: i have a 25fps video, h264, with a burned in timecode. it seems to be off by 1 frame when i compare the burned in timecode to the calculated timecode. i'm using rob coenen's test app at http://www.massive-interactive.nl/html5_video/smpte_test_universal.html to load my own video. what's the process here to report issues? please let me know whatever formal or informal steps are required and i'll gladly follow them. Depends on the browser. Which browser? i'm aware that crooked framerates (i.e. the notorious 29.97) were not supported when frame accuracy was implemented. in my tests, 29.97DF timecodes were incorrect by 1 to 3 frames at any given point. will there ever be support for crooked framerate accuracy? i would be more than happy to contribute whatever i can to help test it and make it possible. can someone comment on this? This is a Quality of Implementation issue, basically. I believe there's nothing inherently in the API that would make accuracy to such timecodes impossible. TLDR; for precise navigation, you need to use a a rational time class, rather than a float value. The nature of floating point math makes precise frame navigation difficult, if not impossible. Rob's test is especially hairy, given that each frame has a timing bound of [startTime, endTime), and his test attempts to navigate directly to the startTime of a given frame, a value which gives approximately zero room for error. I'm most familiar with MPEG containers, but I believe the following is also true of the WebM container: times are represented by a rational number, timeValue / timeScale, where both numerator and denominator are unsigned integers. To seek to a particular media time, we must convert a floating-point time value into this rational time format (e.g. when calculating the 4th frame's start time, from 3 * 1/29.97 to 3 * 1001/3). If there is a floating-point error in the wrong direction (e.g., as above, a numerator of 3002 vs 3003), the end result will not be the frame's startTime, but one timeScale before it. We've fixed some frame accuracy bugs in WebKit (and Chromium) by carefully rounding the incoming floating point time value, taking into account the media's time scale, and rounding to the nearest 1/timeScale value. This fixes Rob's precision test, but at the expense of precision. (I.e. in a 30 fps movie, currentTime = 0.99 / 30 will navigate to the second frame, not the first, due to rounding, which is technically incorrect.) This is a common problem, and Apple media frameworks (for example) therefore provide rational time classes which provide enough accuracy for precise navigation (e.g. QTTime, CMTime). Using a floating point number to represent time with any precision is not generally accepted as good practice when these rational time classes are available. -Jer
Re: [whatwg] video feedback
On Dec 20, 2012, at 7:27 PM, Mark Callow callow.m...@artspark.co.jp wrote: On 2012/12/21 2:54, Ian Hickson wrote: On Thu, 20 Dec 2012, Mark Callow wrote: I draw your attention to Don't Store that in a float http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/ and its suggestion to use a double starting at 2^32 to avoid the issue around precision changing with magnitude as the time increases. Everything in the Web platform already uses doubles. Yes, except as noted by Boris. The important point is the idea of using 2^32 as zero time which means the precision barely changes across the range of time values of interest to games, videos, etc. I don't believe the frame accuracy problem in question had to do with precision instability, per se. Many of Rob Coenen's frame accuracy issues were found within the first second of video. Admittedly, this is where the avaliable precision is changing most rapidly, but it is also where available precision is greatest by far. An integral rational number has a benefit over even the 2^32 zero time suggestion: for common time scale values[1], it is intrinsically stable over the range of time t=[0..2^43). It has the added benefit of being exactly the representation used by the underlying media engine. On Dec 17, 2012, at 4:01 PM, Ian Hickson i...@hixie.ch wrote: Should we add a preciseSeek() method with two arguments that does a seek using the given rational time? This method would be more useful if there were a way to retrieve the media's time scale. Otherwise, the script would have to pick an arbitrary scale value, or provide the correct media scale through other means (such as querying the server hosting the media). Additionally, authors like Rob are going to want to retrieve this precise representation of the currentTime. If rational time values were encapsulated into their own interface, a preciseCurrentTime (or similar) read-write attribute could be used instead. -Jer [i] E.g., 1001 is a common time scale for 29.997 and 23.976 FPS video.
Re: [whatwg] Background audio channels
On Mar 15, 2013, at 10:57 AM, Wesley Johnston wjohns...@mozilla.com wrote: In most situations, when the user puts a webpage in the background, any media being played by the page should be paused. Any attempts to play audio by a background page should also be prevented. However, for some sites (music or radio apps) the user would like to continue to hear the app while they do something else. These pages should be able to designate their audio as a type that should keep playing while in the background. The useragent should also attempt to avoid having the stream killed by the operating system if possible. Why can't this just be handled by the UA? MobileSafari, for instance, already supports playing audio while the app is backgrounded. It even supports playing and pausing audio elements with all the standard media playback controls. Were it to support this spec, it would break every page which does not explicity opt into the background channel. This is especially true on mobile devices, but the problem is also already prevalent on desktop. What does in the background mean in a desktop context? A non-frontmost window? Minimized? A non-topmost tab? I think semantically we need a way to describe to the useragent how to play a particular track. I'd suggest we add an optional attribute to media elements, audiochannel, designating the output and priority of this audio. The channel attribute can potentially take on three different values. normal, background, and telephony. normal channels are the default for all media elements. Using them doesn't require any special permissions. Audio playing with these channels is paused when the web page moves into the background. In addition, calling play on an media element with this channel while in the background will put the element into the paused for user interaction state (i.e. playback won't start until the webapp is brought to the foreground)? background channels will continue to play when the page is put into the background. Trying to play a background channel while in the background should also work. The ability to play audio on this channel may require requesting permission from the UA first (i.e. possibly a prompt when the audio is first played or when moving to the background). If the user doesn't grant permission, these should throw a MediaError (MEDIA_ERR_CHANNEL_PERMISSION_NOT_GRANTED?) so that the page can know what has happened and do something appropriate. The normal channel will be incredibly frustrating, especially for mobile users. For the overwhelming majority of audio use-cases, a user will be incredibly annoyed if audio pauses while switching tabs or switching to another app. Every single page will have to update in order to opt into the background channel to get (what is currently the default) optimum experience. If this spec is going to move forward, background should be the default. normal should be opt-in, or removed entirely. telephony channels are similar to background channels and can play even if the page is in the background. Playing audio on a telephony channel may cause any audio playing on normal or background channels to be paused or have their volume severely decreased. They also, on devices where its supported, will likely play over handset speakers rather than normal speakers. Similar to background, these may require permission from the UA. Users already have permission UI to allow apps use of the handset speakers: the mute switch. Throwing up another permission dialog when the user is trying to answer a webapp telephone call is going to suck. (Presumably that webapp will also need permission to use the microphone, as well, so there will be multiple UA permission dialogs up.) And when some user accidentally grants a malicious site telephony permission, that site can now blare ads over their handset speakers, and the mute switch is powerless to stop it. Without the ignore the mute switch behavior this channel seems identical to background. Note: This is all based rather loosely on the AudioChannels implementation written for B2G recently [1]. It includes a few other use-cases on its wiki page, along with definitions of additional channels to accomadate them. I've been trying to simplify it down to handle the most common use cases. Finding the correct terminology here is difficult though. For instance, it seems likely that games will see the background channel and think its an appropriate place to play game background music, the exact type of audio you'd like to have paused when you leave the game. Ideas for better ways to describe it are welcome. This mechanism may make sense for installed apps. iOS has a similar concept of Audio Session Categories [1] which govern how audio is routed, how audio apps interact with one another, how interruptions are handled, and whether playback resumes after an interruption.
Re: [whatwg] Background audio channels
On Apr 10, 2013, at 12:14 PM, Wesley Johnston wjohns...@mozilla.com wrote: Again, IMO 1.) The EVENTUAL default behavior here has to be to mute tabs in the background. I disagree. The current default behavior (allowing audio to play in the background) is working just fine for Safari. Maybe it isn't for Gecko, but that should be a choice left up to the UA, and not a specced requirement. -Jer
Re: [whatwg] VIDEO and pitchAdjustment
> On Mar 11, 2016, at 1:11 PM, Garrett Smith <dhtmlkitc...@gmail.com> wrote: > > > > On Tuesday, March 8, 2016, Jer Noble <jer.no...@apple.com > <mailto:jer.no...@apple.com>> wrote: > > > On Mar 8, 2016, at 4:42 PM, Garrett Smith <dhtmlkitc...@gmail.com <>> wrote: > > > > On Fri, Mar 4, 2016 at 3:43 PM, Jer Noble <jer.no...@apple.com <>> wrote: > >> > >>> On Mar 4, 2016, at 3:19 PM, Garrett Smith <dhtmlkitc...@gmail.com <>> > >>> wrote: > >>> > >>> On Fri, Mar 4, 2016 at 1:55 PM, Jer Noble <jer.no...@apple.com <>> wrote: > >>>> > >>>>> On Mar 1, 2016, at 8:00 PM, Philip Jägenstedt <phil...@opera.com <>> > >>>>> wrote: > >>>>> > >>>>> On Wed, Mar 2, 2016 at 9:19 AM, Garrett Smith <dhtmlkitc...@gmail.com > >>>>> <>> wrote: > >>>>>> On Thu, Nov 12, 2015 at 11:32 AM, Philip Jägenstedt <phil...@opera.com > >>>>>> <>> wrote: > >>>>>>> On Thu, Nov 12, 2015 at 10:55 AM, Garrett Smith > >>>>>>> <dhtmlkitc...@gmail.com <>> > >>>>>>> wrote: > >>>>>>>> On 11/12/15, Philip Jägenstedt <phil...@opera.com <>> wrote: > >>>>>>>>> On Thu, Nov 12, 2015 at 9:07 AM, Garrett Smith > >>>>>>>>> <dhtmlkitc...@gmail.com <>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> On 10/19/15, Philip Jägenstedt <phil...@opera.com <>> wrote: > >>>>>>>>>>> On Tue, Sep 1, 2015 at 11:21 AM, Philip Jägenstedt > >>>>>>>>>>> <phil...@opera.com <>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>> On Mon, Aug 31, 2015 at 9:48 PM, Domenic Denicola > >>>>>>>>>>>> <d...@domenic.me <>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> From: Eric Carlson [mailto:eric.carl...@apple.com <>] > > > >> > >> The Web Audio equivalent would be: > >> > >> var video = document.querySelector(‘video’); > >> video.preservesPitch = false; > >> var context = new AudioContext(); > >> var sourceNode = context.createMediaElementSource(video); > >> var pitchShiftNode = context.createPitchShift(); > >> pitchShiftNode.shiftAmount = 1.023; > >> sourceNode.connect(pitchShiftNode); > >> pitchShiftNode.connect(context.destination); > >> > > > > Which implementations does that work in? > > None, because there is no such thing as a PitchShiftNode. > > > I see. > > > That code is more complex than should be necessary. I see where you're > > coming from on separating the audio. Could we move the media decorator > > behind the scenes, and replace it with a simple getter/setter property > > like `videoElement.audio` so that that can happen automagically? > > Reminds me of createElement, createRange, document.implementation, > > etc. Warts! > > I’m not entirely sure what you’re asking here. If it’s that you don’t like > the `context.createMediaElementSource()` or `context.createPitchShift()` > syntax and would rather a constructor syntax, Issue #250 > <https://github.com/WebAudio/web-audio-api/issues/250 > <https://github.com/WebAudio/web-audio-api/issues/250>> in the Web Audio spec > is the issue for you. > > > But then again, you also just said that there are no APIs on OS X that > > allow an arbitrary pitch shift to be added to audio. If that is true, > > then your `createPitchShift` code would be possible anyway, is that > > There is no such API for such post-processing built into the OS X media > frameworks. > > Oh. Poor hardware integration, and now this…. Being Apple CEO is not Tim > Cook's greatest gift… I’m not sure how that’s relevant. > As an example, there is an API for preserving pitch across rate changes: > -[AVPlayerItem setAudioTimePitchAlgorithm:] > <https://developer.apple.com/library/prerelease/ios/documentation/AVFoundation/Reference/AVPlayerItem_Class/index.html#//apple_ref/occ/instp/AVPlayerItem/audioTimePitchAlgorithm > > <https://developer.apple.com/library/prerelease/ios/documentation/AVFoundation/Reference/AVPlayerItem_Class/index.html#//apple_ref/o
Re: [whatwg] VIDEO and pitchAdjustment
> On Mar 1, 2016, at 8:00 PM, Philip Jägenstedtwrote: > > On Wed, Mar 2, 2016 at 9:19 AM, Garrett Smith wrote: >> On Thu, Nov 12, 2015 at 11:32 AM, Philip Jägenstedt >> wrote: >>> On Thu, Nov 12, 2015 at 10:55 AM, Garrett Smith >>> wrote: On 11/12/15, Philip Jägenstedt wrote: > On Thu, Nov 12, 2015 at 9:07 AM, Garrett Smith > wrote: >> >> On 10/19/15, Philip Jägenstedt wrote: >>> On Tue, Sep 1, 2015 at 11:21 AM, Philip Jägenstedt >>> wrote: On Mon, Aug 31, 2015 at 9:48 PM, Domenic Denicola wrote: > From: Eric Carlson [mailto:eric.carl...@apple.com] >> >>> Two things. >>> >>> 1. Do the underlying media frameworks that browsers are using support >>> arbitrary pitch changes, or do they also only have the limited >>> preservesPitch-style API? >> >> Are there any problems getting in the way of pitch adjustment (without >> depending on playbackRate)? > > I don't know, that was basically my question too. If the underlying > APIs don't support it, that's a problem that needs to be fixed first. There are no such APIs on OS X which would allow an arbitrary pitch shift to be added to an otherwise normally playing piece of audio. IMO, this is a more appropriate request for the Web Audio API (adding a Audio Node which can add an arbitrary amount of pitch shift). At which point, there would be no need for this in HTMLMediaElement, as authors could make a simple node graph consisting of an MediaElementAudioSourceNode and a PitchShiftNode. -Jer
Re: [whatwg] VIDEO and pitchAdjustment
> On Mar 4, 2016, at 3:19 PM, Garrett Smith <dhtmlkitc...@gmail.com> wrote: > > On Fri, Mar 4, 2016 at 1:55 PM, Jer Noble <jer.no...@apple.com> wrote: >> >>> On Mar 1, 2016, at 8:00 PM, Philip Jägenstedt <phil...@opera.com> wrote: >>> >>> On Wed, Mar 2, 2016 at 9:19 AM, Garrett Smith <dhtmlkitc...@gmail.com> >>> wrote: >>>> On Thu, Nov 12, 2015 at 11:32 AM, Philip Jägenstedt <phil...@opera.com> >>>> wrote: >>>>> On Thu, Nov 12, 2015 at 10:55 AM, Garrett Smith <dhtmlkitc...@gmail.com> >>>>> wrote: >>>>>> On 11/12/15, Philip Jägenstedt <phil...@opera.com> wrote: >>>>>>> On Thu, Nov 12, 2015 at 9:07 AM, Garrett Smith <dhtmlkitc...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> On 10/19/15, Philip Jägenstedt <phil...@opera.com> wrote: >>>>>>>>> On Tue, Sep 1, 2015 at 11:21 AM, Philip Jägenstedt <phil...@opera.com> >>>>>>>>> wrote: >>>>>>>>>> On Mon, Aug 31, 2015 at 9:48 PM, Domenic Denicola <d...@domenic.me> >>>>>>>>>> wrote: >>>>>>>>>>> From: Eric Carlson [mailto:eric.carl...@apple.com] >>>> >>>>> Two things. >>>>> >>>>> 1. Do the underlying media frameworks that browsers are using support >>>>> arbitrary pitch changes, or do they also only have the limited >>>>> preservesPitch-style API? >>>> >>>> Are there any problems getting in the way of pitch adjustment (without >>>> depending on playbackRate)? >>> >>> I don't know, that was basically my question too. If the underlying >>> APIs don't support it, that's a problem that needs to be fixed first. >> >> >> There are no such APIs on OS X which would allow an arbitrary pitch shift to >> be added to an otherwise normally playing piece of audio. >> >> IMO, this is a more appropriate request for the Web Audio API (adding a >> Audio Node which can add an arbitrary amount of pitch shift). At which >> point, there would be no need for this in HTMLMediaElement, as authors could >> make a simple node graph consisting of an MediaElementAudioSourceNode and a >> PitchShiftNode. >> >> -Jer > > But that can't work on OSX, right? I wonder how audio software on mac > does it, Audacity, Amazing Slow Downer, and Garage Band, Logic, and > many others can all do this. None of them use the built in platform APIs to shift the pitch of encoded media. Each does it manually, within their app, and each probably uses a different algorithm to achieve shift in pitch. > Plus how would web Audio API solve for the use case? > > To frame an example, go to YT and pull up "Take it Easy" (Eagles). The > song is about a 50 cents flat of standard tuning. The pitch can be > adjusted by setting playbackRate to 1.023 and setting > MozPreservesPitch to false:— > > var vv = document.querySelector("video"); > vv.mozPreservesPitch = 0; > vv.playbackRate = 1.023 > > — but that speeds it up. I don't want speed coupled with pitch. The Web Audio equivalent would be: var video = document.querySelector(‘video’); video.preservesPitch = false; var context = new AudioContext(); var sourceNode = context.createMediaElementSource(video); var pitchShiftNode = context.createPitchShift(); pitchShiftNode.shiftAmount = 1.023; sourceNode.connect(pitchShiftNode); pitchShiftNode.connect(context.destination); -Jer > Thanks, > -- > Garrett > @xkit > ChordCycles.wordpress.com > garretts.github.io > personx.tumblr.com