Re: [whatwg] Video with MIME type application/octet-stream
On 18.08.2010 13:47, Julian Reschke wrote: In the meantime, Ian did some test, see http://krijnhoetmer.nl/irc-logs/whatwg/20100819#l-28 and http://hixie.ch/tests/adhoc/html/video/001.html Ian, any chance you could tests for *absent* content type? Best regards, Julian
Re: [whatwg] Bluetooth devices
Is there any intention to provide access to Bluetooth devices through the Device element? Addressing Bluetooth specifically might or might not be out of scope of the device element, depending on what the scope will be :) This is to low level for my taste; what would a web page want direct communication over Bluetooth for? If RS-323 will be left out, Bluetooth probably should be [left out] too. Higher level JavaScript interfaces for accessing filesystems and user selected audio streams cover all use cases I can think of. But then, I don't fully understand what use cases device is trying to solve in the first place.
Re: [whatwg] Bluetooth devices
I am not sure whether the physical connectivity used has much bearing on what devices are connected and usable, honestly. Why does the 'virtual' wire matter? (USB, serial, Bluetooth, built-in, IEEE4888, )? On Aug 19, 2010, at 14:28 , Bjartur Thorlacius wrote: Is there any intention to provide access to Bluetooth devices through the Device element? Addressing Bluetooth specifically might or might not be out of scope of the device element, depending on what the scope will be :) This is to low level for my taste; what would a web page want direct communication over Bluetooth for? If RS-323 will be left out, Bluetooth probably should be [left out] too. Higher level JavaScript interfaces for accessing filesystems and user selected audio streams cover all use cases I can think of. But then, I don't fully understand what use cases device is trying to solve in the first place. David Singer Multimedia and Software Standards, Apple Inc.
Re: [whatwg] Bluetooth devices
Agreed. Serial is the only case where it might make sense (if you were making a web version of hyperterminal). J On Thu, Aug 19, 2010 at 1:52 PM, David Singer sin...@apple.com wrote: I am not sure whether the physical connectivity used has much bearing on what devices are connected and usable, honestly. Why does the 'virtual' wire matter? (USB, serial, Bluetooth, built-in, IEEE4888, )? On Aug 19, 2010, at 14:28 , Bjartur Thorlacius wrote: Is there any intention to provide access to Bluetooth devices through the Device element? Addressing Bluetooth specifically might or might not be out of scope of the device element, depending on what the scope will be :) This is to low level for my taste; what would a web page want direct communication over Bluetooth for? If RS-323 will be left out, Bluetooth probably should be [left out] too. Higher level JavaScript interfaces for accessing filesystems and user selected audio streams cover all use cases I can think of. But then, I don't fully understand what use cases device is trying to solve in the first place. David Singer Multimedia and Software Standards, Apple Inc.
Re: [whatwg] Bluetooth devices
My specific interest is in the ability to connect a mobile phone to BT-serial capable medical devices in a JS interchange with a user, rather than having to load a handset-specific program on the handset to do this. There are kludged ways to do this now, but it would be far more elegant to build this into the Device element. Don Rosen CTO dro...@generationone.com 718-874-9449
Re: [whatwg] On implementing videos with multiple tracks in HTML5
On Sat, 22 May 2010, Carlos Andr�s Sol�s wrote: Imagine a hypothetical website that delivers videos in multiple languages. Like on a DVD, where you can choose your audio and subtitles language. And also imagine there is the possibility of downloading a file with the video, along with either the chosen audio/sub tracks, or all of them at once. Right now, though, there's no way to deliver multiple audio and subtitle streams on HTML5 and WebM. Since the latter supports only one audio and one video track, with no embedded subtitles, creating a file with multiple tracks is impossible, unless using full Matroska instead of WebM - save for the fact that the standard proposed is WebM and not Matroska. A solution could be to stream the full Matroska with all tracks embedded. This, though, would be inefficient, since the user often will select only one language to view the video, and there's no way yet to stream only the selected tracks to the user. I have thought of two solutions for this: * Solution 1: Server-side demuxing. The video with all tracks is stored as a Matroska file. The server demuxes the file, generates a new one with the chosen tracks, and streams only the tracks chosen by the user. When the user chooses to download the full video, the full Matroska file is downloaded with no overhead. The downside is the server-side demuxing and remuxing; fortunately most users only need to choose once. Also, there's the problem of having to download the full file instead of a file with only the tracks wanted; this could be solved by even more muxing. On Sun, 23 May 2010, Silvia Pfeiffer wrote: For the last 10 years, we have tried to solve many of the media challenges on servers, making servers increasingly intelligent, and by that slow, and not real HTTP servers any more. Much of that happened in proprietary software, but others tried it with open software, too. For example I worked on a project called Annodex which was trying to make open media resources available on normal HTTP servers with only a cgi script installed that would allow remuxing files for serving time segments of the media resources. Or look at any of the open source RTSP streaming servers that were created. We have learnt in the last 10 years that the Web is better served with a plain HTTP server than with custom media servers and we have started putting the intelligence into user agents instead. User agents now know how to do byte range requests to retrieve temporal segments of a media resource. I believe for certain formats it's even possible to retrieve tracks through byte range requests only. In short, the biggest problem with your idea of dynamic muxing on a server is that it's very CPU intensive and doesn't lead easily to a scalable server. Also, it leads to specialised media servers in contrast to just using a simple HTTP server. It's possible, of course, but it's complex and not general-purpose. On Mon, 31 May 2010, Lachlan Hunt wrote: WebM, just like Matroska, certainly does support multiple video and audio tracks. The current limitation is that browser implementations don't yet provide an interface or API for track selection. Whether or not authors would actually do this depends on their use case and what trade offs they're willing to make. The use cases I'm aware of for multiple tracks include offering stereo and surround sound alternatives, audio descripitons, audio commentaries or multiple languages. The trade off here is in bandwidth usage vs. storage space (or processing time if you're doing dynamic server side muxing). Duplicating the video track in each file, containing only a single audio track saves bandwidth for users while increasing storage space. Storing all audio tracks in one multi-track webm file avoids duplication, while increasing the bandwidth for users downloading tracks they may not need. The latter theoretically allows for the user to dynamically switch audio tracks to, e.g. change language or listen to commentary, without having to download a whole new copy of the video. The former requires the user to choose which tracks they want prior to downloading the appropriate file. If there's only a choice between 2 or maybe 3 tracks, then the extra bandwidth may be insignificant. If, however, you're offering several alternate languages in both stereo and surround sound, with audio descriptions and directors commentary — the kind of stuff you'll find on many commercial DVDs — then the extra bandwidth wasted by users downloading so many tracks they don't need may not be worth it. On Sat, 22 May 2010, Carlos Andr�s Sol�s wrote: * Solution 2: User-side muxing. Each track (video, audio, subtitles) is stored in standalone files. The server streams the tracks chosen by the user, and the web browser muxes them back. When the user chooses to download the video, the
Re: [whatwg] On implementing videos with multiple tracks in HTML5
On Fri, Aug 20, 2010 at 9:58 AM, Ian Hickson i...@hixie.ch wrote: On Sat, 22 May 2010, Carlos Andrés Solís wrote: Imagine a hypothetical website that delivers videos in multiple languages. Like on a DVD, where you can choose your audio and subtitles language. And also imagine there is the possibility of downloading a file with the video, along with either the chosen audio/sub tracks, or all of them at once. Right now, though, there's no way to deliver multiple audio and subtitle streams on HTML5 and WebM. Since the latter supports only one audio and one video track, with no embedded subtitles, creating a file with multiple tracks is impossible, unless using full Matroska instead of WebM - save for the fact that the standard proposed is WebM and not Matroska. A solution could be to stream the full Matroska with all tracks embedded. This, though, would be inefficient, since the user often will select only one language to view the video, and there's no way yet to stream only the selected tracks to the user. I have thought of two solutions for this: * Solution 1: Server-side demuxing. The video with all tracks is stored as a Matroska file. The server demuxes the file, generates a new one with the chosen tracks, and streams only the tracks chosen by the user. When the user chooses to download the full video, the full Matroska file is downloaded with no overhead. The downside is the server-side demuxing and remuxing; fortunately most users only need to choose once. Also, there's the problem of having to download the full file instead of a file with only the tracks wanted; this could be solved by even more muxing. On Sun, 23 May 2010, Silvia Pfeiffer wrote: For the last 10 years, we have tried to solve many of the media challenges on servers, making servers increasingly intelligent, and by that slow, and not real HTTP servers any more. Much of that happened in proprietary software, but others tried it with open software, too. For example I worked on a project called Annodex which was trying to make open media resources available on normal HTTP servers with only a cgi script installed that would allow remuxing files for serving time segments of the media resources. Or look at any of the open source RTSP streaming servers that were created. We have learnt in the last 10 years that the Web is better served with a plain HTTP server than with custom media servers and we have started putting the intelligence into user agents instead. User agents now know how to do byte range requests to retrieve temporal segments of a media resource. I believe for certain formats it's even possible to retrieve tracks through byte range requests only. In short, the biggest problem with your idea of dynamic muxing on a server is that it's very CPU intensive and doesn't lead easily to a scalable server. Also, it leads to specialised media servers in contrast to just using a simple HTTP server. It's possible, of course, but it's complex and not general-purpose. On Mon, 31 May 2010, Lachlan Hunt wrote: WebM, just like Matroska, certainly does support multiple video and audio tracks. The current limitation is that browser implementations don't yet provide an interface or API for track selection. Whether or not authors would actually do this depends on their use case and what trade offs they're willing to make. The use cases I'm aware of for multiple tracks include offering stereo and surround sound alternatives, audio descripitons, audio commentaries or multiple languages. The trade off here is in bandwidth usage vs. storage space (or processing time if you're doing dynamic server side muxing). Duplicating the video track in each file, containing only a single audio track saves bandwidth for users while increasing storage space. Storing all audio tracks in one multi-track webm file avoids duplication, while increasing the bandwidth for users downloading tracks they may not need. The latter theoretically allows for the user to dynamically switch audio tracks to, e.g. change language or listen to commentary, without having to download a whole new copy of the video. The former requires the user to choose which tracks they want prior to downloading the appropriate file. If there's only a choice between 2 or maybe 3 tracks, then the extra bandwidth may be insignificant. If, however, you're offering several alternate languages in both stereo and surround sound, with audio descriptions and directors commentary — the kind of stuff you'll find on many commercial DVDs — then the extra bandwidth wasted by users downloading so many tracks they don't need may not be worth it. On Sat, 22 May 2010, Carlos Andrés Solís wrote: * Solution 2: User-side muxing. Each track (video, audio, subtitles) is stored in standalone files. The server streams the tracks chosen by the user, and
Re: [whatwg] A standard for adaptive HTTP streaming for media resources
On Tue, 25 May 2010, Silvia Pfeiffer wrote: We've in the past talked about how there is a need to adapt the bitrate version of a audio or video resource that is being delivered to a user agent based on the available bandwidth on the network, the available CPU cycles, and possibly other conditions. It has been discussed to do this using @media queries and providing links to alternative versions of a media resources through the source element inside it. But this is a very inflexible solution, since the side conditions for choosing a bitrate version may change over time and what is good at the beginning of video playback may not be good 2 minutes later (in particular if you're on a mobile device driving through town). Further, we have discussed the need for supporting a live streaming approach such as RTP/RTSP - but RTP/RTSP has its own non-Web issues that will make it difficult to make it part of a Web application framework - in particular it request a custom server and won't just work with a HTTP server. In recent times, vendors have indeed started moving away from custom protocols and custom servers and have moved towards more intelligence in the UA and special approaches to streaming over HTTP. Microsoft developed Smooth Streaming, Apple developed HTTP Live Streaming and Adobe recently launched HTTP Dynamic Streaming. (Also see a comparison at). As these vendors are working on it for MPEG files, so are some people for Ogg. I'm not aware anyone is looking at it for WebM yet. Standards bodies haven't held back either. The 3GPP organisation have defined 3GPP adaptive HTTP Streaming (AHS) in their March 2010 release 9 of 3GPP. Now, MPEG has started consolidating approaches for adaptive bitrate streaming over HTTP for MPEG file formats. Adaptive bitrate streaming over HTTP is the correct approach towards solving the double issues of adapting to dynamic bandwidth availability, and of providing a live streaming approach that is reliable. Right now, no standard exists that has been proven to work in a format-independent way. This is particularly an issue for HTML5, where we want at least support for MPEG4, Ogg Theora/Vorbis, and WebM. I know that it is not difficult to solve this issue in a format-independent way, which is why solutions are jumping up everywhere. They are, however, not compatible and create a messy environment where people have to install solutions for multiple different approaches to make sure they are covered for different platforms, different devices, and different formats. It's a clear situation where a new standard is necessary. The standard basically needs to provide three different things: * authoring of content in a specific way * description of the alternative files on the server and their features for the UA to download and use for switching * a means to easily switch mid-way between these alternative files On Mon, 24 May 2010, Chris Holland wrote: I don't have something decent to offer for the first and last bullets but I'd like to throw-in something for the middle bullet: The http protocol is vastly under-utilized today when it comes to URIs and the various Accept* headers. Today developers might embed an image in a document as chris.png. Web daemons know to find that resource and serve it, in this sense, chris.png is a resource locator. Technically one might reference the image as a resource identifier named chris. The user's browser may send image/gif as the only value of an accept header, signaling the following to the server: I'm supposed to download an image of chris here, but I only support gif, so don't bother sending me a .png. In a perhaps more useful scenario the user agent may tell the server don't bother sending me an image, I'm a screen reader, do you have anything my user could listen to?. In this sense, the document's author didn't have to code against or account for every possible context out there, the author merely puts a reference to a higher-level representation that should remain forward-compatible with evolving servers and user-agents. By passing a list of accepted mimetypes, the accept http header provides this ability to serve context-aware resources, which starts to feel like a contender for catering to your middle bullet. To that end, new mime-types could be defined to encapsulate media type/bit rate combinations. Or the accept header might remain confined to media types and acceptable bit rate information might get encapsulated into a new header, such as: X-Accept-Bitrate . If you combined the above approach with existing standards for http byte range requests, there may be a mechanism there to cater to your 3rd bullet as well: when network conditions deteriorate, the client could interrupt the current stream and issue a new request where it left off to the server. Although this likel
Re: [whatwg] A standard for adaptive HTTP streaming for media resources
On Fri, Aug 20, 2010 at 11:08 AM, Ian Hickson i...@hixie.ch wrote: On Tue, 25 May 2010, Silvia Pfeiffer wrote: We've in the past talked about how there is a need to adapt the bitrate version of a audio or video resource that is being delivered to a user agent based on the available bandwidth on the network, the available CPU cycles, and possibly other conditions. It has been discussed to do this using @media queries and providing links to alternative versions of a media resources through the source element inside it. But this is a very inflexible solution, since the side conditions for choosing a bitrate version may change over time and what is good at the beginning of video playback may not be good 2 minutes later (in particular if you're on a mobile device driving through town). Further, we have discussed the need for supporting a live streaming approach such as RTP/RTSP - but RTP/RTSP has its own non-Web issues that will make it difficult to make it part of a Web application framework - in particular it request a custom server and won't just work with a HTTP server. In recent times, vendors have indeed started moving away from custom protocols and custom servers and have moved towards more intelligence in the UA and special approaches to streaming over HTTP. Microsoft developed Smooth Streaming, Apple developed HTTP Live Streaming and Adobe recently launched HTTP Dynamic Streaming. (Also see a comparison at). As these vendors are working on it for MPEG files, so are some people for Ogg. I'm not aware anyone is looking at it for WebM yet. Standards bodies haven't held back either. The 3GPP organisation have defined 3GPP adaptive HTTP Streaming (AHS) in their March 2010 release 9 of 3GPP. Now, MPEG has started consolidating approaches for adaptive bitrate streaming over HTTP for MPEG file formats. Adaptive bitrate streaming over HTTP is the correct approach towards solving the double issues of adapting to dynamic bandwidth availability, and of providing a live streaming approach that is reliable. Right now, no standard exists that has been proven to work in a format-independent way. This is particularly an issue for HTML5, where we want at least support for MPEG4, Ogg Theora/Vorbis, and WebM. I know that it is not difficult to solve this issue in a format-independent way, which is why solutions are jumping up everywhere. They are, however, not compatible and create a messy environment where people have to install solutions for multiple different approaches to make sure they are covered for different platforms, different devices, and different formats. It's a clear situation where a new standard is necessary. The standard basically needs to provide three different things: * authoring of content in a specific way * description of the alternative files on the server and their features for the UA to download and use for switching * a means to easily switch mid-way between these alternative files On Mon, 24 May 2010, Chris Holland wrote: I don't have something decent to offer for the first and last bullets but I'd like to throw-in something for the middle bullet: The http protocol is vastly under-utilized today when it comes to URIs and the various Accept* headers. Today developers might embed an image in a document as chris.png. Web daemons know to find that resource and serve it, in this sense, chris.png is a resource locator. Technically one might reference the image as a resource identifier named chris. The user's browser may send image/gif as the only value of an accept header, signaling the following to the server: I'm supposed to download an image of chris here, but I only support gif, so don't bother sending me a .png. In a perhaps more useful scenario the user agent may tell the server don't bother sending me an image, I'm a screen reader, do you have anything my user could listen to?. In this sense, the document's author didn't have to code against or account for every possible context out there, the author merely puts a reference to a higher-level representation that should remain forward-compatible with evolving servers and user-agents. By passing a list of accepted mimetypes, the accept http header provides this ability to serve context-aware resources, which starts to feel like a contender for catering to your middle bullet. To that end, new mime-types could be defined to encapsulate media type/bit rate combinations. Or the accept header might remain confined to media types and acceptable bit rate information might get encapsulated into a new header, such as: X-Accept-Bitrate . If you combined the above approach with existing standards for http byte range requests, there may be a mechanism there to cater to your 3rd bullet as well: when network conditions deteriorate, the client could interrupt the current