Re: [whatwg] Memory management problem of video elements

2014-08-20 Thread Philip Jägenstedt
On Tue, Aug 19, 2014 at 3:54 PM, duanyao duan...@ustc.edu wrote:
 于 2014年08月19日 20:23, Philip Jägenstedt 写道:

 On Tue, Aug 19, 2014 at 11:56 AM, duanyao duan...@ustc.edu wrote:

 If the media element object keeps track of its current playing url and
 current position (this requires little memory), and the media file is
 seekable, then
 the media is always resumable. UA can drop any other associated memory of
 the media element, and users will not notice any difference except a
 small
 delay
 when they resume playing.

 That small delay is a problem, at least when it comes to audio
 elements used for sound effects. For video elements, there's the
 additional problem that getting back to the same state will require
 decoding video from the previous keyframe, which could take several
 seconds of CPU time.

 Of course, anything is better than crashing, but tearing down a media
 pipeline and recreating it in the exact same state is quite difficult,
 which is probably why nobody has tried it, AFAIK.

 UA can pre-create the media pipeline according to some hints, e.g. the video
 element is becoming visible,
 so that the delay may be minimized.

 There is a load() method on media element, can it be extended to instruct
 the UA to recreate
 the media pipeline? Thus script can reduce the delay if it knows the media
 is about to be played.

load() resets all state and starts resource selection anew, so without
a way of detecting when a media element has destroyed its media
pipeline to save memory, calling load() can in the worst case increase
the time until play.

 Audios usually eat much less memory, so UAs may have a different strategy
 for them.

 Many native media players can save playing position on exit, and resume the
 playing from that position on the next run.
 Most users are satisfied with such feature. Is recovering to exact same
 state important to some web applications?

I don't know what is required for site compat, but ideally destroying
and recreating a pipeline should get you back to the exact same
currentTime and continue playback at the correct video frame and audio
sample. It could be done.

 I'm not familiar with game programing. Are sound effects small audio files
 that are usually
 played as a whole? Then it should be safe to recreate the pipeline.

There's also a trick called audio sprites where you put all sound
effects into a single file with some silence in between and then seek
to the appropriate offset.

Philip


Re: [whatwg] Memory management problem of video elements

2014-08-20 Thread duanyao

于 2014年08月20日 15:52, Philip Jägenstedt 写道:

On Tue, Aug 19, 2014 at 3:54 PM, duanyao duan...@ustc.edu wrote:

于 2014年08月19日 20:23, Philip Jägenstedt 写道:


On Tue, Aug 19, 2014 at 11:56 AM, duanyao duan...@ustc.edu wrote:

If the media element object keeps track of its current playing url and
current position (this requires little memory), and the media file is
seekable, then
the media is always resumable. UA can drop any other associated memory of
the media element, and users will not notice any difference except a
small
delay
when they resume playing.

That small delay is a problem, at least when it comes to audio
elements used for sound effects. For video elements, there's the
additional problem that getting back to the same state will require
decoding video from the previous keyframe, which could take several
seconds of CPU time.

Of course, anything is better than crashing, but tearing down a media
pipeline and recreating it in the exact same state is quite difficult,
which is probably why nobody has tried it, AFAIK.

UA can pre-create the media pipeline according to some hints, e.g. the video
element is becoming visible,
so that the delay may be minimized.

There is a load() method on media element, can it be extended to instruct
the UA to recreate
the media pipeline? Thus script can reduce the delay if it knows the media
is about to be played.

load() resets all state and starts resource selection anew, so without
a way of detecting when a media element has destroyed its media
pipeline to save memory, calling load() can in the worst case increase
the time until play.
I meant we could add an optional parameter to load() to support soft 
reload, e.g. load(boolean soft),

which doesn't reset states and re-select resource.

Maybe it is better to reuse pause() method to request UA to recreate the 
media pipeline. If a media element is in
memory-saving state, it must be in paused state as well, so invoke 
pause() should not have undesired side effects.


Anyway, it seems the spec needs to introduce a new state of media 
element: memory-saving state.
In low memory condition, UA can select some low-priority media elements 
and turn them into memory-saving state.


Suggested priorities for videos are:
(1) recently (re)started, playing, and visible videos
(2) previously (re)started, playing, and visible videos
(3) paused and visible videos; playing and invisible videos
(4) paused and invisible videos

Priorities for audios are to be considered.

Memory-saving state implies paused state.

If memory becomes sufficient, or a media elements priorities are about 
to change, UA can restore some of them to
normal paused state (previously playing media doesn't automatically 
resume playback).


If pause() method is invoked on a media element in memory-saving state, 
UA must restore it to normal paused state.



Audios usually eat much less memory, so UAs may have a different strategy
for them.

Many native media players can save playing position on exit, and resume the
playing from that position on the next run.
Most users are satisfied with such feature. Is recovering to exact same
state important to some web applications?

I don't know what is required for site compat, but ideally destroying
and recreating a pipeline should get you back to the exact same
currentTime and continue playback at the correct video frame and audio
sample. It could be done.


I'm not familiar with game programing. Are sound effects small audio files
that are usually
played as a whole? Then it should be safe to recreate the pipeline.

There's also a trick called audio sprites where you put all sound
effects into a single file with some silence in between and then seek
to the appropriate offset.
I think if UA can get and set currentTime property accurately, it should 
be able to recreate the pipeline

with the same accuracy. What are the main factors limiting the accuracy?
However, a UA using priorities to manage media memory is unlikely to 
reclaim a in-use audio sprites element's memory.


Philip





Re: [whatwg] Memory management problem of video elements

2014-08-20 Thread Philip Jägenstedt
On Wed, Aug 20, 2014 at 12:04 PM, duanyao duan...@ustc.edu wrote:
 于 2014年08月20日 15:52, Philip Jägenstedt 写道:

 On Tue, Aug 19, 2014 at 3:54 PM, duanyao duan...@ustc.edu wrote:

 I'm not familiar with game programing. Are sound effects small audio
 files
 that are usually
 played as a whole? Then it should be safe to recreate the pipeline.

 There's also a trick called audio sprites where you put all sound
 effects into a single file with some silence in between and then seek
 to the appropriate offset.

 I think if UA can get and set currentTime property accurately, it should be
 able to recreate the pipeline
 with the same accuracy. What are the main factors limiting the accuracy?

I don't know, but would guess that not all media frameworks can seek
to an exact audio sample but only to the beginning of a video frame or
an audio frame, in which case currentTime would be slightly off. One
could just lie about currentTime until playback continues, though.

Philip


Re: [whatwg] Memory management problem of video elements

2014-08-20 Thread duanyao

于 2014年08月20日 19:26, Philip Jägenstedt 写道:

On Wed, Aug 20, 2014 at 12:04 PM, duanyao duan...@ustc.edu wrote:

于 2014年08月20日 15:52, Philip Jägenstedt 写道:


On Tue, Aug 19, 2014 at 3:54 PM, duanyao duan...@ustc.edu wrote:

I'm not familiar with game programing. Are sound effects small audio
files
that are usually
played as a whole? Then it should be safe to recreate the pipeline.

There's also a trick called audio sprites where you put all sound
effects into a single file with some silence in between and then seek
to the appropriate offset.

I think if UA can get and set currentTime property accurately, it should be
able to recreate the pipeline
with the same accuracy. What are the main factors limiting the accuracy?

I don't know, but would guess that not all media frameworks can seek
to an exact audio sample but only to the beginning of a video frame or
an audio frame, in which case currentTime would be slightly off. One
could just lie about currentTime until playback continues, though.
Such limitation also affects seeking, not only memory-saving feature, 
and the spec allows quality-of-implementation issue, so I think this 
is acceptable.
Additionally, a media in memory-saving state must be paused, I think 
users won't care about the small error of resuming position.




Philip





Re: [whatwg] Memory management problem of video elements

2014-08-20 Thread Ian Hickson
On Wed, 20 Aug 2014, Philip Jägenstedt wrote:

 I don't know, but would guess that not all media frameworks can seek to 
 an exact audio sample but only to the beginning of a video frame or an 
 audio frame, in which case currentTime would be slightly off. One could 
 just lie about currentTime until playback continues, though.

Note that setting currentTime is required to be precise, even if that 
means actually playing the content in silence for a while to get to the 
precise point. To seek fast, we have a separate fastSeek() method.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Memory management problem of video elements

2014-08-19 Thread Philip Jägenstedt
On Tue, Aug 19, 2014 at 9:12 AM, duanyao duan...@ustc.edu wrote:
 Hi,

 Recently I have investigated memory usage of HTML video element in
 several desktop browsers (firefox and chrome on windows and linux, and
 IE 11), and have found some disappointing results:

 1. A video element in a playable state consumes significant amount of
 memory. For each playing or paused or preload=auto video element, the
 memory usage
 is up to 30~80MB; for those with preload=metadata, memory usage is
 6~13MB; for those with preload=none, memory usage is not notable. Above
 numbers are measured with 720p to 1080p H.264 videos, and videos in
 lower resolutions use less memory.

 2. For a page having multiple video elements, memory usage is scaled up
 linearly. So a page with tens of videos can exhaust the memory space of
 a 32bit browser. In my tests, such a page may crash the browser or
 freeze a low memory system.

 3. Even if a video element is or becomes invisible, either by being out
 of viewport, having display:none style, or being removed from the active
 DOM tree (but not released),
 almost same amount of memory is still occupied.

 4. The methods to reduce memory occupied by video elements requires
 script, and the element must be modified. For example, remove and
 release the element.

 Although this looks like a implementors' problem, not a spec's problem,
 but I think the current spec is encouraging implementors to push the
 responsibility of memory management of media elements to authors, which
 is very bad. See the section 4.8.14.18
 (http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#best-practices-for-authors-using-media-elements):

4.8.14.18 Best practices for authors using media elements
it is a good practice to release resources held by media elements when
 they are done playing, either by being very careful about removing all
 references to the element and allowing it to be garbage collected, or,
 even better, by removing the element's src attribute and any source
 element descendants, and invoking the element's load() method.

 Why this is BAD in my opinion?

 1. It requires script. What if the UA doesn't support or disables script
 (email reader, epub reader, etc), or the script is simply failed to
 download? What if users insert many video elements to a page hosted by a
 site that is not aware of this problem (so no video management script
 available)? Users' browsers may be crashed, or systems may be freezed,
 with no obvious reason.

 2. It is hard to make the script correct. Authors can't simply depend on
 done playing, because users may simply pause a video in the middle and
 start playing another one, and then resume the first one. So authors
 have to determine which video is out of viewport, and remove its src,
 and record its currentTime; when it comes back to viewport, set src and
 seek to previous currentTime. This is quite complicated. For WYSIWYG
 html editors based on browsers, this is even more complicated because of
 the interaction with undo manager.

 3. Browsers are at a much better position to make memory management
 correct. Browsers should be able to save most of the memory of an
 invisible video by only keep its state (or with a current frame), and
 limit the total amount of memory used by media elements.

 So I think the spec should remove section 4.8.14.1, and instead stresses
 the the responsibility of UA to memory management of media elements.

What concrete advice should the spec give to UAs on memory management?
If a script creates a thousand media elements and seeks those to a
thousand different offsets, what is a browser to do? It looks like a
game preparing a lot of sound effects with the expectation that they
will be ready to go, so which ones should be thrown out?

A media element in an active document never gets into a state where it
could never start playing again, so I don't know what to do other than
trying to use less memory per media element. Have you filed bugs at
the browsers that crash or freeze the system?

Regardless of what the UA does, section 4.8.14.1 is still good advice
when the script knows that a resource won't be needed but the browser
cannot. Example: a sound effect is played for last time in a game as
the last secret in the level is found.

Philip


Re: [whatwg] Memory management problem of video elements

2014-08-19 Thread duanyao

于 2014年08月19日 16:00, Philip Jägenstedt 写道:

On Tue, Aug 19, 2014 at 9:12 AM, duanyao duan...@ustc.edu wrote:

Hi,

Recently I have investigated memory usage of HTML video element in
several desktop browsers (firefox and chrome on windows and linux, and
IE 11), and have found some disappointing results:

1. A video element in a playable state consumes significant amount of
memory. For each playing or paused or preload=auto video element, the
memory usage
is up to 30~80MB; for those with preload=metadata, memory usage is
6~13MB; for those with preload=none, memory usage is not notable. Above
numbers are measured with 720p to 1080p H.264 videos, and videos in
lower resolutions use less memory.

2. For a page having multiple video elements, memory usage is scaled up
linearly. So a page with tens of videos can exhaust the memory space of
a 32bit browser. In my tests, such a page may crash the browser or
freeze a low memory system.

3. Even if a video element is or becomes invisible, either by being out
of viewport, having display:none style, or being removed from the active
DOM tree (but not released),
almost same amount of memory is still occupied.

4. The methods to reduce memory occupied by video elements requires
script, and the element must be modified. For example, remove and
release the element.

Although this looks like a implementors' problem, not a spec's problem,
but I think the current spec is encouraging implementors to push the
responsibility of memory management of media elements to authors, which
is very bad. See the section 4.8.14.18
(http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#best-practices-for-authors-using-media-elements):


4.8.14.18 Best practices for authors using media elements
it is a good practice to release resources held by media elements when

they are done playing, either by being very careful about removing all
references to the element and allowing it to be garbage collected, or,
even better, by removing the element's src attribute and any source
element descendants, and invoking the element's load() method.

Why this is BAD in my opinion?

1. It requires script. What if the UA doesn't support or disables script
(email reader, epub reader, etc), or the script is simply failed to
download? What if users insert many video elements to a page hosted by a
site that is not aware of this problem (so no video management script
available)? Users' browsers may be crashed, or systems may be freezed,
with no obvious reason.

2. It is hard to make the script correct. Authors can't simply depend on
done playing, because users may simply pause a video in the middle and
start playing another one, and then resume the first one. So authors
have to determine which video is out of viewport, and remove its src,
and record its currentTime; when it comes back to viewport, set src and
seek to previous currentTime. This is quite complicated. For WYSIWYG
html editors based on browsers, this is even more complicated because of
the interaction with undo manager.

3. Browsers are at a much better position to make memory management
correct. Browsers should be able to save most of the memory of an
invisible video by only keep its state (or with a current frame), and
limit the total amount of memory used by media elements.

So I think the spec should remove section 4.8.14.1, and instead stresses
the the responsibility of UA to memory management of media elements.

What concrete advice should the spec give to UAs on memory management?
If a script creates a thousand media elements and seeks those to a
thousand different offsets, what is a browser to do? It looks like a
game preparing a lot of sound effects with the expectation that they
will be ready to go, so which ones should be thrown out?


UA can limit the number of simultaneously playing medias according to 
available memory or user preference,
and fire error events on media elements if the limit is hit. We may need 
another error code, currently some UAs fire MEDIA_ERR_DECODE,

which is misleading.

If the thousand media elements are just sought, not playing, UA can seek 
them one by one, and drop cached frames afterwards, only keep current 
frames;

if memory is even more limited, the current frames can also be dropped.

For a html based slideshows or textbooks, it is quite possible to have 
tens of videos in one html file.


For audio elements, I think it is less problematic because they usually 
use far less memory than videos.

A media element in an active document never gets into a state where it
could never start playing again, so I don't know what to do other than
trying to use less memory per media element.

What do you mean by a state where it could never start playing again?
If the media element object keeps track of its current playing url and 
current position (this requires little memory), and the media file is 
seekable, then
the media is always resumable. UA can drop any other associated memory 
of 

Re: [whatwg] Memory management problem of video elements

2014-08-19 Thread Philip Jägenstedt
On Tue, Aug 19, 2014 at 11:56 AM, duanyao duan...@ustc.edu wrote:
 于 2014年08月19日 16:00, Philip Jägenstedt 写道:

 On Tue, Aug 19, 2014 at 9:12 AM, duanyao duan...@ustc.edu wrote:

 Hi,

 Recently I have investigated memory usage of HTML video element in
 several desktop browsers (firefox and chrome on windows and linux, and
 IE 11), and have found some disappointing results:

 1. A video element in a playable state consumes significant amount of
 memory. For each playing or paused or preload=auto video element, the
 memory usage
 is up to 30~80MB; for those with preload=metadata, memory usage is
 6~13MB; for those with preload=none, memory usage is not notable. Above
 numbers are measured with 720p to 1080p H.264 videos, and videos in
 lower resolutions use less memory.

 2. For a page having multiple video elements, memory usage is scaled up
 linearly. So a page with tens of videos can exhaust the memory space of
 a 32bit browser. In my tests, such a page may crash the browser or
 freeze a low memory system.

 3. Even if a video element is or becomes invisible, either by being out
 of viewport, having display:none style, or being removed from the active
 DOM tree (but not released),
 almost same amount of memory is still occupied.

 4. The methods to reduce memory occupied by video elements requires
 script, and the element must be modified. For example, remove and
 release the element.

 Although this looks like a implementors' problem, not a spec's problem,
 but I think the current spec is encouraging implementors to push the
 responsibility of memory management of media elements to authors, which
 is very bad. See the section 4.8.14.18

 (http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#best-practices-for-authors-using-media-elements):

 4.8.14.18 Best practices for authors using media elements
 it is a good practice to release resources held by media elements when

 they are done playing, either by being very careful about removing all
 references to the element and allowing it to be garbage collected, or,
 even better, by removing the element's src attribute and any source
 element descendants, and invoking the element's load() method.

 Why this is BAD in my opinion?

 1. It requires script. What if the UA doesn't support or disables script
 (email reader, epub reader, etc), or the script is simply failed to
 download? What if users insert many video elements to a page hosted by a
 site that is not aware of this problem (so no video management script
 available)? Users' browsers may be crashed, or systems may be freezed,
 with no obvious reason.

 2. It is hard to make the script correct. Authors can't simply depend on
 done playing, because users may simply pause a video in the middle and
 start playing another one, and then resume the first one. So authors
 have to determine which video is out of viewport, and remove its src,
 and record its currentTime; when it comes back to viewport, set src and
 seek to previous currentTime. This is quite complicated. For WYSIWYG
 html editors based on browsers, this is even more complicated because of
 the interaction with undo manager.

 3. Browsers are at a much better position to make memory management
 correct. Browsers should be able to save most of the memory of an
 invisible video by only keep its state (or with a current frame), and
 limit the total amount of memory used by media elements.

 So I think the spec should remove section 4.8.14.1, and instead stresses
 the the responsibility of UA to memory management of media elements.

 What concrete advice should the spec give to UAs on memory management?
 If a script creates a thousand media elements and seeks those to a
 thousand different offsets, what is a browser to do? It looks like a
 game preparing a lot of sound effects with the expectation that they
 will be ready to go, so which ones should be thrown out?


 UA can limit the number of simultaneously playing medias according to
 available memory or user preference,
 and fire error events on media elements if the limit is hit. We may need
 another error code, currently some UAs fire MEDIA_ERR_DECODE,
 which is misleading.

Opera 12.16 using Presto had such a limit to avoid address space
exhaustion on 32-bit machines, limiting the number of concurrent media
pipelines to 200. However, when the limit was reached it just acted as
if the network was stalling while waiting for an existing pipeline to
be destroyed.

It wasn't a great model, but if multiple browsers (want to) impose
limits like this, maybe a way for script to tell the difference would
be useful.

 If the thousand media elements are just sought, not playing, UA can seek
 them one by one, and drop cached frames afterwards, only keep current
 frames;
 if memory is even more limited, the current frames can also be dropped.

 For a html based slideshows or textbooks, it is quite possible to have tens
 of videos in one html file.

 For audio elements, I think 

Re: [whatwg] Memory management problem of video elements

2014-08-19 Thread duanyao

于 2014年08月19日 20:23, Philip Jägenstedt 写道:

On Tue, Aug 19, 2014 at 11:56 AM, duanyao duan...@ustc.edu wrote:

于 2014年08月19日 16:00, Philip Jägenstedt 写道:


On Tue, Aug 19, 2014 at 9:12 AM, duanyao duan...@ustc.edu wrote:

Hi,

Recently I have investigated memory usage of HTML video element in
several desktop browsers (firefox and chrome on windows and linux, and
IE 11), and have found some disappointing results:

1. A video element in a playable state consumes significant amount of
memory. For each playing or paused or preload=auto video element, the
memory usage
is up to 30~80MB; for those with preload=metadata, memory usage is
6~13MB; for those with preload=none, memory usage is not notable. Above
numbers are measured with 720p to 1080p H.264 videos, and videos in
lower resolutions use less memory.

2. For a page having multiple video elements, memory usage is scaled up
linearly. So a page with tens of videos can exhaust the memory space of
a 32bit browser. In my tests, such a page may crash the browser or
freeze a low memory system.

3. Even if a video element is or becomes invisible, either by being out
of viewport, having display:none style, or being removed from the active
DOM tree (but not released),
almost same amount of memory is still occupied.

4. The methods to reduce memory occupied by video elements requires
script, and the element must be modified. For example, remove and
release the element.

Although this looks like a implementors' problem, not a spec's problem,
but I think the current spec is encouraging implementors to push the
responsibility of memory management of media elements to authors, which
is very bad. See the section 4.8.14.18

(http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content.html#best-practices-for-authors-using-media-elements):


4.8.14.18 Best practices for authors using media elements
it is a good practice to release resources held by media elements when

they are done playing, either by being very careful about removing all
references to the element and allowing it to be garbage collected, or,
even better, by removing the element's src attribute and any source
element descendants, and invoking the element's load() method.

Why this is BAD in my opinion?

1. It requires script. What if the UA doesn't support or disables script
(email reader, epub reader, etc), or the script is simply failed to
download? What if users insert many video elements to a page hosted by a
site that is not aware of this problem (so no video management script
available)? Users' browsers may be crashed, or systems may be freezed,
with no obvious reason.

2. It is hard to make the script correct. Authors can't simply depend on
done playing, because users may simply pause a video in the middle and
start playing another one, and then resume the first one. So authors
have to determine which video is out of viewport, and remove its src,
and record its currentTime; when it comes back to viewport, set src and
seek to previous currentTime. This is quite complicated. For WYSIWYG
html editors based on browsers, this is even more complicated because of
the interaction with undo manager.

3. Browsers are at a much better position to make memory management
correct. Browsers should be able to save most of the memory of an
invisible video by only keep its state (or with a current frame), and
limit the total amount of memory used by media elements.

So I think the spec should remove section 4.8.14.1, and instead stresses
the the responsibility of UA to memory management of media elements.

What concrete advice should the spec give to UAs on memory management?
If a script creates a thousand media elements and seeks those to a
thousand different offsets, what is a browser to do? It looks like a
game preparing a lot of sound effects with the expectation that they
will be ready to go, so which ones should be thrown out?


UA can limit the number of simultaneously playing medias according to
available memory or user preference,
and fire error events on media elements if the limit is hit. We may need
another error code, currently some UAs fire MEDIA_ERR_DECODE,
which is misleading.

Opera 12.16 using Presto had such a limit to avoid address space
exhaustion on 32-bit machines, limiting the number of concurrent media
pipelines to 200. However, when the limit was reached it just acted as
if the network was stalling while waiting for an existing pipeline to
be destroyed.

It wasn't a great model, but if multiple browsers (want to) impose
limits like this, maybe a way for script to tell the difference would
be useful.


I think it is even better for UA to play the media element that the 
user/script tried to play most recently, and drop pipelines for those 
are paused and/or invisible.


P.S. I forgot to say that UAs that fire MEDIA_ERR_DECODE event for 
not-enough-memory error also show error message decode error

on the UI of video elements, which confuse users too.

If the thousand media