On 8/1/2013 11:52 PM, Richard Bair wrote:
as far as I can read it, your idea is to start preparing the next
frame right after synchronization (scenegraph to render tree) is
completed for the previous frame. Do I get it correctly? If yes,
we'll likely re-introduce the old problem with input events
starvation. There will be no or very little window, when the events
can be processed on the event thread, because the thread will
always be either busy handling CSS, animations, etc., or blocked
waiting for the render thread to finish rendering.
I think the difference is that I was going to use the vsync as the
limiter. That is, the first time through we do a pulse, then we
schedule another pulse, then we run that other pulse (almost
immediately), then we hit the sync point with the render thread and
have to wait for it because it is blocked on vsync. Meanwhile the
user events are being queued up. When we get back from this, the next
pulse is placed on the end of the queue, we process all input events,
then the next pulse.
I now see the picture.
As I wrote in the previous email, it seems that we currently are not
blocked waiting for vsync(), at least on Windows with D3D pipeline.
Anyway, even if we "fix" that, what you propose is that sometimes both
threads will be blocked (the render thread waiting for vsync, the event
thread waiting for the render thread), which doesn't sound perfect.
Note that on Windows and Mac OS X, input events and application
runnables are handled differently at the native level (either using
different mechanisms, or having different priorities). To implement this
proposal, we'll need to eliminate the difference, which may be a
difficult task.
Thanks,
Artem
Whenever an animation starts, the runningAnimationCounter is
incremented. When an animation ends, it is decremented (or it could
be a Set<Animation> or whatever). The pendingPulse is simply false to
start with, and is checked before we submit another pulse. Whenever a
node in the scene graph becomes dirty, or the scene is resized, or
stylesheets are changed, or in any case something happens that
requires us to draw again, we check this flag and fire a new pulse if
one is not already pending.
Scene graph is only changed on the event thread. So my guess is that "fire a new
pulse" is just
Platform.runLater(() -> pulse())
Right.
When a pulse occurs, we process animations first, then CSS, then
layout, then validate all the bounds, and *then we block* until the
rendering thread is available for synchronization. I believe this is
what we are doing today (it was a change Steve and I looked at with
Jasper a couple months ago IIRC).
But now for the new part. Immediately after synchronization, we check
the runningAnimationCounter. If it is > 0, then we fire off a new
pulse and leave the pendingPulse flag set to true. If
runningAnimationCounter == 0, then we flip pendingPulse to false.
Other than the pick that always happens at the end of the pulse, we
do nothing else new and, if the pick didn't cause state to change, we
are now quiescent.
Meanwhile, the render thread has run off doing its thing. The last
step of rendering is the present, where we will block until the thing
is presented, which, when we return, would put us *immediately* at
the start of the next 16.66ms cycle. Since the render thread has just
completed its duties, it goes back to waiting until the FX thread
comes around asking to sync up again.
If there is an animation going on such that a new pulse had been
fired immediately after synchronization, then that new pulse would
have been handled while the previous frame was being rendered. Most
likely, by the time the render thread completes presenting and comes
back to check with the FX thread, it will find that the FX thread is
already waiting for it with the next frames data. It will synchronize
immediately and then carry on rendering another frame.
Given that you propose to fire a new pulse() whenever anything is changed in
scene graph, and also right after synchronization, there is no need to have an
external timer (QuantumToolkit.pulseTimer()) any longer.
Correct.
I think the way this would behave is that, when an animation is first
played, you will get two pulses close to each other. The first pulse
will do its business and then synchronize and then immediately fire
off another pulse. That next pulse will then also get processed and
then the FX thread will block until the previous frame finishes
rendering. During this time, additional events (either application
generated via runLater calls happening on background threads, or from
OS events) will get queued up. Between pulse #2 and pulse #3 then a
bunch of other events will get processed, essentially playing
catch-up. My guess is that this won't be a problem but you might see
a hiccup at the start of a new animation if the event queue is too
full and it can't process all that stuff in 16ms (because at this
point we're really multi-theaded between the FX and render threads
and have nearly 16ms for each thread to do their business, instead of
only 8ms which is what you'd have in a single threaded system).
Another question I have is around resize events and how those work.
If they also come in to glass on the FX thread (but at a higher
priority than user events like a pulse or other input events?) then
what will happen is that we will get a resize event and process a
half-a-pulse (or maybe a whole pulse? animations+css+layout or just
css+layout?) and then render, pretty much just as fast as we can.
As for multiple scenes, I'm actually curious how this happens today.
If I have 2 scenes, and we have just a single render thread servicing
both, then when I go to present, it blocks? Or is there a
non-blocking present method that we use instead? Because if we block,
then having 2 scenes would cut you down to 30fps maximum, wouldn't
This is a very interesting question... Experiments show that we can have more
than one window/scene running at 60 fps. Someone from the graphics team should
comment on this. My only guess (at least, in case of D3D pipeline) that
present() doesn't block, if it's called no more than once between vsyncs (but
the frame is shown on the screen on vsync anyway).
And certainly with full-speed enabled we aren't blocking. The way I guess this
would have to work is that if we have 10 scenes to render, we will end up
rendering them all and then only block on the last render. My goal is to use
the video card as the timer, in essence, such that:
a) We have a full 16.666ms for the render thread to do its business each time
and
b) We never have "timer drift" between some external timer and the vsync timer
There is potentially another benefit which is that we don't ever need to enable hi-res
timers on Windows or whatnot, and never put the machine in a "non power save"
state.
Richard