Hi Steve,

It could be a benchmark problem, although I wouldn’t be surprised at all if the 
benchmark was exercising the platform in some way that was CPU limited. 
Assuming it is CPU limited (and not going multi-core), I think the problem 
really comes down to what Markus said:

> The limiting factor is the single-thread architecture of rather all parts of 
> JavaFX. The only real difference you see between machines is not correlating 
> with neither number of CPU cores nor GPU cores, but only with CPU frequency, 
> roughly spoken. Short term fixes will only provide little improvement, by 
> optimizing the critical execution path (i. e. produce hot spot histogram 
> using a profiler), for example improvement clipping, caching, etc. Huge 
> performance optimizations need an architectural change within JavaFX's 
> "scenegraph-to-bitmapframe" (a.k.a. rendering) pipeline to use parallel 
> execution in lots of places. Typical design patterns would be parallel 
> iterations, work-stealing executors, fibers (a.k.a cooperative 
> multi-threading, a.k.a CompletableFuture), and last but not least partitioned 
> rendering (a.k.a tiled rendering).
> 
> I am pretty sure you can add a lot more ideas to the list and produce great 
> performance, scaling linearly with number of CPU cores / GPU cores, but this 
> somes at a cost: Risk to introduce hard to track bugs, and needed manpower.
> 
> If somebody has at least a lot of free spare time, I am pretty sure Kevin 
> could easily provide a huge set of work items in this area. :-)


JavaFX was setup to be multi-threaded — in fact there are always at least 2 
threads — the application / scene graph thread and the render thread. Going 
way, way back the goal was for multi-core computation/rasterizing on the NG 
side (Prism), but it didn’t get done for a variety of reasons. I couldn’t even 
say what kind of performance win/loss it would bring. I’m sure for some 
workloads it would be way better,but for many others it probably wouldn’t make 
any difference. A lot of other more pressing features had to be implemented 
first which would allow people to build apps at all on top of FX (like controls 
and effects and animations and so forth), and Prism has served us really well.

There are a few places we could play with fork/join to see if we can get 
performance boosts, all of which would be tricky and have to be done very 
carefully because they are part of highly tuned code paths:
Computing and applying CSS styles
Computing bounds
Computing layout
Syncing state between scene graph and the render graph
Updating state on the render graph (not sure much time is spent here…)
Rasterizing (fonts, anti-aliased glyphs, etc)

Some of these might be easier to try out than others. For example, using 
multiple threads at sync time should be safe, but probably won’t see much of 
any gain (as it takes a couple MS maximum to do this sync anyway). The more 
difficult one is probably making the rasterizing steps multi-threaded, but that 
would probably bring the biggest win. Computing bounds and layout and CSS in 
parallel would be very tricky, but if it could be done, would likely result in 
a nice speed boost for very large scenes (it would have no impact on rendering 
performance).

Cheers!
Richard


Reply via email to