As promised in the talk, I moved the qmlbench tool to a separate repo and gave it a readme: https://github.com/sletta/qmlbench
Still under my github account, but now it is at least documented so others can take part in interpreting the results. > On 22 Jun 2015, at 14:47, Robin Burchell <[email protected]> wrote: > > It seems that nobody took proper notes of the session (oops!) so I'm > sending the notes Gunnar and I wrote up in preparation for the session. > If anyone else has any recollections they'd like to add, please go ahead > and do so :) > > ===== > > # Performance! > > * Why it's important > * What needs to be checked? > > # Benchmarking > > * Need support in qmlbench for measuring memory usage. > * For each set of creation benchmarks, we should run them a second > time collecting memory information. > * Once we have that, once graphed across a single test run, memory leaks > become easy to spot > * Graphed over time, regressions between commits become easy to spot > * Likewise for the number of items per frame > > # Start up performance > > We don't have good benchmarks for this, and creating them isn't trivial > to measure. Need a few generated examples, maybe. > > # Memory usage > > * Our memory usage is pretty bad in a lot of areas. > * Sharing data across processes (fork-booster).. > * Similar but better sharing achievable through qmlcompiler > * Benefits could be achieved by carefully allocating all write-once > expected-to-be-shared data contiguously > * Dropping CPU-side image data. We recently added QSG_TRANSIENT_IMAGES > to Qt Quick. > > # Item creation performance > > * We have pretty good benchmarks for this :) > * But we (constantly!) keep regressing > * This week's example: two seperate regressions in QQuickImageBase > (high DPI, and automatic transform) > * Image items per frame dropped from ~550/frame to 492/frame -- ~10% > regression > * Allocations for 5000 images increased by 41mb > * This is just one case - it's nobody's "fault", there's just nobody > taking care of it. > * Ideally need to do some work on automating runs of it (more on this > later) > * What is a good target? > * 1000 items / frame in qmlbench on a modern desktop / laptop (mbp > is one such) > * 100 items / frame on mobile and embedded > > # Binding performance > > I have no idea whether or not we have good benchmarks for this. > > * Probably the creation ones cover the simple cases, but the more > advanced ones probably need seperate coverage. > * Ideally we also need to monitor the impact of things changing in > bindings (& multiple things changing at once, and so on?) > * Help creating benchmarks welcome! > > # Graphics Performance: > > Are we good here? I think we're close at least.. > * Clipping has been brought up as an issue, 'simplerenderer' solves that > * Poor batching gives worse result, 'simplerenderer' solves that also, > but be mindful that simplerenderer also has ~2-3x worse performance > overall. > > # Recommendations for working on QtQuick > > * Avoid structures like QHash unless you are sure they are needed (they > have a heavy cost, and for <1k items or so, QVector, or std::vector, are > often a better choice) > * Avoid signal connections > * Virtuals instead might be a good choice > * If you really have to use them, use qmlobject_connect > * See prior art: > * qtdeclarative: 0de680c8e8fab36e386dca35e5008ffaa27e8ef6 > * qtdeclarative: 7568922fa240e6e9440e9c6e93bf8ec00c06ec17 > * Memory compactness: > * Don't introduce padding holes > * Don't increase the size of frequently allocated things (nodes, > items) "accidentally" without careful consideration > * Use a lazily-allocated ExtraData for things that aren't needed > often > * Consider your data types carefully - don't use a 64 bit int for an > "on/off" toggle > * Consider custom data structures & allocation (page allocation of the > shadow nodes was a big win) > > # Specific Items? > > * Rectangle implementation could be improved quite a bit (Gunnar?) > * Text node improvements (Eskil?) > * Hash of shadow nodes in the batched render is a large problem > * https://codereview.qt-project.org/#/c/97708/ > * Delaying compilation (or whatever it is) on inline components until > they are used. Right now, these can have massive impacts on > performance/memory unless moved to external files. e.g. "Component { > Dialog { ... foo ... } }, only used when a button is pressed. It may > never be pressed. > * "Don't use" classes, like SpriteSequence :) > * In general, small items take up huge amounts.. Repeater { model: 500; > Rectangle { width: 100; height: 100; radius: 10 } } and you have 1Mb or > something :) > * QObject -> QQmlData / QQuickItem / QQuickItemPrivate etc all adds up > on individual items - but what can we do to fix that? > * The recent introduction of 'padding' in the box model might need a > second look to make sure it isn't increasing item sizes in common cases. > 4 extra doubles is quite a large addition to item sizes. > _______________________________________________ > Development mailing list > [email protected] > http://lists.qt-project.org/mailman/listinfo/development _______________________________________________ Development mailing list [email protected] http://lists.qt-project.org/mailman/listinfo/development
