On Fri, 2011-04-08 at 08:41 -0700, Richard Purdie wrote: > We've done a lot of good work in optimising task execution and latency > however I was looking at some of our build data and I think we still > have work to do. The attached picture is a "bootchart" style picture > showing the task execution at the start of a build. Each vertical > division represents one second. As you can see, the fetch/patch tasks > themselves run very quickly. Those markers are from the task entry/exit. > > The worrying part is the empty space between the tasks which looks to be > about 1.5 seconds in size *per task*. This is a scary amount of latency > to be losing for task execution when you consider we have 4500 of these > for our reference build. > > More investigation is needed to figure out where this time is being > spent but I'm guessing a significant fraction is in parsing the recipe > of the task that is about to be executed. We need to confirm that but if > that is the case but one possible optimisation is we should ensure we're > not parsing every BBCLASSEXTEND variant of the recipe, just the one we > want to run. I'm sure there are other ways we might be able to improve > this too.
To follow up on this, I added instrumentation to runqueue.py fork_off_task(). There are two things of note. On my laptop which is a much slower system than the one I showed in the benchmark (a quad core), the task overhead isn't 1+ seconds as shown in the graph but 0.4 seconds per task. I think the reason is there is a compile job running in parallel on the other graph which is starving new tasks of CPU. The second thing of note is 95% of the 0.4 seconds are in the loadDataFull() call as suspected. I did notice that the logging changes around that point in the code can distort timings but the time is really being spent there. Cheers, Richard _______________________________________________ Openembedded-core mailing list [email protected] http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core
