On Fri, 2011-04-08 at 10:29 -0700, Richard Purdie wrote: > To follow up on this, I added instrumentation to runqueue.py > fork_off_task(). There are two things of note. On my laptop which is a > much slower system than the one I showed in the benchmark (a quad core), > the task overhead isn't 1+ seconds as shown in the graph but 0.4 seconds > per task. I think the reason is there is a compile job running in > parallel on the other graph which is starving new tasks of CPU. > > The second thing of note is 95% of the 0.4 seconds are in the > loadDataFull() call as suspected. I did notice that the logging changes > around that point in the code can distort timings but the time is really > being spent there.
I've dug into this a bit. When running a given task, it is parsing all bbclassextend variants again. I have a simple patch which fixes that which I'll cleanup and share. What I also noticed which was more odd was that the first finalise() call was taking 0.2s, subsequent ones were taking 0.1s. It turns out that the parsercache which is used by the siggen code isn't functioning the way it should with a lot of cache misses. I think this is related to the parallel parsing and only saving out cached data from the core, not the individual recipes (i.e. the cache from the parser subthreads). I suspect we could get some performance improvement by fixing this. Cheers, Richard _______________________________________________ Openembedded-core mailing list [email protected] http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core
