On Thursday 18 November 2010, edA-qa mort-ora-y wrote: > On 11/18/2010 06:34 PM, Josef Weidendorfer wrote: > > We could extend the simulator do assume separate L1 caches per thread, > assuming a > > fixed pinning of each thread to its own core... > > By default Linux appears to pride itself on moving a thread in circle > through all the cores on a machine! :)
Wow. I was not aware of this. I thought Linux does quite well (especially compared to Windows). > We do explicit pinning of threads to cores. If valgrind did support it, > would it be possible to intercept the function to set affinity and pin > the threads? With L1 separate, L2 shared for all threads, everything is symmetric, thus no need for affinity settings. I agree that this gets interesting once NUMA nodes are simulated. > I have no idea what cache hit/miss I have. Is there a way to see it in > KCachegrind? I have a feeling there is somethign wrong in the GUI: the > event pane just has a single type "Instruction Fetch". You talk about > other event types but I can't see any such thing...? The default is to only count number of instructions executed; that is the event you see. You should switch on cache simulation with "--simulate-cache=yes". > > This gives quite a rough estimation, but usually it highlights the > bottlenecks. > > If you do not have cache issues (ie. a large number of misses) anyway, > the estimation > > of course will be quite wrong. > > So once I optimize enough and think that my major issue is a context > switching issue, can I use valgrind to find those? That is, can I > somehow get callgrind to flag the points where a kernel call is made to > warn me of a possible context switch, suspend, or whatnot? Valgrinds current profiling tools have a given use case: see how your code performs regarding to cache behavior. If that is not an issue, you probably should use other tools, such as strace, perf (for system-wide sampling), or the newly announced "trace" tool. Sure, there is the possibility to extend the Valgrind tools to cover further use cases, but sometimes you have to agree that another measurement strategy is just better. E.g. Valgrind simply can not see what is going on in the OS. Josef ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users