> One silly idea while reading this thread (not the paper): > For single core, the idea makes sense. > It may even extend easily to multicore VM (just forget about Smalltalk > for a while) by counting message sent per core and taking the max.
But you are here assuming that all the CPU have the same speed (#messages/sec) right? Alexandre > > Nicolas > > 2011/4/29 Alexandre Bergel <[email protected]>: >> +1 >> >> Alexandre >> >> >> On 28 Apr 2011, at 17:35, Michael Haupt wrote: >> >>> Hi Alex, >>> >>> On 29 April 2011 00:08, Alexandre Bergel <[email protected]> wrote: >>>>> I think you're being a bit harsh on stack sampling there. It is exact >>>>> enough to drive optimisation in some really high-performance VMs. It >>>>> is also deterministic enough to yield data leading to very good >>>>> performance results in those VMs. Whether focusing on counting >>>>> messages instead of taking samples is more beneficial would have to be >>>>> determined by experiment ... >>>> >>>> Yes, 25 pages of experiment :-) >>> >>> oh, I was not referring to your paper. I was referring to the general >>> applicability of message counting as opposed to sampling. The latter >>> is true-and-tried in many high performance VMs. For the former, an >>> experiment has yielded good results (from what I take from this thread >>> - I still haven't read your paper, sorry, it's on top of the pile) in >>> a constrained setting. All I was saying is that it is not possible to >>> conclude anything for the broader area from the experiments you >>> conducted. But we're probably of the same opinion here. >>> >>>>> What you mean with "non-portable" I do not understand. >>>> >>>> The information about the execution time contained in your profile cannot >>>> be compared with a new profile realized on a different machine, with a >>>> different CPU. >>> >>> That is correct, but the approach itself is portable - may I quote >>> you: "Most profilers, including MessageTally, count stack frames at a >>> regular interval. This is doomed to be inexact, non-deterministic and >>> non-portable". You weren't talking about the results. Those are >>> obviously specific to the platform, clock frequency, L1/L2/L3 cache >>> and memory sizes, application input (!) and other factors. >>> >>> Best, >>> >>> Michael >>> >> >> -- >> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >> Alexandre Bergel http://www.bergel.eu >> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >> >> >> >> >> >> >> > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
