On 07/11/2011 07:36 AM, Josef Weidendorfer wrote: > On Monday 11 July 2011, John Reiser wrote: >> [snip]
>>From my example in my last mail, you can see that a recursive implementation > can produce the same code as a loop-based one. So you can never say with > certainty > that some machine code was programmed as a loop. But the recursion depth (number of actual pending return addresses) is certain. If the depth does not change, and if the target is in the same routine, then it's a loop. If the depth does not change and if the target is in some other routine, then it's a tail-recursive continuation. If the depth increases [by 1], then it's a CALL. If the depth decreases by 1, then it's a RETurn; if the depth decreases by more than 1 then it's an "unwind". > > Up to now, Callgrind tries to reconstruct what the programmer has coded. Which > means that it tries to reconstruct recursive behavior even when the compiler > did tail recursion optimization (and the number of pending returns do not > change). > > I see that your main criteria is: "if the number of pending returns doesn't > change, > it is a loop". > > The problem is highlighted exactly with your example: > >> Consider the source-code call chain A ==> B ==> C where B ==> C is >> a tail recursion which the compiler recognizes and implements. During >> execution of C, then the actual dynamic _return_ chain is C ==> A. > > Typo: "A ==> C". C will return to A, and a return happens in the future, so I write the arrow for a _return_ chain in the direction of future motion. A _call_ chain records the direction of past motion: "A called B" as a call chain is "A ==> B"; the resulting _return_ chain is "B ==> A". Perhaps I should write "A <== B" ? (Most often a return does not go to the beginning, but this does not matter here. There is a style which explicitly writes [constructs, schedules] the continuation stack; this style often _does_ "return" [continue] to the beginning of many routines.) > >> B has been erased by the tail recursion. The stack contains no record >> of B ever having been invoked. So the effective dynamic call chain >> [backtrace] at this point is A ==> C. > > This is impossible to represent in one single call graph for the execution > of the program. C is either called from B or from A. The latter would mean > an abstract return from B to A before A calls C. The CPU executes only continuations ("what is the [address of the] next instruction?") and is indifferent to the terminology "call" and "return". Often there is hardware (such as an on-chip stack of subroutine return addresses) to speed the execution of common idioms such as CALL with RET, but to the instruction fetch+decode unit and to the execution unit this is a mere optimization, not a fundamental concept. > > This of course is possible, but as you say yourself: > >> This may surprise the programmer >> who never wrote any call from A to C, but this is much the same as >> stopping at a breakpoint immediately after a taken branch. >> You know where you are [$pc], and where you're going (the backtrace of >> pending >> _return_s), but not every detail of how you got here (taken branches >> or tail recursions are not known.) > > But you simple can not compare it with debugging, as Callgrind wants > to come up with that single call graph. > For this call graph, I much prefer Callgrind to use "A ==> B ==> C". > > The whole point of Callgrind/KCachegrind is to present performance > metrics attributed to call graphs in a top-down way. > Your suggestion seems incompatible to that goal. Going from "top-down" to insisting on "call and return" is problematic. (For one thing, such a view fumbles true co-routines, which never return.) The instant that B executes a tail-recursive continuation to C, then the call path A ==> B ==> C is transformed into A ==> C, and I expect the call graph to have such an arc, because that's what the true state has become. B has vanished, A and C still are there, and [for the moment] C is going to return to A. This still is "top-down", except based on continuations. >>> There is a similar scenario: machine code can have jumps between >>> functions (e.g. as result of a tail recursion optimization, or hand crafted >>> assembler). Callgrind output is about call relationship. There is no way >>> to reason about jumps between functions. Thus, callgrind has to map a >>> jump between functions either to a "call" (with an additional return when >>> returning from the function jumped to) or a "return/call" pair. >> >> It is better to think of this as a _continuation_, for which >> some programmers might use the idea of "return+call" as a crutch >> until they become familiar with continuations. In many cases >> it is important to realize that a principle side effect during >> execution is the dynamic writing of the program which constitutes >> the continuation from the current subroutine. Of course, this >> "program of continuations" is the stack. > > And how to map this to a call graph? > Do you suggest introducing nodes which are continuations of multiple > functions? If $pc is in C, and the next return address is in A, then there is an arc A ==> C which should be charged (billed) accordingly. > I think this would be even more confusing, as regular programming languages > do not allow to code continuations, so programmers usually don't know this > concept. Tail recursion necessarily introduces the concept of continuation. Trying to hide continuations causes problems, as you have noticed. -- ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users