Hello Ben,

Thanks for your suggestions. The decision to adapt GHCi came out of a
discussion with my supervisor and his colleagues. At this point the
entire set of desired capabilities of the work is still unknown, however
we do consider the GHCi-compatible programs to represent a large enough
set for future analysis, and the ease of mapping breakpoints back to
source code is a significant benefit.

I do plan on using the ghc-heap-view (I assume that's what you meant by
ghc-heap, or is there another library I don't know about?) logic in the
project, although I'm currently more focused on implementing the proper
hook mechanism. I expect that events of deeply nested thunks being
forced will be quite important. The possibility of tracking control
flow via breakpoints/tracepoints also seems appealing. I'm not aware
of any existing solutions which would allow dynamic tracing, although
it's very well possible I didn't look hard enough.

Regarding ghc-debug, I'm not sure what kinds of trade-offs it offers
compared to the approach I'm currently taking. It looks like it's a
fairly newborn project, do you think it's mature enough for the
proposed use cases? I couldn't find docs online, although I did come
across [0] which Discourse[1] says is related. I've yet to watch the
introduction video. Support for unboxed tuples and other features not
supported by GHCi would of course be nice, although performance is not
a concern. Keeping the relationship between source code spans and
heap objects in the infotables is an intriguing idea.

> Note that Luite's recent work on refactoring the bytecode generator to

> produce code from STG is quite relevant here. In particular, you will

> likely want to look at !4589 [1], which does the work of refactoring

> Tickish to follow the Trees That Grow pattern. You would likely want to

> do the same to capture your free variable information.


Excellent, I was not aware of this. Thank you!


Regards,
Andrew



[0]: https://well-typed.com/blog/2021/01/first-look-at-hi-profiling-mode/
[1]: https://discourse.haskell.org/t/an-introduction-to-ghc-debug-precise-memory-analysis-for-haskell-programs/1771


On 26/01/2021 16:28, Ben Gamari wrote:
Andrew Kvapil <vil...@seznam.cz> writes:

Hello,

I'm interested in inspecting the strictness of functions at runtime
and the depth of thunks "in the wild."

Hi Andrew,

Interesting. When I first read your introduction my first thought was to
rather walk the "normal" heap using ghc-heap or, perhaps, the relatively
new ghc-debug library [1] rather than introduce this feature in GHCi.
It's hard to know whether non-GHCi-based approach is viable without
knowing more about what you are doing, but it potentially brings the
benefit of generality: your analysis is not limited to programs with can
be run under GHCi.

Of course, it also brings some challenges: You would need to find a way
to associate info table symbols with whatever information you need of
from the Core program and in the presence of simplification tying your
results back to the source program may not be possible at all.

Cheers,

- Ben


[1] https://gitlab.haskell.org/ghc/ghc-debug

For this reason I'm modifying
GHC 8.10.2, essentially to add additional information to breakpoints.
I'd like to reuse the logic behind GHCi's :print command
(pprintClosureCommand, obtainTermFromId, ...) for which I suppose I
need Id's. Those however don't exist for destructuring patterns, such
as those in the following equations:

      last [x] = x
      last (_:xs) = last xs

So I'm wondering where would be a good place in the pipeline to
transform patterns like these into at-patterns, to give them Id's.
However, the breakpoint logic only looks at the free variables of the
right-hand sides and not transitively, which means that e.g. in the
following example neither ':print arg1' nor ':print as' works when the
interpreter hits a breakpoint in the top level expression on the RHS:

      qsort arg1@(a:as) = qsort left ++ [a] ++ qsort right
        where (left, right) = (filter (<=a) as, filter (>a) as)

Thus I'd also like to know how to extend the free var logic for
Tickish that eventually leads to CgBreakInfo and :print's ability to
inspect these bindings at runtime. My goal would be to determine to what
extent was a thunk evaluated during function application.

Note that Luite's recent work on refactoring the bytecode generator to
produce code from STG is quite relevant here. In particular, you will
likely want to look at !4589 [1], which does the work of refactoring
Tickish to follow the Trees That Grow pattern. You would likely want to
do the same to capture your free variable information.

Cheers,

- Ben


[1] https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4589

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to