Hi Bolo.

This should be a larger group discussion so I am cc'ing the group. I'd like to bring the collective expertise on this one.

Everyone: I have a conflict this Weds morning so propose we move the meeting time to 1pm on Weds.

--bart

On 9/18/2023 11:57 AM, Bolo wrote:
Hi Bart -- I've been taking a  look at memgaze.

I wanted to chat briefly about my current work on this,
to see if you might have some knowledge or intuition about
what I've discovered before I start a debug hunt on their
very large software stack and dyninst.


They are correct that sometimes dyninst is inserting extra
code and register spill/recovery where they are not needed.

The register spills and recoveries are also accompanied
by a temporary stack area.

Comparing the generated code to the code that dyninst
emits in the source, it appears that in _some_ cases of
instrumentation, that dyninst is emitting a dyninst stack
area, to allow for the "dyninst virtual instrumentation
machine"'s chunk of registers on the stack, as well as saving
processor state, so that it could then emit dyninst instrumentation
to use that facility... and be able to restore the machine state
post-instrumentation.

        Those operations might change CPU flags, so that spill/restore
        is generated for the "VM" and to maintain a constant CPU state,
        uninterrupted by the instrumentation.

However, nothing like that is being generated --  no
dyninst instrumentation, just the instructions rendered by
memgaze to add a ptwrite.  My original analysis that perhaps
dyninst is trying to save machine state (flags register)
across instrumentation does seem that it would be a cause
in some cases:

        an instruction such as perform comparison and set flags
                dyninst allocate  stack space via sp
                spill rax and flags
                instrumentation (ptwrite some register)
                dyninst restore rax and flags
                dyninst reclaim  stack space
        jCOND somewhere

However, in other cases it does not do that.   Is dyninst performing
dataflow analysis to try and avoid that issue .. even when it
doesn't issue "dyninst virtual machine instructions"?

Most of the problem rewrites appear to be in the c library,
not in the numeric code they are analyzing:

There is another case (this happens quite frequently) when the intel
string instructions are being used (in the c library actually)... and
the same type of code is emitted around them -- even though there is
no context to be saved:

        setup rcx, rsi, rdi  for  a comparison
                dyninst allocate  stack space via sp
                spill rax and flags
                instrumentation (add ptwrite on rdi, rsi)
                dyninst restore rax and flags
                dyninst reclaim  stack space
        repz cmpsb %es:(%rdi), %ds:(%rsi)

In other cases -- dyninst just adds the instrumentation verbatim
and everything is great.  Which is mostly in the numeric code,
not in the c library code, where the unneeded spill/restore happen.

        It's almost like the complexity of the branching in the
        c library code is affecting dyninst's choices, or the
        possible but not actual processor state save mentioned
        earlier.


There was also a side issue that the version of objdump they used
disassembled the x86_64 instructions incorrectly, making it appear that
dyninst was emitting instructions incorrectly in the instrumentation.
        
        That is NOT the case, when I disassembled by hand and then
        re-verified with xed, dyninst was generating perfect code.
        I didn't see anything in dyninst that would generate
        invalid code of that nature.   I'm checking to see if a
        newer version of objdump fixes that bug.



Initially I couldn't build their software, it corrupted my spack
and caused some other problems.   I've currently managed to compile
their stack locally, so that I can get a debugger on it and see
what logic dyninst is using in those "odd" cases.

Before I delve into that...

However, I would like to ask you first if you have some
intuition or knowledge on how dyninst chooses to add those
kinds of "dyninst vm spills and  restores" to code -- when it
isn't generating code there using it's instrumentation VM.

        The difference between the sites is a bit un-systematic,
        but perhaps it is and I don't know something about
        dyninst's methodology that would help with the analysis.

I've looked at the code, and I'm going to re-read some
of the dyninst papers first -- I though checking in with
you might be the thing to do, as you might have a better
knowledge of that scenario than the papers do.

Bolo -- Josef T. Burger
_______________________________________________
Dyninst-api mailing list
[email protected]
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

Reply via email to