Hi all (specifically Gabe),

I was trying to run some tests that use the RDTSCP instruction and I found
that the rdtsc micro op's current implementation isn't quite serializing
enough. From the Intel manual:

The RDTSCP instruction is not a serializing instruction, but it does wait
until all previous instructions have executed and all previous loads are
globally visible. But it does not wait for previous stores to be globally
visible, and subsequent instructions may begin execution before the read
operation is performed. The following items may guide software seeking to
order executions of RDTSCP:
• If software requires RDTSCP to be executed only after all previous stores
are globally visible, it can execute MFENCE immediately before RDTSCP.
• If software requires RDTSCP to be executed prior to execution of any
subsequent instruction (including any memory accesses), it can execute
LFENCE immediately after RDTSCP.

This sounds like the microop should be "serializing before" in gem5's
parlance. I believe that the two instructions RDTSC and RDTSCP have the
same semantics, but that is not clearly stated in the instruction manual. I
don't see any reason not to implement them the same in gem5. Correct me if
I'm wrong.

In testing, I found that making the macro-op serializing doesn't work
because it only serializes the final instruction and the TSC has already
been read. For instance if you have the following code sequence:

rdtscp
load miss
rdtscp

The difference in the two counters is ~load miss time on real hardware. In
gem5, the difference is ~4 cycles.

I've found that this can be fixed by adding the following code to the RDTSC
micro-op implementation generated by the ISA description in
decode-ns.cc.inc.

flags[IsSerializeBefore] = true    ;

After this change, gem5 reports numbers closer to real hardware.

I can't figure out the "right" way to get this code generated, though! I
assume I need to somehow change the rdstc micro-op definition in regop.isa.
Any help would be greatly appreciated!

Other quick question: RDTSCP is supposed to return the CPU number along
with the TSC value. Any hints has to how to get this from the ISA language?
Would the best way be to add a new micro-op for this?

Thanks,
Jason
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to