Hi all,

I've been working on integrating McVerSi [1] (a framework for
simulation-based memory consistency verification) into the latest Gem5
version; in doing so I have also discovered new bugs (see end of email).

At [1] I provide a patch and instructions on how to use McVerSi with the
current Gem5 version. There is a chance that (due to differing
checkpoints, timing, etc.) other bugs may be discovered than the
mentioned ones here.

Ultimately, it would be good to have this in upstream, but currently
doesn't appear clear to me what the best approach for this is. It was
raised at HPCA (during discussion with Brad and David), that it might
be nice to have for regression testing. Comments would be appreciated.

The patch at [1] needs some work to be suitable to be accepted into
upstream; does it still make sense to upload to RB to discuss? I am also
unsure if Gem5 currently provides a clean way to implement the hooks
required, and would rather not need to change other subsystems.

(A small portion of the patch at [1] is already on the RB:
http://reviews.gem5.org/r/3398/)

Furthermore, it requires a substantial external component
(https://github.com/melver/mc2lib) to live in ext/; should this be put
onto RB at the same time?

Many thanks,

--
Marco Elver
Research Associate | University of Edinburgh | ac.marcoelver.com

[1] http://ac.marcoelver.com/research/mcversi


The following are new bugs found; neither look particularly easy to fix:

1. In the classic memory system, same address read->read reordering (SC
PER LOCATION violation) can occur with O3CPU (found while using ARM ISA,
but probably affects all other ISAs requiring at least same address
read->read ordering):

Consider the case of a replacement (Cache::allocateBlock), which chooses
a block corresponding to a prior request that has already been
satisfied, i.e. no longer in MSHR. However, the request is still not
committed in the LQ on CPU side, and if we don't send an invalidation to
the CPU side on a replacement we may miss any potential external
invalidations after. This bug is found after running for about 30 min.

2. In O3CPU LSQ, write->read reordering (SC PER LOCATION violation) can
occur in the same thread when using virtual address synonyms (found with
X86+Ruby, but will affect all other configurations that use the O3CPU):

When forwarding in-flight writes from the SQ, only virtual addresses are
checked; if two virtual addresses map to the same physical address, no
forwarding takes place. I realize this might be a rare use-case for the
kinds of workloads being run on Gem5 right now? Found with first
generated test, i.e. instantly.

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to