-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2181/
-----------------------------------------------------------

Review request for Default.


Repository: gem5


Description
-------

Changeset 10101:1cb43a0d4ec6
---------------------------
o3: Fix occupancy checks for SMT

A number of calls to isEmpty() and numFreeEntries()
should be thread-specific.

In cpu.cc, the fact that tid is /*commented*/ out is a bug. Say the rob
has instructions from thread 0 (isEmpty() returns false), and none from
thread 1. If we are trying to squash all of thread 1, then
readTailInst(thread 1) will be called because rob->isEmpty() returns
false. The result is end_it is not in the list and the while
statement loops indefinitely back over the cpu's instList.

In iew_impl.hh, all threads are told they have the entire remaining IQ, when
each thread actually has a certain allocation. The result is extra stalls at
the iew dispatch stage which the rename stage usually takes care of.

In commit_impl.hh, rob->readHeadInst(thread 1) can be called if the rob only
contains instructions from thread 0. This returns a dummyInst (which may work
since we are trying to squash all instructions, but hardly seems like the right
way to do it).

In rob_impl.hh this fix skips the rest of the function more frequently and is
more efficient.


Diffs
-----

  src/cpu/o3/commit_impl.hh 24cfe67c0749 
  src/cpu/o3/cpu.cc 24cfe67c0749 
  src/cpu/o3/iew_impl.hh 24cfe67c0749 
  src/cpu/o3/rob_impl.hh 24cfe67c0749 

Diff: http://reviews.gem5.org/r/2181/diff/


Testing
-------

quick Alpha debug regression, SMT test's stats change as expected.

***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/inorder-timing 
passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/o3-timing passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-atomic 
passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing 
passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing-ruby 
passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/o3-timing passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-atomic 
passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing 
passed.
***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing-ruby 
passed.
***** build/ALPHA/tests/debug/quick/se/01.hello-2T-smt/alpha/linux/o3-timing 
CHANGED!
***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-atomic 
skipped.
***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-timing 
skipped.
***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-atomic-mp 
skipped.
***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-timing-mp 
skipped.
***** build/ALPHA/tests/debug/quick/se/50.memtest/alpha/linux/memtest-ruby 
passed.
***** build/ALPHA/tests/debug/quick/se/60.rubytest/alpha/linux/rubytest-ruby 
passed.

===== Statistics differences =====
Maximum error magnitude: +9999.000000%

                                  Reference  New Value   Abs Diff   Pct Chg
Key statistics:

  host_inst_rate                      46987       9998     -36989   -78.72%
  host_mem_usage                     231368     228376      -2992    -1.29%
  sim_insts                           12745      12745          0    +0.00%
  sim_ops                             12745      12745          0    +0.00%
  sim_ticks                        24229500   24353500     124000    +0.51%
  system.cpu.commit.committedInsts::0       6390       6390          0    +0.00%
  system.cpu.commit.committedInsts::1       6389       6389          0    +0.00%
  system.cpu.commit.committedInsts::total      12779      12779          0    
+0.00%
  system.cpu.commit.committedOps::0       6390       6390          0    +0.00%
  system.cpu.commit.committedOps::1       6389       6389          0    +0.00%
  system.cpu.commit.committedOps::total      12779      12779          0    
+0.00%
  system.cpu.committedInsts::0         6373       6373          0    +0.00%
  system.cpu.committedInsts::1         6372       6372          0    +0.00%
  system.cpu.committedInsts_total      12745      12745          0    +0.00%
  system.cpu.committedOps::0           6373       6373          0    +0.00%
  system.cpu.committedOps::1           6372       6372          0    +0.00%
  system.cpu.ipc::0                0.131511   0.130841  -0.000670    -0.51%
  system.cpu.ipc::1                0.131490   0.130820  -0.000670    -0.51%
  system.cpu.ipc_total             0.263000   0.261661  -0.001339    -0.51%

Differences > 0%:

  system.cpu.iew.iewLSQFullEvents          0          2          2  +9999.00%
  system.cpu.iew.iewIQFullEvents         23          4        -19   -82.61%
  system.cpu.iew.lsq.thread1.ignoredResponses          2          3          1  
 +50.00%
  system.physmem.bytesPerActivate::832      2.000      1.000     -1.000   
-50.00%
  system.cpu.iew.iewBlockCycles        2954       1837      -1117   -37.81%
  system.cpu.iew.lsq.thread1.forwLoads         47         64         17   
+36.17%
  system.cpu.memDep0.conflictingLoads          9          6         -3   -33.33%
  system.physmem.bytesPerActivate::320      3.000      2.000     -1.000   
-33.33%
  system.physmem.bytesPerActivate::448      3.000      4.000      1.000   
+33.33%
  system.physmem.bytesPerActivate::960      3.000      4.000      1.000   
+33.33%
  system.cpu.iq.issued_per_cycle::8     22.000     15.000     -7.000   -31.82%
  system.cpu.rename.ROBFullEvents         54         71         17   +31.48%
  system.physmem.bytesPerActivate::576      4.000      5.000      1.000   
+25.00%
  system.physmem.bytesPerActivate::640      4.000      5.000      1.000   
+25.00%
  system.physmem.bytesPerActivate::704      5.000      6.000      1.000   
+20.00%
  system.physmem.rdQLenPdf::4            15         12         -3   -20.00%
  system.cpu.iq.iqSquashedInstsIssued        131        105        -26   -19.85%
  system.cpu.rename.serializeStallCycles       1585       1276       -309   
-19.50%
  system.cpu.rename.BlockCycles        6164       5248       -916   -14.86%
  system.cpu.iew.iewUnblockCycles         42         36         -6   -14.29%
[... showing top 20 errors only, additional errors omitted ...]

Missing 2 reference statistics:

  system.physmem.bytesPerActivate::1536                 2      0.92%     98.16%
  system.physmem.bytesPerActivate::896                  2      0.92%     94.47%

Found 2 new statistics:

  system.cpu.rename.IQFullEvents                        39
  system.physmem.bytesPerActivate::1664                 2      0.94%     98.59%


Thanks,

Faissal Sleiman

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to