----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2181/ -----------------------------------------------------------
Review request for Default. Repository: gem5 Description ------- Changeset 10101:1cb43a0d4ec6 --------------------------- o3: Fix occupancy checks for SMT A number of calls to isEmpty() and numFreeEntries() should be thread-specific. In cpu.cc, the fact that tid is /*commented*/ out is a bug. Say the rob has instructions from thread 0 (isEmpty() returns false), and none from thread 1. If we are trying to squash all of thread 1, then readTailInst(thread 1) will be called because rob->isEmpty() returns false. The result is end_it is not in the list and the while statement loops indefinitely back over the cpu's instList. In iew_impl.hh, all threads are told they have the entire remaining IQ, when each thread actually has a certain allocation. The result is extra stalls at the iew dispatch stage which the rename stage usually takes care of. In commit_impl.hh, rob->readHeadInst(thread 1) can be called if the rob only contains instructions from thread 0. This returns a dummyInst (which may work since we are trying to squash all instructions, but hardly seems like the right way to do it). In rob_impl.hh this fix skips the rest of the function more frequently and is more efficient. Diffs ----- src/cpu/o3/commit_impl.hh 24cfe67c0749 src/cpu/o3/cpu.cc 24cfe67c0749 src/cpu/o3/iew_impl.hh 24cfe67c0749 src/cpu/o3/rob_impl.hh 24cfe67c0749 Diff: http://reviews.gem5.org/r/2181/diff/ Testing ------- quick Alpha debug regression, SMT test's stats change as expected. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/inorder-timing passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/o3-timing passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-atomic passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing-ruby passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/o3-timing passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-atomic passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing passed. ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing-ruby passed. ***** build/ALPHA/tests/debug/quick/se/01.hello-2T-smt/alpha/linux/o3-timing CHANGED! ***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-atomic skipped. ***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-timing skipped. ***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-atomic-mp skipped. ***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-timing-mp skipped. ***** build/ALPHA/tests/debug/quick/se/50.memtest/alpha/linux/memtest-ruby passed. ***** build/ALPHA/tests/debug/quick/se/60.rubytest/alpha/linux/rubytest-ruby passed. ===== Statistics differences ===== Maximum error magnitude: +9999.000000% Reference New Value Abs Diff Pct Chg Key statistics: host_inst_rate 46987 9998 -36989 -78.72% host_mem_usage 231368 228376 -2992 -1.29% sim_insts 12745 12745 0 +0.00% sim_ops 12745 12745 0 +0.00% sim_ticks 24229500 24353500 124000 +0.51% system.cpu.commit.committedInsts::0 6390 6390 0 +0.00% system.cpu.commit.committedInsts::1 6389 6389 0 +0.00% system.cpu.commit.committedInsts::total 12779 12779 0 +0.00% system.cpu.commit.committedOps::0 6390 6390 0 +0.00% system.cpu.commit.committedOps::1 6389 6389 0 +0.00% system.cpu.commit.committedOps::total 12779 12779 0 +0.00% system.cpu.committedInsts::0 6373 6373 0 +0.00% system.cpu.committedInsts::1 6372 6372 0 +0.00% system.cpu.committedInsts_total 12745 12745 0 +0.00% system.cpu.committedOps::0 6373 6373 0 +0.00% system.cpu.committedOps::1 6372 6372 0 +0.00% system.cpu.ipc::0 0.131511 0.130841 -0.000670 -0.51% system.cpu.ipc::1 0.131490 0.130820 -0.000670 -0.51% system.cpu.ipc_total 0.263000 0.261661 -0.001339 -0.51% Differences > 0%: system.cpu.iew.iewLSQFullEvents 0 2 2 +9999.00% system.cpu.iew.iewIQFullEvents 23 4 -19 -82.61% system.cpu.iew.lsq.thread1.ignoredResponses 2 3 1 +50.00% system.physmem.bytesPerActivate::832 2.000 1.000 -1.000 -50.00% system.cpu.iew.iewBlockCycles 2954 1837 -1117 -37.81% system.cpu.iew.lsq.thread1.forwLoads 47 64 17 +36.17% system.cpu.memDep0.conflictingLoads 9 6 -3 -33.33% system.physmem.bytesPerActivate::320 3.000 2.000 -1.000 -33.33% system.physmem.bytesPerActivate::448 3.000 4.000 1.000 +33.33% system.physmem.bytesPerActivate::960 3.000 4.000 1.000 +33.33% system.cpu.iq.issued_per_cycle::8 22.000 15.000 -7.000 -31.82% system.cpu.rename.ROBFullEvents 54 71 17 +31.48% system.physmem.bytesPerActivate::576 4.000 5.000 1.000 +25.00% system.physmem.bytesPerActivate::640 4.000 5.000 1.000 +25.00% system.physmem.bytesPerActivate::704 5.000 6.000 1.000 +20.00% system.physmem.rdQLenPdf::4 15 12 -3 -20.00% system.cpu.iq.iqSquashedInstsIssued 131 105 -26 -19.85% system.cpu.rename.serializeStallCycles 1585 1276 -309 -19.50% system.cpu.rename.BlockCycles 6164 5248 -916 -14.86% system.cpu.iew.iewUnblockCycles 42 36 -6 -14.29% [... showing top 20 errors only, additional errors omitted ...] Missing 2 reference statistics: system.physmem.bytesPerActivate::1536 2 0.92% 98.16% system.physmem.bytesPerActivate::896 2 0.92% 94.47% Found 2 new statistics: system.cpu.rename.IQFullEvents 39 system.physmem.bytesPerActivate::1664 2 0.94% 98.59% Thanks, Faissal Sleiman _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
