> On Aug. 26, 2012, 2:26 p.m., Steve Reinhardt wrote: > > src/sim/clocked_object.hh, line 85 > > <http://reviews.gem5.org/r/1321/diff/4/?file=29012#file29012line85> > > > > Since tick and clock correspond to the *next* clock edge, I think this > > '==' could be '>='. It wouldn't change the results at all (I'm pretty > > sure) but would avoid the divide in more cases. > > > > Also, re: your problem with tick += clock being outside the if -- I'd > > say the real problem is that between tick += clock and cycle += 1, one was > > outside the if and the other was inside. So there's an alternate solution > > like this: > > > > tick += clock; > > cycle += 1; > > > > if (tick >= curTick()) { > > return; > > } > > > > The elapsedCycles code doesn't have to change at all, though the value > > of elapsedCycles would now be smaller by one than the value with the > > current code. > > > > I'm not sure which version is better; I thought of this new one in the > > process of trying to eliminate the redundant addition of tick to clock > > that's in the current code, though (1) I'd think a good compiler could > > eliminate that anyway and (2) it's just an integer add so who cares. Feel > > free to pick whichever version you prefer.
The proposed change causes a lot of regressions to behave quite differently. I haven't managed to figure out exactly why yet, but intuitively I find my previous solution easier to follow. I'll do the changes one by one and see what the impact is (== to >= followed by moving the addition outside the if-statement). - Andreas ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/1321/#review3332 ----------------------------------------------------------- On Aug. 23, 2012, 2:48 a.m., Andreas Hansson wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/1321/ > ----------------------------------------------------------- > > (Updated Aug. 23, 2012, 2:48 a.m.) > > > Review request for Default. > > > Description > ------- > > Changeset 9168:571d3ae56173 > --------------------------- > Clock: Rework clocks to avoid tick-to-cycle transformations > > This patch introduces the notion of a clock update function that aims > to avoid costly divisions when turning the current tick into a > cycle. Each clocked object advances a private (hidden) cycle member > and a tick member and uses these to implement functions for getting > the tick of the next cycle, or the tick of a cycle some time in the > future. > > In the different modules using the clocks, changes are made to avoid > counting in ticks only to later translate to cycles. There are a few > oddities in how the O3 and inorder CPU count idle cycles, as seen by a > few locations where a cycle is subtracted in the calculation. This is > done such that the regression does not change any stats, but should be > revisited in a future patch. > > Another, much needed, change that is not done as part of this patch is > to introduce a new typedef uint64_t Cycle to be able to at least hint > at the unit of the variables counting Ticks vs Cycles. This will be > done as a follow-up patch. > > As an additional follow up, the thread context still uses ticks for > the book keeping of last activate and last suspend and this should > probably also be changed into cycles as well. > > > Diffs > ----- > > src/arch/arm/table_walker.cc 1d983855df2c > src/arch/x86/mmapped_ipr.hh 1d983855df2c > src/cpu/base.cc 1d983855df2c > src/cpu/inorder/cpu.hh 1d983855df2c > src/cpu/inorder/cpu.cc 1d983855df2c > src/cpu/inorder/resource.cc 1d983855df2c > src/cpu/inorder/resource_pool.cc 1d983855df2c > src/cpu/o3/O3CPU.py 1d983855df2c > src/cpu/o3/commit.hh 1d983855df2c > src/cpu/o3/commit_impl.hh 1d983855df2c > src/cpu/o3/cpu.hh 1d983855df2c > src/cpu/o3/cpu.cc 1d983855df2c > src/cpu/o3/fetch_impl.hh 1d983855df2c > src/cpu/o3/inst_queue_impl.hh 1d983855df2c > src/cpu/o3/lsq_unit.hh 1d983855df2c > src/cpu/simple/atomic.cc 1d983855df2c > src/cpu/simple/timing.hh 1d983855df2c > src/cpu/simple/timing.cc 1d983855df2c > src/cpu/testers/memtest/memtest.cc 1d983855df2c > src/cpu/testers/networktest/networktest.cc 1d983855df2c > src/dev/arm/pl111.cc 1d983855df2c > src/dev/i8254xGBe.cc 1d983855df2c > src/dev/ns_gige.cc 1d983855df2c > src/sim/clocked_object.hh 1d983855df2c > > Diff: http://reviews.gem5.org/r/1321/diff/ > > > Testing > ------- > > util/regress all passing (disregarding t1000 and eio) > > A minor update. This change did improve performance. Running the > full regression, including a clean compile of all the ISAs went > down by 8%. Note that this includes the time for building as well. > > > Thanks, > > Andreas Hansson > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
