> On Aug. 26, 2012, 2:26 p.m., Steve Reinhardt wrote:
> > src/sim/clocked_object.hh, line 85
> > <http://reviews.gem5.org/r/1321/diff/4/?file=29012#file29012line85>
> >
> >     Since tick and clock correspond to the *next* clock edge, I think this 
> > '==' could be '>='.  It wouldn't change the results at all (I'm pretty 
> > sure) but would avoid the divide in more cases.
> >     
> >     Also, re: your problem with tick += clock being outside the if -- I'd 
> > say the real problem is that between tick += clock and cycle += 1, one was 
> > outside the if and the other was inside.  So there's an alternate solution 
> > like this:
> >     
> >         tick += clock;
> >         cycle += 1;
> >     
> >         if (tick >= curTick()) {
> >             return;
> >         }
> >     
> >     The elapsedCycles code doesn't have to change at all, though the value 
> > of elapsedCycles would now be smaller by one than the value with the 
> > current code.
> >     
> >     I'm not sure which version is better; I thought of this new one in the 
> > process of trying to eliminate the redundant addition of tick to clock 
> > that's in the current code, though (1) I'd think a good compiler could 
> > eliminate that anyway and (2) it's just an integer add so who cares.  Feel 
> > free to pick whichever version you prefer.
> 
> Andreas Hansson wrote:
>     The proposed change causes a lot of regressions to behave quite 
> differently. I haven't managed to figure out exactly why yet, but intuitively 
> I find my previous solution easier to follow.
>     
>     I'll do the changes one by one and see what the impact is (== to >= 
> followed by moving the addition outside the if-statement).
> 
> Steve Reinhardt wrote:
>     Interesting... do you agree that they look like they should be completely 
> equivalent?
>     
>     Thanks for looking into this... I thought it would be a simple change, 
> didn't mean to cause you a lot of extra work.
> 
> Andreas Hansson wrote:
>     I also do not see why they are not equivalent. My guess is that it has to 
> do with the case when the original tick < curTick() but tick + clock > 
> curTick(), but that is just a guess. I'll do some more digging.
> 
> Andreas Hansson wrote:
>     Indeed, it is exactly the case I mention that is causing problems with 
> the code you suggested. With my original code, the if-statement will not hold 
> (tick + clock is not curTick()) and so the update is based on the 
> elapsedCycles.
>     
>     With your suggested change we have already added a clock period and a 
> cycle, and actually we shouldn't have done so. I guess we can either add yet 
> another if statement to catch this case, or fall back on the code as I had it 
> previously.
>     
>     Ideas? Opinions?

At this point I'm mostly curious about why I was wrong when I thought that this 
code would be equivalent.  Your original code handles two special cases:
    tick == curTick()
    tick + clock == curTick()
I'm trying to handle the additional case where
    tick < curTick() < tick + clock
in which case the elapsedCycles calculation should end up having the same 
effect as
    { tick += clock; clock += 1; }
because divCeil(curTick() - tick, clock) should be 1 if tick < curTick() <= 
tick + clock.  Right?

So in other words if tick < curTick() then we should always be adding a clock 
period and a cycle, and the only question is whether adding one is sufficient 
or if we really need to add more than one.

It just occured to me that the case that seems like it might be causing 
problems is if update() gets called with tick > curTick().  In this case I 
think what's happening with the current code is that both of the ifs fail, and 
we go through the elapsedCycles computation but curTick() - tick turns out to 
be negative so elapsedCycles is zero and nothing happens.

So here's my new hypothesis: if you change the very first condition to;
    if (tick >= curTick())
        return;

then all will be well, and we'll save even more trips through divCeil() than we 
even realized were happening before...

Of course, I have been wrong before.


- Steve


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1321/#review3332
-----------------------------------------------------------


On Aug. 23, 2012, 2:48 a.m., Andreas Hansson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/1321/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2012, 2:48 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Description
> -------
> 
> Changeset 9168:571d3ae56173
> ---------------------------
> Clock: Rework clocks to avoid tick-to-cycle transformations
> 
> This patch introduces the notion of a clock update function that aims
> to avoid costly divisions when turning the current tick into a
> cycle. Each clocked object advances a private (hidden) cycle member
> and a tick member and uses these to implement functions for getting
> the tick of the next cycle, or the tick of a cycle some time in the
> future.
> 
> In the different modules using the clocks, changes are made to avoid
> counting in ticks only to later translate to cycles. There are a few
> oddities in how the O3 and inorder CPU count idle cycles, as seen by a
> few locations where a cycle is subtracted in the calculation. This is
> done such that the regression does not change any stats, but should be
> revisited in a future patch.
> 
> Another, much needed, change that is not done as part of this patch is
> to introduce a new typedef uint64_t Cycle to be able to at least hint
> at the unit of the variables counting Ticks vs Cycles. This will be
> done as a follow-up patch.
> 
> As an additional follow up, the thread context still uses ticks for
> the book keeping of last activate and last suspend and this should
> probably also be changed into cycles as well.
> 
> 
> Diffs
> -----
> 
>   src/arch/arm/table_walker.cc 1d983855df2c 
>   src/arch/x86/mmapped_ipr.hh 1d983855df2c 
>   src/cpu/base.cc 1d983855df2c 
>   src/cpu/inorder/cpu.hh 1d983855df2c 
>   src/cpu/inorder/cpu.cc 1d983855df2c 
>   src/cpu/inorder/resource.cc 1d983855df2c 
>   src/cpu/inorder/resource_pool.cc 1d983855df2c 
>   src/cpu/o3/O3CPU.py 1d983855df2c 
>   src/cpu/o3/commit.hh 1d983855df2c 
>   src/cpu/o3/commit_impl.hh 1d983855df2c 
>   src/cpu/o3/cpu.hh 1d983855df2c 
>   src/cpu/o3/cpu.cc 1d983855df2c 
>   src/cpu/o3/fetch_impl.hh 1d983855df2c 
>   src/cpu/o3/inst_queue_impl.hh 1d983855df2c 
>   src/cpu/o3/lsq_unit.hh 1d983855df2c 
>   src/cpu/simple/atomic.cc 1d983855df2c 
>   src/cpu/simple/timing.hh 1d983855df2c 
>   src/cpu/simple/timing.cc 1d983855df2c 
>   src/cpu/testers/memtest/memtest.cc 1d983855df2c 
>   src/cpu/testers/networktest/networktest.cc 1d983855df2c 
>   src/dev/arm/pl111.cc 1d983855df2c 
>   src/dev/i8254xGBe.cc 1d983855df2c 
>   src/dev/ns_gige.cc 1d983855df2c 
>   src/sim/clocked_object.hh 1d983855df2c 
> 
> Diff: http://reviews.gem5.org/r/1321/diff/
> 
> 
> Testing
> -------
> 
> util/regress all passing (disregarding t1000 and eio)
> 
> A minor update. This change did improve performance. Running the
> full regression, including a clean compile of all the ISAs went
> down by 8%. Note that this includes the time for building as well.
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to