I'd like the CPUs to remain as dumb as possible as far as ISA
semantics and mechanisms so neither they nor the ISAs are
unnecessarily constrained or complicated. In light of that, I think
it's actually better if the CPU has no idea what predication is or
when it may or may not have happened.
This is a class of stats I don't think we have a good way to collect,
though. We might want to know how often store conditionals fail, how
often compare and swaps fail, how often a register window push/pop
needs to fill/spill, etc. I think a more general mechanism that solved
-this- problem would solve the larger problem, and I think could be
quite useful.
Gabe
Quoting Ali Saidi <sa...@umich.edu>:
Yes, but we've not really solved the larger problem that is a CPU model
should be able to see if a instruction is predicated false. For example,
it's impossible to create a statistics that is the number of instructions
that were predicated false when we should just be able to do if
(inst->readPredicate() == false) predicatedFalse++; in the CPU models.
Ali
On Fri, 20 Aug 2010 11:42:23 -0700, Gabe Black <gbl...@eecs.umich.edu>
wrote:
Well the problem is just the load/store instructions, right? Otherwise
the execute method can just do/not do whatever it needs without having
to coordinate with the CPU. If you make your proposed isa parser
changes, the instructions should be able to handle all the other cases
internally without too much fuss.
Gabe
Ali Saidi wrote:
Interesting... we're top posting on this thread and bottom posting on
the
other one...
Anyway... yes you're correct initiateAcc() is called one instruction at
at
time.
For memory ops alone the o3 model could be changed in
executeLoad/Store()
inst->predicated = false; // assume failure
load_fault = inst->initiateAcc();
if (inst->predicated) ....
However, to make this work, the read()/write() methods would have to
set
inst->predicated = true;
That isn't so bad, but I still have two problems with this method:
1) This only works for load/store instructions. There isn't a
corresponding way to do this for any other type of op since they're in
either case they're going to call setInt/FloatReg().
2) I'm not really a fan on the idea that the absence of a call triggers
is
what triggers the mechanism. It seems like a convoluted way to instead
of
having:
if (testPredicate(....)) {
.....
} else {
....
xc->setPredicate(false);
}
Doing it the way it's currently implemented also means that the same
mechanism works for multiple cpu models that might need it.
Thanks,
Ali
On Thu, 19 Aug 2010 20:45:41 -0700, Gabe Black <gbl...@eecs.umich.edu>
wrote:
O3 only -seems- to execute multiple memory instructions at a time.
Each
initiateAcc is called one at a time and completes execution before the
next starts, so the same thing should apply as far as that goes.
Gabe
Ali Saidi wrote:
Remember that the timing cpu is only executing one instruction at a
time.
If the instruction calls read() and no access isn't set the timing
cpu
packages up the request ships it out and sets it's state to
DcacheWaitResponse. If the instruction doesn't call read() it
continues
on
like nothing happened {because it didn't). With the o3 cpu there are
multiple instructions in-flight, so simply waiting for it to call
read
or
not isn't an option. Something has to be passed through the xc to
tell
the
cpu that this instruction is done since the normal mechanism won't
take
care of it. The way this works now is by passing back something other
than
NoFault. However, the instruction didn't actually fault, so then we
would
have to special case everything that reads that fault later on in the
pipeline to say if its a predicationfault, do the same thing you
would
do
for no fault. This seems worse and more error prone.
Thanks,
ali
On Thu, 19 Aug 2010 15:16:22 -0400, Gabriel Michael Black
<gbl...@eecs.umich.edu> wrote:
I don't think that's true, but I may be confused. I think, at least
for the timing CPU, that it checks if a read/write was called and
doesn't just fall through. The timing CPU would wait forever for a
read/write response that would never come otherwise. (digs around a
little) I think it's sort of like what I described. The CPU will
continue if it's not waiting for anything, which would be the case
if
no access actually happened. We could probably get the same behavior
if we checked if the instruction was waiting for a read/write
response, but that might be kept somewhere annoying to get at.
Generally, if we can hide the existence of predication from the
CPUs,
I think that'll make everyones life easier (except for the ARM
ISA's,
I suppose).
Gabe
Quoting Ali Saidi <sa...@umich.edu>:
It's not the same issue here. The simple cpus just have their
execute/completeAccess methods guarded by a predicate condition
test.
If
nothing happens in there, so be it and the cpu goes onto the next
instruction without complaint. The out of order cpu on the other
hand
needs to know if the instruction was predicated false so it can
notify
commit that it is complete, even though it hasn't done anything. If
commit
isn't notified, the instruction will never commit and the processor
will
stall.
This information should clearly belong in the dyninst. Unless there
is
some other way to access the class from the the isa description, I
think
the change is correct. An alternate approach would be to have the
method
in
threadstate do nothing because it's unimportant for it.
Ali
On Tue, 17 Aug 2010 19:04:10 -0400, Gabriel Michael Black
<gbl...@eecs.umich.edu> wrote:
Sorry if I wasn't clear before (I reread my post and it sounded a
little vague) but what the simple CPU does is keep track of
whether
the supposed memory instruction actually calls read or write on
the
execution context. If not, then the CPU doesn't try to complete
any
access, it just considers that part over. Ideally we can do the
same
thing here.
Gabe
Quoting Ali Saidi <sa...@umich.edu>:
Anyone have comments on this? It seems like this is the only way
to
access
the DynInst from the isa description. Threadstate does have the
current
instruction in it, as well as things like "Temporary storage to
pass
the
source address from copy_load to". It doesn't seem to out of
place
to
include current instruction predication state in there.
Ali
On Sat, 14 Aug 2010 07:08:35 -0000, "Gabe Black"
<gbl...@eecs.umich.edu>
wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/177/#review185
-----------------------------------------------------------
src/cpu/thread_context.hh
<http://reviews.m5sim.org/r/177/#comment326>
This isn't really a property of a thread, it's the property
of
a
single instruction. I don't think this is being done in the
right
place. I think we should have a discussion on m5-dev to
determine
the
best way to handle this. There was a little code added to
the
simple
CPU that does what this is supposed to do if a memory
instruction
didn't actually read or write memory, and I think this is a
better
way
to handle this. We should have a discussion about this on
m5-dev,
especially since it touches lots of low level bits like
*contexts,
instruction behavior, CPUs, etc. These sorts of changes need
to
be
made
carefully.
- Gabe
On 2010-08-13 10:12:35, Ali Saidi wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/177/
-----------------------------------------------------------
(Updated 2010-08-13 10:12:35)
Review request for Default and Min Kyu Jeong.
Summary
-------
ARM/O3: store the result of the predicate evaluation in DynInst
or
Threadstate.
THis allows the CPU to handle predicated-false instructions
accordingly.
This particular patch makes loads that are predicated-false to
be
sent
straight to the commit stage directly, not waiting for return
of
the
data
that was never requested since it was predicated-false.
Diffs
-----
src/arch/arm/isa/templates/mem.isa 3c48b2b3cb83
src/arch/arm/isa/templates/pred.isa 3c48b2b3cb83
src/cpu/base_dyn_inst.hh 3c48b2b3cb83
src/cpu/base_dyn_inst_impl.hh 3c48b2b3cb83
src/cpu/o3/lsq_unit_impl.hh 3c48b2b3cb83
src/cpu/simple/base.hh 3c48b2b3cb83
src/cpu/simple_thread.hh 3c48b2b3cb83
src/cpu/thread_context.hh 3c48b2b3cb83
Diff: http://reviews.m5sim.org/r/177/diff
Testing
-------
Thanks,
Ali
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev