I'd like the CPUs to remain as dumb as possible as far as ISA semantics and mechanisms so neither they nor the ISAs are unnecessarily constrained or complicated. In light of that, I think it's actually better if the CPU has no idea what predication is or when it may or may not have happened.

This is a class of stats I don't think we have a good way to collect, though. We might want to know how often store conditionals fail, how often compare and swaps fail, how often a register window push/pop needs to fill/spill, etc. I think a more general mechanism that solved -this- problem would solve the larger problem, and I think could be quite useful.

Gabe

Quoting Ali Saidi <sa...@umich.edu>:


Yes, but we've not really solved the larger problem that is a CPU model
should be able to see if a instruction is predicated false. For example,
it's impossible to create a statistics that is the number of instructions
that were predicated false when we should just be able to do if
(inst->readPredicate() == false) predicatedFalse++; in the CPU models.

Ali


On Fri, 20 Aug 2010 11:42:23 -0700, Gabe Black <gbl...@eecs.umich.edu>
wrote:
Well the problem is just the load/store instructions, right? Otherwise
the execute method can just do/not do whatever it needs without having
to coordinate with the CPU. If you make your proposed isa parser
changes, the instructions should be able to handle all the other cases
internally without too much fuss.

Gabe

Ali Saidi wrote:
Interesting... we're top posting on this thread and bottom posting on
the
other one...

Anyway... yes you're correct initiateAcc() is called one instruction at
at
time.

For memory ops alone the o3 model could be changed in
executeLoad/Store()
inst->predicated = false; // assume failure
load_fault = inst->initiateAcc();

if (inst->predicated) ....

However, to make this work, the read()/write() methods would have to
set
inst->predicated = true;

That isn't so bad, but I still have two problems with this method:
1) This only works for load/store instructions. There isn't a
corresponding way to do this for any other type of op since they're in
either case they're going to call setInt/FloatReg().
2) I'm not really a fan on the idea that the absence of a call triggers
is
what triggers the mechanism. It seems like a convoluted way to instead
of
having:
if (testPredicate(....)) {
   .....
} else {
  ....
  xc->setPredicate(false);
}

Doing it the way it's currently implemented also means that the same
mechanism works for multiple cpu models that might need it.

Thanks,
Ali



On Thu, 19 Aug 2010 20:45:41 -0700, Gabe Black <gbl...@eecs.umich.edu>
wrote:

O3 only -seems- to execute multiple memory instructions at a time.
Each
initiateAcc is called one at a time and completes execution before the
next starts, so the same thing should apply as far as that goes.

Gabe

Ali Saidi wrote:

Remember that the timing cpu is only executing one instruction at a

time.

If the instruction calls read() and no access isn't set the timing
cpu
packages up the request ships it out and sets it's state to
DcacheWaitResponse. If the instruction doesn't call read() it
continues
on
like nothing happened {because it didn't). With the o3 cpu there are
multiple instructions in-flight, so simply waiting for it to call
read

or

not isn't an option. Something has to be passed through the xc to
tell
the
cpu that this instruction is done since the normal mechanism won't
take
care of it. The way this works now is by passing back something other
than
NoFault. However, the instruction didn't actually fault, so then we

would

have to special case everything that reads that fault later on in the
pipeline to say if its a predicationfault, do the same thing you
would

do

for no fault. This seems worse and more error prone.

Thanks,
ali


On Thu, 19 Aug 2010 15:16:22 -0400, Gabriel Michael Black
<gbl...@eecs.umich.edu> wrote:


I don't think that's true, but I may be confused. I think, at least

for the timing CPU, that it checks if a read/write was called and
doesn't just fall through. The timing CPU would wait forever for a
read/write response that would never come otherwise. (digs around a

little) I think it's sort of like what I described. The CPU will
continue if it's not waiting for anything, which would be the case
if



no access actually happened. We could probably get the same behavior

if we checked if the instruction was waiting for a read/write
response, but that might be kept somewhere annoying to get at.
Generally, if we can hide the existence of predication from the
CPUs,



I think that'll make everyones life easier (except for the ARM
ISA's,



I suppose).

Gabe

Quoting Ali Saidi <sa...@umich.edu>:



It's not the same issue here. The simple cpus just have their
execute/completeAccess methods guarded by a predicate condition
test.


If


nothing happens in there, so be it and the cpu goes onto the next
instruction without complaint.  The out of order cpu on the other

hand

needs to know if the instruction was predicated false so it can

notify

commit that it is complete, even though it hasn't done anything. If
commit
isn't notified, the instruction will never commit and the processor


will


stall.

This information should clearly belong in the dyninst. Unless there

is

some other way to access the class from the the isa description, I


think


the change is correct. An alternate approach would be to have the


method


in
threadstate do nothing because it's unimportant for it.

Ali



On Tue, 17 Aug 2010 19:04:10 -0400, Gabriel Michael Black
<gbl...@eecs.umich.edu> wrote:


Sorry if I wasn't clear before (I reread my post and it sounded a
little vague) but what the simple CPU does is keep track of
whether
the supposed memory instruction actually calls read or write on
the
execution context. If not, then the CPU doesn't try to complete
any
access, it just considers that part over. Ideally we can do the
same
thing here.

Gabe

Quoting Ali Saidi <sa...@umich.edu>:



Anyone have comments on this? It seems like this is the only way
to
access
the DynInst from the isa description. Threadstate does have the


current


instruction in it, as well as things like "Temporary storage to

pass



the


source address from copy_load to". It doesn't seem to out of
place

to

include current instruction predication state in there.

Ali


On Sat, 14 Aug 2010 07:08:35 -0000, "Gabe Black"


<gbl...@eecs.umich.edu>


wrote:


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/177/#review185
-----------------------------------------------------------



src/cpu/thread_context.hh
<http://reviews.m5sim.org/r/177/#comment326>

    This isn't really a property of a thread, it's the property
of

a

    single instruction. I don't think this is being done in the


right


    place. I think we should have a discussion on m5-dev to


determine


the


    best way to handle this. There was a little code added to
the


simple


    CPU that does what this is supposed to do if a memory


instruction


    didn't actually read or write memory, and I think this is a


better


way


    to handle this. We should have a discussion about this on


m5-dev,


    especially since it touches lots of low level bits like


*contexts,


    instruction behavior, CPUs, etc. These sorts of changes need

to



be


made


    carefully.


- Gabe


On 2010-08-13 10:12:35, Ali Saidi wrote:


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/177/
-----------------------------------------------------------

(Updated 2010-08-13 10:12:35)


Review request for Default and Min Kyu Jeong.


Summary
-------

ARM/O3: store the result of the predicate evaluation in DynInst

or

Threadstate.
THis allows the CPU to handle predicated-false instructions


accordingly.


This particular patch makes loads that are predicated-false to
be


sent


straight to the commit stage directly, not waiting for return
of


the


data


that was never requested since it was predicated-false.


Diffs
-----

  src/arch/arm/isa/templates/mem.isa 3c48b2b3cb83
  src/arch/arm/isa/templates/pred.isa 3c48b2b3cb83
  src/cpu/base_dyn_inst.hh 3c48b2b3cb83
  src/cpu/base_dyn_inst_impl.hh 3c48b2b3cb83
  src/cpu/o3/lsq_unit_impl.hh 3c48b2b3cb83
  src/cpu/simple/base.hh 3c48b2b3cb83
  src/cpu/simple_thread.hh 3c48b2b3cb83
  src/cpu/thread_context.hh 3c48b2b3cb83

Diff: http://reviews.m5sim.org/r/177/diff


Testing
-------


Thanks,

Ali




_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to