Actually, maybe this falls into a subcategory of 2. which is why it seems more acceptable.

2.a If the ISA says it's safe to use this optimization which doesn't affect behavior and which could be turned off and only result in lower simulator performance, skip this code/check/whatever.

This still has some of the drawbacks I mentioned for the list-of-checkboxes model, but at least if it doesn't fit there's always a failsafe option. These would be optional hints, like telling the compiler a function never returns.

This may have been what you were talking about in the first place, in which case great, we're on the same page. I'd been debating whether to send out my little list anyway since I think it's productive to spend a few cycles thinking about this stuff.

Gabe

Quoting Gabriel Michael Black <gbl...@eecs.umich.edu>:

To me there's actually a spectrum of ISA dependence, incompletely described below:

1. If ISA == suchandsuch do this, otherwise do that.
2. If ISA has characteristic such and such, do this, otherwise do that.
3. Here, ISA, you take care of this.
4. ISA data parameterizing non-ISA stuff.

Number 1 is the worst since it's hard to maintain, can be cumbersome to specify, and it isn't always clear -why- ISA suchandsuch needs a particular behavior.

I'd say number 2 is second worst, and that's what this is an example of. It's better since it's obvious why the code is separated out and there can be sharing between CPU models, etc., but at the same time it increases the CPU's awareness of the ISA on it and partially breaks down the barriers of abstraction. It also sets up special case code paths where, for instance, only x86 on the timing CPU would possibly exhibit a certain bug. If someone changes things for ARM and everything seems to work, they could be subtley breaking x86 which they aren't familiar with and weren't thinking about when they made their change.

3. Three is better in some ways and worse in others. It's clear what's happening, there's a lot of flexibility, and the CPU isn't actually aware of -how- something is being done, just that, say, this would be a good time to check for interrupts, whatever that means. It's not as great because you have more complex interactions between ISAs and CPUs, and you have to do a bit more work in the ISA. It can also be hard to make some of this functionality work sensibly in order, out of order, single cycle, multi cycle, timing mode, atomic mode, etc. etc.

4. This one is the best when you can get away with it. This is where you, say, make your integer register file 32 registers because the ISA says that's how many it needs. Basically the only draw back to this is that behavior changes a little based on each ISA, but if you can get away with it this is the safest.



I think having a multi ISA simulator that will be modified by its end users, especially one with as large a cross product of options as ours, needs to try to be safe and simple before being as absolutely fast as it can be. You know what they say about premature optimization. Ideally we should design things so the big, unnecessary pieces of code just aren't part of the equation because they're in the ISA defined objects, control just doesn't go that way when it wouldn't make sense, etc. And if, for instance, a pointer should always be 0, the code should still behave correctly. The code should do its job correctly no matter -why- it's being asked to do it.

I think it's bad news to have a big list of yes or no checkboxes in each ISA directory which turn on and off behaviors. This is especially true when it's ambiguous whether to say yes or no, if the behavior changes based on circumstances, etc., and if/when the checkbox is interpretted subtley (or not so subtley) differently by the consumer.

In this particular limitted case it seems relatively ok although not necessarily advisable, but it's a slippery slope I don't consider us have a completely clean history with.

Gabe

Quoting Steve Reinhardt <ste...@gmail.com>:

In generalI think this is the kind of ISA hook we should be using...
in the sense that checking TheISA::HasUnalignedMemAcc is much better
than "(TheISA == x86 || TheISA == Power)".  I think it's useful not
only to avoid the overhead of a dynamic check for an ISA that doesn't
need it, but also to clarify that this is code that never gets
executed on those ISAs.  Maybe for a little one-liner like this it's
not a big deal either way, but for bigger hunks of code I think that
clarification is potentially useful.

Steve

On Thu, Jul 22, 2010 at 1:10 PM, Gabriel Michael Black
<gbl...@eecs.umich.edu> wrote:
Yes it does, and that sounds reasonable to me. I'd still like to see us use
ISA hooks as minimally as possible, but this seems ok.

Gabe

Quoting Timothy M Jones <tjon...@inf.ed.ac.uk>:

Oh, ok, I see where you're going with that.  However, the main idea of
having TheISA::HasUnalignedMemAcc was that it is a constant specific to each
ISA.  Therefore, the compiler should really recognise this and optimise it
away wherever it's used.  In this case, for ISAs that don't have unaligned
memory accesses the whole 'if' block should disappear.  For ISAs that do
have them then the condition should be reduced to just checking sreqLow.
 Therefore it's better for the first set of ISAs for this to be kept in.
 Does that make sense?

Tim

On Thu, 22 Jul 2010 14:30:56 -0400, Gabe Black <gbl...@eecs.umich.edu>
wrote:

I think you missed my point. If the check of TheISA::HasUnalignedMemAcc
is redundant, we shouldn't be checking it at all. It's a free, though
very small, performance bump, but more significantly it removes a direct
dependence on the ISA.

Gabe

Timothy M. Jones wrote:

changeset 3bd51d6ac9ef in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=3bd51d6ac9ef
description:
       O3CPU: Fix a bug where stores in the cpu where never marked as
split.

diffstat:

src/cpu/o3/lsq_unit.hh |  6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diffs (16 lines):

diff -r 02b471d9d400 -r 3bd51d6ac9ef src/cpu/o3/lsq_unit.hh
--- a/src/cpu/o3/lsq_unit.hh    Thu Jul 22 18:47:52 2010 +0100
+++ b/src/cpu/o3/lsq_unit.hh    Thu Jul 22 18:52:02 2010 +0100
@@ -822,6 +822,12 @@
   storeQueue[store_idx].sreqLow = sreqLow;
   storeQueue[store_idx].sreqHigh = sreqHigh;
   storeQueue[store_idx].size = sizeof(T);
+
+    // Split stores can only occur in ISAs with unaligned memory
accesses.  If
+    // a store request has been split, sreqLow and sreqHigh will be
non-null.
+    if (TheISA::HasUnalignedMemAcc && sreqLow) {
+        storeQueue[store_idx].isSplit = true;
+    }
   assert(sizeof(T) <= sizeof(storeQueue[store_idx].data));

   T gData = htog(data);
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



--
Timothy M. Jones
http://homepages.inf.ed.ac.uk/tjones1

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to