> On 2010-05-08 11:08:07, Steve Reinhardt wrote:
> > src/arch/arm/isa/formats/data.isa, line 41
> > <http://reviews.m5sim.org/r/20/diff/1/?file=206#file206line41>
> >
> >     Is there a reason we can't use predefined bitfields here (and a ton of 
> > other places)?
> 
> Gabe Black wrote:
>     ARM defines bitfields like movie theaters make popcorn. There was a huge 
> list in ExtMachInst (or the isa description, if you prefer) which all had 
> nearly meaningless names and which still only covered a small fraction of 
> what was needed. Usually the names are ambiguous (globally at least) and 
> nearly meaningless even in the manual, ie. things like "op", "op1", "u", "a", 
> "b", "c", etc. By just extracting most of them in place they stay meaningless 
> and arbitrary, but they don't spill into a global context and don't end up in 
> a giant list. There -are- cases, I'm sure, were something wasn't ambiguous, 
> had a real name, and I decoded it this way anyway. I hope those are very rare.
> 
> Steve Reinhardt wrote:
>     OK, this is the other bitfield issue I raised above.  It seems like I did 
> see some spots where these appeared to be redundant wrt the defined 
> bitfields.  Even when that's not the case, it would be nice to get rid of the 
> redundancy where we have "bits(machInst, X, Y)" with the same X and Y 
> appearing in multiple places and referring to the same logical field.  My 
> preference would be to have the bitfields defined and labeled somewhere even 
> if there are a bazillion of them, maybe with the possible exception of the 
> truly one-off fields; to a large extent this is a corollary of the general 
> "no magic constants" rule.
>
> 
> Gabe Black wrote:
>     I'm pretty sure (although not entirely sure) that when X and Y are the 
> same it's coincidence. There are very few places that I can recall where the 
> placement of bitfields or their names seemed to have any greater purpose than 
> that was just an unambiguous (strictly speaking) place to stuff some bits. I 
> think if you looked through the manual you'll see that most bitfields are 
> one-off which was actually pretty frustrating while working on the decoder. 
> Going with the magic number analogy, It'd be like having a file filed with
>     
>     #define ONE 1
>     #define TWO 2
>     
>     to avoid using magic numbers like 1 and 2. The problem is if they 
> actually -are- arbitrary, meaningless values, giving them arbitrary 
> meaningless names doesn't make them any less magical.
>     
>     As far as what's there right now, I think things are probably not quite 
> right in both directions. There are probably a few (rare, at least I hope) 
> instances of bitfields with actual significance not being given a real name, 
> and some bitfields that were still in use by code that's since been replaced 
> with meaningless names like these
>     
>     def bitfield TOPCODE_15_13  topcode15_13;
>     def bitfield TOPCODE_13_11  topcode13_11;
>     def bitfield TOPCODE_12_11  topcode12_11;
>     def bitfield TOPCODE_12_10  topcode12_10;
>     def bitfield TOPCODE_11_9   topcode11_9;
>     def bitfield TOPCODE_11_8   topcode11_8;
>     def bitfield TOPCODE_10_9   topcode10_9;
>     def bitfield TOPCODE_10_8   topcode10_8;
>     
>     OPCODE is even a misnomer here. These are nothing more than bits X_Y of a 
> 16 bit thumb instruction. The old version of things and older, pre thumb 
> versions of the architecture make a passing attempt at named, consistent 
> bitfields, although it wasn't perfect. I tried extending that to the 
> complete, modern version of the ISA, but it fell apart when they'd move 
> around, get recycled, etc. so I went to what's above. Those were tedious and 
> frustrating to work with, so I just gave up and grabbed the bits I needed 
> when I needed them. Even the manual backed off of some fields that used to 
> have more descriptive names.
> 
> Steve Reinhardt wrote:
>     I do see lots of repeated definitions of rn, rd, and rm.  Also for other 
> fields like lsb, msb, rotation, that are defined in multiple places (e.g., 2 
> out of 3 branches of an if/else if tree) I'd prefer to see them defined at 
> the top in a common place; I don't think that avoiding extracting them in the 
> one case where they're not needed is a real performance issue.  If our decode 
> cache isn't broken then actual from-scratch decode should not be a 
> performance issue at all.
>     
>     I had a couple of other ideas:
>     1. Maybe we should define bits() as a method on ExtMachInst, so we can 
> say machInst.bits(X,Y).  That's not a huge win by itself, but if we couple it 
> with getting rid of "def bitfield" and have bitfields in the ISA language be 
> attributes on the ExtMachInst, then you could even say "decode bits(5,3)" 
> right in the ISA language.
>     2. We could extend bits() to take multiple pairs of bit indices, 
> concatenating the indicated ranges, so expressions like:
>         timm = (bits(machInst, 19, 16) << 12) | bits(machInst, 11, 0);
>     could be rewritten as:
>         timm = machInst.bits(19, 16, 11, 0);
>     
>     I think #2 would even be useful in converting some of your if/else if 
> structures back to switches, which I find preferable because they make it 
> clear(er) that you've covered all the possible cases.
>
> 
> Ali Saidi wrote:
>     I think that this would certainly clear up a few things.
> 
> Gabe Black wrote:
>     I've had thoughts similar to these. For instance, it would be nice to add 
> a function to either the ExtMachInst (or the StaticInst) that would return an 
> IntRegIndex from a set of bits. That would get rid of some of the stacks of 
> casts that tend to happen around register indexes. It wouldn't be that bad 
> (knock on wood) to add new types of bitfields to BitUnions that would do 
> arbitrarily complex things when reading or writing bits. The problem is that 
> they'd need to be set up first, and if they only show up once or twice the 
> overhead makes them a lot more verbose than just calling bits directly.
>     
>     I'd be fine with moving the common definitions to the tops of the blocks 
> they're used in. The thing to watch out for is bitfields that are -almost- 
> the same but that are changed around sometimes. There are instances of that 
> and they have caused some bugs.
>     
>     While these sorts of extensions are a good idea and would clean things up 
> even outside of ARM in some places, they won't remove all instances of having 
> to do things the hard way because there are plenty of places where the bits 
> of interest shift for what seems like (or literaly is) every different 
> instruction in a group, or where things are combined like bit a == bit b, or 
> a == 1 and b != 010. It's sort of a mess.

Are the IntRegIndex casts really necessary?  I'm surprised you can't get by on 
an implicit conversion.

I saw some of those weird cases... I think some (maybe many) of them could be 
cleaned up, but I would not be surprised if there are a few that resist 
normalization.


> On 2010-05-08 11:08:07, Steve Reinhardt wrote:
> > src/arch/arm/insts/vfp.hh, line 314
> > <http://reviews.m5sim.org/r/20/diff/1/?file=190#file190line314>
> >
> >     What's the purpose of these __asm__ statements?
> >
> 
> Gabe Black wrote:
>     gcc was reordering setting fp rounding modes after the operation I needed 
> them set for, extracting condition flags before them, etc. It was generally a 
> mess. These say "Look out gcc! You can't see what I'm doing so you better be 
> careful!" and it puts things were they're supposed to be. I think there's 
> supposed to be a way to do this without clubbing the compiler over the head, 
> but it isn't actually supported properly in gcc.
> 
> Steve Reinhardt wrote:
>     Interesting... do you need to declare some variables as volatile?  It 
> seems like that ought to be sufficient, and of course would be a lot more 
> portable.
> 
> Gabe Black wrote:
>     If I'm remembering right I tried that and it didn't work. I think by 
> using __volatile__ __asm__ you get stronger guarantees, maybe because they 
> inseparably both consume and produce the value at a certain point. One 
> nice-ish property of the asm route is that you sort of turn on the 
> volatileness at the critical points and let it slip back into a register 
> later. These are sort of portable since they don't actually have any code in 
> them, although they do mean the compiler you're working with needs to support 
> that sort of inline asm syntax. We do have gcc dependencies elsewhere, too. 
> I'm not a huge fan of these (they were about the 5th attempt), but they 
> seemed to be the only way to get things to work.
> 
> Steve Reinhardt wrote:
>     Hmm, I'd be surprised there's no better solution.  Certainly this is the 
> kind of thing that "volatile" is supposed to fix in a standardized fashion.  
> Anybody else have an opinion on this?  Nate?
> 
> Ali Saidi wrote:
>     We went through several revisions before coming up with this one. I 
> imagine that feXXXXXX() should actually be marked in the standard library in 
> such a way that disallows code motion around it, but it doesn't appear to 
> work. I think the best solution would be to add something a define in 
> base/compiler.hh called M5_CODE_BARRIER and define it to the asm at the 
> moment. I don't like the volatile in this use case since it prevents many 
> compiler optimizations for the code.
> 
> Steve Reinhardt wrote:
>     I'm surprised that copying op1 and op2 to local volatile vars before 
> calling fesetround() and then doing the operation on those temp vars wouldn't 
> fix it.  Seems like that would be a gcc bug if it didn't work.
> 
> Gabe Black wrote:
>     I thought of putting them in a macro, but I think an important part of 
> the voodoo is that they refer to the variables in question as inputs and 
> outputs. I'm not sure exactly how that would get wrapped into a macro. A 
> generic code barrier would be just as effective, though, if we can find a 
> nice way to build one.
>     
>     This is in some ways a gcc deficiency since there's supposed to be a way 
> around this problem generally. It's not implemented, though, or something 
> like that. I forget the details.

Yea, like I said, it seems to me that if using a set of volatile local vars in 
an appropriate way doesn't fix it, that'd be a gcc bug.


> On 2010-05-08 11:08:07, Steve Reinhardt wrote:
> > src/arch/arm/insts/mem.hh, line 252
> > <http://reviews.m5sim.org/r/20/diff/1/?file=181#file181line252>
> >
> >     seems like this should go in a .cc file
> 
> Gabe Black wrote:
>     Wouldn't that prevent it from being inlined? Or do we not care since it's 
> disassembly?
> 
> Steve Reinhardt wrote:
>     We don't care... disassembly is never performance critical to begin with, 
> and we only call the methods at most once per StaticInst anyway (the return 
> string is cached and reused after that).
> 
> Gabe Black wrote:
>     This may not be the best place to mention it, but there's a glitch in 
> that that the Florida people had at least attempted to address, although I 
> wiped out their fix in these changes. Basically, -most- of the disassembly is 
> constant, but the pc value may change between uses as far as I can tell and 
> it won't invalidate the disassembled string.
> 
> Steve Reinhardt wrote:
>     That's not a bug, that's a feature :-).  Seriously, it's always been the 
> case that StaticInst objects are reused for identical instructions appearing 
> at different PCs.  I don't think we've ever done any measurement on it, but 
> my recollection from other projects is that this is a pretty significant 
> memory savings over creating a new isntance for every PC.  Expecting that the 
> PC value would be constant reflects not understanding how it's supposed to 
> work.  Is PC-relative addressing the issue?  If it's really that important to 
> have those values be printed as absolute addresses, we could extend the 
> disassembly to incorporate PC-relative values in such a way that they do get 
> recalculated.
> 
> Gabe Black wrote:
>     The fix was specifically for PC relative branches that print their target 
> along with a symbol+offset, although the problem is more general than that. 
> Those branches are pretty common, I think, and I would guess are also used in 
> the Alpha stuff.
>     
>     Note that I'm not advocating making the decode cache on hit on the same 
> inst at the same PC, although I'm also not arguing the opposite since I don't 
> know one way or the other. I'm just saying our disassembled string may be 
> stale and it would be nice to fix that.

Sometimes it helps to actually look at the code... there is a documented way to 
deal with this:
http://repo.m5sim.org/m5/file/24379f92cc10/src/cpu/static_inst.hh#l423

We can discuss whether that documented method could be improved on, but there's 
no "glitch" to be worked around.  Not only that, but this code is so old it 
predates our revision control (it came in on changeset 2)... I think that means 
it pre-dates our switch to bitkeeper.


- Steve


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/20/#review15
-----------------------------------------------------------


On 2010-04-29 15:33:58, Ali Saidi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.m5sim.org/r/20/
> -----------------------------------------------------------
> 
> (Updated 2010-04-29 15:33:58)
> 
> 
> Review request for Default.
> 
> 
> Summary
> -------
> 
> Initial set of patches to improve the M5 support of the ARM ISA. Bundled into 
> one large change for review. This change implements the majority of thumb, 
> thumb2, and arm instructions and allows the running of all tested SPEC2000 
> benchmarks in atomic mode. 
> 
> 
> Diffs
> -----
> 
>   configs/common/cpu2000.py ad784e759a74 
>   src/arch/arm/ArmTLB.py ad784e759a74 
>   src/arch/arm/SConscript ad784e759a74 
>   src/arch/arm/faults.hh ad784e759a74 
>   src/arch/arm/faults.cc ad784e759a74 
>   src/arch/arm/insts/branch.hh ad784e759a74 
>   src/arch/arm/insts/branch.cc ad784e759a74 
>   src/arch/arm/insts/macromem.hh ad784e759a74 
>   src/arch/arm/insts/macromem.cc PRE-CREATION 
>   src/arch/arm/insts/mem.hh ad784e759a74 
>   src/arch/arm/insts/mem.cc ad784e759a74 
>   src/arch/arm/insts/misc.hh PRE-CREATION 
>   src/arch/arm/insts/misc.cc PRE-CREATION 
>   src/arch/arm/insts/mult.hh PRE-CREATION 
>   src/arch/arm/insts/pred_inst.hh ad784e759a74 
>   src/arch/arm/insts/pred_inst.cc ad784e759a74 
>   src/arch/arm/insts/static_inst.hh ad784e759a74 
>   src/arch/arm/insts/static_inst.cc ad784e759a74 
>   src/arch/arm/insts/vfp.hh PRE-CREATION 
>   src/arch/arm/insts/vfp.cc PRE-CREATION 
>   src/arch/arm/interrupts.hh ad784e759a74 
>   src/arch/arm/interrupts.cc ad784e759a74 
>   src/arch/arm/intregs.hh ad784e759a74 
>   src/arch/arm/isa.hh ad784e759a74 
>   src/arch/arm/isa.cc PRE-CREATION 
>   src/arch/arm/isa/bitfields.isa ad784e759a74 
>   src/arch/arm/isa/copyright.txt ad784e759a74 
>   src/arch/arm/isa/decoder.isa ad784e759a74 
>   src/arch/arm/isa/decoder/arm.isa PRE-CREATION 
>   src/arch/arm/isa/decoder/decoder.isa PRE-CREATION 
>   src/arch/arm/isa/decoder/thumb.isa PRE-CREATION 
>   src/arch/arm/isa/formats/basic.isa ad784e759a74 
>   src/arch/arm/isa/formats/branch.isa ad784e759a74 
>   src/arch/arm/isa/formats/breakpoint.isa PRE-CREATION 
>   src/arch/arm/isa/formats/data.isa PRE-CREATION 
>   src/arch/arm/isa/formats/formats.isa ad784e759a74 
>   src/arch/arm/isa/formats/fp.isa ad784e759a74 
>   src/arch/arm/isa/formats/macromem.isa ad784e759a74 
>   src/arch/arm/isa/formats/mem.isa ad784e759a74 
>   src/arch/arm/isa/formats/misc.isa PRE-CREATION 
>   src/arch/arm/isa/formats/mult.isa PRE-CREATION 
>   src/arch/arm/isa/formats/pred.isa ad784e759a74 
>   src/arch/arm/isa/formats/uncond.isa PRE-CREATION 
>   src/arch/arm/isa/formats/unimp.isa ad784e759a74 
>   src/arch/arm/isa/formats/unknown.isa ad784e759a74 
>   src/arch/arm/isa/formats/util.isa ad784e759a74 
>   src/arch/arm/isa/includes.isa ad784e759a74 
>   src/arch/arm/isa/insts/basic.isa PRE-CREATION 
>   src/arch/arm/isa/insts/branch.isa PRE-CREATION 
>   src/arch/arm/isa/insts/data.isa PRE-CREATION 
>   src/arch/arm/isa/insts/div.isa PRE-CREATION 
>   src/arch/arm/isa/insts/fp.isa PRE-CREATION 
>   src/arch/arm/isa/insts/insts.isa PRE-CREATION 
>   src/arch/arm/isa/insts/ldr.isa PRE-CREATION 
>   src/arch/arm/isa/insts/macromem.isa PRE-CREATION 
>   src/arch/arm/isa/insts/mem.isa PRE-CREATION 
>   src/arch/arm/isa/insts/misc.isa PRE-CREATION 
>   src/arch/arm/isa/insts/mult.isa PRE-CREATION 
>   src/arch/arm/isa/insts/str.isa PRE-CREATION 
>   src/arch/arm/isa/insts/swap.isa PRE-CREATION 
>   src/arch/arm/isa/main.isa ad784e759a74 
>   src/arch/arm/isa/operands.isa ad784e759a74 
>   src/arch/arm/isa/templates/basic.isa PRE-CREATION 
>   src/arch/arm/isa/templates/branch.isa PRE-CREATION 
>   src/arch/arm/isa/templates/macromem.isa PRE-CREATION 
>   src/arch/arm/isa/templates/mem.isa PRE-CREATION 
>   src/arch/arm/isa/templates/misc.isa PRE-CREATION 
>   src/arch/arm/isa/templates/mult.isa PRE-CREATION 
>   src/arch/arm/isa/templates/pred.isa PRE-CREATION 
>   src/arch/arm/isa/templates/templates.isa PRE-CREATION 
>   src/arch/arm/isa/templates/vfp.isa PRE-CREATION 
>   src/arch/arm/isa_traits.hh ad784e759a74 
>   src/arch/arm/linux/linux.hh ad784e759a74 
>   src/arch/arm/linux/process.hh ad784e759a74 
>   src/arch/arm/linux/process.cc ad784e759a74 
>   src/arch/arm/miscregs.hh ad784e759a74 
>   src/arch/arm/miscregs.cc PRE-CREATION 
>   src/arch/arm/nativetrace.cc ad784e759a74 
>   src/arch/arm/pagetable.hh ad784e759a74 
>   src/arch/arm/pagetable.cc ad784e759a74 
>   src/arch/arm/predecoder.hh ad784e759a74 
>   src/arch/arm/process.hh ad784e759a74 
>   src/arch/arm/process.cc ad784e759a74 
>   src/arch/arm/registers.hh ad784e759a74 
>   src/arch/arm/table_walker.hh PRE-CREATION 
>   src/arch/arm/table_walker.cc PRE-CREATION 
>   src/arch/arm/tlb.hh ad784e759a74 
>   src/arch/arm/tlb.cc ad784e759a74 
>   src/arch/arm/types.hh ad784e759a74 
>   src/arch/arm/utility.hh ad784e759a74 
>   src/arch/arm/utility.cc ad784e759a74 
>   src/arch/isa_parser.py ad784e759a74 
>   src/base/loader/elf_object.cc ad784e759a74 
>   src/base/loader/object_file.hh ad784e759a74 
>   src/cpu/BaseCPU.py ad784e759a74 
>   src/cpu/exetrace.cc ad784e759a74 
>   src/cpu/simple/base.cc ad784e759a74 
>   src/cpu/simple_thread.hh ad784e759a74 
>   src/dev/arm/SConscript ad784e759a74 
>   src/dev/arm/Versatile.py ad784e759a74 
>   src/dev/arm/versatile.hh ad784e759a74 
>   src/dev/arm/versatile.cc ad784e759a74 
>   src/dev/copy_engine.cc ad784e759a74 
>   src/dev/io_device.hh ad784e759a74 
>   src/dev/io_device.cc ad784e759a74 
>   src/sim/process.cc ad784e759a74 
>   tests/quick/00.hello/ref/arm/linux/simple-atomic/config.ini ad784e759a74 
>   tests/quick/00.hello/ref/arm/linux/simple-atomic/simerr ad784e759a74 
>   tests/quick/00.hello/ref/arm/linux/simple-atomic/simout ad784e759a74 
>   tests/quick/00.hello/ref/arm/linux/simple-atomic/stats.txt ad784e759a74 
>   util/statetrace/arch/tracechild_arm.hh ad784e759a74 
>   util/statetrace/arch/tracechild_arm.cc ad784e759a74 
>   util/statetrace/statetrace.cc ad784e759a74 
> 
> Diff: http://reviews.m5sim.org/r/20/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ali
> 
>

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to