> On 2010-05-08 11:08:07, Steve Reinhardt wrote: > > src/arch/arm/isa/formats/data.isa, line 41 > > <http://reviews.m5sim.org/r/20/diff/1/?file=206#file206line41> > > > > Is there a reason we can't use predefined bitfields here (and a ton of > > other places)? > > Gabe Black wrote: > ARM defines bitfields like movie theaters make popcorn. There was a huge > list in ExtMachInst (or the isa description, if you prefer) which all had > nearly meaningless names and which still only covered a small fraction of > what was needed. Usually the names are ambiguous (globally at least) and > nearly meaningless even in the manual, ie. things like "op", "op1", "u", "a", > "b", "c", etc. By just extracting most of them in place they stay meaningless > and arbitrary, but they don't spill into a global context and don't end up in > a giant list. There -are- cases, I'm sure, were something wasn't ambiguous, > had a real name, and I decoded it this way anyway. I hope those are very rare. > > Steve Reinhardt wrote: > OK, this is the other bitfield issue I raised above. It seems like I did > see some spots where these appeared to be redundant wrt the defined > bitfields. Even when that's not the case, it would be nice to get rid of the > redundancy where we have "bits(machInst, X, Y)" with the same X and Y > appearing in multiple places and referring to the same logical field. My > preference would be to have the bitfields defined and labeled somewhere even > if there are a bazillion of them, maybe with the possible exception of the > truly one-off fields; to a large extent this is a corollary of the general > "no magic constants" rule. > > > Gabe Black wrote: > I'm pretty sure (although not entirely sure) that when X and Y are the > same it's coincidence. There are very few places that I can recall where the > placement of bitfields or their names seemed to have any greater purpose than > that was just an unambiguous (strictly speaking) place to stuff some bits. I > think if you looked through the manual you'll see that most bitfields are > one-off which was actually pretty frustrating while working on the decoder. > Going with the magic number analogy, It'd be like having a file filed with > > #define ONE 1 > #define TWO 2 > > to avoid using magic numbers like 1 and 2. The problem is if they > actually -are- arbitrary, meaningless values, giving them arbitrary > meaningless names doesn't make them any less magical. > > As far as what's there right now, I think things are probably not quite > right in both directions. There are probably a few (rare, at least I hope) > instances of bitfields with actual significance not being given a real name, > and some bitfields that were still in use by code that's since been replaced > with meaningless names like these > > def bitfield TOPCODE_15_13 topcode15_13; > def bitfield TOPCODE_13_11 topcode13_11; > def bitfield TOPCODE_12_11 topcode12_11; > def bitfield TOPCODE_12_10 topcode12_10; > def bitfield TOPCODE_11_9 topcode11_9; > def bitfield TOPCODE_11_8 topcode11_8; > def bitfield TOPCODE_10_9 topcode10_9; > def bitfield TOPCODE_10_8 topcode10_8; > > OPCODE is even a misnomer here. These are nothing more than bits X_Y of a > 16 bit thumb instruction. The old version of things and older, pre thumb > versions of the architecture make a passing attempt at named, consistent > bitfields, although it wasn't perfect. I tried extending that to the > complete, modern version of the ISA, but it fell apart when they'd move > around, get recycled, etc. so I went to what's above. Those were tedious and > frustrating to work with, so I just gave up and grabbed the bits I needed > when I needed them. Even the manual backed off of some fields that used to > have more descriptive names. > > Steve Reinhardt wrote: > I do see lots of repeated definitions of rn, rd, and rm. Also for other > fields like lsb, msb, rotation, that are defined in multiple places (e.g., 2 > out of 3 branches of an if/else if tree) I'd prefer to see them defined at > the top in a common place; I don't think that avoiding extracting them in the > one case where they're not needed is a real performance issue. If our decode > cache isn't broken then actual from-scratch decode should not be a > performance issue at all. > > I had a couple of other ideas: > 1. Maybe we should define bits() as a method on ExtMachInst, so we can > say machInst.bits(X,Y). That's not a huge win by itself, but if we couple it > with getting rid of "def bitfield" and have bitfields in the ISA language be > attributes on the ExtMachInst, then you could even say "decode bits(5,3)" > right in the ISA language. > 2. We could extend bits() to take multiple pairs of bit indices, > concatenating the indicated ranges, so expressions like: > timm = (bits(machInst, 19, 16) << 12) | bits(machInst, 11, 0); > could be rewritten as: > timm = machInst.bits(19, 16, 11, 0); > > I think #2 would even be useful in converting some of your if/else if > structures back to switches, which I find preferable because they make it > clear(er) that you've covered all the possible cases. > > > Ali Saidi wrote: > I think that this would certainly clear up a few things.
I've had thoughts similar to these. For instance, it would be nice to add a function to either the ExtMachInst (or the StaticInst) that would return an IntRegIndex from a set of bits. That would get rid of some of the stacks of casts that tend to happen around register indexes. It wouldn't be that bad (knock on wood) to add new types of bitfields to BitUnions that would do arbitrarily complex things when reading or writing bits. The problem is that they'd need to be set up first, and if they only show up once or twice the overhead makes them a lot more verbose than just calling bits directly. I'd be fine with moving the common definitions to the tops of the blocks they're used in. The thing to watch out for is bitfields that are -almost- the same but that are changed around sometimes. There are instances of that and they have caused some bugs. While these sorts of extensions are a good idea and would clean things up even outside of ARM in some places, they won't remove all instances of having to do things the hard way because there are plenty of places where the bits of interest shift for what seems like (or literaly is) every different instruction in a group, or where things are combined like bit a == bit b, or a == 1 and b != 010. It's sort of a mess. > On 2010-05-08 11:08:07, Steve Reinhardt wrote: > > src/arch/arm/isa/bitfields.isa, line 49 > > <http://reviews.m5sim.org/r/20/diff/1/?file=197#file197line49> > > > > I assume these are all instances of your C++ bitfield class? Maybe we > > should add some first-class support in the ISA parser to get rid of this > > redundant-looking header. > > What are the advantages of doing it this way vs. the old way, anyway? > > Gabe Black wrote: > They aren't instances of the C++ bitfield class (BitUnion, coined by Nate > I think), but the ExtMachInst is and so these are members of one. First class > support was added to the parser for x86 where ExtMachInst is actually a > struct, some members of which are BitUnions. Basically, if the descriptive > part of a def bitfield is a name, it's interpreted as a named field of the > ExtMachInst. encoding would turn into a macro defined as machInst.encoding. > This is more important in x86 where there's simply no other way to get at > structure members and the ExtMachInst is too big for anything else. It adds > value here too since it removes redundancy between the ISA definition and > parts outside of there that need to refer to bitfields, for instance the > predecoder. The ExtMachInst type defines everything to be where it needs to > be, and then the ISA description, predecoder, whatever, can get at what it > needs using a named bitfield and not a convention of bit positions. This is > important in the predecoder because it defines a few artificial instruction bits to contextualize an instruction (arm vs. 32 bit thumb vs. 16 bit thumb, etc.) and also to the load/store multiple constructor where most of the decoding for those instructions take place. It's a lot easier to have those constructed programatically as needed than to have all the versions around and to pick the right one. That was something the Florida guys had done originally, and it seemed like the right way to continue to do it. > > If you're suggesting replacing -all- instances of the bitfield > definitions with the BitUnion stuff, that would make sense to me. It would > get rid of some macros and streamline the parser a bit, although the > BitUnions themselves are a little magical too. One important bug in the > BitUnions that we (or really I) have yet to solve is that if you assign one > that's of the same signedness, same position in its parent, and from a parent > of the same type to another, gcc sees that they're the same type and just > copies one on top of the other rather than going through a regular scalar > intermediary. The Bitfields are really all unioned on top of each other and > "contain" the data for the whole BitUnion, so in that case you'll overwrite > the whole value with the other rather than just the Bitfield you thought you > were. I'm sure (or really really hope) there's some C++-foo that can be > applied to correct that, but I wasn't able to find anything that made gcc > happy and worked correctly the last I tried. > > Steve Reinhardt wrote: > There are two things that seem sub-optimal from my 5,000-ft view: > > 1. A file containing a bunch of lines that just say "def bitfield FOO > foo;" for various values of "foo" > 2. The fact that there are still a bunch of expressions like > "bits(machInst, X, Y)" with magic values of X and Y in the C++ part of the > decode logic, and these are frequently redundant in that there are multiple > instances of the same expression with identical values of X and Y. > > Maybe these are unrelated, but they're both dealing with bitfields, which > is why they stay close together in my head. To me it would be best if both > could be made to go away. > > For the first one, instead of having an exception where if a bitfield > refers to a string rather than an actual bitfield expression we assume it's > an ExtMachInst field, why not just assume that anywhere in the grammar you > expect a defined bitfield (via def bitfield) but have an identifier instead > then it must be an ExtMachInst field? I'd be more sympathetic toward the > former if it gave better error messages, but (1) I suspect that may not be > true and (2) even if it is, there's probably a cleaner way to do it (e.g., > allow something like "def bitfields foo, bar, baz;" which is equivalent to > "def bitfield foo foo;" etc. > > For the second one, I need you to explain more why it is the way it is to > begin with (unless that's in another comment somewhere that I haven't read > yet). > > Gabe Black wrote: > As far the second strategy you mention for 1, I don't like how that would > effectively make a macro all lower case. It also collapses a namespace where > you might have a field called, say, n or i or x, and preclude having those as > variable names. You might say i is a lousy name for a bitfield and you'd be > right, but that's only an easy example. I'm sure there are plenty of other > more defensible ways to run into trouble. That's mitigated if you toupper > everything, but that feels like a hack. The first strategy is better since it > avoids all that, but then it precludes using the macros in the actual C++ > code. I can't think of any place that's essential, but it is handy. I > wouldn't mind getting rid of def bitfield entirely since that would gets rid > of some macro magic (generated macro magic at that), but I don't have a > better idea (I like to complain, not fix things!). > > I talk more about 2 below. > > Steve Reinhardt wrote: > What about getting rid of def bitfield entirely and only using defined > fields on ExtMachInst? I agree that the macro stuff would be nice to get rid > of; it looks cute, but it's not very robust. And if they're not macros > anymore, maybe you won't object to them being lowercase? That sounds good to me, and I wouldn't mind if they weren't macros and the C++ had to stick machInst in there. If the extra text got too annoying the StaticInsts could have inline accessors that pulled from machInst for you. I'd still be a lot more comfortable with that if we fixed the BitUnion bug. We should do that in general since it does come up from time to time. Ali ran into it with this Arm stuff, as a matter of fact. > On 2010-05-08 11:08:07, Steve Reinhardt wrote: > > src/arch/arm/insts/mem.hh, line 252 > > <http://reviews.m5sim.org/r/20/diff/1/?file=181#file181line252> > > > > seems like this should go in a .cc file > > Gabe Black wrote: > Wouldn't that prevent it from being inlined? Or do we not care since it's > disassembly? > > Steve Reinhardt wrote: > We don't care... disassembly is never performance critical to begin with, > and we only call the methods at most once per StaticInst anyway (the return > string is cached and reused after that). > > Gabe Black wrote: > This may not be the best place to mention it, but there's a glitch in > that that the Florida people had at least attempted to address, although I > wiped out their fix in these changes. Basically, -most- of the disassembly is > constant, but the pc value may change between uses as far as I can tell and > it won't invalidate the disassembled string. > > Steve Reinhardt wrote: > That's not a bug, that's a feature :-). Seriously, it's always been the > case that StaticInst objects are reused for identical instructions appearing > at different PCs. I don't think we've ever done any measurement on it, but > my recollection from other projects is that this is a pretty significant > memory savings over creating a new isntance for every PC. Expecting that the > PC value would be constant reflects not understanding how it's supposed to > work. Is PC-relative addressing the issue? If it's really that important to > have those values be printed as absolute addresses, we could extend the > disassembly to incorporate PC-relative values in such a way that they do get > recalculated. The fix was specifically for PC relative branches that print their target along with a symbol+offset, although the problem is more general than that. Those branches are pretty common, I think, and I would guess are also used in the Alpha stuff. Note that I'm not advocating making the decode cache on hit on the same inst at the same PC, although I'm also not arguing the opposite since I don't know one way or the other. I'm just saying our disassembled string may be stale and it would be nice to fix that. > On 2010-05-08 11:08:07, Steve Reinhardt wrote: > > src/arch/arm/insts/vfp.hh, line 314 > > <http://reviews.m5sim.org/r/20/diff/1/?file=190#file190line314> > > > > What's the purpose of these __asm__ statements? > > > > Gabe Black wrote: > gcc was reordering setting fp rounding modes after the operation I needed > them set for, extracting condition flags before them, etc. It was generally a > mess. These say "Look out gcc! You can't see what I'm doing so you better be > careful!" and it puts things were they're supposed to be. I think there's > supposed to be a way to do this without clubbing the compiler over the head, > but it isn't actually supported properly in gcc. > > Steve Reinhardt wrote: > Interesting... do you need to declare some variables as volatile? It > seems like that ought to be sufficient, and of course would be a lot more > portable. > > Gabe Black wrote: > If I'm remembering right I tried that and it didn't work. I think by > using __volatile__ __asm__ you get stronger guarantees, maybe because they > inseparably both consume and produce the value at a certain point. One > nice-ish property of the asm route is that you sort of turn on the > volatileness at the critical points and let it slip back into a register > later. These are sort of portable since they don't actually have any code in > them, although they do mean the compiler you're working with needs to support > that sort of inline asm syntax. We do have gcc dependencies elsewhere, too. > I'm not a huge fan of these (they were about the 5th attempt), but they > seemed to be the only way to get things to work. > > Steve Reinhardt wrote: > Hmm, I'd be surprised there's no better solution. Certainly this is the > kind of thing that "volatile" is supposed to fix in a standardized fashion. > Anybody else have an opinion on this? Nate? > > Ali Saidi wrote: > We went through several revisions before coming up with this one. I > imagine that feXXXXXX() should actually be marked in the standard library in > such a way that disallows code motion around it, but it doesn't appear to > work. I think the best solution would be to add something a define in > base/compiler.hh called M5_CODE_BARRIER and define it to the asm at the > moment. I don't like the volatile in this use case since it prevents many > compiler optimizations for the code. > > Steve Reinhardt wrote: > I'm surprised that copying op1 and op2 to local volatile vars before > calling fesetround() and then doing the operation on those temp vars wouldn't > fix it. Seems like that would be a gcc bug if it didn't work. I thought of putting them in a macro, but I think an important part of the voodoo is that they refer to the variables in question as inputs and outputs. I'm not sure exactly how that would get wrapped into a macro. A generic code barrier would be just as effective, though, if we can find a nice way to build one. This is in some ways a gcc deficiency since there's supposed to be a way around this problem generally. It's not implemented, though, or something like that. I forget the details. - Gabe ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/20/#review15 ----------------------------------------------------------- On 2010-04-29 15:33:58, Ali Saidi wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.m5sim.org/r/20/ > ----------------------------------------------------------- > > (Updated 2010-04-29 15:33:58) > > > Review request for Default. > > > Summary > ------- > > Initial set of patches to improve the M5 support of the ARM ISA. Bundled into > one large change for review. This change implements the majority of thumb, > thumb2, and arm instructions and allows the running of all tested SPEC2000 > benchmarks in atomic mode. > > > Diffs > ----- > > configs/common/cpu2000.py ad784e759a74 > src/arch/arm/ArmTLB.py ad784e759a74 > src/arch/arm/SConscript ad784e759a74 > src/arch/arm/faults.hh ad784e759a74 > src/arch/arm/faults.cc ad784e759a74 > src/arch/arm/insts/branch.hh ad784e759a74 > src/arch/arm/insts/branch.cc ad784e759a74 > src/arch/arm/insts/macromem.hh ad784e759a74 > src/arch/arm/insts/macromem.cc PRE-CREATION > src/arch/arm/insts/mem.hh ad784e759a74 > src/arch/arm/insts/mem.cc ad784e759a74 > src/arch/arm/insts/misc.hh PRE-CREATION > src/arch/arm/insts/misc.cc PRE-CREATION > src/arch/arm/insts/mult.hh PRE-CREATION > src/arch/arm/insts/pred_inst.hh ad784e759a74 > src/arch/arm/insts/pred_inst.cc ad784e759a74 > src/arch/arm/insts/static_inst.hh ad784e759a74 > src/arch/arm/insts/static_inst.cc ad784e759a74 > src/arch/arm/insts/vfp.hh PRE-CREATION > src/arch/arm/insts/vfp.cc PRE-CREATION > src/arch/arm/interrupts.hh ad784e759a74 > src/arch/arm/interrupts.cc ad784e759a74 > src/arch/arm/intregs.hh ad784e759a74 > src/arch/arm/isa.hh ad784e759a74 > src/arch/arm/isa.cc PRE-CREATION > src/arch/arm/isa/bitfields.isa ad784e759a74 > src/arch/arm/isa/copyright.txt ad784e759a74 > src/arch/arm/isa/decoder.isa ad784e759a74 > src/arch/arm/isa/decoder/arm.isa PRE-CREATION > src/arch/arm/isa/decoder/decoder.isa PRE-CREATION > src/arch/arm/isa/decoder/thumb.isa PRE-CREATION > src/arch/arm/isa/formats/basic.isa ad784e759a74 > src/arch/arm/isa/formats/branch.isa ad784e759a74 > src/arch/arm/isa/formats/breakpoint.isa PRE-CREATION > src/arch/arm/isa/formats/data.isa PRE-CREATION > src/arch/arm/isa/formats/formats.isa ad784e759a74 > src/arch/arm/isa/formats/fp.isa ad784e759a74 > src/arch/arm/isa/formats/macromem.isa ad784e759a74 > src/arch/arm/isa/formats/mem.isa ad784e759a74 > src/arch/arm/isa/formats/misc.isa PRE-CREATION > src/arch/arm/isa/formats/mult.isa PRE-CREATION > src/arch/arm/isa/formats/pred.isa ad784e759a74 > src/arch/arm/isa/formats/uncond.isa PRE-CREATION > src/arch/arm/isa/formats/unimp.isa ad784e759a74 > src/arch/arm/isa/formats/unknown.isa ad784e759a74 > src/arch/arm/isa/formats/util.isa ad784e759a74 > src/arch/arm/isa/includes.isa ad784e759a74 > src/arch/arm/isa/insts/basic.isa PRE-CREATION > src/arch/arm/isa/insts/branch.isa PRE-CREATION > src/arch/arm/isa/insts/data.isa PRE-CREATION > src/arch/arm/isa/insts/div.isa PRE-CREATION > src/arch/arm/isa/insts/fp.isa PRE-CREATION > src/arch/arm/isa/insts/insts.isa PRE-CREATION > src/arch/arm/isa/insts/ldr.isa PRE-CREATION > src/arch/arm/isa/insts/macromem.isa PRE-CREATION > src/arch/arm/isa/insts/mem.isa PRE-CREATION > src/arch/arm/isa/insts/misc.isa PRE-CREATION > src/arch/arm/isa/insts/mult.isa PRE-CREATION > src/arch/arm/isa/insts/str.isa PRE-CREATION > src/arch/arm/isa/insts/swap.isa PRE-CREATION > src/arch/arm/isa/main.isa ad784e759a74 > src/arch/arm/isa/operands.isa ad784e759a74 > src/arch/arm/isa/templates/basic.isa PRE-CREATION > src/arch/arm/isa/templates/branch.isa PRE-CREATION > src/arch/arm/isa/templates/macromem.isa PRE-CREATION > src/arch/arm/isa/templates/mem.isa PRE-CREATION > src/arch/arm/isa/templates/misc.isa PRE-CREATION > src/arch/arm/isa/templates/mult.isa PRE-CREATION > src/arch/arm/isa/templates/pred.isa PRE-CREATION > src/arch/arm/isa/templates/templates.isa PRE-CREATION > src/arch/arm/isa/templates/vfp.isa PRE-CREATION > src/arch/arm/isa_traits.hh ad784e759a74 > src/arch/arm/linux/linux.hh ad784e759a74 > src/arch/arm/linux/process.hh ad784e759a74 > src/arch/arm/linux/process.cc ad784e759a74 > src/arch/arm/miscregs.hh ad784e759a74 > src/arch/arm/miscregs.cc PRE-CREATION > src/arch/arm/nativetrace.cc ad784e759a74 > src/arch/arm/pagetable.hh ad784e759a74 > src/arch/arm/pagetable.cc ad784e759a74 > src/arch/arm/predecoder.hh ad784e759a74 > src/arch/arm/process.hh ad784e759a74 > src/arch/arm/process.cc ad784e759a74 > src/arch/arm/registers.hh ad784e759a74 > src/arch/arm/table_walker.hh PRE-CREATION > src/arch/arm/table_walker.cc PRE-CREATION > src/arch/arm/tlb.hh ad784e759a74 > src/arch/arm/tlb.cc ad784e759a74 > src/arch/arm/types.hh ad784e759a74 > src/arch/arm/utility.hh ad784e759a74 > src/arch/arm/utility.cc ad784e759a74 > src/arch/isa_parser.py ad784e759a74 > src/base/loader/elf_object.cc ad784e759a74 > src/base/loader/object_file.hh ad784e759a74 > src/cpu/BaseCPU.py ad784e759a74 > src/cpu/exetrace.cc ad784e759a74 > src/cpu/simple/base.cc ad784e759a74 > src/cpu/simple_thread.hh ad784e759a74 > src/dev/arm/SConscript ad784e759a74 > src/dev/arm/Versatile.py ad784e759a74 > src/dev/arm/versatile.hh ad784e759a74 > src/dev/arm/versatile.cc ad784e759a74 > src/dev/copy_engine.cc ad784e759a74 > src/dev/io_device.hh ad784e759a74 > src/dev/io_device.cc ad784e759a74 > src/sim/process.cc ad784e759a74 > tests/quick/00.hello/ref/arm/linux/simple-atomic/config.ini ad784e759a74 > tests/quick/00.hello/ref/arm/linux/simple-atomic/simerr ad784e759a74 > tests/quick/00.hello/ref/arm/linux/simple-atomic/simout ad784e759a74 > tests/quick/00.hello/ref/arm/linux/simple-atomic/stats.txt ad784e759a74 > util/statetrace/arch/tracechild_arm.hh ad784e759a74 > util/statetrace/arch/tracechild_arm.cc ad784e759a74 > util/statetrace/statetrace.cc ad784e759a74 > > Diff: http://reviews.m5sim.org/r/20/diff > > > Testing > ------- > > > Thanks, > > Ali > > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
