On Fri, Jul 9, 2010 at 5:41 PM, Min Kyu Jeong <[email protected]> wrote:
> The following is the excerpt from the disassembly. > > 117c: e321f013 msr CPSR_c, #19 ; 0x13 > 1180: e24fd08c sub sp, pc, #140 ; 0x8c > 1184: e321f011 msr CPSR_c, #17 ; 0x11 > 1188: e24fd094 sub sp, pc, #148 ; 0x94 > 118c: e321f012 msr CPSR_c, #18 ; 0x12 > 1190: e24fd09c sub sp, pc, #156 ; 0x9c > 1194: e321f01b msr CPSR_c, #27 ; 0x1b > 1198: e24fd0a4 sub sp, pc, #164 ; 0xa4 > 119c: e321f017 msr CPSR_c, #23 ; 0x17 > 11a0: e24fd0ac sub sp, pc, #172 ; 0xac > 11a4: e321f01f msr CPSR_c, #31 ; 0x1f > 11a8: e24fd0b4 sub sp, pc, #180 ; 0xb4 > 11ac: e321f013 msr CPSR_c, #19 ; 0x13 > 11b0: ea000002 b 11c0 <skipLabel_00000002> > > 000011b4 <LabStr_00000002>: > 11b4: 5f444441 69736162 00315f63 ADD_basic_1. > > > After x11b0, branch is predicated fall-through and the string label is > fetched. The particular bytes that causes the > predicated (X) -> predicted (O) mess is of address 11b8, so I think it is the second 4B chunk of the label: > x69736162 > > The following is the relevant part of trace from the run. There are some > additional prints that I added. > > 17680000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b4 (0) created > [sn:1585] > 17680000: system.cpu.fetch: [tid:0]: Instruction is: svcpl > 17680000: global: MicroLdrUop, regIdx : 577 > 17680000: global: MicroLdrUop, regIdx : 581 > 17680000: global: MicroLdrUop, regIdx : 582 > 17680000: global: MicroLdrUop, regIdx : 584 > 17680000: global: MicroLdrUop, regIdx : 589 > 17680000: global: MicroLdrUop, regIdx : 590 > 17680000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (0) created > [sn:1586] > 17680000: system.cpu.fetch: [tid:0]: Instruction is: addi_uopvs r34, > r3, #0 > 17680000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (1) created > [sn:1587] > 17680000: system.cpu.fetch: [tid:0]: Instruction is: subi_uopvs r3, r3, > #24 > 17680000: system.cpu.fetch: [tid:0]: Done fetching, reached fetch bandwidth > for this cycle. > 17680000: system.cpu.fetch: [tid:0]: Setting PC to 0x0011b8. > ... > 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (2) created > [sn:1588] > 17690000: system.cpu.fetch: [tid:0]: Instruction is: ldr_uopvs XXX, > [r34, #24] > 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (3) created > [sn:1589] > 17690000: system.cpu.fetch: [tid:0]: Instruction is: ldr_uopvs XXX, > [r34, #20] > 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (4) created > [sn:1590] > 17690000: system.cpu.fetch: [tid:0]: Instruction is: ldr_uopvs XXX, > [r34, #16] > 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (5) created > [sn:1591] > 17690000: system.cpu.fetch: [tid:0]: Instruction is: ldr_uopvs XXX, > [r34, #12] > 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (6) created > [sn:1592] > 17690000: system.cpu.fetch: [tid:0]: Instruction is: ldr_uopvs XXX, > [r34, #8] > 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (7) created > [sn:1593] > 17690000: system.cpu.fetch: [tid:0]: Instruction is: ldr_uopvs XXX, > [r34, #4] > ... > 18440000: system.cpu.rename: [tid:0]: Processing instruction [sn:1588] with > PC 0x11b8. > 18440000: system.cpu.rename: Adjusting reg index from 105 to 105. > 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 105, got physical > reg 512. > 18440000: system.cpu.rename: [tid:0]: Register 512 is ready. > 18440000: global: [sn:1588] has 1 ready out of 4 sources. RTI 0) > 18440000: system.cpu.rename: Flattening index 35 to 35. > 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 35, got physical > reg 147. > 18440000: system.cpu.rename: [tid:0]: Register 147 is ready. > 18440000: global: [sn:1588] has 2 ready out of 4 sources. RTI 0) > 18440000: system.cpu.rename: Flattening index 34 to 34. > 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 34, got physical > reg 72. > 18440000: system.cpu.rename: [tid:0]: Register 72 is not ready. > 18440000: system.cpu.rename: Adjusting reg index from 577 to 577. > 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 577, got physical > reg 984. > 18440000: system.cpu.rename: [tid:0]: Register 984 is ready. > 18440000: global: [sn:1588] has 3 ready out of 4 sources. RTI 0) > 18440000: system.cpu.rename: Adjusting reg index from 577 to 577. > 18440000: global: Renamed misc reg 472 > *18440000: global: Renamed reg 472 to physical reg 984 old mapping was 984 > * > *18440000: system.cpu.rename: [tid:0]: Renaming arch reg 577 to physical > reg 984.* > 18440000: system.cpu.rename: [tid:0]: Adding instruction to history buffer > (size=3), [sn:1588]. > > > > On Fri, Jul 9, 2010 at 4:23 PM, Gabriel Michael Black < > [email protected]> wrote: > >> Thanks for the extra info which should be very helpful. Can you please >> tell us what the actual bytes are for the junk instruction? >> >> >> Gabe >> >> Quoting Min Kyu Jeong <[email protected]>: >> >> I looked into this thing, but still don't fully understand how the >>> out-of-bound reg index causes segfault. Instead, I will just describe >>> what >>> is happening hoping someone would catch a clue from it. >>> >>> The register index that goes out of bound is the architectural register >>> index, stored in StaticInst class _destRegIdx[0]. The particular >>> StaticInst >>> I am getting this from is MicroLdrUop. In the constructor of the >>> MacroMemOp >>> (This invalid garbage instruction from mispredicted path is decoded as >>> LdmStm), MicroLdrUop instances are generated. The destination register >>> indices for uops are generated from the bit vector, and there is this bit >>> of >>> code >>> >>> if (force_user) { >>> regIdx = instRegInMode(MODE_USER, regIdx); >>> } >>> >>> that changes regIdx from 1 to 577. This is stored in _destRegIdx[0] >>> variable >>> of the MicroLdrUop StaticInst. During renaming, 577 is renamed to 984 = >>> 577 >>> - numLogicalRegs + numPhysicalRegs >>> >>> This instRegInMode() is what I found suspicous, since the reg window is >>> handled by ArmISA::flattenIntIndex() call. >>> >>> Anyways, the the simulation segfaults during advance() function call of >>> the >>> timebuffer at the end of the NEXT tick. The following is the call stack. >>> >>> >>> #0 0x000000000040a2fa in RefCounted::decref (this=0x8d4810708d48c84d) at >>> build/ARM_FS/base/refcnt.hh:51 >>> #1 0x0000000000761daa in RefCountingPtr<BaseO3DynInst<O3CPUImpl> >::del >>> (this=0x9cd200) at build/ARM_FS/base/refcnt.hh:69 >>> #2 0x0000000000761dc1 in ~RefCountingPtr (this=0x9cd200) at >>> build/ARM_FS/base/refcnt.hh:85 >>> #3 0x000000000077ebf9 in ~commitComm (this=0x9cd1a8) at >>> build/ARM_FS/cpu/o3/comm.hh:153 >>> #4 0x000000000077ec49 in ~TimeBufStruct (this=0x9ccec8) at >>> build/ARM_FS/cpu/o3/comm.hh:110 >>> #5 0x00000000007884c4 in TimeBuffer<TimeBufStruct<O3CPUImpl> >::advance >>> (this=0x1991038) at build/ARM_FS/base/timebuf.hh:187 >>> #6 0x0000000000798c31 in FullO3CPU<O3CPUImpl>::tick (this=0x198d310) at >>> build/ARM_FS/cpu/o3/cpu.cc:523 >>> >>> I made this segfault goes away by overriding the idx 577 to 0. >>> >>> Thanks, >>> >>> On Wed, Jun 30, 2010 at 2:17 PM, Gabriel Michael Black < >>> [email protected]> wrote: >>> >>> Could you be more specific? There are a lot of register related indexes >>>> and >>>> I'm not sure exactly which ones you're talking about. Could you walk >>>> through >>>> what's happening from the illegal encoding, through the decoder, through >>>> the >>>> CPU and up to the segfault? I don't totally understand the mechanics of >>>> the >>>> failure at the moment, but my gut reaction is that the decoder should >>>> have >>>> returned an "Unknown(machInst)" when the index was out of bounds. I'm >>>> not >>>> convinced that's what's happening, though. >>>> >>>> >>>> Gabe >>>> >>>> Quoting Min Kyu Jeong <[email protected]>: >>>> >>>> I just found a case somewhat related to this. Not exactly an assertion, >>>> >>>>> but >>>>> a segfault from the mispredicated path (non)instructions. >>>>> >>>>> When the operand register index is out of the range, the call to >>>>> timeBuffer.advance() right after the renaming of such registers causes >>>>> segfault . I bypassed this problem by making that out-of-bound register >>>>> index to ZERO registers during the renaming (more particularly, during >>>>> index >>>>> flattening). I think raising a fault would be a better solution, but >>>>> holding >>>>> off from actually doing it. Any suggestion would be appreciated. >>>>> >>>>> ps. is the [m5-dev] tag in the title added by the mailing list, or >>>>> should >>>>> I >>>>> add it myself? >>>>> >>>>> Thanks, >>>>> >>>>> Min >>>>> >>>>> On Mon, Jun 14, 2010 at 3:52 PM, Gabriel Michael Black < >>>>> [email protected]> wrote: >>>>> >>>>> It's important to distinguish between M5 making sense, and the code >>>>> it's >>>>> >>>>>> executing making sense. We shouldn't (and I hope don't) have any >>>>>> asserts >>>>>> that check conditions controllable from the simulated code since those >>>>>> should generally just cause a fault and may, as you point out, be >>>>>> mispeculated. It's fine to check that M5 is internally consistent, >>>>>> though. >>>>>> This is supposed to work in all the CPU models and as far as I know >>>>>> generally does. M5's CPU models should, to the first order, correctly >>>>>> do >>>>>> whatever whacky, nonsensical things the instruction memory tells it to >>>>>> do >>>>>> without complaining. If you've found a case where it doesn't (which >>>>>> has >>>>>> happened before) please let us know so we can fix it. >>>>>> >>>>>> Gabe >>>>>> >>>>>> >>>>>> Quoting Min Kyu Jeong <[email protected]>: >>>>>> >>>>>> Is it possible that the speculatively fetched instructions can cause >>>>>> >>>>>> programming assertions to fail? Until a branch is resolved, whatever >>>>>>> (even >>>>>>> non-instructions) in the predicted path could be fetched and decoded. >>>>>>> Can't >>>>>>> assertions on instruction sanity fail for those? >>>>>>> >>>>>>> I am trying to make O3 CPU model for ARM working. In many cases the >>>>>>> first >>>>>>> instruction is a branch followed by a interrupt vector table. I was >>>>>>> wondering if such cases exist for other CPU models and if it is, >>>>>>> handled >>>>>>> how. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Min >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>> m5-dev mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/mailman/listinfo/m5-dev >>>>>> >>>>>> >>>>>> >>>>> >>>> _______________________________________________ >>>> m5-dev mailing list >>>> [email protected] >>>> http://m5sim.org/mailman/listinfo/m5-dev >>>> >>>> >>> >> >> _______________________________________________ >> m5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/m5-dev >> > >
_______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
