On Fri, Jul 9, 2010 at 5:41 PM, Min Kyu Jeong <[email protected]> wrote:

> The following is the excerpt from the disassembly.
>
>     117c: e321f013 msr CPSR_c, #19 ; 0x13
>     1180: e24fd08c sub sp, pc, #140 ; 0x8c
>     1184: e321f011 msr CPSR_c, #17 ; 0x11
>     1188: e24fd094 sub sp, pc, #148 ; 0x94
>     118c: e321f012 msr CPSR_c, #18 ; 0x12
>     1190: e24fd09c sub sp, pc, #156 ; 0x9c
>     1194: e321f01b msr CPSR_c, #27 ; 0x1b
>     1198: e24fd0a4 sub sp, pc, #164 ; 0xa4
>     119c: e321f017 msr CPSR_c, #23 ; 0x17
>     11a0: e24fd0ac sub sp, pc, #172 ; 0xac
>     11a4: e321f01f msr CPSR_c, #31 ; 0x1f
>     11a8: e24fd0b4 sub sp, pc, #180 ; 0xb4
>     11ac: e321f013 msr CPSR_c, #19 ; 0x13
>     11b0: ea000002 b 11c0 <skipLabel_00000002>
>
> 000011b4 <LabStr_00000002>:
>     11b4: 5f444441 69736162 00315f63              ADD_basic_1.
>
>
> After x11b0, branch is predicated fall-through and the string label is
> fetched. The particular bytes that causes the
>

predicated (X) -> predicted (O)

mess is of address 11b8, so I think it is the second 4B chunk of the label:
> x69736162
>
> The following is the relevant part of trace from the run. There are some
> additional prints that I added.
>
> 17680000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b4 (0) created
> [sn:1585]
> 17680000: system.cpu.fetch: [tid:0]: Instruction is:   svcpl
> 17680000: global: MicroLdrUop, regIdx : 577
> 17680000: global: MicroLdrUop, regIdx : 581
> 17680000: global: MicroLdrUop, regIdx : 582
> 17680000: global: MicroLdrUop, regIdx : 584
> 17680000: global: MicroLdrUop, regIdx : 589
> 17680000: global: MicroLdrUop, regIdx : 590
> 17680000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (0) created
> [sn:1586]
> 17680000: system.cpu.fetch: [tid:0]: Instruction is:   addi_uopvs   r34,
> r3, #0
> 17680000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (1) created
> [sn:1587]
> 17680000: system.cpu.fetch: [tid:0]: Instruction is:   subi_uopvs   r3, r3,
> #24
> 17680000: system.cpu.fetch: [tid:0]: Done fetching, reached fetch bandwidth
> for this cycle.
> 17680000: system.cpu.fetch: [tid:0]: Setting PC to 0x0011b8.
> ...
> 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (2) created
> [sn:1588]
> 17690000: system.cpu.fetch: [tid:0]: Instruction is:   ldr_uopvs   XXX,
> [r34, #24]
> 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (3) created
> [sn:1589]
> 17690000: system.cpu.fetch: [tid:0]: Instruction is:   ldr_uopvs   XXX,
> [r34, #20]
> 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (4) created
> [sn:1590]
> 17690000: system.cpu.fetch: [tid:0]: Instruction is:   ldr_uopvs   XXX,
> [r34, #16]
> 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (5) created
> [sn:1591]
> 17690000: system.cpu.fetch: [tid:0]: Instruction is:   ldr_uopvs   XXX,
> [r34, #12]
> 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (6) created
> [sn:1592]
> 17690000: system.cpu.fetch: [tid:0]: Instruction is:   ldr_uopvs   XXX,
> [r34, #8]
> 17690000: system.cpu.fetch: [tid:0]: Instruction PC 0x11b8 (7) created
> [sn:1593]
> 17690000: system.cpu.fetch: [tid:0]: Instruction is:   ldr_uopvs   XXX,
> [r34, #4]
> ...
> 18440000: system.cpu.rename: [tid:0]: Processing instruction [sn:1588] with
> PC 0x11b8.
> 18440000: system.cpu.rename: Adjusting reg index from 105 to 105.
> 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 105, got physical
> reg 512.
> 18440000: system.cpu.rename: [tid:0]: Register 512 is ready.
> 18440000: global: [sn:1588] has 1 ready out of 4 sources. RTI 0)
> 18440000: system.cpu.rename: Flattening index 35 to 35.
> 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 35, got physical
> reg 147.
> 18440000: system.cpu.rename: [tid:0]: Register 147 is ready.
> 18440000: global: [sn:1588] has 2 ready out of 4 sources. RTI 0)
> 18440000: system.cpu.rename: Flattening index 34 to 34.
> 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 34, got physical
> reg 72.
> 18440000: system.cpu.rename: [tid:0]: Register 72 is not ready.
> 18440000: system.cpu.rename: Adjusting reg index from 577 to 577.
> 18440000: system.cpu.rename: [tid:0]: Looking up arch reg 577, got physical
> reg 984.
> 18440000: system.cpu.rename: [tid:0]: Register 984 is ready.
> 18440000: global: [sn:1588] has 3 ready out of 4 sources. RTI 0)
> 18440000: system.cpu.rename: Adjusting reg index from 577 to 577.
> 18440000: global: Renamed misc reg 472
> *18440000: global: Renamed reg 472 to physical reg 984 old mapping was 984
> *
> *18440000: system.cpu.rename: [tid:0]: Renaming arch reg 577 to physical
> reg 984.*
> 18440000: system.cpu.rename: [tid:0]: Adding instruction to history buffer
> (size=3), [sn:1588].
>
>
>
> On Fri, Jul 9, 2010 at 4:23 PM, Gabriel Michael Black <
> [email protected]> wrote:
>
>> Thanks for the extra info which should be very helpful. Can you please
>> tell us what the actual bytes are for the junk instruction?
>>
>>
>> Gabe
>>
>> Quoting Min Kyu Jeong <[email protected]>:
>>
>>  I looked into this thing, but still don't fully understand how the
>>> out-of-bound reg index causes segfault. Instead, I will just describe
>>> what
>>> is happening hoping someone would catch a clue from it.
>>>
>>> The register index that goes out of bound is the architectural register
>>> index, stored in StaticInst class _destRegIdx[0]. The particular
>>> StaticInst
>>> I am getting this from is MicroLdrUop. In the constructor of the
>>> MacroMemOp
>>> (This invalid garbage instruction from mispredicted path is decoded as
>>> LdmStm), MicroLdrUop instances are generated. The destination register
>>> indices for uops are generated from the bit vector, and there is this bit
>>> of
>>> code
>>>
>>> if (force_user) {
>>>   regIdx = instRegInMode(MODE_USER, regIdx);
>>> }
>>>
>>> that changes regIdx from 1 to 577. This is stored in _destRegIdx[0]
>>> variable
>>> of the MicroLdrUop StaticInst. During renaming, 577 is renamed to 984 =
>>> 577
>>> - numLogicalRegs + numPhysicalRegs
>>>
>>> This instRegInMode() is what I found suspicous, since the reg window is
>>> handled by ArmISA::flattenIntIndex() call.
>>>
>>> Anyways, the the simulation segfaults during advance() function call of
>>> the
>>> timebuffer at the end of the NEXT tick. The following is the call stack.
>>>
>>>
>>> #0  0x000000000040a2fa in RefCounted::decref (this=0x8d4810708d48c84d) at
>>> build/ARM_FS/base/refcnt.hh:51
>>> #1  0x0000000000761daa in RefCountingPtr<BaseO3DynInst<O3CPUImpl> >::del
>>> (this=0x9cd200) at build/ARM_FS/base/refcnt.hh:69
>>> #2  0x0000000000761dc1 in ~RefCountingPtr (this=0x9cd200) at
>>> build/ARM_FS/base/refcnt.hh:85
>>> #3  0x000000000077ebf9 in ~commitComm (this=0x9cd1a8) at
>>> build/ARM_FS/cpu/o3/comm.hh:153
>>> #4  0x000000000077ec49 in ~TimeBufStruct (this=0x9ccec8) at
>>> build/ARM_FS/cpu/o3/comm.hh:110
>>> #5  0x00000000007884c4 in TimeBuffer<TimeBufStruct<O3CPUImpl> >::advance
>>> (this=0x1991038) at build/ARM_FS/base/timebuf.hh:187
>>> #6  0x0000000000798c31 in FullO3CPU<O3CPUImpl>::tick (this=0x198d310) at
>>> build/ARM_FS/cpu/o3/cpu.cc:523
>>>
>>> I made this segfault goes away by overriding the idx 577 to 0.
>>>
>>> Thanks,
>>>
>>> On Wed, Jun 30, 2010 at 2:17 PM, Gabriel Michael Black <
>>> [email protected]> wrote:
>>>
>>>  Could you be more specific? There are a lot of register related indexes
>>>> and
>>>> I'm not sure exactly which ones you're talking about. Could you walk
>>>> through
>>>> what's happening from the illegal encoding, through the decoder, through
>>>> the
>>>> CPU and up to the segfault? I don't totally understand the mechanics of
>>>> the
>>>> failure at the moment, but my gut reaction is that the decoder should
>>>> have
>>>> returned an "Unknown(machInst)" when the index was out of bounds. I'm
>>>> not
>>>> convinced that's what's happening, though.
>>>>
>>>>
>>>> Gabe
>>>>
>>>> Quoting Min Kyu Jeong <[email protected]>:
>>>>
>>>>  I just found a case somewhat related to this. Not exactly an assertion,
>>>>
>>>>> but
>>>>> a segfault from the mispredicated path (non)instructions.
>>>>>
>>>>> When the operand register index is out of the range, the call to
>>>>> timeBuffer.advance() right after the renaming of such registers causes
>>>>> segfault . I bypassed this problem by making that out-of-bound register
>>>>> index to ZERO registers during the renaming (more particularly, during
>>>>> index
>>>>> flattening). I think raising a fault would be a better solution, but
>>>>> holding
>>>>> off from actually doing it. Any suggestion would be appreciated.
>>>>>
>>>>> ps. is the [m5-dev] tag in the title added by the mailing list, or
>>>>> should
>>>>> I
>>>>> add it myself?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Min
>>>>>
>>>>> On Mon, Jun 14, 2010 at 3:52 PM, Gabriel Michael Black <
>>>>> [email protected]> wrote:
>>>>>
>>>>>  It's important to distinguish between M5 making sense, and the code
>>>>> it's
>>>>>
>>>>>> executing making sense. We shouldn't (and I hope don't) have any
>>>>>> asserts
>>>>>> that check conditions controllable from the simulated code since those
>>>>>> should generally just cause a fault and may, as you point out, be
>>>>>> mispeculated. It's fine to check that M5 is internally consistent,
>>>>>> though.
>>>>>> This is supposed to work in all the CPU models and as far as I know
>>>>>> generally does. M5's CPU models should, to the first order, correctly
>>>>>> do
>>>>>> whatever whacky, nonsensical things the instruction memory tells it to
>>>>>> do
>>>>>> without complaining. If you've found a case where it doesn't (which
>>>>>> has
>>>>>> happened before) please let us know so we can fix it.
>>>>>>
>>>>>> Gabe
>>>>>>
>>>>>>
>>>>>> Quoting Min Kyu Jeong <[email protected]>:
>>>>>>
>>>>>>  Is it possible that the speculatively fetched instructions can cause
>>>>>>
>>>>>>  programming assertions to fail? Until a branch is resolved, whatever
>>>>>>> (even
>>>>>>> non-instructions) in the predicted path could be fetched and decoded.
>>>>>>> Can't
>>>>>>> assertions on instruction sanity fail for those?
>>>>>>>
>>>>>>> I am trying to make O3 CPU model for ARM working. In many cases the
>>>>>>> first
>>>>>>> instruction is a branch followed by a interrupt vector table. I was
>>>>>>> wondering if such cases exist for other CPU models and if it is,
>>>>>>> handled
>>>>>>> how.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Min
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  _______________________________________________
>>>>>> m5-dev mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> m5-dev mailing list
>>>> [email protected]
>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to