Re: [lldb-dev] Stepping into function generates EXC_BAD_INSTRUCTION signal

jingham Tue, 02 Dec 2014 15:06:18 -0800

Again, this is a good analysis of why we are crashing.  But it isn't the right 
solution for stepping.  When we are stepping through the IT & the instructions 
it governs we need to make sure we DON'T try to do the fast-stepping, but just 
use the hardware single step and go instruction by instruction.


Jim

> On Dec 2, 2014, at 2:52 PM, Greg Clayton <gclay...@apple.com> wrote:
> 
> So the nitty gritty details are IT is followed by up  to four "i" for if, or 
> "e" for else.
> 
> 
> So if you have:
> 
> IT ieei
> op1  (if)
> op2  (else)
> op3  (else) 
> op4  (if)
> 
> The if/else bits are stored in the CPSR so it knows that an IT is going on 
> and the condition is stored in some bits and the if/else bits are also stored.
> 
> The processor will evaluate the condition when it hits the IT instruction 
> which sets bits in the CPSR and then it will either execute or not the next 
> instructions (four in this case). So we _must_ have a 4 byte breakpoint 
> opcode as recognized by the CPU pipeline when it wants to skip the 
> instructions. otherwise your 16 bit trap + NOP would count for two of the 
> instructions in the IT block. So if we have:
> 
> IT ieei
> op1_32 (if)
> op2_16 (else)
> op3_16 (else)
> op4_16 (if)
> 
> And we replace the op1_32 with a trap16 + nop16 we would execute:
> 
> IT ieei
> trap16 (if)
> nop16  (else)
> op2_16 (else)
> op3_16 (if)
> op4_16
> 
> So now our if/thens won't work...
> 
> 
> 
>> On Dec 2, 2014, at 2:35 PM, Mario Zechner <badlogicga...@gmail.com> wrote:
>> 
>> I guess the 4-byte instruction could be replaced by a 2-byte trap and a 
>> 2-byte nop (if that exists on ARM). Or any 2-byte instruction instead of the 
>> nop. If i understand correctly, once the trap is hit, the original 4-byte 
>> instruction should be restored again, no? If that's not the case, i could 
>> manually replace the 2nd 2-byte nop with the original 2-bytes after whatever 
>> mechanism restores the memory at the trap location.
>> 
>> I'm still confused why IT isn't considered a branch, i go hit the ARM docs 
>> again :)
>> 
>> On Tue, Dec 2, 2014 at 10:59 PM, Greg Clayton <gclay...@apple.com> wrote:
>> The problem here is that we are modifying a 32 bit instruction with a 16 bit 
>> trap. The "IT" instruction isn't a branch and it shouldn't be considered 
>> one. The solution of using a 4 byte thumb breakpoint must be used and this 
>> will work for Linux, but won't work on MacOSX because our kernel, to my 
>> knowledge doesn't support a 32 bit thumb breakpoint. I will check on this. 
>> So the real fix is to use a 32 bit thumb breakpoint for 32 bit thumb 
>> instructions.
>> 
>> Greg
>> 
>>> On Dec 2, 2014, at 12:01 PM, jing...@apple.com wrote:
>>> 
>>> The question that the stepping code is really asking is "can I predict the 
>>> next instruction that will follow on from this one before I get there."  If 
>>> IsBranch isn't sufficient to know that for some instruction or class of 
>>> instructions, then adding whatever other tests are required seems okay 
>>> formally.  It would be nicer if the MC instructions had some other way to 
>>> characterize this and whatever other instructions behave the same way, but 
>>> I'm not clear enough on what the "way" is to know what question to ask the 
>>> instruction other than "IsBranch".  It would definitely be worth having a 
>>> conversation with the LLVM folks about some way to determine this.  
>>> Otherwise, having to special-case some instructions as "these we know 
>>> confuse us" seems ugly but not terrible.
>>> 
>>> Jim
>>> 
>>> 
>>>> On Dec 2, 2014, at 4:11 AM, Mario Zechner <badlogicga...@gmail.com> wrote:
>>>> 
>>>> Sorry Stephane, forgot to hit "Reply all".
>>>> 
>>>> I dug a bit deeper. The problem is in LLVM's instruction table for ARM 
>>>> Thumbv2. Here's the definition of the IT instruction 
>>>> (llvm/lib/Target/ARM/ARMInstrThumb2.td):
>>>> 
>>>> // IT block
>>>> let Defs = [ITSTATE] in
>>>> def t2IT : Thumb2XI<(outs), (ins it_pred:$cc, it_mask:$mask),
>>>>                   AddrModeNone, 2,  IIC_iALUx,
>>>>                   "it$mask\t$cc", "", []>,
>>>>          ComplexDeprecationPredicate<"IT"> {
>>>> // 16-bit instruction.
>>>> let Inst{31-16} = 0x0000;
>>>> let Inst{15-8} = 0b10111111;
>>>> 
>>>> bits<4> cc;
>>>> bits<4> mask;
>>>> let Inst{7-4} = cc;
>>>> let Inst{3-0} = mask;
>>>> 
>>>> let DecoderMethod = "DecodeIT";
>>>> }
>>>> 
>>>> The instruction isn't marked as isBranch (e.g. via let isBranch=1).
>>>> 
>>>> ThreadPlanStepRange retrieves the next branch instruction for an address 
>>>> range via InstructionList::GetIndexOfNextBranchInstruction, which uses a 
>>>> Disassembler instance that gets all the instruction info from that 
>>>> tablegen file (through the llvm::Target). For each 
>>>> lldb_private::Instruction in the list InstructionLLVMC::DoesBranch is 
>>>> called, which in turn calls 
>>>> DisassemblerLLVMC::LLVMCDisassembler::CanBranch. That method looks up the 
>>>> MCInstrDesc for the instruction's opcode. That MCInstrDesc has a Flag 
>>>> member, which comes from the tablegen file. That is set to 
>>>> MCID::UnmodeledSideEffects for the IT instruction, which is why it's not 
>>>> selected as the next branch instruction.
>>>> 
>>>> Now, i have no idea what side effects it would have to change the tablegen 
>>>> file and regenerate the table. My guess would be that it's a bad idea to 
>>>> change that 4.7k LOC .td file and hope for the best. I guess i'll manually 
>>>> check for the IT instruction in 
>>>> InstructionList::GetIndexOfNextBranchInstruction in case the target arch 
>>>> is ARM. That seems like a really dirty hack though.
>>>> 
>>>> Any other ideas? Is this something that should be brought up with the LLVM 
>>>> guys?
>>>> 
>>>> On Tue, Dec 2, 2014 at 1:10 PM, Mario Zechner <badlogicga...@gmail.com> 
>>>> wrote:
>>>> I dug a bit deeper. The problem is in LLVM's instruction table for ARM 
>>>> Thumbv2. Here's the definition of the IT instruction 
>>>> (llvm/lib/Target/ARM/ARMInstrThumb2.td):
>>>> 
>>>> // IT block
>>>> let Defs = [ITSTATE] in
>>>> def t2IT : Thumb2XI<(outs), (ins it_pred:$cc, it_mask:$mask),
>>>>                   AddrModeNone, 2,  IIC_iALUx,
>>>>                   "it$mask\t$cc", "", []>,
>>>>          ComplexDeprecationPredicate<"IT"> {
>>>> // 16-bit instruction.
>>>> let Inst{31-16} = 0x0000;
>>>> let Inst{15-8} = 0b10111111;
>>>> 
>>>> bits<4> cc;
>>>> bits<4> mask;
>>>> let Inst{7-4} = cc;
>>>> let Inst{3-0} = mask;
>>>> 
>>>> let DecoderMethod = "DecodeIT";
>>>> }
>>>> 
>>>> The instruction isn't marked as isBranch (e.g. via let isBranch=1).
>>>> 
>>>> ThreadPlanStepRange retrieves the next branch instruction for an address 
>>>> range via InstructionList::GetIndexOfNextBranchInstruction, which uses a 
>>>> Disassembler instance that gets all the instruction info from that 
>>>> tablegen file (through the llvm::Target). For each 
>>>> lldb_private::Instruction in the list InstructionLLVMC::DoesBranch is 
>>>> called, which in turn calls 
>>>> DisassemblerLLVMC::LLVMCDisassembler::CanBranch. That method looks up the 
>>>> MCInstrDesc for the instruction's opcode. That MCInstrDesc has a Flag 
>>>> member, which comes from the tablegen file. That is set to 
>>>> MCID::UnmodeledSideEffects for the IT instruction, which is why it's not 
>>>> selected as the next branch instruction.
>>>> 
>>>> Now, i have no idea what side effects it would have to change the tablegen 
>>>> file and regenerate the table. My guess would be that it's a bad idea to 
>>>> change that 4.7k LOC .td file and hope for the best. I guess i'll manually 
>>>> check for the IT instruction in 
>>>> InstructionList::GetIndexOfNextBranchInstruction in case the target arch 
>>>> is ARM. That seems like a really dirty hack though.
>>>> 
>>>> Any other ideas? Is this something that should be brought up with the LLVM 
>>>> guys?
>>>> 
>>>> On Mon, Dec 1, 2014 at 7:55 PM, Stephane Sezer <s...@fb.com> wrote:
>>>> I suppose it wouldn’t get hit, no. I don’t know about considering it 
>>>> instructions as a branching instruction. I guess it makes sense but I 
>>>> don’t know how the rest would work with it.
>>>> 
>>>> --
>>>> Stephane Sezer
>>>> 
>>>>> On Dec 1, 2014, at 10:47 AM, Mario Zechner <badlogicga...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> Thanks, i'm going to try that. I just wonder if it would make more sense 
>>>>> to consider the it instruction a branching instruction. Not sure what 
>>>>> side effects that may have.
>>>>> 
>>>>> Also, if i wrote a 4-byte breakpoint for blne, would it get hit if the it 
>>>>> branches over it? Guess i'll find out :)
>>>>> 
>>>>> On Dec 1, 2014 6:46 PM, "Stephane Sezer" <s...@fb.com> wrote:
>>>>> I remember fighting with this recently in our debug server (ds2), your 
>>>>> understanding of the problem is correct I believe. What you need to do is 
>>>>> to place a four-byte thumb breakpoint instead of a two-byte thumb 
>>>>> breakpoint. I don’t know what the iOS kernel expects exactly, but for 
>>>>> example the Linux kernel understands the following:
>>>>> - two-byte thumb breakpoint: 0xde01
>>>>> - four-byte thumb breakpoint: 0xa000f7f0
>>>>> - arm breakpoint: 0xe7f001f0
>>>>> 
>>>>> If you insert a four-byte thumb breakpoint at 0x27b2ea, the it 
>>>>> instruction will skip four bytes when skipping the breakpoint, and will 
>>>>> end up at address 0x27b2ee, which is what you would expect.
>>>>> 
>>>>> --
>>>>> Stephane Sezer
>>>>> 
>>>>>> On Dec 1, 2014, at 8:13 AM, Mario Zechner <badlogicga...@gmail.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> I think i understand the issue now. 
>>>>>> ThreadPlanStepRange::SetNextBranchBreakpoint is falsely selecting the 
>>>>>> blne instruction instead of the it instruction. The condition is not 
>>>>>> meet, so the CPU jumps over the instruction after it. Since we have a 
>>>>>> trap there that's 2 bytes long, it will end up at 0x27b2ec (PC after 2 
>>>>>> byte trap instruction) instead of 0x27b2ee (PC after 4 byte blne). So 
>>>>>> the CPU ends up in the middle of the blne instruction, which is of 
>>>>>> course not a valid instruction.
>>>>>> 
>>>>>> I guess the next thing i have to figure out is why the it instruction 
>>>>>> isn't marked as a branch instruction, which is why it isn't selected by 
>>>>>> ThreadPlanStepRange::SetNextBranchBreakpoint as the next branch 
>>>>>> breakpoint.
>>>>>> 
>>>>>> On Mon, Dec 1, 2014 at 4:59 PM, Mario Zechner <badlogicga...@gmail.com> 
>>>>>> wrote:
>>>>>> I traced through ThreadPlanStepRange and ThreadPlanStepRange for this 
>>>>>> piece of code:
>>>>>> 
>>>>>> 0x27b2d4 <[J]java.lang.Object.<init>()V>: push   {r7, lr}
>>>>>> 
>>>>>> 0x27b2d6 <[J]java.lang.Object.<init>()V+2>: mov    r7, sp
>>>>>> 
>>>>>> 0x27b2d8 <[J]java.lang.Object.<init>()V+4>: sub    sp, #0x4
>>>>>> 
>>>>>> 0x27b2da <[J]java.lang.Object.<init>()V+6>: movs   r2, #0x0
>>>>>> 
>>>>>> 0x27b2dc <[J]java.lang.Object.<init>()V+8>: str    r2, [sp]
>>>>>> 
>>>>>> 0x27b2de <[J]java.lang.Object.<init>()V+10>: str    r1, [sp]
>>>>>> 
>>>>>> 0x27b2e0 <[J]java.lang.Object.<init>()V+12>: ldr    r2, [r1]
>>>>>> 
>>>>>> 0x27b2e2 <[J]java.lang.Object.<init>()V+14>: ldr    r2, [r2, #0x30]
>>>>>> 
>>>>>> 0x27b2e4 <[J]java.lang.Object.<init>()V+16>: tst.w  r2, #0x100000
>>>>>> 
>>>>>> 0x27b2e8 <[J]java.lang.Object.<init>()V+20>: it     ne
>>>>>> 
>>>>>> 0x27b2ea <[J]java.lang.Object.<init>()V+22>: blne   0x466290             
>>>>>>      ; _bcRegisterFinalizer
>>>>>> 
>>>>>> 0x27b2ee <[J]java.lang.Object.<init>()V+26>: add    sp, #0x4
>>>>>> 
>>>>>> 0x27b2f0 <[J]java.lang.Object.<init>()V+28>: pop    {r7, pc}
>>>>>> 
>>>>>> 0x27b2f2 <[J]java.lang.Object.<init>()V+30>: nop
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Execution is halted at 0x27b2e0 when i issue a source-level step. The 
>>>>>> ThreadPlanStepRange::DidPush method sets up a breakpoint at 0x27b2ea (2 
>>>>>> bytes) successfully after identifying the instruction at 0x27b2ea (blne) 
>>>>>> as the next branch instruction in 
>>>>>> ThreadPlanStepRange::SetNextBranchBreakpoint.
>>>>>> 
>>>>>> Next, the threads are then resumed by the command interpreter. We 
>>>>>> receive an event from the inferior with stop reason eStopReasonException 
>>>>>> (EXC_BAD_INSTRUCTION) right after the resume, stopping the process.
>>>>>> 
>>>>>> I guess this means i need to figure out how "it" and "blne" work 
>>>>>> together (my ARM assembler knowledge is minimal) to then understand why 
>>>>>> the breakpoint instruction that's written to the inferior results in a 
>>>>>> EXC_BAD_INSTRUCTION. If someone knows what could be the culprit let me 
>>>>>> know :)
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Mario
>>>>>> 
>>>>>> 
>>>>>> On Mon, Dec 1, 2014 at 2:07 PM, Mario Zechner <badlogicga...@gmail.com> 
>>>>>> wrote:
>>>>>> Well, i wrote a very long mail detailing my journey to resolve issue #2 
>>>>>> (hanging after setting target.use-fast-stepping=false), only to 
>>>>>> eventually realize that it doesn't hang but instead just waits for the 
>>>>>> above loop to complete.
>>>>>> 
>>>>>> This means turning off target.use-fast-stepping is not an option and i'm 
>>>>>> back to square one. I'd be grateful for any pointers on how to fix issue 
>>>>>> #1 (EXC_BAD_INSTRUCTION). I guess i'll start by investigating the "run 
>>>>>> to next branch" stepping algorithm in LLDB, though my understanding is 
>>>>>> likely not sufficient to make a dent.
>>>>>> 
>>>>>> Thanks,
>>>>>> Mario
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Dec 1, 2014 at 11:05 AM, Mario Zechner <badlogicga...@gmail.com> 
>>>>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> setting target.use-fast-stepping to false did indeed solve this issue, 
>>>>>> albeit at the cost of increased runtime obviously. However, i ran into 
>>>>>> another issue right after i stepped out of the previously problematic 
>>>>>> function: http://sht.tl/bdAKRC
>>>>>> 
>>>>>> Trying to source-level step this function (with use-fast-stepping=false) 
>>>>>> results in 1) the disassembly getting all kinds of messed up and 2) the 
>>>>>> process not stepping but hanging at the `cmp r1, #0` instruction. The 
>>>>>> original assembly code around that PC looks like this:
>>>>>> 
>>>>>> LBB24_1:                                @ %label0
>>>>>>                                       @ =>This Inner Loop Header: Depth=1
>>>>>>     @DEBUG_VALUE: 
>>>>>> [J]java.lang.Thread.<init>(Ljava/lang/Runnable;Ljava/lang/String;)V:__$env
>>>>>>  <- R5
>>>>>>     ldrexd  r1, r2, [r0]
>>>>>>     strexd  r1, r6, r6, [r0]
>>>>>>     cmp     r1, #0
>>>>>>     bne     LBB24_1
>>>>>> @ BB#2:                                 @ %label0
>>>>>>     @DEBUG_VALUE: 
>>>>>> [J]java.lang.Thread.<init>(Ljava/lang/Runnable;Ljava/lang/String;)V:__$env
>>>>>>  <- R5
>>>>>>     dmb     ish
>>>>>>     movs    r1, #5
>>>>>> 
>>>>>> A simple loop, which is actually part of an inlined function. We had 
>>>>>> some issues with inlined functions previously, i assume this issue is 
>>>>>> related. Interestingly enough, the back trace is also a bit wonky:
>>>>>> 
>>>>>> (lldb) bt
>>>>>> 
>>>>>> * thread #1: tid = 0x18082, 0x0021a9b4 
>>>>>> AttachTestIOSDev`[J]java.lang.Thread.<init>(Ljava/lang/Runnable;Ljava/lang/String;)V
>>>>>>  [inlined] [j]java.lang.Thread.threadPtr(J)[set] + 14 at Thread.java:1, 
>>>>>> stop reason = trace
>>>>>> 
>>>>>> * frame #0: 0x0021a9b4 
>>>>>> AttachTestIOSDev`[J]java.lang.Thread.<init>(Ljava/lang/Runnable;Ljava/lang/String;)V
>>>>>>  [inlined] [j]java.lang.Thread.threadPtr(J)[set] + 14 at Thread.java:1
>>>>>> 
>>>>>>   frame #1: 0x0021a9a6 
>>>>>> AttachTestIOSDev`[J]java.lang.Thread.<init>(__$env=0x01662fc8, 
>>>>>> __$this=0x64da3833, runnable=0xa4f07400, threadName=0x00286000)V + 46 at 
>>>>>> Thread.java:138
>>>>>> 
>>>>>> There should be a lot more frame. I'm gonna try to dig up some more 
>>>>>> details.
>>>>>> 
>>>>>> Thanks a lot!
>>>>>> Mario
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sun, Nov 30, 2014 at 1:32 AM, Jason Molenda <ja...@molenda.com> wrote:
>>>>>> The size of the breakpoint instruction is set by 
>>>>>> GetSoftwareBreakpointTrapOpcode().  In your case, most likely you're in 
>>>>>> PlatformDarwin::GetSoftwareBreakpointTrapOpcode() - lldb uses the symbol 
>>>>>> table (from the binary file) to determine if the code in a given 
>>>>>> function is arm or thumb.  If it's arm, a 4 byte breakpoint is used.  If 
>>>>>> it's thumb, a 2 byte breakpoint.  Of course thumbv2 of T32 instructions 
>>>>>> can be 4 bytes -- the blne instruction is in your program -- but I 
>>>>>> assume the 2 byte breakpoint instruction still works correctly in these 
>>>>>> cases; the cpu sees the 2-byte instruction and stops execution.
>>>>>> 
>>>>>> I am a little wary about the fact that this comes after an it 
>>>>>> instruction, I kind of vaguely remember issues with that instruction's 
>>>>>> behavior.
>>>>>> 
>>>>>> It shouldn't make any difference but you might want to try
>>>>>> 
>>>>>> (lldb) settings set target.use-fast-stepping false
>>>>>> 
>>>>>> which will force lldb to single instruction step through the function.  
>>>>>> Right now lldb is looking at the instruction stream and putting 
>>>>>> breakpoints on branch/call/jump instructions to do your high-level 
>>>>>> "step" command, instead of stopping on every instruction.  It is 
>>>>>> possible there could be a problem with that approach and the it 
>>>>>> instruction.  Please report back if this changes the behavior.
>>>>>> 
>>>>>> J
>>>>>> 
>>>>>> 
>>>>>>> On Nov 26, 2014, at 9:22 AM, Mario Zechner <badlogicga...@gmail.com> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> I dug a little deeper, inspecting the GDB remote packets send by LLDB 
>>>>>>> to perform the stepping. It appears when sending memory breakpoint 
>>>>>>> commands used for stepping, the size of the instruction being replaced 
>>>>>>> isn't taken into account, or writing back the original instruction 
>>>>>>> isn't done properly. The following log shows what happens when stepping 
>>>>>>> into the previously mentioned function:
>>>>>>> 
>>>>>>> (lldb) s
>>>>>>> Process 166 stopped
>>>>>>> * thread #1: tid = 0x0fd9, 0x002602e0 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>(__$env=0x016bffc8, 
>>>>>>> __$this=0x017864b0)V + 12 at Object.java:136, queue = 
>>>>>>> 'com.apple.main-thread', stop reason = step in
>>>>>>>   frame #0: 0x002602e0 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>(__$env=0x016bffc8, 
>>>>>>> __$this=0x017864b0)V + 12 at Object.java:136
>>>>>>> (lldb) disassemble -p
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V + 12 at Object.java:136:
>>>>>>> -> 0x2602e0:  ldr    r2, [r1]
>>>>>>>  0x2602e2:  ldr    r2, [r2, #0x30]
>>>>>>>  0x2602e4:  tst.w  r2, #0x100000
>>>>>>>  0x2602e8:  it     ne
>>>>>>> (lldb) s
>>>>>>> Process 166 stopped
>>>>>>> * thread #1: tid = 0x0fd9, 0x002602ec 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>(__$env=0x016bffc8, 
>>>>>>> __$this=0x017864b0)V + 24 at Object.java:136, queue = 
>>>>>>> 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION 
>>>>>>> (code=EXC_ARM_UNDEFINED, subcode=0xffd1b001)
>>>>>>>   frame #0: 0x002602ec 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>(__$env=0x016bffc8, 
>>>>>>> __$this=0x017864b0)V + 24 at Object.java:136
>>>>>>> (lldb) disassemble -p
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V + 24 at Object.java:136:
>>>>>>> -> 0x2602ec:  .long  0xb001ffd1                ; unknown opcode
>>>>>>>  0x2602f0:  pop    {r7, pc}
>>>>>>> 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V + 30:
>>>>>>>  0x2602f2:  nop
>>>>>>> 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.clone()Ljava/lang/Object; at 
>>>>>>> Object.java:154:
>>>>>>>  0x2602f4:  push   {r4, r5, r7, lr}
>>>>>>> (lldb) disassemble -f
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V at Object.java:136:
>>>>>>>  0x2602d4:  push   {r7, lr}
>>>>>>>  0x2602d6:  mov    r7, sp
>>>>>>>  0x2602d8:  sub    sp, #0x4
>>>>>>>  0x2602da:  movs   r2, #0x0
>>>>>>>  0x2602dc:  str    r2, [sp]
>>>>>>>  0x2602de:  str    r1, [sp]
>>>>>>>  0x2602e0:  ldr    r2, [r1]
>>>>>>>  0x2602e2:  ldr    r2, [r2, #0x30]
>>>>>>>  0x2602e4:  tst.w  r2, #0x100000
>>>>>>>  0x2602e8:  it     ne
>>>>>>>  0x2602ea:  blne   0x44b290                  ; _bcRegisterFinalizer
>>>>>>>  0x2602ee:  add    sp, #0x4
>>>>>>>  0x2602f0:  pop    {r7, pc}
>>>>>>> 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V + 30:
>>>>>>>  0x2602f2:  nop
>>>>>>> 
>>>>>>> The first step succeeds and ends up right after the prologue, at 
>>>>>>> 0x2602e0:  ldr    r2, [r1]. The next step ends up at 0x2602ec:  .long  
>>>>>>> 0xb001ffd1 which is wrong, it should be 0x2602ea:  blne   0x44b290.
>>>>>>> 
>>>>>>> The GDB remote conversation between lldb and the debugserver on the 
>>>>>>> device (only relevant parts):
>>>>>>> 
>>>>>>> # First step
>>>>>>> lldb->debugserver: $Z0,2602e0,2#73
>>>>>>> debugserver->lldb: $OK#00
>>>>>>> lldb->debugserver: $vCont;c:0fd9#15
>>>>>>> debugserver->lldb: (320) 
>>>>>>> $T05thread:fd9;qaddr:37ebfad0;threads:fd9,ffa,ffb,ffd,fff,1009,100a,100b;00:c8ff6b01;01:b0647801;02:00000000;03:c87d6a00;04:00000000;05:c8ff6b01;06:fc6a6501;07:0c6a6501;08:90e96b01;09:28000000;0a:74a0ea37;0b:c8ff6b01;0c:b09e5b00;0d:086a6501;0e:d1b22000;0f:
>>>>>>> 
>>>>>>> # Second step
>>>>>>> lldb->debugserver: $Z0,2602ea,2#a4
>>>>>>> debugserver->lldb: $OK#00
>>>>>>> lldb->debugserver: $vCont;c:0fd9#15
>>>>>>> debugserver->lldb: (324) 
>>>>>>> $T92thread:fd9;qaddr:37ebfad0;threads:fd9,ffa,ffb,ffd,fff,1009,100a,100b;00:c8ff6b01;01:b0647801;02:01004300;03:c87d6a00;04:00000000;05:c8ff6b01;06:fc6a6501;07:0c6a6501;08:90e96b01;09:28000000;0a:74a0ea37;0b:c8ff6b01;0c:b09e5b00;0d:086a6501;0e:d1b22000;0f:
>>>>>>> 
>>>>>>> For the first step, a 2 byte memory breakpoint is written to 0x2602e0 
>>>>>>> ($Z0,2602e0,2#73), which is where the first step ended up. The 
>>>>>>> instruction that got replaced is 2 bytes long. The GDB command wrote a 
>>>>>>> 2 bytes memory breakpoint to the address, so all is good.
>>>>>>> 
>>>>>>> For the second step, a 2 byte memory breakpoint is written to 0x2602ea 
>>>>>>> ($Z0,2602ea,2#a4). But instead of ending up at 0x2602ec, which is in 
>>>>>>> the middle of the 4-byte blne instruction.
>>>>>>> 
>>>>>>> Is it correct for LLDB to set a 2 byte memory breakpoint instead of a 
>>>>>>> 4-byte memory breakpoint in this case? The PC will be set to an invalid 
>>>>>>> address, which then causes the EXC_BAD_INSTRUCTION.
>>>>>>> 
>>>>>>> Am i understanding this correctly? Is there a way for me to fix this?
>>>>>>> 
>>>>>>> On Wed, Nov 26, 2014 at 5:26 PM, Mario Zechner 
>>>>>>> <badlogicga...@gmail.com> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> we generate thumbv7 binaries for iOS devices. We deploy, launch and 
>>>>>>> debug those via LLDB. Stepping into functions seems to almost always 
>>>>>>> generate a EXC_BAD_INSTRUCTION signal. The signal is not generated when 
>>>>>>> running the app without the debugger attached. It is also not generated 
>>>>>>> when we attach a debugger, but simply let the app run without 
>>>>>>> breakpoints or any stepping.
>>>>>>> 
>>>>>>> Here's one of these function's LLVM IR:
>>>>>>> 
>>>>>>> =======================
>>>>>>> define external void @"[J]java.lang.Object.<init>()V"(%Env* %p0, 
>>>>>>> %Object* %p1) nounwind noinline optsize {
>>>>>>> label0:
>>>>>>>   call void @"llvm.dbg.declare"(metadata !{%Env* %p0}, metadata !19), 
>>>>>>> !dbg !{i32 136, i32 0, metadata !{i32 786478, metadata !0, metadata !1, 
>>>>>>> metadata !"[J]java.lang.Object.<init>()V", metadata 
>>>>>>> !"[J]java.lang.Object.<init>()V", metadata !"", i32 136, metadata !15, 
>>>>>>> i1 false, i1 true, i32 0, i32 0, null, i32 256, i1 false, void (%Env*, 
>>>>>>> %Object*)* @"[J]java.lang.Object.<init>()V", null, null, metadata !17, 
>>>>>>> i32 136}, null}
>>>>>>>   %r0 = alloca %Object*
>>>>>>>   store %Object* null, %Object** %r0
>>>>>>>   call void @"llvm.dbg.declare"(metadata !{%Object** %r0}, metadata 
>>>>>>> !21), !dbg !{i32 136, i32 0, metadata !14, null}
>>>>>>>   store %Object* %p1, %Object** %r0
>>>>>>>   call void @"register_finalizable"(%Env* %p0, %Object* %p1), !dbg 
>>>>>>> !{i32 136, i32 0, metadata !18, null}
>>>>>>>   ret void, !dbg !{i32 136, i32 0, metadata !18, null}
>>>>>>> }
>>>>>>> =======================
>>>>>>> 
>>>>>>> The corresponding thumbv7 assembler code as generated by LLVM:
>>>>>>> 
>>>>>>> =======================
>>>>>>>     .globl  "_[J]java.lang.Object.<init>()V"
>>>>>>>     .align  2
>>>>>>>     .code   16                      @ @"[J]java.lang.Object.<init>()V"
>>>>>>>     .thumb_func     "_[J]java.lang.Object.<init>()V"
>>>>>>> "_[J]java.lang.Object.<init>()V":
>>>>>>>     .cfi_startproc
>>>>>>> Lfunc_begin18:
>>>>>>>     .loc    1 136 0                 @ Object.java:136:0
>>>>>>> @ BB#0:                                 @ %label0
>>>>>>>     .loc    1 136 0                 @ Object.java:136:0
>>>>>>>     push    {r7, lr}
>>>>>>>     mov     r7, sp
>>>>>>>     sub     sp, #4
>>>>>>>     @DEBUG_VALUE: [J]java.lang.Object.<init>()V:__$env <- R0
>>>>>>>     movs    r2, #0
>>>>>>>     str     r2, [sp]
>>>>>>>     str     r1, [sp]
>>>>>>>     .loc    1 136 0 prologue_end    @ Object.java:136:0
>>>>>>> Ltmp6:
>>>>>>>     ldr     r2, [r1]
>>>>>>>     ldr     r2, [r2, #48]
>>>>>>>     tst.w   r2, #1048576
>>>>>>> Ltmp7:
>>>>>>>     @DEBUG_VALUE: [J]java.lang.Object.<init>()V:__$env <- R0
>>>>>>>     it      ne
>>>>>>>     blxne   __bcRegisterFinalizer
>>>>>>>     add     sp, #4
>>>>>>>     pop     {r7, pc}
>>>>>>> Ltmp8:
>>>>>>> Lfunc_end18:
>>>>>>> "L_[J]java.lang.Object.<init>()V_end":
>>>>>>> 
>>>>>>>     .cfi_endproc
>>>>>>> =======================
>>>>>>> 
>>>>>>> Now, when stepping into this function, LLDB receives a signal from the 
>>>>>>> debug server:
>>>>>>> 
>>>>>>> =======================
>>>>>>> (lldb) s
>>>>>>> Process 176 stopped
>>>>>>> * thread #1: tid = 0x11f5, 0x0023e2ec 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>(__$env=0x0169efc8, 
>>>>>>> __$this=0x0174cd10)V + 24 at Object.java:136, queue = 
>>>>>>> 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION 
>>>>>>> (code=EXC_ARM_UNDEFINED, subcode=0xffd1b001)
>>>>>>>   frame #0: 0x0023e2ec 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>(__$env=0x0169efc8, 
>>>>>>> __$this=0x0174cd10)V + 24 at Object.java:136
>>>>>>> =======================
>>>>>>> 
>>>>>>> Disassembling around the PC gives:
>>>>>>> 
>>>>>>> =======================
>>>>>>> (lldb) disassemble --pc
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V + 24 at Object.java:136:
>>>>>>> -> 0x23e2ec:  .long  0xb001ffd1                ; unknown opcode
>>>>>>>  0x23e2f0:  pop    {r7, pc}
>>>>>>> 
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V + 30:
>>>>>>>  0x23e2f2:  nop
>>>>>>> 
>>>>>>> Disassembling until the beginning of the frame gives:
>>>>>>> 
>>>>>>> (lldb) disassemble -f
>>>>>>> AttachTestIOSDev`[J]java.lang.Object.<init>()V at Object.java:136:
>>>>>>>  0x23e2d4:  push   {r7, lr}
>>>>>>>  0x23e2d6:  mov    r7, sp
>>>>>>>  0x23e2d8:  sub    sp, #0x4
>>>>>>>  0x23e2da:  movs   r2, #0x0
>>>>>>>  0x23e2dc:  str    r2, [sp]
>>>>>>>  0x23e2de:  str    r1, [sp]
>>>>>>>  0x23e2e0:  ldr    r2, [r1]
>>>>>>>  0x23e2e2:  ldr    r2, [r2, #0x30]
>>>>>>>  0x23e2e4:  tst.w  r2, #0x100000
>>>>>>>  0x23e2e8:  it     ne
>>>>>>>  0x23e2ea:  blne   0x429290                  ; _bcRegisterFinalizer
>>>>>>>  0x23e2ee:  add    sp, #0x4
>>>>>>>  0x23e2f0:  pop    {r7, pc}
>>>>>>> 
>>>>>>> Accprding to this, execution should never end up at address 0x23e2ec. 
>>>>>>> That's right in the middle of the blne and add instructions in the 
>>>>>>> second disassembly. I have a hunch that the debugserver on the device 
>>>>>>> may interfere here, e.g. add a trap instruction to implement the 
>>>>>>> stepping. I'm not quite sure what to make of it.
>>>>>>> 
>>>>>>> I'd appreciate any hints. If you require more information, i got plenty 
>>>>>>> of logs :)
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Mario
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> lldb-dev mailing list
>>>>>>> lldb-dev@cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> lldb-dev mailing list
>>>>>> lldb-dev@cs.uiuc.edu
>>>>>> https://urldefense.proofpoint.com/v1/url?u=http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=g1GoAnQQskSBaWLJWw6X6w%3D%3D%0A&m=Zl2rgz3vY3p3Z1gT4mYUogC%2B71s1vpu6iiR2%2BAqSFEs%3D%0A&s=3063d588fdc99fda75142f80da681ac13b53ba823de3e2221c1b01c0c7c54982
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> lldb-dev mailing list
>>>> lldb-dev@cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>> 
>>> 
>>> _______________________________________________
>>> lldb-dev mailing list
>>> lldb-dev@cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> 


_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Re: [lldb-dev] Stepping into function generates EXC_BAD_INSTRUCTION signal

Reply via email to