Re: [DynInst_API:] where to find the code for handling switch() statements?
Hey there, ok, I identified one of my mistakes that led to poor performance on switch()es -- I had marked the code section as writeable, and the switch handling heuristics then declare all switch jumptables invalid. Fixing this fixed my issues with the switch code above :) Cheers, Thomas On Tue, Aug 22, 2017 at 4:06 PM, Thomas Dullienwrote: > Hey there, > > regarding 64-bit PE binaries: I am providing the data to Dyninst myself, > so anything that works "disassembly-wise" should work here, too. In the > end, to Dyninst, the code is just a blob of assembly. > > Thanks a lot for the hints regarding the env var and the code, digging > into it now :-) > > Cheers, > Thomas > > On Tue, Aug 22, 2017 at 3:33 PM, Xiaozhu Meng wrote: > >> Hi Thomas, >> >> While Dyninst fully supports 64-bit ELF binaries, I don't think Dyninst >> currently work with 64-bit PE binaries. I need to ask others to know how >> much efforts are needed if you really want to do analyze 64-bit PE binaries. >> >> In terms of your 32-bit code example, the jump table construct looks very >> primitive, so I am a little surprise that Dyninst currently failed to >> analyze it. >> >> To debug this, you can first set "DYNINST_DEBUG_PARSING" to 1 and then >> run your program again. This will dump the complete debugging log. In terms >> of the code, you want to start with parseAPI/src/IndirectAnalyzer.C, >> which performs the analysis of the jump tables. It contains two major >> pieces: parseAPI/src/JumpTableFormatPred.C, which contains the code to >> determine the jump table locations, jump table index variables, and other >> format elements, and parseAPI/src/JumpTableIndexPred.C, which tries the >> determine the value bound of the index variables. >> >> In your case, I am guessing that the problem is in JumpTableFormatPred.C. >> >> If you find it difficult to debug this by your own and if it is possible >> to share this problematic binary with me, I can take a look at it. >> >> Thanks, >> >> --Xiaozhu >> >> On Tue, Aug 22, 2017 at 7:50 AM, Thomas Dullien > > wrote: >> >>> Hey there, >>> >>> I gave the fork a try, but it does not seem to have handled the switch I >>> encounter either. The construct looks >>> as follows: >>> >>> .text:5A6E59FA pushebp >>> .text:5A6E59FB mov ebp, esp >>> .text:5A6E59FD sub esp, 18h >>> .text:5A6E5A00 imuleax, [ebp+arg_4], 28h >>> .text:5A6E5A04 pushebx >>> .text:5A6E5A05 mov ebx, [ebp+arg_0] >>> .text:5A6E5A08 pushesi >>> .text:5A6E5A09 mov esi, ecx >>> .text:5A6E5A0B mov [ebp+var_8], 17D7840h >>> .text:5A6E5A12 add eax, ebx >>> .text:5A6E5A14 mov [ebp+var_14], esi >>> .text:5A6E5A17 mov [ebp+var_C], ebx >>> .text:5A6E5A1A mov [ebp+var_18], eax >>> .text:5A6E5A1D pushedi >>> .text:5A6E5A1E cmp ebx, eax >>> .text:5A6E5A20 jnb loc_5A6E608A >>> .text:5A6E5A26 lea eax, [ebx+8] >>> .text:5A6E5A29 mov ecx, esi >>> .text:5A6E5A2B pusheax >>> .text:5A6E5A2C call(..) >>> .text:5A6E5A31 mov edi, eax >>> .text:5A6E5A33 lea eax, [ebx+18h] >>> .text:5A6E5A36 pusheax >>> .text:5A6E5A37 call(...) >>> .text:5A6E5A3C mov ecx, eax >>> .text:5A6E5A3E mov eax, [ebx] >>> .text:5A6E5A40 cmp eax, 36h; switch 55 cases >>> .text:5A6E5A43 ja loc_5A6E6095; jumptable >>> 5A6E5A49 default case >>> .text:5A6E5A49 jmp ds:off_5A6E609A[eax*4] ; switch >>> jump >>> >>> Any advice on where in the dyninst codebase I should go digging for the >>> switch handling code? >>> >>> Cheers, >>> Thomas >>> >>> On Tue, Aug 22, 2017 at 1:26 PM, Thomas Dullien < >>> thomasdull...@google.com> wrote: >>> Hey there, an example from 32-bit code where the default switch handling fails: .text:00412990 sub esp, 50h .text:00412993 mov eax, ___security_cookie .text:00412998 xor eax, esp .text:0041299A mov [esp+50h+var_4], eax .text:0041299E mov edx, [esp+50h+arg_0] .text:004129A2 pushebx .text:004129A3 mov ebx, ecx .text:004129A5 lea eax, [edx-1] .text:004129A8 cmp eax, 6 ; switch 7 cases .text:004129AB ja loc_412F7E ; jumptable 004129B4 default case .text:004129B1 pushebp .text:004129B2
Re: [DynInst_API:] where to find the code for handling switch() statements?
Hi Thomas, While Dyninst fully supports 64-bit ELF binaries, I don't think Dyninst currently work with 64-bit PE binaries. I need to ask others to know how much efforts are needed if you really want to do analyze 64-bit PE binaries. In terms of your 32-bit code example, the jump table construct looks very primitive, so I am a little surprise that Dyninst currently failed to analyze it. To debug this, you can first set "DYNINST_DEBUG_PARSING" to 1 and then run your program again. This will dump the complete debugging log. In terms of the code, you want to start with parseAPI/src/IndirectAnalyzer.C, which performs the analysis of the jump tables. It contains two major pieces: parseAPI/src/JumpTableFormatPred.C, which contains the code to determine the jump table locations, jump table index variables, and other format elements, and parseAPI/src/JumpTableIndexPred.C, which tries the determine the value bound of the index variables. In your case, I am guessing that the problem is in JumpTableFormatPred.C. If you find it difficult to debug this by your own and if it is possible to share this problematic binary with me, I can take a look at it. Thanks, --Xiaozhu On Tue, Aug 22, 2017 at 7:50 AM, Thomas Dullienwrote: > Hey there, > > I gave the fork a try, but it does not seem to have handled the switch I > encounter either. The construct looks > as follows: > > .text:5A6E59FA pushebp > .text:5A6E59FB mov ebp, esp > .text:5A6E59FD sub esp, 18h > .text:5A6E5A00 imuleax, [ebp+arg_4], 28h > .text:5A6E5A04 pushebx > .text:5A6E5A05 mov ebx, [ebp+arg_0] > .text:5A6E5A08 pushesi > .text:5A6E5A09 mov esi, ecx > .text:5A6E5A0B mov [ebp+var_8], 17D7840h > .text:5A6E5A12 add eax, ebx > .text:5A6E5A14 mov [ebp+var_14], esi > .text:5A6E5A17 mov [ebp+var_C], ebx > .text:5A6E5A1A mov [ebp+var_18], eax > .text:5A6E5A1D pushedi > .text:5A6E5A1E cmp ebx, eax > .text:5A6E5A20 jnb loc_5A6E608A > .text:5A6E5A26 lea eax, [ebx+8] > .text:5A6E5A29 mov ecx, esi > .text:5A6E5A2B pusheax > .text:5A6E5A2C call(..) > .text:5A6E5A31 mov edi, eax > .text:5A6E5A33 lea eax, [ebx+18h] > .text:5A6E5A36 pusheax > .text:5A6E5A37 call(...) > .text:5A6E5A3C mov ecx, eax > .text:5A6E5A3E mov eax, [ebx] > .text:5A6E5A40 cmp eax, 36h; switch 55 cases > .text:5A6E5A43 ja loc_5A6E6095; jumptable > 5A6E5A49 default case > .text:5A6E5A49 jmp ds:off_5A6E609A[eax*4] ; switch jump > > Any advice on where in the dyninst codebase I should go digging for the > switch handling code? > > Cheers, > Thomas > > On Tue, Aug 22, 2017 at 1:26 PM, Thomas Dullien > wrote: > >> Hey there, >> >> an example from 32-bit code where the default switch handling fails: >> >> .text:00412990 sub esp, 50h >> .text:00412993 mov eax, ___security_cookie >> .text:00412998 xor eax, esp >> .text:0041299A mov [esp+50h+var_4], eax >> .text:0041299E mov edx, [esp+50h+arg_0] >> .text:004129A2 pushebx >> .text:004129A3 mov ebx, ecx >> .text:004129A5 lea eax, [edx-1] >> .text:004129A8 cmp eax, 6 ; switch 7 cases >> .text:004129AB ja loc_412F7E ; jumptable >> 004129B4 default case >> .text:004129B1 pushebp >> .text:004129B2 pushesi >> .text:004129B3 pushedi >> .text:004129B4 jmp ds:off_412F90[eax*4] ; switch jump >> >> Enough of this for the moment, though :-)) -- I will check your branch >> now :-) >> >> Cheers, >> Thomas >> >> On Tue, Aug 22, 2017 at 1:24 PM, Thomas Dullien > > wrote: >> >>> Hey there, >>> >>> I am back at work on this :-). >>> >>> A few questions: >>> - Your fork is a fork of Dyninst 9 ? >>> - Are there any things I need to be aware of when building it? >>> >>> The particular scenario I am dealing with right now is the following >>> construct (x86_64 disassembly of >>> Visual Studio compiled code). >>> >>> .text:00014004D970 mov [rsp+arg_8], edx >>> .text:00014004D974 mov [rsp+arg_0], rcx >>> .text:00014004D979 pushrdi >>> .text:00014004D97A sub rsp, 220h >>> .text:00014004D981 mov rdi, rsp >>>
Re: [DynInst_API:] where to find the code for handling switch() statements?
Hey there, I gave the fork a try, but it does not seem to have handled the switch I encounter either. The construct looks as follows: .text:5A6E59FA pushebp .text:5A6E59FB mov ebp, esp .text:5A6E59FD sub esp, 18h .text:5A6E5A00 imuleax, [ebp+arg_4], 28h .text:5A6E5A04 pushebx .text:5A6E5A05 mov ebx, [ebp+arg_0] .text:5A6E5A08 pushesi .text:5A6E5A09 mov esi, ecx .text:5A6E5A0B mov [ebp+var_8], 17D7840h .text:5A6E5A12 add eax, ebx .text:5A6E5A14 mov [ebp+var_14], esi .text:5A6E5A17 mov [ebp+var_C], ebx .text:5A6E5A1A mov [ebp+var_18], eax .text:5A6E5A1D pushedi .text:5A6E5A1E cmp ebx, eax .text:5A6E5A20 jnb loc_5A6E608A .text:5A6E5A26 lea eax, [ebx+8] .text:5A6E5A29 mov ecx, esi .text:5A6E5A2B pusheax .text:5A6E5A2C call(..) .text:5A6E5A31 mov edi, eax .text:5A6E5A33 lea eax, [ebx+18h] .text:5A6E5A36 pusheax .text:5A6E5A37 call(...) .text:5A6E5A3C mov ecx, eax .text:5A6E5A3E mov eax, [ebx] .text:5A6E5A40 cmp eax, 36h; switch 55 cases .text:5A6E5A43 ja loc_5A6E6095; jumptable 5A6E5A49 default case .text:5A6E5A49 jmp ds:off_5A6E609A[eax*4] ; switch jump Any advice on where in the dyninst codebase I should go digging for the switch handling code? Cheers, Thomas On Tue, Aug 22, 2017 at 1:26 PM, Thomas Dullienwrote: > Hey there, > > an example from 32-bit code where the default switch handling fails: > > .text:00412990 sub esp, 50h > .text:00412993 mov eax, ___security_cookie > .text:00412998 xor eax, esp > .text:0041299A mov [esp+50h+var_4], eax > .text:0041299E mov edx, [esp+50h+arg_0] > .text:004129A2 pushebx > .text:004129A3 mov ebx, ecx > .text:004129A5 lea eax, [edx-1] > .text:004129A8 cmp eax, 6 ; switch 7 cases > .text:004129AB ja loc_412F7E ; jumptable > 004129B4 default case > .text:004129B1 pushebp > .text:004129B2 pushesi > .text:004129B3 pushedi > .text:004129B4 jmp ds:off_412F90[eax*4] ; switch jump > > Enough of this for the moment, though :-)) -- I will check your branch now > :-) > > Cheers, > Thomas > > On Tue, Aug 22, 2017 at 1:24 PM, Thomas Dullien > wrote: > >> Hey there, >> >> I am back at work on this :-). >> >> A few questions: >> - Your fork is a fork of Dyninst 9 ? >> - Are there any things I need to be aware of when building it? >> >> The particular scenario I am dealing with right now is the following >> construct (x86_64 disassembly of >> Visual Studio compiled code). >> >> .text:00014004D970 mov [rsp+arg_8], edx >> .text:00014004D974 mov [rsp+arg_0], rcx >> .text:00014004D979 pushrdi >> .text:00014004D97A sub rsp, 220h >> .text:00014004D981 mov rdi, rsp >> .text:00014004D984 mov ecx, 88h >> .text:00014004D989 mov eax, 0h >> .text:00014004D98E rep stosd >> .text:00014004D990 mov rcx, [rsp+228h+arg_0] >> .text:00014004D998 mov rax, cs:__security_cookie >> .text:00014004D99F xor rax, rsp >> .text:00014004D9A2 mov [rsp+228h+var_18], rax >> .text:00014004D9AA mov eax, [rsp+228h+arg_8] >> .text:00014004D9B1 mov [rsp+228h+var_80], eax >> .text:00014004D9B8 mov eax, [rsp+228h+var_80] >> .text:00014004D9BF dec eax >> .text:00014004D9C1 mov [rsp+228h+var_80], eax >> .text:00014004D9C8 cmp [rsp+228h+var_80], 5 ; >> switch 6 cases >> .text:00014004D9D0 ja loc_14004EA48 ; >> jumptable 00014004D9EF default case >> .text:00014004D9D6 movsxd rax, [rsp+228h+var_80] >> .text:00014004D9DE lea rcx, cs:14000h >> .text:00014004D9E5 mov eax, ds:(off_14004EA70 - >> 14000h)[rcx+rax*4] >> .text:00014004D9EC add rax, rcx >> .text:00014004D9EF jmp rax ; switch >> jump >>
Re: [DynInst_API:] where to find the code for handling switch() statements?
Hey there, an example from 32-bit code where the default switch handling fails: .text:00412990 sub esp, 50h .text:00412993 mov eax, ___security_cookie .text:00412998 xor eax, esp .text:0041299A mov [esp+50h+var_4], eax .text:0041299E mov edx, [esp+50h+arg_0] .text:004129A2 pushebx .text:004129A3 mov ebx, ecx .text:004129A5 lea eax, [edx-1] .text:004129A8 cmp eax, 6 ; switch 7 cases .text:004129AB ja loc_412F7E ; jumptable 004129B4 default case .text:004129B1 pushebp .text:004129B2 pushesi .text:004129B3 pushedi .text:004129B4 jmp ds:off_412F90[eax*4] ; switch jump Enough of this for the moment, though :-)) -- I will check your branch now :-) Cheers, Thomas On Tue, Aug 22, 2017 at 1:24 PM, Thomas Dullienwrote: > Hey there, > > I am back at work on this :-). > > A few questions: > - Your fork is a fork of Dyninst 9 ? > - Are there any things I need to be aware of when building it? > > The particular scenario I am dealing with right now is the following > construct (x86_64 disassembly of > Visual Studio compiled code). > > .text:00014004D970 mov [rsp+arg_8], edx > .text:00014004D974 mov [rsp+arg_0], rcx > .text:00014004D979 pushrdi > .text:00014004D97A sub rsp, 220h > .text:00014004D981 mov rdi, rsp > .text:00014004D984 mov ecx, 88h > .text:00014004D989 mov eax, 0h > .text:00014004D98E rep stosd > .text:00014004D990 mov rcx, [rsp+228h+arg_0] > .text:00014004D998 mov rax, cs:__security_cookie > .text:00014004D99F xor rax, rsp > .text:00014004D9A2 mov [rsp+228h+var_18], rax > .text:00014004D9AA mov eax, [rsp+228h+arg_8] > .text:00014004D9B1 mov [rsp+228h+var_80], eax > .text:00014004D9B8 mov eax, [rsp+228h+var_80] > .text:00014004D9BF dec eax > .text:00014004D9C1 mov [rsp+228h+var_80], eax > .text:00014004D9C8 cmp [rsp+228h+var_80], 5 ; > switch 6 cases > .text:00014004D9D0 ja loc_14004EA48 ; jumptable > 00014004D9EF default case > .text:00014004D9D6 movsxd rax, [rsp+228h+var_80] > .text:00014004D9DE lea rcx, cs:14000h > .text:00014004D9E5 mov eax, ds:(off_14004EA70 - > 14000h)[rcx+rax*4] > .text:00014004D9EC add rax, rcx > .text:00014004D9EF jmp rax ; switch > jump > .text:00014004D9F1 ; -- > - > > Cheers, > Thomas > > On Tue, Jun 13, 2017 at 4:35 PM, Thomas Dullien > wrote: > >> Hey there, >> >> excellent, thanks for your quick response :-) I will give your fork a try >> in the next 2-3 days -- I am currently >> at a conference and hence won't have time to try it today :-) >> >> Cheers, >> Thomas >> >> On Tue, Jun 13, 2017 at 10:30 AM, Xiaozhu Meng wrote: >> >>> Hi Thomas, >>> >>> I am working with an improved jump table analysis. Its prototype is >>> available at my Dyninst fork (https://github.com/mxz297/dyn >>> inst/tree/jump_table_multi_slices). This improved version should be >>> merged back to mainstream Dyninst in the near future. Could you try my >>> version to see whether it solves your problem? If the problem remains, >>> could you provide me the problematic binary so that I can further improve >>> my code? >>> >>> Thanks, >>> >>> --Xiaozhu >>> >>> On Tue, Jun 13, 2017 at 7:25 AM, Thomas Dullien < >>> thomasdull...@google.com> wrote: >>> Hey all, I am using DynInst for a small project that helps search for similar flowgraph in a search index (https://www.github.com/thomas dullien/functionsimsearch) and noticed that most switch statements that it encounters are not handled properly (e.g. the control flow reconstruction fails to resolve the switch targets). Where in the source code should I go looking for the relevant code? I'd love to have a look around to see if it can be improved. Cheers, Thomas ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api >>> >> > ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu
Re: [DynInst_API:] where to find the code for handling switch() statements?
Hey there, I am back at work on this :-). A few questions: - Your fork is a fork of Dyninst 9 ? - Are there any things I need to be aware of when building it? The particular scenario I am dealing with right now is the following construct (x86_64 disassembly of Visual Studio compiled code). .text:00014004D970 mov [rsp+arg_8], edx .text:00014004D974 mov [rsp+arg_0], rcx .text:00014004D979 pushrdi .text:00014004D97A sub rsp, 220h .text:00014004D981 mov rdi, rsp .text:00014004D984 mov ecx, 88h .text:00014004D989 mov eax, 0h .text:00014004D98E rep stosd .text:00014004D990 mov rcx, [rsp+228h+arg_0] .text:00014004D998 mov rax, cs:__security_cookie .text:00014004D99F xor rax, rsp .text:00014004D9A2 mov [rsp+228h+var_18], rax .text:00014004D9AA mov eax, [rsp+228h+arg_8] .text:00014004D9B1 mov [rsp+228h+var_80], eax .text:00014004D9B8 mov eax, [rsp+228h+var_80] .text:00014004D9BF dec eax .text:00014004D9C1 mov [rsp+228h+var_80], eax .text:00014004D9C8 cmp [rsp+228h+var_80], 5 ; switch 6 cases .text:00014004D9D0 ja loc_14004EA48 ; jumptable 00014004D9EF default case .text:00014004D9D6 movsxd rax, [rsp+228h+var_80] .text:00014004D9DE lea rcx, cs:14000h .text:00014004D9E5 mov eax, ds:(off_14004EA70 - 14000h)[rcx+rax*4] .text:00014004D9EC add rax, rcx .text:00014004D9EF jmp rax ; switch jump .text:00014004D9F1 ; --- Cheers, Thomas On Tue, Jun 13, 2017 at 4:35 PM, Thomas Dullienwrote: > Hey there, > > excellent, thanks for your quick response :-) I will give your fork a try > in the next 2-3 days -- I am currently > at a conference and hence won't have time to try it today :-) > > Cheers, > Thomas > > On Tue, Jun 13, 2017 at 10:30 AM, Xiaozhu Meng wrote: > >> Hi Thomas, >> >> I am working with an improved jump table analysis. Its prototype is >> available at my Dyninst fork (https://github.com/mxz297/dyn >> inst/tree/jump_table_multi_slices). This improved version should be >> merged back to mainstream Dyninst in the near future. Could you try my >> version to see whether it solves your problem? If the problem remains, >> could you provide me the problematic binary so that I can further improve >> my code? >> >> Thanks, >> >> --Xiaozhu >> >> On Tue, Jun 13, 2017 at 7:25 AM, Thomas Dullien > > wrote: >> >>> Hey all, >>> >>> I am using DynInst for a small project that helps search for similar >>> flowgraph in a search index (https://www.github.com/thomas >>> dullien/functionsimsearch) >>> and noticed that most switch statements that it encounters are not >>> handled properly (e.g. the control flow reconstruction fails to resolve >>> the switch targets). >>> >>> Where in the source code should I go looking for the relevant code? >>> I'd love to have a look around to see if it can be improved. >>> >>> Cheers, >>> Thomas >>> >>> ___ >>> Dyninst-api mailing list >>> Dyninst-api@cs.wisc.edu >>> https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api >>> >>> >> > ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Re: [DynInst_API:] where to find the code for handling switch() statements?
Hey there, excellent, thanks for your quick response :-) I will give your fork a try in the next 2-3 days -- I am currently at a conference and hence won't have time to try it today :-) Cheers, Thomas On Tue, Jun 13, 2017 at 10:30 AM, Xiaozhu Mengwrote: > Hi Thomas, > > I am working with an improved jump table analysis. Its prototype is > available at my Dyninst fork (https://github.com/mxz297/ > dyninst/tree/jump_table_multi_slices). This improved version should be > merged back to mainstream Dyninst in the near future. Could you try my > version to see whether it solves your problem? If the problem remains, > could you provide me the problematic binary so that I can further improve > my code? > > Thanks, > > --Xiaozhu > > On Tue, Jun 13, 2017 at 7:25 AM, Thomas Dullien > wrote: > >> Hey all, >> >> I am using DynInst for a small project that helps search for similar >> flowgraph in a search index (https://www.github.com/thomas >> dullien/functionsimsearch) >> and noticed that most switch statements that it encounters are not >> handled properly (e.g. the control flow reconstruction fails to resolve >> the switch targets). >> >> Where in the source code should I go looking for the relevant code? >> I'd love to have a look around to see if it can be improved. >> >> Cheers, >> Thomas >> >> ___ >> Dyninst-api mailing list >> Dyninst-api@cs.wisc.edu >> https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api >> >> > ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Re: [DynInst_API:] where to find the code for handling switch() statements?
Hi Thomas, I am working with an improved jump table analysis. Its prototype is available at my Dyninst fork ( https://github.com/mxz297/dyninst/tree/jump_table_multi_slices). This improved version should be merged back to mainstream Dyninst in the near future. Could you try my version to see whether it solves your problem? If the problem remains, could you provide me the problematic binary so that I can further improve my code? Thanks, --Xiaozhu On Tue, Jun 13, 2017 at 7:25 AM, Thomas Dullienwrote: > Hey all, > > I am using DynInst for a small project that helps search for similar > flowgraph in a search index (https://www.github.com/thomasdullien/ > functionsimsearch) > and noticed that most switch statements that it encounters are not > handled properly (e.g. the control flow reconstruction fails to resolve > the switch targets). > > Where in the source code should I go looking for the relevant code? > I'd love to have a look around to see if it can be improved. > > Cheers, > Thomas > > ___ > Dyninst-api mailing list > Dyninst-api@cs.wisc.edu > https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api > > ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api