Hi DS, > Please have a look at https://www.freecalypso.org/members/ds/ccdgen.exe.c > which is the result of the whole executable decompilation.
Is it from IDA or Ghidra? In either case, thank you for this point of reference - it gives me an idea of what can be expected from the existing tools. > I recommend you give ghidra a try nonetheless. Adding it to my long to-do list... Just "giving it a try" is already difficult in itself because they don't support 32-bit Linux hosts: I refuse to defile my pristine primary-use machine with a 64-bit OS, thus I have to use a different computer other than my preferred primary for this whole "give it a try" experiment... > I would be surprised if either IDA's or ghidra output can be recompiled > as-is. The result of the decompiler was not meant for recompilation, This part is the most disappointing: the tools are simply not made for the purpose that's needed here, and the remaining problems then mostly stem from the mismatch in purpose-orientation between the creator and the user of the tool... When the problem statement in need of solving is recovery of a lost- source math or data processing or "business logic" application (meaning a program that is known to not issue any system calls, not access any hw or any network resources etc, only reading and writing files via stdio), the first step prior to applying decompilation logic needs to be identification of program vs standard library code. I expect that every part of the .text section in a binary such as ccdgen.exe must fall neatly into one of just a few mutually exclusive categories: 1) entry point code coming from crt0.o or whatever M$ called their version, executing before main() entry; 2) actual program code of interest, beginning with main() and ending at points where the code calls fopen(), fgets(), fscanf(), printf(), fprintf() etc, as well as other (not stdio) libc functions like strcmp(), malloc() and whatnot; 3) bodies of all those libc functions just named and everything they call further downstream; 4) bits of code inserted by the linker, whatever is needed for Win32 environment - my memory is rusty after not touching that stuff for over 25 y. Actual decompilation logic, as in machine generation of recompilable C code from disassembly, needs to be applied *only* to part 2 of the just-listed division, and not any other parts. Also given how linkers typically work, especially old and "dumb" ones, I would expect the 4 code divisions I just listed above to actually appear in the .text section in that order: the linker would first process crt0.o and the application objects listed on its invokation line, then start pulling modules from whatever was MSVC's equivalent of libc+libgcc in order to satisfy externals. Hence I would expect to see a boundary in the .text section between the end of interesting code and the beginning of uninteresting bits pulled from the standard library - but looking at the fully automated decompiler output, it looks like the tools aren't smart enough to recognize it... It has been 25 y since I did any work with x86 assembly, and 28 y since I did truly hard-core x86 reversing, so my memory is quite a bit rusty, but it looks like I have no choice but to heavily brush up on my x86 knowledge, dust off and re-read all those books about Win32 and PE file format which I should still have somewhere, and then decide on the most appropriate course of action, which may involve developing some new tools. Hasta la Victoria, Siempre, Mychaela aka The Mother _______________________________________________ Community mailing list Community@freecalypso.org https://www.freecalypso.org/mailman/listinfo/community