RE: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
Thank you all for all the helpful suggestions :) Problem is now solved. As I suspected earlier, RTL uses quite a lot of absolute addressing while calling its internal routines. These addresses are stored both in initialized and uninitialized data sections of DLL (.data and .bss respectively). While addresses stored in .data are fixed by relocations, addresses in .bss are not, and are initialized to zero by the windows DLL loader. So does my engine now, and everything started to work just fine. Best regards, Gennadiy. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
= Preamble = The task I'm working on is somewhat unorthodox, but I'd like to get some comments anyway. I'm writing semi-stealth DLL which is one loaded by the process (and visible by others) while not residing on disk as a file. This is done by the following method: 1. Call LoadLibrary on an existing DLL file 2. Take memory snapshot of the loaded DLL 3. Call FreeLibrary on it and delete DLL file. 4. Create KnownDlls kernel section object, obtain its new base address. 5. Process relocations and imagebase fixups, write such patched image to kernel section. Now whenever a process tries to load DLL normally by magic name, it is loaded from this kernel section, not from file. = Problem = As long as I use minimalistic DLL with single invoke MessageBoxW (written in masm) everything works just fine. Then comes the following: --- library demo; uses Windows; begin MessageBoxW(GetDesktopWindow,'Demo window','Demo',MB_OK or MB_ICONINFORMATION); end. --- This one is failing by causing AV upon loading from kernel section. I narrowed down this problem as being caused by the implicitly linked code from _FPC_DLL_Entry as well as initialization section of system.pp. The first failure is caused by this: --- { pass dummy value } StackLength := CheckInitialStkLen($100); StackBottom := StackTop - StackLength; --- These are threadvars, but the helper functions SysAllocateThreadVars() and suchlike are only initialized _later_ in the code with the call of InitSystemThreads(). Therefore, AV ensues since SysAllocateThreadVars() is being called by the now invalid absolute addresses. I wonder why it even works normally. At first I thought InitSystemThreads() was the only culprit since it remembers the addresses of threadvar and other helper functions. But placing it prior to access to StackLength did not help much. There seems to be alot of other stuff that remembers other function's addresses in such a way that they are not recalculated upon DLL reload. I tried to figure out in what order the procedure calls between begin and end in system.pp should be rearranged but that didn't help either. Finally, as a last resort, I commented out everything that was in system.pp's begin-end section and now my library is “working” just fine. I understand though that this method implies stripping most of FPC juicy features like strings, memory management, maybe arrays etc and that I can't really do much without it. = Question = First, I'd like to know why calling of DLL entry point anew does not re-initialize _all_ RTL internals regardless of what was remembered prior to taking the snapshot. Is it just assuming good behavior of system loader that carefully zeroes data segment while I don't? Second, I'd be glad to hear any suggestions of workarounds, such as: a) is there a way to supply specially modified system.ppu/o compile-time depending on what source file is being compiled? or b) is there a way to somehow IFDEF the initialization section in system.pp so that my DLL code can not have it while everything else can? or c) should I take care to zero data segment after the relocations are processed? or (in the ideal Universe) d) is there a proper order of calling the initialization routines in the system.pp section that makes sure EVERY helper function gets its address variable updated prior to using? Thank you all in advance. Best Regards, Gennadiy. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
Hello FPC, Friday, January 6, 2012, 3:07:31 PM, you wrote: GP The task I'm working on is somewhat unorthodox, but I'd like to get some comments anyway. [...] GP I'm writing semi-stealth DLL which is one loaded by the GP process (and visible by others) while not residing on disk as a GP file. This is done by the following method: Why you do not load the DLL from memory ? There is code to load and relocate it as it is being done by the Windows OS. -- Best regards, José ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
RE: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
GP The task I'm working on is somewhat unorthodox, but I'd like to get some comments anyway. [...] GP I'm writing semi-stealth DLL which is one loaded by the process GP (and visible by others) while not residing on disk as a file. This GP is done by the following method: Why you do not load the DLL from memory ? There is code to load and relocate it as it is being done by the Windows OS. I've seen a lot of code samples that do just that, including one for kernelmode. My goal is not to reinvent the wheel and duplicate the whole DLL loader mechanism, but rather to not have a DLL file on a disk. Also I want to keep code as little as possible, leaving the OS to do most of work -- it's better for [backward] compatibility too (I need both x86 and x64 versions with shared codebase). The current engine is only 7kb in source code, and it omits manual adjustment of sections to page boundaries, import walking, and most of the relocation block processing (ntdll luckily happens to have LdrProcessRelocationBlock), etc, since all this is done by LdrLoadLibrary anyway. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
On 06.01.2012 15:07, Gennadiy Poryev wrote: First, I'd like to know why calling of DLL entry point anew does not re-initialize _all_ RTL internals regardless of what was remembered prior to taking the snapshot. Is it just assuming good behavior of system loader that carefully zeroes data segment while I don't? The point is: why should a DLL that is used normally take special care of this? The normal lifetime of a DLL is * DLL is loaded by LoadLibrary * Windows calls DLL entrypoint with PROCESS_ATTACH * DLL is used * DLL is unloaded by FreeLibrary * Windows calls DLL entrypoint with PROCESS_DETACH * DLL is gone from memory If now the process loads that DLL again after a PROCESS_DETACH it runs the complete PROCESS_ATTACH again. I have to admit though that I don't know either why the StackLength, StackBottom parts work... Thinking about this a bit... it might be that there is still the TLS value set to a value Nil. Thus the RTL will reference the old values which are no longer valid. I'd suggest you to take a look at %fpcdir%/rtl/win/systhrd.inc and there SysRelocateThreadVar. It's just a guess though. Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
06.01.2012 18:07, Gennadiy Poryev пишет: = Preamble = First, I'd like to know why calling of DLL entry point anew does not re-initialize _all_ RTL internals regardless of what was remembered prior to taking the snapshot. Is it just assuming good behavior of system loader that carefully zeroes data segment while I don't? The data segment is not necessarily zeroed, it can contain non-zero initial values. Once you load DLL using LoadLibrary and let its entrypoint run, it will overwrite initialized part with new values and there is no way to recover the original values. In particular, tlsindex global variable is initalized with value of -1, not 0. Several APIs exist which allow to load the image without invoking its entrypoint. LoadLibraryEx, MapImage, etc. They vary in processing imports and relocations, though. Regards, Sergei ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
RE: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
First, I'd like to know why calling of DLL entry point anew does not re-initialize _all_ RTL internals regardless of what was remembered prior to taking the snapshot. Is it just assuming good behavior of system loader that carefully zeroes data segment while I don't? The data segment is not necessarily zeroed, it can contain non-zero initial values. Once you load DLL using LoadLibrary and let its entrypoint run, it will overwrite initialized part with new values and there is no way to recover the original values. In particular, tlsindex global variable is initalized with value of -1, not 0. That's precisely what I was complaining about. Apparently it does NOT overwrite with new values and I want to know how to make sure it does so. Several APIs exist which allow to load the image without invoking its entrypoint. LoadLibraryEx, MapImage, etc. They vary in processing imports and relocations, though. Irrelevant. I'm pretty comfortable with what LoadLibrary is supposed to do. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
RE: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
First, I'd like to know why calling of DLL entry point anew does not re-initialize _all_ RTL internals regardless of what was remembered prior to taking the snapshot. Is it just assuming good behavior of system loader that carefully zeroes data segment while I don't? The point is: why should a DLL that is used normally take special care of this? The normal lifetime of a DLL is * DLL is loaded by LoadLibrary * Windows calls DLL entrypoint with PROCESS_ATTACH * DLL is used * DLL is unloaded by FreeLibrary * Windows calls DLL entrypoint with PROCESS_DETACH Actually PROCESS_DETACH call happens before DLL is unloaded. But my point is that I take snapshot of the DLL image before all this happens, actually right after the call of its entry point with PROCESS_ATTACH. Therefore, for a DLL it appears as if PROCESS_ATTACH is called once again, only this time with different imagebase (i.e. hInstance) and relocs adjusted accordingly. And I don't see why it should not work this way. If now the process loads that DLL again after a PROCESS_DETACH it runs the complete PROCESS_ATTACH again. I wish it was that easy. But there is no reasonable way to take snapshot image between PROCESS_DETACH and actual freeing of the memory. Thinking about this a bit... it might be that there is still the TLS value set to a value Nil. Thus the RTL will reference the old values which are no longer valid. I'd suggest you to take a look at %fpcdir%/rtl/win/systhrd.inc and there SysRelocateThreadVar. It's just a guess though. If so, how can I make sure these values are properly initialized if I call dll entry point again? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Manual reload of a DLL snapshot (with relocations) causes multiple AV
On 06.01.2012 19:10, Gennadiy Poryev wrote: First, I'd like to know why calling of DLL entry point anew does not re-initialize _all_ RTL internals regardless of what was remembered prior to taking the snapshot. Is it just assuming good behavior of system loader that carefully zeroes data segment while I don't? The point is: why should a DLL that is used normally take special care of this? The normal lifetime of a DLL is * DLL is loaded by LoadLibrary * Windows calls DLL entrypoint with PROCESS_ATTACH * DLL is used * DLL is unloaded by FreeLibrary * Windows calls DLL entrypoint with PROCESS_DETACH Actually PROCESS_DETACH call happens before DLL is unloaded. I used a unclear description. I meant more like Unloading is triggered by call to FreeLibrary. But my point is that I take snapshot of the DLL image before all this happens, actually right after the call of its entry point with PROCESS_ATTACH. Therefore, for a DLL it appears as if PROCESS_ATTACH is called once again, only this time with different imagebase (i.e. hInstance) and relocs adjusted accordingly. And I don't see why it should not work this way. I think this is exactly the problem. The RTL code relies on the fact that the entrypoint is only called once with PROCESS_ATTACH (maybe it relies not explicitely on that, but implicitly) during the lifetime of a DLL. If PROCESS_ATTACH is called again than the DLL should have been loaded freshly. If now the process loads that DLL again after a PROCESS_DETACH it runs the complete PROCESS_ATTACH again. I wish it was that easy. But there is no reasonable way to take snapshot image between PROCESS_DETACH and actual freeing of the memory. It might not be the wisest to take a snapshot at that moment. Theoretically it would be the best if you take a snapshot BEFORE PROCESS_ATTACH is called or at least before any RTL startup code is run. I don't know how exactly you achive that snapshot taking, but you could modify the DLL entrypoint of FPC (located in %fpcdir%\rtl\win\syswin.inc) and e.g. raise an SEH exception that you catch, do the snapshot and continue. I have not tested whether raising an SEH exception inside DLL loading will abort the loading process or simply call the installed exception handlers, so this is just a theory... Thinking about this a bit... it might be that there is still the TLS value set to a value Nil. Thus the RTL will reference the old values which are no longer valid. I'd suggest you to take a look at %fpcdir%/rtl/win/systhrd.inc and there SysRelocateThreadVar. It's just a guess though. If so, how can I make sure these values are properly initialized if I call dll entry point again? AFAIK you'll need to modify the RTL. Additionally I don't know (perhaps noone really does) in which locations the RTL finalizes in a way that does not allow reinitialization. Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel