Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26
Hi I'm trying to debug a persistent error that we've had for a long time, but has now become acute. We always "use strict" and do extensive testing, but this problem has eluded us so far. Every now and again (4 or 5 times per day) a process would error with a Windows Error 26 generating a Windows Error Reporting Popup on a terminal server console. The user data and process looks like it has completed without any error at the Perl and DB level: it's only the garbage collection that's showing an error. In the past this wasn't so bad, as the Apache2 process died and could be restarted and no harm was done. Now that the code has been moved from a real to a virtual server (which is allegedly identical but clearly isn't) the httpd.exe process is hanging awaiting user input to confirm the error popup window before terminating. We have searched high and low and tried all sorts of registry hacks. We've also written more watchdog code to regularly fetch pages from the server and if they time out to call taskkill to recycle the process, which although it works, isn't ideal. Question 1) Does anyone have a reliable way of ignoring this Windows popup error and letting the httpd.exe process terminate silently so that our standard process watchdog can restart it? We have also been looking at the output of a crash dump that we triggered manually with MS userdump.exe. We already have symbols files for Apache2, other modiles and mod_perl installed. The error is occurring during a string copy during the general destructor process of mod_perl after our code has successfully completed. Although the Windows error is clearly being triggered by mod_perl, we suspect that the root cause lies much deeper and is probably something to do with win32::OLE and not being thread safe. We can't find any symbols files for perl itself. Question 2) Can anyone tell us how to reliably dump trace information at the perl level so we can trace precisely which perl program (and hopefully which subroutine or method) is triggering the problem? Again we've searched high and low for a perl58.pdb and perl.pdb symbol table for the Activestate Perl build but we cannot find one anywhere. Would this mean we have to recompile our own custom Perl kernel to get this info? If so that would be a real pain,as then we would have altered something on the core system. So can anyone point me at a ready built version of symbol tables (perl58.pdb) for Activestate Perl Win32? If you have any other constructive tips for how to dig into this further, please share. Thanks -- View this message in context: http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32004421.html Sent from the mod_perl - General mailing list archive at Nabble.com.
Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26
Thanks. Answers below. awarnier wrote: > > Q1 : are you running Apache as a Windows Service ? > Q2 : if yes, in the Service properties, is the "allow service to interact > with the > desktop" checkbox checked ? > Q3 : still if yes, under what user-id ? is it "Local System", or another > local user, or a > domain user ? > Q4 : for such an error, there should be some message in the Windows Event > Logs. > What does it say there ? > > A1. Yes Apache 2.2 running as a service Apache/2.2.19 (Win32) mod_auth_sspi/1.0.4 mod_perl/2.0.4 Perl/v5.8.8 A2. yes. interactive checkbox is checked A3. Local System account A4. Exact text is The instruction at "0x2808625d" referenced memory at "0x13715268". The memory could not be "read". Click on OK to terminate the program The instruction is usually the same, although sometimes it is "0x2802627f" The referenced memory changes every time. http://old.nabble.com/file/p32004847/httpd-error.png Further stack debug that we have below: 0:999> .reload;!analyze -v;r;kv;lmnt;.logclose Loading unloaded module list *** WARNING: Unable to verify checksum for perl58.dll *** ERROR: Symbol file could not be found. Defaulted to export symbols for perl58.dll - *** WARNING: Unable to verify checksum for mod_perl.so *** ERROR: Symbol file could not be found. Defaulted to export symbols for mod_perl.so - *** * * *Exception Analysis * * * *** *** WARNING: Unable to verify checksum for libapr-1.dll *** WARNING: Unable to verify checksum for libhttpd.dll *** WARNING: Unable to verify checksum for httpd.exe *** WARNING: Unable to verify checksum for libaprutil-1.dll GetPageUrlData failed, server returned HTTP status 404 URL requested: http://watson.microsoft.com/StageOne/httpd_exe/2_2_19_0/perl58_dll/5_8_8_822/0008627f.htm?Retriage=1 FAULTING_IP: perl58!Perl_my_strlcpy+91f 2808627f 832500 and dword ptr ds:[0],0 EXCEPTION_RECORD: -- (.exr 0x) .exr 0x ExceptionAddress: 2808627f (perl58!Perl_my_strlcpy+0x091f) ExceptionCode: c005 (Access violation) ExceptionFlags: NumberParameters: 2 Parameter[0]: 0001 Parameter[1]: Attempt to write to address DEFAULT_BUCKET_ID: NULL_POINTER_WRITE PROCESS_NAME: httpd.exe ERROR_CODE: (NTSTATUS) 0xc005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s". EXCEPTION_CODE: (NTSTATUS) 0xc005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s". EXCEPTION_PARAMETER1: 0001 EXCEPTION_PARAMETER2: WRITE_ADDRESS: FOLLOWUP_IP: perl58!Perl_my_strlcpy+91f 2808627f 832500 and dword ptr ds:[0],0 MOD_LIST: NTGLOBALFLAG: 0 APPLICATION_VERIFIER_FLAGS: 0 FAULTING_THREAD: 2814 PRIMARY_PROBLEM_CLASS: NULL_POINTER_WRITE BUGCHECK_STR: APPLICATION_FAULT_NULL_POINTER_WRITE LAST_CONTROL_TRANSFER: from 1000ada9 to 2808627f STACK_TEXT: WARNING: Stack unwind information not available. Following frames may be wrong. 10b5fe68 1000ada9 00d360f4 145ce478 00c46da8 perl58!Perl_my_strlcpy+0x91f 10b5fe80 10001d3b 00d360f4 00d360f4 00d360f4 mod_perl!modperl_perl_destruct+0x5f 10b5fe98 10001dc6 00c46da8 1000268b 00bdeed0 mod_perl!modperl_interp_destroy+0x1f 10b5fea0 1000268b 00bdeed0 00bdeec0 00c46da8 mod_perl!modperl_interp_pool_destroy+0x47 10b5febc 100025a4 00bdeed0 00e8e7e0 00c46da8 mod_perl!modperl_tipool_putback_data+0xfa 10b5ff00 6eec7fdb 01046b18 01046b48 0105d0d0 mod_perl!modperl_tipool_putback_data+0x13 10b5ff24 6ff0aa33 01046b08 0001 00bdeb2c libapr_1!apr_pool_destroy+0x3b 10b5ff3c 6ff04d61 0105d0d0 0105d0d0 0105d0d0 libhttpd!ap_process_http_connection+0x83 10b5ff54 6ff05023 0105d0d0 008d3550 10b5ff84 libhttpd!ap_run_process_connection+0x21 10b5ff64 6ff1da2c 0105d0d0 00eeb568 libhttpd!ap_process_connection+0x33 10b5ff84 77bcb530 0105d0c8 libhttpd!worker_main+0x9c 10b5ffb8 77e6482f 00e9e5b8 msvcrt!_endthreadex+0xa3 10b5ffec 77bcb4bc 00e9e5b8 kernel32!BaseThreadStart+0x34 SYMBOL_STACK_INDEX: 0 SYMBOL_NAME: perl58!Perl_my_strlcpy+91f FOLLOWUP_NAME: MachineOwner MODULE_NAME: perl58 IMAGE_NAME: perl58.dll DEBUG_FLR_IMAGE_TIMESTAMP: 46aff16d STACK_COMMAND: dt ntdll!LdrpLastDllInitializer BaseDllName ; dt ntdll!LdrpFailureData ; ~999s; .ecxr ; kb FAILURE_BUCKET_ID: NULL_POINTER_WRITE_c005_perl58.dll!Perl_my_strlcpy BUCKET_ID: APPLICATION_FAULT_NULL_POINTER_WRITE_perl58!Perl_my_strlcp
Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26
awarnier wrote: > > >> >> A2. yes. interactive checkbox is checked > > That is why you get this error box. > Have you tried unchecking that option ? > (I'm not saying that it will solve the underlying problem, but it may > remove the annoying > symptom). > > On a separate note : since you are running this on virtual machines, you > may want to try > running Apache in a command window, from the command-line. > >> >> A3. Local System account > > That's interesting, considering you are logged-in as LocalSystem, which is > not a domain > account. mod_auth_sspi is to do Windows domain authentication, isn't it ? > Have now unchecked the enable interact with desktop checkbox, although I thought we needed that for some automation scripts in the past (that ran external windows programs on the server). May also try running Apache from the command line, but then again as you say it is then a different user, and quite different from running as a service. This server is not part of a domain. Maybe we just need to remove the mod_auth_sspi module {probably a hang over from another system}. -- View this message in context: http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32006592.html Sent from the mod_perl - General mailing list archive at Nabble.com.
Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26
HappyPerlUser wrote: > > Have now unchecked the enable interact with desktop checkbox, although I > thought we needed that for some automation scripts in the past (that ran > external windows programs on the server). > > May also try running Apache from the command line, but then again as you > say it is then a different user, and quite different from running as a > service. > > This server is not part of a domain. Maybe we just need to remove the > mod_auth_sspi module {probably a hang over from another system}. > Removed mod_auth_sspi module. It was a hang over. No change in behavior. Despite the enable interact with desktop being unchecked, the popup messages still seems to be appearing on the admin console (need to confirm this) Running Apache from the command line and disabling our watchog did give us some further information. Stared httpd.exe: no output and normal service started. Requests were serviced normally Initially there were 2 windows processes: PID 3748 & 9000 After the server stalled, the popup appeared on the terminal server and no further requests were serviced. Upon hitting the OK button on the popup, there were still 2 windows processes, but now they were PID 3748 1580 Again server stalled and a popup appeared. No web requests serviced. After clicking OK requests were serviced again normally. Now the PIDs are 3748 and 3556. No messages ever appeared on the Windows cmd window running the httpd.exe process (presume this meant that nothing was sent to stderr or stdout). No crash dump was ever created, even though I had userdump configured to monitor the process. So it looks like something inside Apache is causing the popup and then attempting to recover internally by regenerating a new worker process. Could this be related to config parameters??? # Turn on library reload mechanism PerlModule Apache2::Reload PerlInitHandler Apache2::Reload I'd like mod_perl to auto-reload, but I'd like it to do that without waiting for the OK button on the windows popup. regards, -- View this message in context: http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32011207.html Sent from the mod_perl - General mailing list archive at Nabble.com.
Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26
awarnier wrote: > > > > > What happens if you disable Apache2::Reload ? > > Didn't think it was related, and we aren't in the habit of changing the perl libraries, so I've disabled it for now. This does seem to have a significant contributing effect. I haven't been able to trigger the problem now for 30 minutes and before I could hammer the server and cause the popup in a matter of minutes. I can't explain this, and can only assume it's down to some sort of race condition. I doubt this is the root cause, but it could reduce the pain. Thanks. -- View this message in context: http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32012115.html Sent from the mod_perl - General mailing list archive at Nabble.com.
Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26
awarnier wrote: > > > > > What happens if you disable Apache2::Reload ? > > 48 hours continuous operations now without a single glitch at all. Almost certain that Apache2::Reload was not the root cause, but it was certainly heavily implicated in the problem and resultant instability. If anyone's interested in a crash dump I can re-enable this module temporarily and try and generate one for you. -- View this message in context: http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32026697.html Sent from the mod_perl - General mailing list archive at Nabble.com.