Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26

2011-07-06 Thread HappyPerlUser

Hi I'm trying to debug a persistent error that we've had for a long time, but
has now become acute.

We always "use strict" and do extensive testing, but this problem has eluded
us so far.

Every now and again (4 or 5 times per day) a process would error with a
Windows Error 26 generating a Windows Error Reporting Popup on a terminal
server console. The user data and process looks like it has completed
without any error at the Perl and DB level: it's only the garbage collection
that's showing an error.

In the past this wasn't so bad, as the Apache2 process died and could be
restarted and no harm was done. Now that the code has been moved from a real
to a virtual server (which is allegedly identical but clearly isn't) the
httpd.exe process is hanging awaiting user input to confirm the error popup
window before terminating.

We have searched high and low and tried all sorts of registry hacks.

We've also written more watchdog code to regularly fetch pages from the
server and if they time out to call taskkill to recycle the process, which
although it works, isn't ideal.

Question 1) Does anyone have a reliable way of ignoring this Windows popup
error and letting the httpd.exe process terminate silently so that our
standard process watchdog can restart it?


We have also been looking at the output of a crash dump that we triggered
manually with MS userdump.exe. We already have symbols files for Apache2,
other modiles and mod_perl installed. The error is occurring during a string
copy during the general destructor process of mod_perl after our code has
successfully completed. 

Although the Windows error is clearly being triggered by mod_perl, we
suspect that the root cause lies much deeper and is probably something to do
with win32::OLE and not being thread safe.

We can't find any symbols files for perl itself.

Question 2) Can anyone tell us how to reliably dump trace information at the
perl level so we can trace precisely which perl program (and hopefully which
subroutine or method) is triggering the problem?

Again we've searched high and low for a perl58.pdb and perl.pdb symbol table
for the Activestate Perl build but we cannot find one anywhere. Would this
mean we have to recompile our own custom Perl kernel to get this info? If so
that would be a real pain,as then we would have altered something on the
core system.

So can anyone point me at a ready built version of symbol tables
(perl58.pdb) for Activestate Perl Win32?

If you have any other constructive tips for how to dig into this further,
please share.

Thanks
-- 
View this message in context: 
http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32004421.html
Sent from the mod_perl - General mailing list archive at Nabble.com.



Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26

2011-07-06 Thread HappyPerlUser

Thanks. Answers below.


awarnier wrote:
> 
> Q1 : are you running Apache as a Windows Service ?
> Q2 : if yes, in the Service properties, is the "allow service to interact
> with the 
> desktop" checkbox checked ?
> Q3 : still if yes, under what user-id ? is it "Local System", or another
> local user, or a 
> domain user ?
> Q4 : for such an error, there should be some message in the Windows Event
> Logs.
> What does it say there ?
> 
> 
A1. Yes Apache 2.2 running as a service
Apache/2.2.19 (Win32) mod_auth_sspi/1.0.4 mod_perl/2.0.4 Perl/v5.8.8

A2. yes. interactive checkbox is checked

A3. Local System account

A4. Exact text is
The instruction at "0x2808625d" referenced memory at "0x13715268". The
memory could not be "read".
Click on OK to terminate the program

The instruction is usually the same, although sometimes it is "0x2802627f"
The referenced memory changes every time.


http://old.nabble.com/file/p32004847/httpd-error.png 

Further stack debug that we have below:

0:999> .reload;!analyze -v;r;kv;lmnt;.logclose


Loading unloaded module list

*** WARNING: Unable to verify checksum for perl58.dll
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for
perl58.dll - 
*** WARNING: Unable to verify checksum for mod_perl.so
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for
mod_perl.so - 
***
*
*
*Exception Analysis  
*
*
*
***

*** WARNING: Unable to verify checksum for libapr-1.dll
*** WARNING: Unable to verify checksum for libhttpd.dll
*** WARNING: Unable to verify checksum for httpd.exe
*** WARNING: Unable to verify checksum for libaprutil-1.dll
GetPageUrlData failed, server returned HTTP status 404
URL requested:
http://watson.microsoft.com/StageOne/httpd_exe/2_2_19_0/perl58_dll/5_8_8_822/0008627f.htm?Retriage=1

FAULTING_IP: 
perl58!Perl_my_strlcpy+91f
2808627f 832500  and dword ptr ds:[0],0

EXCEPTION_RECORD:   -- (.exr 0x)
.exr 0x
ExceptionAddress: 2808627f (perl58!Perl_my_strlcpy+0x091f)
   ExceptionCode: c005 (Access violation)
  ExceptionFlags: 
NumberParameters: 2
   Parameter[0]: 0001
   Parameter[1]: 
Attempt to write to address 

DEFAULT_BUCKET_ID:  NULL_POINTER_WRITE

PROCESS_NAME:  httpd.exe

ERROR_CODE: (NTSTATUS) 0xc005 - The instruction at "0x%08lx" referenced
memory at "0x%08lx". The memory could not be "%s".

EXCEPTION_CODE: (NTSTATUS) 0xc005 - The instruction at "0x%08lx"
referenced memory at "0x%08lx". The memory could not be "%s".

EXCEPTION_PARAMETER1:  0001

EXCEPTION_PARAMETER2:  

WRITE_ADDRESS:   

FOLLOWUP_IP: 
perl58!Perl_my_strlcpy+91f
2808627f 832500  and dword ptr ds:[0],0

MOD_LIST: 

NTGLOBALFLAG:  0

APPLICATION_VERIFIER_FLAGS:  0

FAULTING_THREAD:  2814

PRIMARY_PROBLEM_CLASS:  NULL_POINTER_WRITE

BUGCHECK_STR:  APPLICATION_FAULT_NULL_POINTER_WRITE

LAST_CONTROL_TRANSFER:  from 1000ada9 to 2808627f

STACK_TEXT:  
WARNING: Stack unwind information not available. Following frames may be
wrong.
10b5fe68 1000ada9 00d360f4 145ce478 00c46da8 perl58!Perl_my_strlcpy+0x91f
10b5fe80 10001d3b 00d360f4 00d360f4 00d360f4
mod_perl!modperl_perl_destruct+0x5f
10b5fe98 10001dc6 00c46da8 1000268b 00bdeed0
mod_perl!modperl_interp_destroy+0x1f
10b5fea0 1000268b 00bdeed0 00bdeec0 00c46da8
mod_perl!modperl_interp_pool_destroy+0x47
10b5febc 100025a4 00bdeed0 00e8e7e0 00c46da8
mod_perl!modperl_tipool_putback_data+0xfa
10b5ff00 6eec7fdb 01046b18 01046b48 0105d0d0
mod_perl!modperl_tipool_putback_data+0x13
10b5ff24 6ff0aa33 01046b08 0001 00bdeb2c libapr_1!apr_pool_destroy+0x3b
10b5ff3c 6ff04d61 0105d0d0 0105d0d0 0105d0d0
libhttpd!ap_process_http_connection+0x83
10b5ff54 6ff05023 0105d0d0 008d3550 10b5ff84
libhttpd!ap_run_process_connection+0x21
10b5ff64 6ff1da2c 0105d0d0 00eeb568 
libhttpd!ap_process_connection+0x33
10b5ff84 77bcb530 0105d0c8   libhttpd!worker_main+0x9c
10b5ffb8 77e6482f 00e9e5b8   msvcrt!_endthreadex+0xa3
10b5ffec  77bcb4bc 00e9e5b8  kernel32!BaseThreadStart+0x34


SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  perl58!Perl_my_strlcpy+91f

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: perl58

IMAGE_NAME:  perl58.dll

DEBUG_FLR_IMAGE_TIMESTAMP:  46aff16d

STACK_COMMAND:  dt ntdll!LdrpLastDllInitializer BaseDllName ; dt
ntdll!LdrpFailureData ; ~999s; .ecxr ; kb

FAILURE_BUCKET_ID:  NULL_POINTER_WRITE_c005_perl58.dll!Perl_my_strlcpy

BUCKET_ID:  APPLICATION_FAULT_NULL_POINTER_WRITE_perl58!Perl_my_strlcp

Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26

2011-07-06 Thread HappyPerlUser



awarnier wrote:
> 
> 
>> 
>> A2. yes. interactive checkbox is checked
> 
> That is why you get this error box.
> Have you tried unchecking that option ?
> (I'm not saying that it will solve the underlying problem, but it may
> remove the annoying 
> symptom).
> 
> On a separate note : since you are running this on virtual machines, you
> may want to try 
> running Apache in a command window, from the command-line.
> 
>> 
>> A3. Local System account
> 
> That's interesting, considering you are logged-in as LocalSystem, which is
> not a domain 
> account. mod_auth_sspi is to do Windows domain authentication, isn't it ?
> 
Have now unchecked the enable interact with desktop checkbox, although I
thought we needed that for some automation scripts in the past (that ran
external windows programs on the server).

May also try running Apache from the command line, but then again as you say
it is then a different user, and quite different from running as a service.

This server is not part of a domain. Maybe we just need to remove the
mod_auth_sspi module {probably a hang over from another system}.
-- 
View this message in context: 
http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32006592.html
Sent from the mod_perl - General mailing list archive at Nabble.com.



Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26

2011-07-07 Thread HappyPerlUser



HappyPerlUser wrote:
> 
> Have now unchecked the enable interact with desktop checkbox, although I
> thought we needed that for some automation scripts in the past (that ran
> external windows programs on the server).
> 
> May also try running Apache from the command line, but then again as you
> say it is then a different user, and quite different from running as a
> service.
> 
> This server is not part of a domain. Maybe we just need to remove the
> mod_auth_sspi module {probably a hang over from another system}.
> 

Removed mod_auth_sspi module. It was a hang over. No change in behavior.

Despite the enable interact with desktop being unchecked, the popup messages
still seems to be appearing on the admin console (need to confirm this)

Running Apache from the command line and disabling our watchog did give us
some further information.

Stared httpd.exe: no output and normal service started. Requests were
serviced normally

Initially there were 2 windows processes: PID 3748 & 9000

After the server stalled, the popup appeared on the terminal server and no
further requests were serviced.

Upon hitting the OK button on the popup, there were still 2 windows
processes, but now they were PID 3748 1580

Again server stalled and a popup appeared. No web requests serviced. After
clicking OK requests were serviced again normally. Now the PIDs are 3748 and
3556. 

No messages ever appeared on the Windows cmd window running the httpd.exe
process (presume this meant that nothing was sent to stderr or stdout).

No crash dump was ever created, even though I had userdump configured to
monitor the process.

So it looks like something inside Apache is causing the popup and then
attempting to recover internally by regenerating a new worker process.

Could this be related to config parameters???
# Turn on library reload mechanism
PerlModule Apache2::Reload
PerlInitHandler Apache2::Reload

I'd like mod_perl to auto-reload, but I'd like it to do that without waiting
for the OK button on the windows popup.

regards,
-- 
View this message in context: 
http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32011207.html
Sent from the mod_perl - General mailing list archive at Nabble.com.



Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26

2011-07-07 Thread HappyPerlUser


awarnier wrote:
> 
> 
> 
> 
> What happens if you disable Apache2::Reload ?
> 
> 

Didn't think it was related, and we aren't in the habit of changing the perl
libraries, so I've disabled it for now. This does seem to have a significant
contributing effect. I haven't been able to trigger the problem now for 30
minutes and before I could hammer the server and cause the popup in a matter
of minutes. I can't explain this, and can only assume it's down to some sort
of race condition. I doubt this is the root cause, but it could reduce the
pain. Thanks.
-- 
View this message in context: 
http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32012115.html
Sent from the mod_perl - General mailing list archive at Nabble.com.



Re: Help Debugging Windows Server 2003 Win32 + Apache2.2 + mod_perl + Activestate Pelr 5.8.8 ErrorID 26

2011-07-09 Thread HappyPerlUser


awarnier wrote:
> 
> 
> 
> 
> What happens if you disable Apache2::Reload ?
> 
> 
48 hours continuous operations now without a single glitch at all. Almost
certain that Apache2::Reload was not the root cause, but it was certainly
heavily implicated in the problem and resultant instability. If anyone's
interested in a crash dump I can re-enable this module temporarily and try
and generate one for you.

-- 
View this message in context: 
http://old.nabble.com/Help-Debugging-Windows-Server-2003-Win32-%2B-Apache2.2-%2B-mod_perl-%2B-Activestate-Pelr-5.8.8-ErrorID-26-tp32004421p32026697.html
Sent from the mod_perl - General mailing list archive at Nabble.com.