Re: IIS Deadlocks happening all the sudden after years of running with no issues

2017-05-22 Thread Dominik Psenner
2017-05-22 23:54 GMT+02:00 Jim Scott :

>
>
> *From:* Dominik Psenner
> *Sent:* Monday, May 22, 2017 1:27 PM
>
>
>
>>  >> Please note that it is known that an application can also cause
>> log4net to deadlock [3].
>> >> [3] https://issues.apache.org/jira/browse/LOG4NET-298
>>
>> This particular application is not using any async/task code so not sure
>> if other scenarios apply?
>>
>
> >> That's not true. All it needs is two threads and the well known dead
> lock candidates (waits, locks, ..). Does your web application pass objects
> on to the logging framework or are all >> arguments converted to immutable
> types (string, int, bool, ..)? If the logging framework calls ToString() of
> an object, beware!
>
> I will have to do some investigating to see. However I know we log
> exceptions using log.Error(“Exception details”, ex) and it seems that every
> time this issue occurs it is happening while attempt to write an exception
> to the exception.log we have configured. Note that according to the stack
> trace it appears that the exceptions log is attempting to be rolled when
> the issue happens. I can also see now that our exceptions log is
> intermittently getting rolled whereas the other logs seem to be rolling as
> expected. One thing different about this logging configuration for
> exceptions is that we are using MinimalLock so that the file can be cleared.
>

Put this on the list of possible causes. I would rather always go with
FileAppender.ExclusiveLock
,
also from a performance point of view. I observed a speedup factor that is
often beyond 20.


>
>
>>
>>
>> We have also installed Debug Diagnostics on the machine to take memory
>> dumps and analyze the results and this made it really easy to determine the
>> deadlock issue and provide the traces previously. If you have any
>> suggestions for further diagnosing it would be appreciated. Was thinking
>> that maybe I would have enabled log4net debugging?
>>
>
> >> That's a great starting point. Maybe something fishy jumps out that
> makes it easier. You could further investigate the behavior of your web
> application when the dead lock happened.
>
> What suggestions might you have with regards to investigating the behavior
> of the web application when the deadlock happened?
>

Looks like you already figured out something I was referring to. Often it
is enough to know what the users did to track down what an application
should do (and possibly should but does not). After all it boils down to a
few scenarios and that helps a lot, just like you pointed out the
MinimalLock for exceptions.


>
> >> Please note also that 1.2.11 is from 2011 and rather old. Can you give
> a newer log4net version a shot?
>
> That was my next move actually. We use 1.2.13.0 on a few other projects
> and since we have been using reliably in those applications was going to
> see if moving to it made any difference. This project has a number of
> dependencies that are all compiled again 1.2.11.0 so was going to use the
> following approach.
>
> Update to 1.2.13.0
> Add this to web.config to redirect to 1.2.13.0
> 
> 
>   
> 
>  newVersion="1.2.13.0"/>
>   
> 
> 
>
>
I did not know this was even possible. :-) The api should still be the
same, so there should be no obstacles unless one dependency uses the strong
name assembly. Then it should (to my knowledge) accept only the one it was
linked against.

-- 
Dominik Psenner


Re: IIS Deadlocks happening all the sudden after years of running with no issues

2017-05-22 Thread Jim Scott


From: Dominik Psenner 
Sent: Monday, May 22, 2017 1:27 PM


   >> Please note that it is known that an application can also cause log4net 
to deadlock [3].
  >> [3] https://issues.apache.org/jira/browse/LOG4NET-298

  This particular application is not using any async/task code so not sure if 
other scenarios apply?

>> That's not true. All it needs is two threads and the well known dead lock 
>> candidates (waits, locks, ..). Does your web application pass objects on to 
>> the logging framework or are all >> arguments converted to immutable types 
>> (string, int, bool, ..)? If the logging framework calls ToString() of an 
>> object, beware!

I will have to do some investigating to see. However I know we log exceptions 
using log.Error(“Exception details”, ex) and it seems that every time this 
issue occurs it is happening while attempt to write an exception to the 
exception.log we have configured. Note that according to the stack trace it 
appears that the exceptions log is attempting to be rolled when the issue 
happens. I can also see now that our exceptions log is intermittently getting 
rolled whereas the other logs seem to be rolling as expected. One thing 
different about this logging configuration for exceptions is that we are using 
MinimalLock so that the file can be cleared.



  We have also installed Debug Diagnostics on the machine to take memory dumps 
and analyze the results and this made it really easy to determine the deadlock 
issue and provide the traces previously. If you have any suggestions for 
further diagnosing it would be appreciated. Was thinking that maybe I would 
have enabled log4net debugging?

>> That's a great starting point. Maybe something fishy jumps out that makes it 
>> easier. You could further investigate the behavior of your web application 
>> when the dead lock happened.

What suggestions might you have with regards to investigating the behavior of 
the web application when the deadlock happened?

>> Please note also that 1.2.11 is from 2011 and rather old. Can you give a 
>> newer log4net version a shot?

That was my next move actually. We use 1.2.13.0 on a few other projects and 
since we have been using reliably in those applications was going to see if 
moving to it made any difference. This project has a number of dependencies 
that are all compiled again 1.2.11.0 so was going to use the following approach.

Update to 1.2.13.0
Add this to web.config to redirect to 1.2.13.0

  
  








Re: IIS Deadlocks happening all the sudden after years of running with no issues

2017-05-22 Thread Dominik Psenner
2017-05-22 18:38 GMT+02:00 Jim Scott :
>
> >> Did any modifications happen lately that you are aware of?
>
> The only thing I can see is that the box has been getting regular windows
> updates which of course includes .NET and OS updates. Could it be that
> something in the framework has changed behavior of something that is now
> creating this issue?
>

Not that I am aware of, but that means nothing.


>  >> Please note that it is known that an application can also cause
> log4net to deadlock [3].
> >> [3] https://issues.apache.org/jira/browse/LOG4NET-298
>
> This particular application is not using any async/task code so not sure
> if other scenarios apply?
>

That's not true. All it needs is two threads and the well known dead lock
candidates (waits, locks, ..). Does your web application pass objects on to
the logging framework or are all arguments converted to immutable types
(string, int, bool, ..)? If the logging framework calls ToString() of an
object, beware!


> >> Further there is also this issue [4].
> >> [4] https://issues.apache.org/jira/browse/LOG4NET-178
>
> We are not seeing an issue with logging stopping and we also changed our
> app pool recycle to a specific time rather than the default of 1720
> minutes. When we have had our deadlock issue it has not been during an app
> pool recycle (unless IIS was trying to kick the process for other reasons,
> although nothing in event viewer indicating something was trying to
> recycle). Anything in this particular bug that you think I might want to
> look at that still might be related even though the symptom is not the same?
>

No idea but at the same time this is not the first thing I would spend time
on.


>
>
> We have also installed Debug Diagnostics on the machine to take memory
> dumps and analyze the results and this made it really easy to determine the
> deadlock issue and provide the traces previously. If you have any
> suggestions for further diagnosing it would be appreciated. Was thinking
> that maybe I would have enabled log4net debugging?
>

That's a great starting point. Maybe something fishy jumps out that makes
it easier. You could further investigate the behavior of your web
application when the dead lock happened.

Please note also that 1.2.11 is from 2011 and rather old. Can you give a
newer log4net version a shot?

Cheers
-- 
Dominik Psenner


Re: IIS Deadlocks happening all the sudden after years of running with no issues

2017-05-22 Thread Jim Scott

Dominik, 

Thank you for your reply. 

>> please don't expect a sensible response within hours, especially with such a 
>> complex problem. 

No problem, I looked at the archive and noticed I was the only post this month 
so was thinking that maybe a newer list existed.

>> :-) At first sight, if it did run for over two years and suddenly keeps 
>> crashing on a daily basis it is 
>> very unlikely that something that was not modified is the actual culprit. 

I agree, nothing in the application has changed as I can see all of the dates 
of the files are from the last deployment in Nov 2015. 

>> Did any modifications happen lately that you are aware of?

The only thing I can see is that the box has been getting regular windows 
updates which of course includes .NET and OS updates. Could it be that 
something in the framework has changed behavior of something that is now 
creating this issue?

>> Please note that IIS has some interesting default behavior, it would for 
>> instance start a new instance of a site before the old site is destroyed. 
>> This means that multiple instances of the same site use the same 
>> configuration and are therefore going to log and roll into the same files. 
>> This is well known [1][2] and has been raised as an issue several times on 
>> this list.
>> The symptoms you see sound very much like that, but note that this is just a 
>> guess. 
>> [1] 
>> https://www.google.it/search?q=log4net+iis+deadlock=log4net+iis+deadlock
>> [2] http://hectorcorrea.com/blog/log4net-thread-safe-but-not-process-safe/17

Yes, I read that article [2] I had our operations group modify IIS set “Disable 
Overlapped Recycle” to true. We are also only running 1 worker process so 
nothing in our install applies to the web garden scenario. Note that I figured 
this might be the culprit but we have had the same deadlock happen after this 
change.

>> Please note that it is known that an application can also cause log4net to 
>> deadlock [3]. 
>> [3] https://issues.apache.org/jira/browse/LOG4NET-298

This particular application is not using any async/task code so not sure if 
other scenarios apply?

>> Further there is also this issue [4].
>> [4] https://issues.apache.org/jira/browse/LOG4NET-178

We are not seeing an issue with logging stopping and we also changed our app 
pool recycle to a specific time rather than the default of 1720 minutes. When 
we have had our deadlock issue it has not been during an app pool recycle 
(unless IIS was trying to kick the process for other reasons, although nothing 
in event viewer indicating something was trying to recycle). Anything in this 
particular bug that you think I might want to look at that still might be 
related even though the symptom is not the same?


We have also installed Debug Diagnostics on the machine to take memory dumps 
and analyze the results and this made it really easy to determine the deadlock 
issue and provide the traces previously. If you have any suggestions for 
further diagnosing it would be appreciated. Was thinking that maybe I would 
have enabled log4net debugging?

Re: IIS Deadlocks happening all the sudden after years of running with no issues

2017-05-22 Thread Dominik Psenner

Hi Jim,

please don't expect a sensible response within hours, especially with 
such a complex problem. :-) At first sight, if it did run for over two 
years and suddenly keeps crashing on a daily basis it is very unlikely 
that something that was not modified is the actual culprit. Did any 
modifications happen lately that you are aware of?


Please note that IIS has some interesting default behavior, it would for 
instance start a new instance of a site before the old site is 
destroyed. This means that multiple instances of the same site use the 
same configuration and are therefore going to log and roll into the same 
files. This is well known [1] [2] and has been raised as an issue 
several times on this list. The symptoms you see sound very much like 
that, but note that this is just a guess. Please note that it is known 
that an application can also cause log4net to deadlock [3]. Further 
there is also this issue [4].


I hope this gives you enough information to dig deeper. Let us know what 
the actual root cause was and how you solved it. We may be able to 
condense documentation that helps future readers.


Cheers,
Dominik

[1] 
https://www.google.it/search?q=log4net+iis+deadlock=log4net+iis+deadlock

[2] http://hectorcorrea.com/blog/log4net-thread-safe-but-not-process-safe/17
[3] https://issues.apache.org/jira/browse/LOG4NET-298
[4] https://issues.apache.org/jira/browse/LOG4NET-178

On 2017-05-20 21:26, Jim Scott wrote:

Is this list still active? Would this be better to post to the dev list?

On May 19, 2017, at 3:45 PM, Jim Scott > wrote:


We have a webservice that has been running on Windows 2008R2 since 
Nov 2015 and has not had any issues until just a week or so ago. The 
webservice becomes unresponsive and after installing some debug tools 
I can see that we have a deadlock issue.
Here is some details of the last dump. Note we are running the 
project under .NET 4 and have .NET 2 dependencies in the project that 
depend on the .NET 2 1.2.11.0 version of log4net. I can see also on 
the stack trace that we had a few exceptions happen and I assume the 
blocking thread was attempting to roll our Exceptions.log file 
because the create time on the file was from yesterday. At the bottom 
of the email is a copy of our logging configuration which we wire up 
during Application_Start.
We thought it might have originally been related to permissions as 
someone had deleted the exception.log and recreated it and it was 
owned by them but we fixed that last night and we had another crash 
this morning. I am stumped since no code has changed on this box and 
nothing unusual about the type of activity. Maybe something in a .NET 
patch changed behavior?

Any assistance would be appreciated.
Thanks
Jim
Thread that appears to be blocking everything
Entry point *clr!Thread::intermediateThreadProc*
Create time *5/19/2017 1:05:32 PM*
Time spent in user mode *0 Days 00:10:06.359*
Time spent in kernel mode   *0 Days 00:02:51.203*



This thread is trying to acquire a System.Threading.ReaderWriterLock 
for *reading*


*The thread has evidence of _.net exceptions_ on the stack. Check the 
Previous .NET Exceptions Report (Exceptions in all .NET Thread 
Stacks) to view more details of the associated exception *


*.NET Call Stack*


[[HelperMethodFrame_1OBJ] (System.Array.Copy)] 
System.Array.Copy(System.Array, Int32, System.Array, Int32, Int32, 
Boolean)

mscorlib_ni!System.Array.Copy(System.Array, System.Array, Int32)+3e
mscorlib_ni!System.Security.Permissions.FileIOPermission.AddPathList(System.Security.Permissions.FileIOPermissionAccess, 
System.Security.AccessControl.AccessControlActions, System.String[], 
Boolean, Boolean, Boolean)+5a
mscorlib_ni!System.Security.Permissions.FileIOPermission..ctor(System.Security.Permissions.FileIOPermissionAccess, 
System.String[], Boolean, Boolean)+2a

mscorlib_ni!System.IO.Path.GetFullPath(System.String)+5c
mscorlib_ni!System.IO.Path.GetDirectoryName(System.String)+a2
log4net.Appender.RollingFileAppender.CombinePath(System.String, 
System.String)+38

log4net.Appender.RollingFileAppender.RollOverTime(Boolean)+16c
log4net.Appender.RollingFileAppender.AdjustFileBeforeAppend()+d3
log4net.Appender.RollingFileAppender.Append(log4net.Core.LoggingEvent)+13 


log4net.Appender.AppenderSkeleton.DoAppend(log4net.Core.LoggingEvent)+f0
log4net.Util.AppenderAttachedImpl.AppendLoopOnAppenders(log4net.Core.LoggingEvent)+a0 

log4net.Repository.Hierarchy.Logger.CallAppenders(log4net.Core.LoggingEvent)+82 

log4net.Repository.Hierarchy.Logger.ForcedLog(System.Type, 
log4net.Core.Level, System.Object, System.Exception)+4a
log4net.Repository.Hierarchy.Logger.Log(System.Type, 
log4net.Core.Level, System.Object, System.Exception)+64

log4net.Core.LogImpl.ErrorFormat(System.String, System.Object[])+6a


Example thread blocked


  Thread 36 - System ID 7332

Entry point *clr!Thread::intermediateThreadProc*
Create 

Re: IIS Deadlocks happening all the sudden after years of running with no issues

2017-05-20 Thread Jim Scott
Is this list still active? Would this be better to post to the dev list?

> On May 19, 2017, at 3:45 PM, Jim Scott  wrote:
> 
> We have a webservice that has been running on Windows 2008R2 since Nov 2015 
> and has not had any issues until just a week or so ago. The webservice 
> becomes unresponsive and after installing some debug tools I can see that we 
> have a deadlock issue.
>  
> Here is some details of the last dump. Note we are running the project under 
> .NET 4 and have .NET 2 dependencies in the project that depend on the .NET 2 
> 1.2.11.0 version of log4net. I can see also on the stack trace that we had a 
> few exceptions happen and I assume the blocking thread was attempting to roll 
> our Exceptions.log file because the create time on the file was from 
> yesterday. At the bottom of the email is a copy of our logging configuration 
> which we wire up during Application_Start.
>  
> We thought it might have originally been related to permissions as someone 
> had deleted the exception.log and recreated it and it was owned by them but 
> we fixed that last night and we had another crash this morning. I am stumped 
> since no code has changed on this box and nothing unusual about the type of 
> activity. Maybe something in a .NET patch changed behavior?
>  
> Any assistance would be appreciated.
>  
> Thanks
> Jim
>  
> Thread that appears to be blocking everything
>  
> Entry point clr!Thread::intermediateThreadProc
> Create time 5/19/2017 1:05:32 PM
> Time spent in user mode 0 Days 00:10:06.359
> Time spent in kernel mode   0 Days 00:02:51.203
> 
> 
> This thread is trying to acquire a System.Threading.ReaderWriterLock for 
> reading
> 
> The thread has evidence of .net exceptions on the stack. Check the Previous 
> .NET Exceptions Report (Exceptions in all .NET Thread Stacks) to view more 
> details of the associated exception 
> 
> .NET Call Stack
> 
> [[HelperMethodFrame_1OBJ] (System.Array.Copy)] 
> System.Array.Copy(System.Array, Int32, System.Array, Int32, Int32, Boolean)
> mscorlib_ni!System.Array.Copy(System.Array, System.Array, Int32)+3e
> mscorlib_ni!System.Security.Permissions.FileIOPermission.AddPathList(System.Security.Permissions.FileIOPermissionAccess,
>  System.Security.AccessControl.AccessControlActions, System.String[], 
> Boolean, Boolean, Boolean)+5a
> mscorlib_ni!System.Security.Permissions.FileIOPermission..ctor(System.Security.Permissions.FileIOPermissionAccess,
>  System.String[], Boolean, Boolean)+2a
> mscorlib_ni!System.IO.Path.GetFullPath(System.String)+5c
> mscorlib_ni!System.IO.Path.GetDirectoryName(System.String)+a2
> log4net.Appender.RollingFileAppender.CombinePath(System.String, 
> System.String)+38
> log4net.Appender.RollingFileAppender.RollOverTime(Boolean)+16c
> log4net.Appender.RollingFileAppender.AdjustFileBeforeAppend()+d3
> log4net.Appender.RollingFileAppender.Append(log4net.Core.LoggingEvent)+13
> log4net.Appender.AppenderSkeleton.DoAppend(log4net.Core.LoggingEvent)+f0
> log4net.Util.AppenderAttachedImpl.AppendLoopOnAppenders(log4net.Core.LoggingEvent)+a0
> log4net.Repository.Hierarchy.Logger.CallAppenders(log4net.Core.LoggingEvent)+82
> log4net.Repository.Hierarchy.Logger.ForcedLog(System.Type, 
> log4net.Core.Level, System.Object, System.Exception)+4a
> log4net.Repository.Hierarchy.Logger.Log(System.Type, log4net.Core.Level, 
> System.Object, System.Exception)+64
> log4net.Core.LogImpl.ErrorFormat(System.String, System.Object[])+6a
>  
> 
> Example thread blocked
>  
> Thread 36 - System ID 7332
> 
> Entry point clr!Thread::intermediateThreadProc
> Create time 5/19/2017 1:07:12 PM
> Time spent in user mode 0 Days 00:00:00.421
> Time spent in kernel mode   0 Days 00:00:00.109
> 
> 
> This thread is waiting to enter a .NET Lock
> 
> The thread has evidence of .net exceptions on the stack. Check the Previous 
> .NET Exceptions Report (Exceptions in all .NET Thread Stacks) to view more 
> details of the associated exception 
> 
> .NET Call Stack
> 
> [[GCFrame]]
> [[HelperMethodFrame_1OBJ] (System.Threading.Monitor.Enter)] 
> System.Threading.Monitor.Enter(System.Object)
> log4net.Appender.AppenderSkeleton.DoAppend(log4net.Core.LoggingEvent)+2b
> log4net.Util.AppenderAttachedImpl.AppendLoopOnAppenders(log4net.Core.LoggingEvent)+a0
> log4net.Repository.Hierarchy.Logger.CallAppenders(log4net.Core.LoggingEvent)+82
> log4net.Repository.Hierarchy.Logger.ForcedLog(System.Type, 
> log4net.Core.Level, System.Object, System.Exception)+4a
> log4net.Repository.Hierarchy.Logger.Log(System.Type, log4net.Core.Level, 
> System.Object, System.Exception)+64
> log4net.Core.LogImpl.InfoFormat(System.String, System.Object[])+6a
>  
>  
> Configuration file
>  
> 
> 
>  type="log4net.Appender.RollingFileAppender">
> 
>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>