Re: Experiencing issue where CPU is pinned at 100% - more info

2013-11-26 Thread Dave Kelsey
Had this been filed as a bug?

From: Administrator da...@gamehouse.commailto:da...@gamehouse.com
Date: Friday, November 22, 2013 at 2:29 PM
To: users@subversion.apache.orgmailto:users@subversion.apache.org 
users@subversion.apache.orgmailto:users@subversion.apache.org
Subject: Experiencing issue where CPU is pinned at 100% - more info


Config:

OS:  Windows 2012 Server 64bit
1TB SSD
8GiB RAM

Software version4.0.2-3732.117
Subversion version  1.8.3-3732.117
Running since   11/21/2013 14:06:28 PST
Repo health Total repositories: 2 - Go to 
the Repositories tab to create or discover
repositories.
Throughput on primary interface 0.0 B/s IN (over about 5 minutes); 0.0 B/s OUT 
(over about 5 minutes)
Disk Usage  as of 11/22/2013 11:10:31 PST
Used space on root volume   565.43 GB
Used space by repositories  138.65 GB
Free space on repository volume 328.81 GB
Authenticating with LDAP AD

~45 active user



We migrated Two days ago.
I dumped the repo to a new physical machine and imported it through the Web 
Interface.
Every day, at some point, the repository becomes inaccessible.
I RDC to the box and Apache is using 100% of the CPU.  The entire machine is 
performing ridiculously slow.

A restart of the Service restores CPU to normal and everyone carries on until 
the next occurrence.

I parsed the error log for yesterday counting occurrences of all AH errors:

occurrences: DEBUG Code

545112  AH01626:
206385  AH01691:
206377  AH01697:
139630  AH01384:
24  AH00361:
  4  AH01316:
  2  AH01320:
  2  AH01318:
  2  AH00408:
  2  AH00403:
  2  AH00348:
  2  AH00334:
  1  AH01284:
  1  AH00456:
  1  AH00455:
  1  AH00453:
  1  AH00431:
  1  AH00422:
  1  AH00418:
  1  AH00411:
  1  AH00407:
  1  AH00402:
  1  AH00391:
  1  AH00359:
  1  AH00357:
  1  AH00354:
  1  AH00352:
  1  AH00096:
  1  AH00094:

I looked at the error log unto the point of the restart.  I could not see 
anything obvious.
I also look in the event viewer and see nothing.

Dave Kelsey


Re: Experiencing issue where CPU is pinned at 100% - more info

2013-11-26 Thread Ben Reser
On 11/22/13 2:29 PM, Dave Kelsey wrote:
 Software version  4.0.2-3732.117

What is this a software version of?

You didn't mention what version of Apache httpd you're using.  Though I'm
guessing 2.4.x given the log message codes you gave below.  It's possible this
is actually a httpd issue and not a svn issue.

 I parsed the error log for yesterday counting occurrences of all AH errors:
 
 occurrences: DEBUG Code
 
 545112  AH01626:

Unimportant, just logging the result of the authorization hook in 
mod_authz_core.c

 206385  AH01691:

Client didn't send a password log message (mod_authnz_ldap.c).  Probably normal
since client doesn't authenticate until it's asked for authentication.

 206377  AH01697:

access granted in mod_authnz_ldap.c

 139630  AH01384:

mod_deflate.c telling how much it compressed things.

 24  AH00361:

From mpm/winnt/child.c (no idea what this means)
/* Shutdown the worker threads
 * Post worker threads blocked on the ThreadDispatch IOCompletion port

Could be normal I'm not super knowledgeable about the winnt mpm.

   4  AH01316:

Some sort of message LDAP caching modules/ldap/util_ldap.c

   2  AH01320:

LDAP SSL support unavailable modules/ldap/until_ldap.c (this looks suspicious
to me)

   2  AH01318:

logging the LDAP SDK used.  modules/ldap/util_ldap.c

   2  AH00408:

Looks like a normal operation of winnt mpm with duplicating a socket.
mpm/winnt/mpm_winnt.c

   2  AH00403:

Looks like a normal operation of winnt mpm, waiting for data on a socket.
mpm/winnt/mpm_winnt.c

   2  AH00348:

Child: Accept thread exiting.  Could be just part of a normal shutdown.
mpm/winnt/child.c

   2  AH00334:

Child: Listening on an address with a filter.  Normal operation
mpm/winnt/child.c

   1  AH01284:

ldap timed out
modules/ldap/util_ldap.c

   1  AH00456:

logging date and time the server was built.
mpm/winnt/mpm_winnt.c

   1  AH00455:

startup finished (resuming normal operation message)
mpm/winnt/mpm_winnt.c

   1  AH00453:

Child process startup, normal operation
mpm/winnt/mpm_winnt.c

   1  AH00431:

Parent forcing child to terminate, part of shutdown.
mpm/winnt/mpm_winnt.c

   1  AH00422:

shutdown signal received, part of shutdown
mpm/winnt/mpm_winnt.c

   1  AH00418:

Created child process, normal operation
mpm/winnt/mpm_winnt.c

   1  AH00411:

More normal operation of the socket duplication
mpm/winnt/mpm_winnt.c

   1  AH00407:

Child side of the previous message, child got the duplicated sockets.
mpm/winnt/mpm_winnt.c

   1  AH00402:

Sent scoreboard handle to child, normal operation
mpm/winnt/mpm_winnt.c

   1  AH00391:

Child got the soreboard from the previous message
mpm/winnt/mpm_winnt.c

   1  AH00359:

Child released start mutex, normal operation
mpm/winnt/mpm_winnt.c

   1  AH00357:

Child got an exit signal, part of shutdown
mpm/winnt/child.c

   1  AH00354:

Creation of worker threads in child, normal operation
mpm/winnt/child.c

   1  AH00352:

Child got the start mutix, normal operation
mpm/winnt/child.c

   1  AH00096:

removed pid file, part of shutdown
log.c

   1  AH00094:

logging of startup command, normal operation
log.c

 I looked at the error log unto the point of the restart.  I could not see 
 anything obvious.
 I also look in the event viewer and see nothing.

Based on the above it looks to me like you're having some sort of issue with
your LDAP connectivity.  But it's almost impossible to diagnose from the
information provided.

I suggest attaching a debugger and seeing what it's doing when it's spinning at
100% CPU.  WinDbg and the Logexts.dll will let you log system calls:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff560170%28v=vs.85%29.aspx

On 11/26/13 3:35 PM, Dave Kelsey wrote:
 Had this been filed as a bug?

No and at this point I don't see a clear cause.  May not even be a Subversion
problem.  So I wouldn't open an issue yet.



Re: Experiencing issue where CPU is pinned at 100% - more info

2013-11-26 Thread Dave Kelsey
Ben,

Thanks for responding. I'll follow up with the debug route.
I used a couple of tools to query our AD and had no trouble using the settings 
from the leap config.  We only have 96 users.  

Thanks for your help.

Dave

Sent from my iPhone

 On Nov 26, 2013, at 5:52 PM, Ben Reser b...@reser.org wrote:
 
 On 11/22/13 2:29 PM, Dave Kelsey wrote:
 Software version4.0.2-3732.117
 
 What is this a software version of?
 
 You didn't mention what version of Apache httpd you're using.  Though I'm
 guessing 2.4.x given the log message codes you gave below.  It's possible this
 is actually a httpd issue and not a svn issue.
 
 I parsed the error log for yesterday counting occurrences of all AH errors:
 
 occurrences: DEBUG Code
 
 545112  AH01626:
 
 Unimportant, just logging the result of the authorization hook in 
 mod_authz_core.c
 
 206385  AH01691:
 
 Client didn't send a password log message (mod_authnz_ldap.c).  Probably 
 normal
 since client doesn't authenticate until it's asked for authentication.
 
 206377  AH01697:
 
 access granted in mod_authnz_ldap.c
 
 139630  AH01384:
 
 mod_deflate.c telling how much it compressed things.
 
24  AH00361:
 
 From mpm/winnt/child.c (no idea what this means)
/* Shutdown the worker threads
 * Post worker threads blocked on the ThreadDispatch IOCompletion port
 
 Could be normal I'm not super knowledgeable about the winnt mpm.
 
  4  AH01316:
 
 Some sort of message LDAP caching modules/ldap/util_ldap.c
 
  2  AH01320:
 
 LDAP SSL support unavailable modules/ldap/until_ldap.c (this looks suspicious
 to me)
 
  2  AH01318:
 
 logging the LDAP SDK used.  modules/ldap/util_ldap.c
 
  2  AH00408:
 
 Looks like a normal operation of winnt mpm with duplicating a socket.
 mpm/winnt/mpm_winnt.c
 
  2  AH00403:
 
 Looks like a normal operation of winnt mpm, waiting for data on a socket.
 mpm/winnt/mpm_winnt.c
 
  2  AH00348:
 
 Child: Accept thread exiting.  Could be just part of a normal shutdown.
 mpm/winnt/child.c
 
  2  AH00334:
 
 Child: Listening on an address with a filter.  Normal operation
 mpm/winnt/child.c
 
  1  AH01284:
 
 ldap timed out
 modules/ldap/util_ldap.c
 
  1  AH00456:
 
 logging date and time the server was built.
 mpm/winnt/mpm_winnt.c
 
  1  AH00455:
 
 startup finished (resuming normal operation message)
 mpm/winnt/mpm_winnt.c
 
  1  AH00453:
 
 Child process startup, normal operation
 mpm/winnt/mpm_winnt.c
 
  1  AH00431:
 
 Parent forcing child to terminate, part of shutdown.
 mpm/winnt/mpm_winnt.c
 
  1  AH00422:
 
 shutdown signal received, part of shutdown
 mpm/winnt/mpm_winnt.c
 
  1  AH00418:
 
 Created child process, normal operation
 mpm/winnt/mpm_winnt.c
 
  1  AH00411:
 
 More normal operation of the socket duplication
 mpm/winnt/mpm_winnt.c
 
  1  AH00407:
 
 Child side of the previous message, child got the duplicated sockets.
 mpm/winnt/mpm_winnt.c
 
  1  AH00402:
 
 Sent scoreboard handle to child, normal operation
 mpm/winnt/mpm_winnt.c
 
  1  AH00391:
 
 Child got the soreboard from the previous message
 mpm/winnt/mpm_winnt.c
 
  1  AH00359:
 
 Child released start mutex, normal operation
 mpm/winnt/mpm_winnt.c
 
  1  AH00357:
 
 Child got an exit signal, part of shutdown
 mpm/winnt/child.c
 
  1  AH00354:
 
 Creation of worker threads in child, normal operation
 mpm/winnt/child.c
 
  1  AH00352:
 
 Child got the start mutix, normal operation
 mpm/winnt/child.c
 
  1  AH00096:
 
 removed pid file, part of shutdown
 log.c
 
  1  AH00094:
 
 logging of startup command, normal operation
 log.c
 
 I looked at the error log unto the point of the restart.  I could not see 
 anything obvious.
 I also look in the event viewer and see nothing.
 
 Based on the above it looks to me like you're having some sort of issue with
 your LDAP connectivity.  But it's almost impossible to diagnose from the
 information provided.
 
 I suggest attaching a debugger and seeing what it's doing when it's spinning 
 at
 100% CPU.  WinDbg and the Logexts.dll will let you log system calls:
 http://msdn.microsoft.com/en-us/library/windows/hardware/ff560170%28v=vs.85%29.aspx
 
 On 11/26/13 3:35 PM, Dave Kelsey wrote:
 Had this been filed as a bug?
 
 No and at this point I don't see a clear cause.  May not even be a Subversion
 problem.  So I wouldn't open an issue yet.
 


Experiencing issue where CPU is pinned at 100% - more info

2013-11-25 Thread Dave Kelsey
Config:

OS:  Windows 2012 Server 64bit
1TB SSD
8GiB RAM

Software version4.0.2-3732.117
Subversion version  1.8.3-3732.117
Running since   11/21/2013 14:06:28 PST
Repo health Total repositories: 2 - Go to 
the Repositories tab to create or discover
repositories.
Throughput on primary interface 0.0 B/s IN (over about 5 minutes); 0.0 B/s OUT 
(over about 5 minutes)
Disk Usage  as of 11/22/2013 11:10:31 PST
Used space on root volume   565.43 GB
Used space by repositories  138.65 GB
Free space on repository volume 328.81 GB
Authenticating with LDAP AD

~45 active user



We migrated Two days ago.
I dumped the repo to a new physical machine and imported it through the Web 
Interface.
Every day, at some point, the repository becomes inaccessible.
I RDC to the box and Apache is using 100% of the CPU.  The entire machine is 
performing ridiculously slow.

A restart of the Service restores CPU to normal and everyone carries on until 
the next occurrence.

I parsed the error log for yesterday counting occurrences of all AH errors:

occurrences: DEBUG Code

545112  AH01626:
206385  AH01691:
206377  AH01697:
139630  AH01384:
24  AH00361:
  4  AH01316:
  2  AH01320:
  2  AH01318:
  2  AH00408:
  2  AH00403:
  2  AH00348:
  2  AH00334:
  1  AH01284:
  1  AH00456:
  1  AH00455:
  1  AH00453:
  1  AH00431:
  1  AH00422:
  1  AH00418:
  1  AH00411:
  1  AH00407:
  1  AH00402:
  1  AH00391:
  1  AH00359:
  1  AH00357:
  1  AH00354:
  1  AH00352:
  1  AH00096:
  1  AH00094:

I looked at the error log unto the point of the restart.  I could not see 
anything obvious.
I also look in the event viewer and see nothing.

Dave Kelsey