Richard,
 
Thank you very much for the information.  We are going to take a pass on SP4 until seeing the documentation on 828297 and doing some more testing.
 
Side question - you mention specific stress tests when you are going to test 828297, what tools or programs are you using do this?
 
Thanks again for the information.  You have saved us a huge amount of grief. 
 
-Stuart Fuller
State of Montana
 
 

From: Puckett, Richard [mailto:[EMAIL PROTECTED]
Sent: Friday, September 19, 2003 2:48 PM
To: [EMAIL PROTECTED]
Subject: RE: [ActiveDir] SP4 or not SP4? (hotfixes 824226 & 828297)

Stuart,
 
We originally installed SP4 near the beginning of August on all of our production Domain Controllers after testing it in our (mirror of production) lab.  Within two production workdays we began to see the same issues Vladimir mentioned in his BUGTRAQ e-mail and we opened a case with MS.  Since the problem was readily identifiable, we were able to get a copy of KB824226, which we tested, then installed.  Later on in the week we found that KB824226 had introduced an as-yet unknown LSASS problem associated with global heap allocations that were not being released (below are a few of the telltale signs of a post-KB824226 DC in resource distress) which resulted in resource deprivation that caused most of the directory service-related functions to fail (failed replication, logons, LDAP queries, etc.).  At first we were concerned that the problems might have been related somehow to the RPC/DCOM vulnerability being exploited by potentially infected hosts on our network, but further analysis ruled this out. 
 
We worked with MS for approximately two weeks to find a resolution for the problem, providing ADPerf, Event, UMDH and LSASS dump data.  Eventually KB828297 came into existence from the analysis of data that we and other customers were providing.  Though MS did work hard to locate and correct the error, KB828297 did not appear in a timely enough fashion for us to use, and with more and more DCs failing we made the decision to back out of SP4 to regain host stability, regressing to SP3.  We're currently running SP4 in one of our lab configurations and are preparing to test KB828297 with some very specific stress tests to ensure we don't encounter any new issues before re-deploying SP4. 
 
Hope this data helps,
Richard
 
 

Post-KB824226 Early (and Late) Resource Consumption Warning Signs
 
Event Type: Error
Event Source: KDC
Event Category: None
Event ID: 7
Date:  8/15/2003
Time:  3:44:00 PM
User:  N/A
Computer: <DOMAIN CONTROLLER NAME>
Description:
The Security Account Manager failed a KDC request in an unexpected way. The error is in the data field. The account name was host/<workstation fqdn> and lookup type 0x48.
Data:
0000: 17 00 00 c0               ...�   
 
Event Type: Warning
Event Source: NTDS General
Event Category: Internal Processing
Event ID: 1519
Date:  8/15/2003
Time:  12:59:50 PM
User:  Everyone
Computer: <DOMAIN CONTROLLER NAME>
Description:
A Directory Service operation failed because the database has run out of version storage.  If this error repeats frequently it most likely indicates that an object that is too large for the Directory Service to handle is attempting to replicate in.  This object must be deleted or shrunk on a Directory Server where it already exists.
 
 The internal id is 2020743.
 
Event Type: Error
Event Source: NTDS General
Event Category: Internal Processing
Event ID: 1168
Date:  8/20/2003
Time:  11:52:44 PM
User:  DOMAIN\userid
Computer: <DOMAIN CONTROLLER NAME>
Description:
Error 8(8) has occurred (Internal ID 302022c).  Please contact Microsoft Product Support Services for assistance.
 
 


From: Fuller, Stuart [mailto:[EMAIL PROTECTED]
Sent: Friday, September 19, 2003 2:24 PM
To: '[EMAIL PROTECTED]'
Subject: [ActiveDir] SP4 or not SP4? (hotfixes 824226 & 828297)

I *was* planning to go ahead and install SP4 on all of our production DC's this weekend.  We have successfully tested it on our test bench and as a pilot in small separate forest. 
 
However, I have been following the notes by Vladimir Markovic on the NTbugtraq mailing list about LSASS and LDAP and those are making me a bit nervous to say the least.  (These notes deal with hotfixes 824226 and 828297). 
 
I would like any comments from admins on the list with real-world experience with SP4 and AD.  Specifically, those people running larger production environments (1,000+ users) and using applications that authenticate against AD via LDAP (e.g. PeopleSoft, Digite/Tufan, etc...).  Has anyone else experienced the problems described in 824226?  
 
I have looked at the posts on Google from the Microsoft newsgroup and there does seem to be other admins that have been affected by this.  I am trying to get a sense of whether this is a global problem or is limited to specific "unique" environments. 
 
Thanks,
Stuart Fuller
AD Dweeb
State of Montana
 

Reply via email to