(sorry about the long winded email)
Hi all,

I'm hoping for a little insight here so I can avoid another evening like last night.

Client has main office with PDC and a second DC. One satellite office with one DC. The PDC and the backup both have DNS installed. At 6pm authentication to anything at the main site started failing. I could not log on to either DC in the office with admin user. I could not log onto TS with my user account. Remote office is a timezone behind and still working...I received calls that everyone lost connection to exchange server at main office.

Authentication from remote DC continued to function.

Rebooted both DCs at main office and AD authentication resumed and dcdiag came back clean. Looking at logs I see what happened, but don't know why it was so catastrophic.

It looks like the PDC's F drive, which is a fiber connected raid array, hiccuped for some reason and was unavailable for about one minute. First error in directory services log is:

/NTDS (544) NTDSA: An attempt to write to the file "F:\NTDS\edb.log" at offset 10049536 (0x0000000000995800) for 512 (0x00000200) bytes failed after 23 seconds with system error 2 (0x00000002): "The system cannot find the file specified. "./

At this point, the DC stopped acting as a DC and began rejecting all authentication requests until a reboot was initiated. I don't understand a couple of things. First, I was actually surprised that there is NTDS folder on the F drive. This is the file server and I certainly wouldn't want it on the same partition as the company file shares. There is also a NTDS folder on the C:/windows and I see this in the logs this morning, which I would assume means it is using this partition for NTDS:
/
NTDS (544) NTDSA: Online defragmentation has completed a full pass on database 'C:\WINDOWS\NTDS\ntds.dit'/.

Can anyone help me understand why it would exist on both drives and if it is on both drives, why would this hiccup bring the network to its knees?

My other huge issue is why did my secondary DC not start authenticating. I couldn't even log onto it during the down time and it has AD and DNS running on it. There are no errors in the DS log viewer on this server during the downtime, but there are many replication errors logged on the Primary DC related to the backup DC trying to replicate to the Primary while the Primary was non responsive.

Thanks for any help, guys.

Bill

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

Reply via email to