RE: [ActiveDir] Ntds.dit file corruption
Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I guess another example for a distributed systems corruption could be a bad schema change, that successfully replicates out to all systems. While not a problem for the ESE Logical and Physical Layer, the AD Logical Layer Brett listed in his post could certainly be heavily affected. Most notable would be a bad schema change which would ensure that DCs could no longer boot up correctly - call it corruption or not, it sure wouldn't be pretty. /Guido -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Dienstag, 6. Dezember 2005 23:42 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration
RE: [ActiveDir] Ntds.dit file corruption
that's true for the corruptions you detect - naturally, this depends on what a corruption actually is or what we define a corruption to be. As mentioned in my previous post, I consider specific bad changes to the schema a distributed corruption, however, AD DCs will happily replicate them around... I've seen other corruptions in AD where the local store truly reported an error and refused to replicate with any DC. I wish I had the details on that incident, however the customer didn't want to delve into finding the root-cause and as such the DC was forcefully demoted, the HW was checked and the box was repromoted (after metadata cleanup etc.) /Guido From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: Montag, 5. Dezember 2005 20:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define
RE: [ActiveDir] Ntds.dit file corruption
The more I think about it the less I think of the distributed system examples as corruption, especially as you branch out and add more examples, such as bad schema values. What about if someone deleted parts of the schema, is that corruption? What if a site link is missing so a site can't replicate? What if there is a bad value for replication period in a site link? A bad SPN for a DC? I just don't see those things as corruption like the disk falling down and ESE failing the data on the integrity checks. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Grillenmeier, Guido Sent: Sunday, December 11, 2005 11:35 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I guess another example for a distributed systems corruption could be a bad schema change, that successfully replicates out to all systems. While not a problem for the ESE Logical and Physical Layer, the AD Logical Layer Brett listed in his post could certainly be heavily affected. Most notable would be a bad schema change which would ensure that DCs could no longer boot up correctly - call it corruption or not, it sure wouldn't be pretty. /Guido -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Dienstag, 6. Dezember 2005 23:42 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would
RE: [ActiveDir] Ntds.dit file corruption
Which is apples to oranges. Ed Crowley MCSE+Internet MVP Freelance E-Mail Philosopher Protecting the world from PSTs and Bricked Backups!T -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 3:27 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Yeah but the information store wasn't. :) I think the comparisons going on are between the store and the AD DIT. Totally different uses of the same database engine - ESE. joe == My ESE Engine can beat up your SQL Server. -Bratt Shirley == -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ed Crowley [MVP] Sent: Thursday, December 08, 2005 2:57 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Of course, that's comparing apples to oranges. Exchange 5.5's *directory* WAS distributed by design. Ed Crowley MCSE+Internet MVP Freelance E-Mail Philosopher Protecting the world from PSTs and Bricked Backups!T -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 6:22 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption That makes more sense - AD is resilient by its distributed design whereas Exch is not (or less so) due to its non-distributed design. A database 'corruption' in AD simply means we re-build the affected DC (since the corruption will not be replicated (we hope)) whereas in Exch, the same corruption means a lack of service and thus a much higher impact. Apologies for pursuing this to the nth degree - I was surprised to hear of a OS/ESE change which was clearly put in place to fix issues caused by dodgy hardware :) Brett alluded to memory bit flipping issues - will ESE be changed to cater for those issues, as well as disk related issues??? :) neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Al Mulnick Sent: 08 December 2005 13:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Right. Different purposes for the technology dictate different answers. A one-bit flip can happen from all sorts of things. We don't tend to notice it very much with newer hardware because a lot of it gets checked, scrubbed and then checked and scrubbed some more as it passes between subsystems. Everything from memory to disk subsystems check and recheck for integrity. It's not infallable however. It's also not impervious to administrative error in terms of misconfiguration or other items that can cause issues (faulty hardware happens.) AD is a distributed fabric made up of layers (that last bit is for Brett) and if thought of that way, you can withstand a hole in the fabric but still provide the service. Exchange is more personal in that it has a one to one relationship with the end user it provides service for. As such, if there's a hole in it, it cannot provide the service it's intended to provide. I don't see this fix as hurting AD either, but I don't see it as being nearly as important because I can just replace the faulty hardware in most environments that follow the best practice of deploying more than one AD DC per domain. It's designed to do operate that way and withstand failures under normal circumstances. Exchange is not as resilient by task. My $0.04 anyway. From: Michael B. Smith [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Thu, 8 Dec 2005 06:10:17 -0500 The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
RE: [ActiveDir] Ntds.dit file corruption
Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down
Re: [ActiveDir] Ntds.dit file corruption
I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources
RE: [ActiveDir] Ntds.dit file corruption
I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value
RE: [ActiveDir] Ntds.dit file corruption
The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about
RE: [ActiveDir] Ntds.dit file corruption
Yep, any time you can correct for an error versus just fail out on detection (or even worse not detect) it is a good thing. I expect someone was sitting around one day saying, hey you know how we detect these problems and you know how it is often a single bit... I bet we could find a way to detect which bit and fix it... Or possibly someone just realized, hey we have enough info to determine this so we don't have to throw an error... Either way... Good job. I wonder what the doubling of pages sizes in E12 (to make it the same as AD Page Sizes) will do to impact the percentages of occurrence. Honestly if it saves just one recovery a month that would probably be worth it to Exchange and probably to SBS AD as well. For non-SBS AD deployments it shouldn't be as critical because you absolutely should have multiple DCs and you just repromote, heck I don't even set up test environments without at least 2 DCs per domain (Virtualization software rocks). -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael B. Smith Sent: Thursday, December 08, 2005 6:10 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when
RE: [ActiveDir] Ntds.dit file corruption
Right. Different purposes for the technology dictate different answers. A one-bit flip can happen from all sorts of things. We don't tend to notice it very much with newer hardware because a lot of it gets checked, scrubbed and then checked and scrubbed some more as it passes between subsystems. Everything from memory to disk subsystems check and recheck for integrity. It's not infallable however. It's also not impervious to administrative error in terms of misconfiguration or other items that can cause issues (faulty hardware happens.) AD is a distributed fabric made up of layers (that last bit is for Brett) and if thought of that way, you can withstand a hole in the fabric but still provide the service. Exchange is more personal in that it has a one to one relationship with the end user it provides service for. As such, if there's a hole in it, it cannot provide the service it's intended to provide. I don't see this fix as hurting AD either, but I don't see it as being nearly as important because I can just replace the faulty hardware in most environments that follow the best practice of deploying more than one AD DC per domain. It's designed to do operate that way and withstand failures under normal circumstances. Exchange is not as resilient by task. My $0.04 anyway. From: Michael B. Smith [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Thu, 8 Dec 2005 06:10:17 -0500 The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider
RE: [ActiveDir] Ntds.dit file corruption
That makes more sense - AD is resilient by its distributed design whereas Exch is not (or less so) due to its non-distributed design. A database 'corruption' in AD simply means we re-build the affected DC (since the corruption will not be replicated (we hope)) whereas in Exch, the same corruption means a lack of service and thus a much higher impact. Apologies for pursuing this to the nth degree - I was surprised to hear of a OS/ESE change which was clearly put in place to fix issues caused by dodgy hardware :) Brett alluded to memory bit flipping issues - will ESE be changed to cater for those issues, as well as disk related issues??? :) neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Al Mulnick Sent: 08 December 2005 13:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Right. Different purposes for the technology dictate different answers. A one-bit flip can happen from all sorts of things. We don't tend to notice it very much with newer hardware because a lot of it gets checked, scrubbed and then checked and scrubbed some more as it passes between subsystems. Everything from memory to disk subsystems check and recheck for integrity. It's not infallable however. It's also not impervious to administrative error in terms of misconfiguration or other items that can cause issues (faulty hardware happens.) AD is a distributed fabric made up of layers (that last bit is for Brett) and if thought of that way, you can withstand a hole in the fabric but still provide the service. Exchange is more personal in that it has a one to one relationship with the end user it provides service for. As such, if there's a hole in it, it cannot provide the service it's intended to provide. I don't see this fix as hurting AD either, but I don't see it as being nearly as important because I can just replace the faulty hardware in most environments that follow the best practice of deploying more than one AD DC per domain. It's designed to do operate that way and withstand failures under normal circumstances. Exchange is not as resilient by task. My $0.04 anyway. From: Michael B. Smith [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Thu, 8 Dec 2005 06:10:17 -0500 The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption
RE: [ActiveDir] Ntds.dit file corruption
I'd agree with that... but I know from doing small scripts that when I put them in someone else's hands, and I start adding in detection for this, and that, and error handling, eventually what was a 20 line script has grown into 1000+ lines and loads of subs and functions without actually adding to the end result if it had been entered correctly initially. Here I'm referring to simply inputting an IP address, and then having to break it down and check it and ensure that a valid address is put back in through WMI. Probably less than the size of the code for the welcome dialog for dcpromo :0 So while it's nice to detect all the scenarios that could create corruptions or irregularities or unexpected conditions, I think sometimes we need to be able to run the Active Directory Zamboni to go through the database when everyone's asleep and find, and fix and/or report on, these irregularities. A huge and better Zamboni wouldn't slow down the whole directory in real time, and while it wouldn't be the solution to every instance, perhaps it would help us be more proactive without having to know what tools to run when for detection. Not that there isn't a Zamboni, just that maybe here are some more things for it to do. Just some ideas... Rich -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 7:53 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Yep, any time you can correct for an error versus just fail out on detection (or even worse not detect) it is a good thing. I expect someone was sitting around one day saying, hey you know how we detect these problems and you know how it is often a single bit... I bet we could find a way to detect which bit and fix it... Or possibly someone just realized, hey we have enough info to determine this so we don't have to throw an error... Either way... Good job. I wonder what the doubling of pages sizes in E12 (to make it the same as AD Page Sizes) will do to impact the percentages of occurrence. Honestly if it saves just one recovery a month that would probably be worth it to Exchange and probably to SBS AD as well. For non-SBS AD deployments it shouldn't be as critical because you absolutely should have multiple DCs and you just repromote, heck I don't even set up test environments without at least 2 DCs per domain (Virtualization software rocks). -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael B. Smith Sent: Thursday, December 08, 2005 6:10 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing guys? neil PS Great thread so far :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: 07 December 2005 01:55 To: ActiveDir
RE: [ActiveDir] Ntds.dit file corruption
Error detection and correction seems to be about 60-90% of any program that does it well, in other words most of the code is to validate you are working with what you should be working with. For user facing apps that usually means a ton of code around detecting/correcting data entry issues. For apps that don't directly interact with a user it is about data consistency and validity of anything based into it from any external source (disk, another function, etc). I once made a very very stupid comment around this. When I was in college a high level math teacher[1] once asked what I wanted to do with computers. I said I wanted to work on system software instead of user applications because I hated wasting all of the time on checking to make sure the information was correct that I was getting because users always enter stupid things. That generated a 90 minute discussion where I got the crap beat out of me for saying something so obtuse. But realistically, at the time, and until about 5 years ago for a lot of MS software, my comment was accurate. System software didn't have a lot of checks for data validity and consistency. That conversation, although it melted my ego and made me crawl back to my dorm in the bushes so people couldn't see me, drammatically changed my outlook on how software should be written. That error checking is one of the core pieces of secure code writing. If you only let through things you expect and you know you handle, it is a lot tougher to compromise a component. If I apply this to joeware, I whip up joeware tools left and right all of the time that are great for me. I know the boundaries. When I have time to spend 10 times longer on a program than I did when I initially wrote it to do what I needed then I can make it so others can use it. There is a ton of stuff in my dev\cpp folder that only I get to use and probably never will make it to anyone else simply because I don't have the time to put in all of the error correction, etc to make it safely useable by others. joe [1] I think I was in Calc IV or something like that where you have maybe 10 people in the class at Michigan State University versus the normal several hundred. It was definitely a math teacher instead of a CIS teacher though. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rich Milburn Sent: Thursday, December 08, 2005 10:46 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I'd agree with that... but I know from doing small scripts that when I put them in someone else's hands, and I start adding in detection for this, and that, and error handling, eventually what was a 20 line script has grown into 1000+ lines and loads of subs and functions without actually adding to the end result if it had been entered correctly initially. Here I'm referring to simply inputting an IP address, and then having to break it down and check it and ensure that a valid address is put back in through WMI. Probably less than the size of the code for the welcome dialog for dcpromo :0 So while it's nice to detect all the scenarios that could create corruptions or irregularities or unexpected conditions, I think sometimes we need to be able to run the Active Directory Zamboni to go through the database when everyone's asleep and find, and fix and/or report on, these irregularities. A huge and better Zamboni wouldn't slow down the whole directory in real time, and while it wouldn't be the solution to every instance, perhaps it would help us be more proactive without having to know what tools to run when for detection. Not that there isn't a Zamboni, just that maybe here are some more things for it to do. Just some ideas... Rich -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 7:53 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Yep, any time you can correct for an error versus just fail out on detection (or even worse not detect) it is a good thing. I expect someone was sitting around one day saying, hey you know how we detect these problems and you know how it is often a single bit... I bet we could find a way to detect which bit and fix it... Or possibly someone just realized, hey we have enough info to determine this so we don't have to throw an error... Either way... Good job. I wonder what the doubling of pages sizes in E12 (to make it the same as AD Page Sizes) will do to impact the percentages of occurrence. Honestly if it saves just one recovery a month that would probably be worth it to Exchange and probably to SBS AD as well. For non-SBS AD deployments it shouldn't be as critical because you absolutely should have multiple DCs and you just repromote, heck I don't even set up test environments without at least 2 DCs per domain (Virtualization software rocks). -Original Message- From: [EMAIL PROTECTED
RE: [ActiveDir] Ntds.dit file corruption
I meant to add that people complain about Windows being big and bulky but I bet a huge portion is the 60-90% you mention. And I imagine they (like me) get frustrated when they try to add just a little less of it in, and then the day after it's released, out comes Hotfix KB91 :) The difference between many of my scripts (like the 20 line version and the 1000+ lines) is the target user. I know how to enter an IP and make sure it's in the right subnet. And with 1000+ lines someone could still probably crash it. (and the difference between my 1000+ line scripts and joeware is I'm passably good at piecing together the results of hours of Google searches, but Joe knows what he's doing :) Rich -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 10:58 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Error detection and correction seems to be about 60-90% of any program that does it well, in other words most of the code is to validate you are working with what you should be working with. For user facing apps that usually means a ton of code around detecting/correcting data entry issues. For apps that don't directly interact with a user it is about data consistency and validity of anything based into it from any external source (disk, another function, etc). I once made a very very stupid comment around this. When I was in college a high level math teacher[1] once asked what I wanted to do with computers. I said I wanted to work on system software instead of user applications because I hated wasting all of the time on checking to make sure the information was correct that I was getting because users always enter stupid things. That generated a 90 minute discussion where I got the crap beat out of me for saying something so obtuse. But realistically, at the time, and until about 5 years ago for a lot of MS software, my comment was accurate. System software didn't have a lot of checks for data validity and consistency. That conversation, although it melted my ego and made me crawl back to my dorm in the bushes so people couldn't see me, drammatically changed my outlook on how software should be written. That error checking is one of the core pieces of secure code writing. If you only let through things you expect and you know you handle, it is a lot tougher to compromise a component. If I apply this to joeware, I whip up joeware tools left and right all of the time that are great for me. I know the boundaries. When I have time to spend 10 times longer on a program than I did when I initially wrote it to do what I needed then I can make it so others can use it. There is a ton of stuff in my dev\cpp folder that only I get to use and probably never will make it to anyone else simply because I don't have the time to put in all of the error correction, etc to make it safely useable by others. joe [1] I think I was in Calc IV or something like that where you have maybe 10 people in the class at Michigan State University versus the normal several hundred. It was definitely a math teacher instead of a CIS teacher though. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rich Milburn Sent: Thursday, December 08, 2005 10:46 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I'd agree with that... but I know from doing small scripts that when I put them in someone else's hands, and I start adding in detection for this, and that, and error handling, eventually what was a 20 line script has grown into 1000+ lines and loads of subs and functions without actually adding to the end result if it had been entered correctly initially. Here I'm referring to simply inputting an IP address, and then having to break it down and check it and ensure that a valid address is put back in through WMI. Probably less than the size of the code for the welcome dialog for dcpromo :0 So while it's nice to detect all the scenarios that could create corruptions or irregularities or unexpected conditions, I think sometimes we need to be able to run the Active Directory Zamboni to go through the database when everyone's asleep and find, and fix and/or report on, these irregularities. A huge and better Zamboni wouldn't slow down the whole directory in real time, and while it wouldn't be the solution to every instance, perhaps it would help us be more proactive without having to know what tools to run when for detection. Not that there isn't a Zamboni, just that maybe here are some more things for it to do. Just some ideas... Rich -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 7:53 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Yep, any time you can correct for an error versus just fail out on detection (or even worse
RE: [ActiveDir] Ntds.dit file corruption
Net share joesdevfolder -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 11:58 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Error detection and correction seems to be about 60-90% of any program that does it well, in other words most of the code is to validate you are working with what you should be working with. For user facing apps that usually means a ton of code around detecting/correcting data entry issues. For apps that don't directly interact with a user it is about data consistency and validity of anything based into it from any external source (disk, another function, etc). I once made a very very stupid comment around this. When I was in college a high level math teacher[1] once asked what I wanted to do with computers. I said I wanted to work on system software instead of user applications because I hated wasting all of the time on checking to make sure the information was correct that I was getting because users always enter stupid things. That generated a 90 minute discussion where I got the crap beat out of me for saying something so obtuse. But realistically, at the time, and until about 5 years ago for a lot of MS software, my comment was accurate. System software didn't have a lot of checks for data validity and consistency. That conversation, although it melted my ego and made me crawl back to my dorm in the bushes so people couldn't see me, drammatically changed my outlook on how software should be written. That error checking is one of the core pieces of secure code writing. If you only let through things you expect and you know you handle, it is a lot tougher to compromise a component. If I apply this to joeware, I whip up joeware tools left and right all of the time that are great for me. I know the boundaries. When I have time to spend 10 times longer on a program than I did when I initially wrote it to do what I needed then I can make it so others can use it. There is a ton of stuff in my dev\cpp folder that only I get to use and probably never will make it to anyone else simply because I don't have the time to put in all of the error correction, etc to make it safely useable by others. joe [1] I think I was in Calc IV or something like that where you have maybe 10 people in the class at Michigan State University versus the normal several hundred. It was definitely a math teacher instead of a CIS teacher though. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rich Milburn Sent: Thursday, December 08, 2005 10:46 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I'd agree with that... but I know from doing small scripts that when I put them in someone else's hands, and I start adding in detection for this, and that, and error handling, eventually what was a 20 line script has grown into 1000+ lines and loads of subs and functions without actually adding to the end result if it had been entered correctly initially. Here I'm referring to simply inputting an IP address, and then having to break it down and check it and ensure that a valid address is put back in through WMI. Probably less than the size of the code for the welcome dialog for dcpromo :0 So while it's nice to detect all the scenarios that could create corruptions or irregularities or unexpected conditions, I think sometimes we need to be able to run the Active Directory Zamboni to go through the database when everyone's asleep and find, and fix and/or report on, these irregularities. A huge and better Zamboni wouldn't slow down the whole directory in real time, and while it wouldn't be the solution to every instance, perhaps it would help us be more proactive without having to know what tools to run when for detection. Not that there isn't a Zamboni, just that maybe here are some more things for it to do. Just some ideas... Rich -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Thursday, December 08, 2005 7:53 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Yep, any time you can correct for an error versus just fail out on detection (or even worse not detect) it is a good thing. I expect someone was sitting around one day saying, hey you know how we detect these problems and you know how it is often a single bit... I bet we could find a way to detect which bit and fix it... Or possibly someone just realized, hey we have enough info to determine this so we don't have to throw an error... Either way... Good job. I wonder what the doubling of pages sizes in E12 (to make it the same as AD Page Sizes) will do to impact the percentages of occurrence. Honestly if it saves just one recovery a month that would probably be worth it to Exchange and probably to SBS AD as well. For non-SBS AD deployments it shouldn't
RE: [ActiveDir] Ntds.dit file corruption
Of course, that's comparing apples to oranges. Exchange 5.5's *directory* WAS distributed by design. Ed Crowley MCSE+Internet MVP Freelance E-Mail Philosopher Protecting the world from PSTs and Bricked Backups!T -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 6:22 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption That makes more sense - AD is resilient by its distributed design whereas Exch is not (or less so) due to its non-distributed design. A database 'corruption' in AD simply means we re-build the affected DC (since the corruption will not be replicated (we hope)) whereas in Exch, the same corruption means a lack of service and thus a much higher impact. Apologies for pursuing this to the nth degree - I was surprised to hear of a OS/ESE change which was clearly put in place to fix issues caused by dodgy hardware :) Brett alluded to memory bit flipping issues - will ESE be changed to cater for those issues, as well as disk related issues??? :) neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Al Mulnick Sent: 08 December 2005 13:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Right. Different purposes for the technology dictate different answers. A one-bit flip can happen from all sorts of things. We don't tend to notice it very much with newer hardware because a lot of it gets checked, scrubbed and then checked and scrubbed some more as it passes between subsystems. Everything from memory to disk subsystems check and recheck for integrity. It's not infallable however. It's also not impervious to administrative error in terms of misconfiguration or other items that can cause issues (faulty hardware happens.) AD is a distributed fabric made up of layers (that last bit is for Brett) and if thought of that way, you can withstand a hole in the fabric but still provide the service. Exchange is more personal in that it has a one to one relationship with the end user it provides service for. As such, if there's a hole in it, it cannot provide the service it's intended to provide. I don't see this fix as hurting AD either, but I don't see it as being nearly as important because I can just replace the faulty hardware in most environments that follow the best practice of deploying more than one AD DC per domain. It's designed to do operate that way and withstand failures under normal circumstances. Exchange is not as resilient by task. My $0.04 anyway. From: Michael B. Smith [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Thu, 8 Dec 2005 06:10:17 -0500 The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: 08 December 2005 09:28 To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I thought it was Windows 2003 sp1 that had additional database correction stuff? [EMAIL PROTECTED] wrote: Maybe I should flip the question around a little... What are the changes made in exch2k3 sp1 (involving ECC corrections) and why were they deemed necessary, given what I have read from joe/Brett/Eric et al)?? The changes appear to be superfluous. We do not appear to need such a (further) check re: AD/ESE(?) What am I missing
RE: [ActiveDir] Ntds.dit file corruption
Yeah but the information store wasn't. :) I think the comparisons going on are between the store and the AD DIT. Totally different uses of the same database engine - ESE. joe == My ESE Engine can beat up your SQL Server. -Bratt Shirley == -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ed Crowley [MVP] Sent: Thursday, December 08, 2005 2:57 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Of course, that's comparing apples to oranges. Exchange 5.5's *directory* WAS distributed by design. Ed Crowley MCSE+Internet MVP Freelance E-Mail Philosopher Protecting the world from PSTs and Bricked Backups!T -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 6:22 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption That makes more sense - AD is resilient by its distributed design whereas Exch is not (or less so) due to its non-distributed design. A database 'corruption' in AD simply means we re-build the affected DC (since the corruption will not be replicated (we hope)) whereas in Exch, the same corruption means a lack of service and thus a much higher impact. Apologies for pursuing this to the nth degree - I was surprised to hear of a OS/ESE change which was clearly put in place to fix issues caused by dodgy hardware :) Brett alluded to memory bit flipping issues - will ESE be changed to cater for those issues, as well as disk related issues??? :) neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Al Mulnick Sent: 08 December 2005 13:55 To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Right. Different purposes for the technology dictate different answers. A one-bit flip can happen from all sorts of things. We don't tend to notice it very much with newer hardware because a lot of it gets checked, scrubbed and then checked and scrubbed some more as it passes between subsystems. Everything from memory to disk subsystems check and recheck for integrity. It's not infallable however. It's also not impervious to administrative error in terms of misconfiguration or other items that can cause issues (faulty hardware happens.) AD is a distributed fabric made up of layers (that last bit is for Brett) and if thought of that way, you can withstand a hole in the fabric but still provide the service. Exchange is more personal in that it has a one to one relationship with the end user it provides service for. As such, if there's a hole in it, it cannot provide the service it's intended to provide. I don't see this fix as hurting AD either, but I don't see it as being nearly as important because I can just replace the faulty hardware in most environments that follow the best practice of deploying more than one AD DC per domain. It's designed to do operate that way and withstand failures under normal circumstances. Exchange is not as resilient by task. My $0.04 anyway. From: Michael B. Smith [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Thu, 8 Dec 2005 06:10:17 -0500 The existing mechanism place in Exchange 2003 prior to sp1 was able to detect problems, and ensure that they didn't cause problems in the Exchange environment -- however that could mean that a store was shut down when a -1018 was detected. And that's a real problem to the user environment! Correcting a single bit error (which can be caused by hardware failure, firmware failure, cosmic rays, or mind control (I'm sure)) allows the store to continue operating about 40% (a significant number) of the time. This results in a noticable reduction of support calls to PSS. :-) I've got notes around here somewhere, but my memory vaguely says that the change was to take the physical page number 32-bit value in the database record header and turn it into an ECC value. The database is updated, record by record, as each record gets updated. Could such a change benefit A/D? I don't see why not. It's probably not as dramatic an improvement though -- the reaction of Exchange to a one-bit error was to shut down the entire store. A/D apparently just fails the current request. Depending on the request, that could be a big deal - or not. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, December 08, 2005 4:38 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I was referring to this KB: http://support.microsoft.com/?kbid=867626 neil -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP
RE: [ActiveDir] Ntds.dit file corruption
Distributed systems hurt the head in that it is not clear *where* the problem is. It is hard to point a finger at something/someone and say there's the issue! when the issue lies in the state in which some number of servers exist relative to one another. However, in a system which aims to provide convergence (in mission and in assumption by clients), such divergence is, I think, corruption. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Tuesday, December 06, 2005 5:55 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical
RE: [ActiveDir] Ntds.dit file corruption
Replication is at an attribute level and the corruption is usually a bit flip - whichisn't replicated. The data itself (a table or an index) is checked and if found to be invalid, I *believe* (joe, ~Eric, brettsh) is marked as such and is no longer replicated. -r --Posting is provided "AS IS", and confers no rights or warranties ... From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]Sent: Tuesday, December 06, 2005 2:49 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Is this guaranteed? How can we/you be sure that the system will recognise the corruptions and therefore not replicate them? Surely this is akin to the new feature added to e2k3 sp1, but which is (sadly) missing from AD(?) I must be missing a subtle point - please show me the light :) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: 05 December 2005 19:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered you
RE: [ActiveDir] Ntds.dit file corruption
I've been informed that I'm wrong on this. Please ignore, and listen to joe/~Eric/Dean/Brett/Anyone else. Cheers! -r --Posting is provided "AS IS", and confers no rights or warranties ... From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rick KingslanSent: Wednesday, December 07, 2005 5:19 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Replication is at an attribute level and the corruption is usually a bit flip - whichisn't replicated. The data itself (a table or an index) is checked and if found to be invalid, I *believe* (joe, ~Eric, brettsh) is marked as such and is no longer replicated. -r --Posting is provided "AS IS", and confers no rights or warranties ... From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]Sent: Tuesday, December 06, 2005 2:49 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Is this guaranteed? How can we/you be sure that the system will recognise the corruptions and therefore not replicate them? Surely this is akin to the new feature added to e2k3 sp1, but which is (sadly) missing from AD(?) I must be missing a subtle point - please show me the light :) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: 05 December 2005 19:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one a
RE: [ActiveDir] Ntds.dit file corruption
I'm tempted to open up the 'Novell were doing this back in '93' debate again, but won't ... and as for "comparing" what Novell did with the PDC/BDC model... that just doesn't deserve a comment at all :)) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sullivan TimSent: 06 December 2005 03:38To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption BDC From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carpenter Robert A Contr WROCI/Enterprise IT Sent: Monday, December 05, 2005 5:33 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Novell. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 11:24 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the
RE: [ActiveDir] Ntds.dit file corruption
Is this guaranteed? How can we/you be sure that the system will recognise the corruptions and therefore not replicate them? Surely this is akin to the new feature added to e2k3 sp1, but which is (sadly) missing from AD(?) I must be missing a subtle point - please show me the light :) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: 05 December 2005 19:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define "ntds.dit file corruption" for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To:
RE: [ActiveDir] Ntds.dit file corruption
Ack you left Alliance. Well crap. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: Tuesday, December 06, 2005 12:49 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption For full disclosure I am no longer in the Microsoft Services organization, I was the last time Joe talked to me where I was an Advisory Support Engineer (AKA Alliance Support). I am now a Product Technology Specialist for Directories and Identities in Microsoft's technical pre-sales organization. Not that it changes the answer below. :-) Thanks, -Steve Steve Linehan | Technology Specialist Directories Identities | South Central District | Microsoft Corporation From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joeSent: Monday, December 05, 2005 2:38 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption RODCs are a LongHorn feature. It will be one-way replication to the RODCs. They will not replicate out anything. If you are on the LongHorn beta you should be able to test this right now. But as Steve (one of the really good PSS guys)said and I can concur as I have seen my share of corrupted DITs, the corruption doesn't replicate. In every case I have seen it the problem has been hardware failure or a firmware/driver matchup issue in the disk subsystem. Fixing them is easy, wipe the machine, do hardware tests, if it passes, do it again. If it passes do it a third time. If it passes, reload and repromo. If it fails one of the tests, get the hardware fixed, reload, and repromo. If SBS, well you have all sorts of issues in that basket as your eggs leak. joe From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 2:24 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely
RE: [ActiveDir] Ntds.dit file corruption
I may get into trouble with this post as Brett/Eric/Dean/Steve correct me... But that will be good. I will start with tryingto differentiate between types of corruption... My idea of AD corruption is underlying table corruption. However some people may consider bad (really unexpected)values in AD to be corruption. The last isn't corruption, AD is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the attribute. If you have the DN of a user in the siteObject attribute that isn't corruption, it isn't good, but it is valid for the schema. Or if you have binary data in a unicode string, again, not corruption (a unicode string IS binary data). That being said, if apps (including parts of AD itself) hit unexpected data, you will have some issues even if it isn't truly "corruption" it may as well be in some cases. In fact, table corruption is probably better than unexpected data in many cases. You might be able to argue that a USN rollback is corruption but I still don't consider it so. Valid data, just out of step. Again corruption to me is in the underlying tables. Since AD doesn't replicate the table structures, you can't pass that table corruption around. Once AD realizes that some portion of the database is corrupt which would probably be recognized byESE saying, "that isn't right" and not passing info back up to higher levels, but instead passing an error. joe From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]Sent: Tuesday, December 06, 2005 3:49 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Is this guaranteed? How can we/you be sure that the system will recognise the corruptions and therefore not replicate them? Surely this is akin to the new feature added to e2k3 sp1, but which is (sadly) missing from AD(?) I must be missing a subtle point - please show me the light :) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: 05 December 2005 19:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind
RE: [ActiveDir] Ntds.dit file corruption
BDC.. Yes and no.. Yes it is read only copy of the PDC's database,but no you do not have an option to choose.Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Sullivan TimSent: Monday, December 05, 2005 7:38 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption BDC From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carpenter Robert A Contr WROCI/Enterprise IT Sent: Monday, December 05, 2005 5:33 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Novell. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 11:24 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest
RE: [ActiveDir] Ntds.dit file corruption
Well you have the option to chose what DCs will be RODCs or which will be normal, you just don't have the ability to switch on the fly. Also the replication mechanism isn't the same as the NT4 PDC/BDC relationship. It is the AD replication, but nothing can pull from an RODC. Also, you will be probably be able to make someone an Admin on an RODC for local server stuff who doesn't have admin rights on other DCs. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Tuesday, December 06, 2005 11:57 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption BDC.. Yes and no.. Yes it is read only copy of the PDC's database,but no you do not have an option to choose.Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Sullivan TimSent: Monday, December 05, 2005 7:38 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption BDC From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carpenter Robert A Contr WROCI/Enterprise IT Sent: Monday, December 05, 2005 5:33 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Novell. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 11:24 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self
Re: [ActiveDir] Ntds.dit file corruption
Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, *Medeiros, Jose* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest. Maybe I am just being a worry wort and this really is not an issue. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Monday, December 05, 2005 8:53 AM To: ActiveDir@mail.activedir.org mailto:ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I did? :-) I think I still said all I know is what the poster said :-) I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun. The directory services one is filled with events 'post' blow up. What is interesting is that it seems to me big server land goes .. oh yeah... ntds.dit corruption... and sbsland freaks out. Either we do indeed need to ensure we have a secondary DC or we need to park a second
RE: [ActiveDir] Ntds.dit file corruption
In the Microsoft book it is dead too. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 12:28 PM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, *Medeiros, Jose* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest. Maybe I am just being a worry wort and this really is not an issue. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Monday, December 05, 2005 8:53 AM To: ActiveDir@mail.activedir.org mailto:ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I did? :-) I think I still said all I know is what the poster said :-) I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see
RE: [ActiveDir] Ntds.dit file corruption
Hi Susan, With all do respect, I think you missed the point. The concept of having a read only DC is similar to a BDC since a BDC is only has a read only copy of the PDC's database. In some situations you may want a read only DC at a small remote office. Which would help reduce replication traffic. Also most technologies are built on past concepts and are hierarchical. Understanding one concept helps you to understand the logic in another. Peace! Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 9:28 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, *Medeiros, Jose* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest. Maybe I am just being a worry wort and this really is not an issue. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]]On Behalf
Re: [ActiveDir] Ntds.dit file corruption
True, but right now, today, we have what we have. From what I'm hearing the corruption won't be replicated, but a longer term solution won't be in play until Longhorn/Vista. Medeiros, Jose wrote: Hi Susan, With all do respect, I think you missed the point. The concept of having a read only DC is similar to a BDC since a BDC is only has a read only copy of the PDC's database. In some situations you may want a read only DC at a small remote office. Which would help reduce replication traffic. Also most technologies are built on past concepts and are hierarchical. Understanding one concept helps you to understand the logic in another. Peace! Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 9:28 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, *Medeiros, Jose* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest. Maybe I am just being a worry wort and this really is not an issue. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL
RE: [ActiveDir] Ntds.dit file corruption
True.. But by bringing it up ( Which is what you did when your SBS server's NTDS.DIT file became Corrupt ) we hopefully can encourage the Microsoft team that monitiors this list into incoprating such features in the next release. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 10:08 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption True, but right now, today, we have what we have. From what I'm hearing the corruption won't be replicated, but a longer term solution won't be in play until Longhorn/Vista. Medeiros, Jose wrote: Hi Susan, With all do respect, I think you missed the point. The concept of having a read only DC is similar to a BDC since a BDC only has a read only copy of the PDC's database. In some situations you may want a read only DC at a small remote office. Which would help reduce replication traffic. Also most technologies are built on past concepts and are hierarchical. Understanding one concept helps you to understand the logic in another. Peace! Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 9:28 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, *Medeiros, Jose* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many
RE: [ActiveDir] Ntds.dit file corruption
I think the topic shifted a little, specifically it shifted from the corruption aspect and into the concept of read only DCs. The read only DCs really have no bearing on directory corruption. I haven't seen details on what kind of corruption and how it was detected but if it is real corruption that is ESE level and not much AD can do about it but ESE can do things about it like the single bit correction he pointed out. Anyway, I don't expect RODCs to be a big hit for SBS deployments. ;o) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 1:08 PM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption True, but right now, today, we have what we have. From what I'm hearing the corruption won't be replicated, but a longer term solution won't be in play until Longhorn/Vista. Medeiros, Jose wrote: Hi Susan, With all do respect, I think you missed the point. The concept of having a read only DC is similar to a BDC since a BDC is only has a read only copy of the PDC's database. In some situations you may want a read only DC at a small remote office. Which would help reduce replication traffic. Also most technologies are built on past concepts and are hierarchical. Understanding one concept helps you to understand the logic in another. Peace! Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 9:28 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, *Medeiros, Jose* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people
RE: [ActiveDir] Ntds.dit file corruption
Great topic and, IMO, great answer ... I've only a few comments in addition to Joe's reply (inline). --Dean WellsMSEtechnology* Email: dwells@msetechnology.comhttp://msetechnology.com From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joeSent: Tuesday, December 06, 2005 8:56 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I may get into trouble with this post as Brett/Eric/Dean/Steve correct me... But that will be good. [DAW] I'm fairly certain Brattwill have something to say on this one (in his shoes, I know I would). [/DAW] I will start with tryingto differentiate between types of corruption... My idea of AD corruption is underlying table corruption. However some people may consider bad (really unexpected)values in AD to be corruption. The last isn't corruption, AD is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the attribute. If you have the DN of a user in the siteObject attribute that isn't corruption, it isn't good, but it is valid for the schema. Or if you have binary data in a unicode string, again, not corruption (a unicode string IS binary data). That being said, if apps (including parts of AD itself) hit unexpected data, you will have some issues even if it isn't truly "corruption" it may as well be in some cases. In fact, table corruption is probably better than unexpected data in many cases. You might be able to argue that a USN rollback is corruption but I still don't consider it so. Valid data, just out of step. [DAW] That's an interesting one. If you treat thedistributed database as a whole, then USN rollback is indeed a form of corruption even though each instance may deem itselfconsistent and intact. [/DAW] Again corruption to me is in the underlying tables. Since AD doesn't replicate the table structures, you can't pass that table corruption around. Once AD realizes that some portion of the database is corrupt which would probably be recognized byESE saying, "that isn't right" and not passing info back up to higher levels, but instead passing an error. joe From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]Sent: Tuesday, December 06, 2005 3:49 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Is this guaranteed? How can we/you be sure that the system will recognise the corruptions and therefore not replicate them? Surely this is akin to the new feature added to e2k3 sp1, but which is (sadly) missing from AD(?) I must be missing a subtle point - please show me the light :) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: 05 December 2005 19:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the defa
RE: [ActiveDir] Ntds.dit file corruption
My apologies to the list members for taking this issue slightly off topic, I hope that no one is offended by such remarks or the additional email. Peace ! :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of joe Sent: Tuesday, December 06, 2005 10:58 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I think the topic shifted a little, specifically it shifted from the corruption aspect and into the concept of read only DCs. The read only DCs really have no bearing on directory corruption. I haven't seen details on what kind of corruption and how it was detected but if it is real corruption that is ESE level and not much AD can do about it but ESE can do things about it like the single bit correction he pointed out. Anyway, I don't expect RODCs to be a big hit for SBS deployments. ;o) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 1:08 PM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption True, but right now, today, we have what we have. From what I'm hearing the corruption won't be replicated, but a longer term solution won't be in play until Longhorn/Vista. Medeiros, Jose wrote: Hi Susan, With all do respect, I think you missed the point. The concept of having a read only DC is similar to a BDC since a BDC is only has a read only copy of the PDC's database. In some situations you may want a read only DC at a small remote office. Which would help reduce replication traffic. Also most technologies are built on past concepts and are hierarchical. Understanding one concept helps you to understand the logic in another. Peace! Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 9:28 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday, December 05, 2005 11:04 AM *To:* ActiveDir@mail.activedir.org *Subject:* Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment
RE: [ActiveDir] Ntds.dit file corruption
LOL. I enjoyed it which means it is all good as you all exist for my personal entertainment. ;o) Well except for Laura, she exists to hound me to the end of my existence on commas. very glad that you can't throw virtual vegetables at list posters joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, Jose Sent: Tuesday, December 06, 2005 2:26 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption My apologies to the list members for taking this issue slightly off topic, I hope that no one is offended by such remarks or the additional email. Peace ! :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of joe Sent: Tuesday, December 06, 2005 10:58 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I think the topic shifted a little, specifically it shifted from the corruption aspect and into the concept of read only DCs. The read only DCs really have no bearing on directory corruption. I haven't seen details on what kind of corruption and how it was detected but if it is real corruption that is ESE level and not much AD can do about it but ESE can do things about it like the single bit correction he pointed out. Anyway, I don't expect RODCs to be a big hit for SBS deployments. ;o) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 1:08 PM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption True, but right now, today, we have what we have. From what I'm hearing the corruption won't be replicated, but a longer term solution won't be in play until Longhorn/Vista. Medeiros, Jose wrote: Hi Susan, With all do respect, I think you missed the point. The concept of having a read only DC is similar to a BDC since a BDC is only has a read only copy of the PDC's database. In some situations you may want a read only DC at a small remote office. Which would help reduce replication traffic. Also most technologies are built on past concepts and are hierarchical. Understanding one concept helps you to understand the logic in another. Peace! Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Tuesday, December 06, 2005 9:28 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Additional Domain controller BDC is a nt4 concept and in my book NT4 is dead ;-) Medeiros, Jose wrote: BDC.. Yes and no.. Yes it is read only copy of the PDC's database, but no you do not have an option to choose. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Sullivan Tim *Sent:* Monday, December 05, 2005 7:38 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption BDC *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Carpenter Robert A Contr WROCI/Enterprise IT *Sent:* Monday, December 05, 2005 5:33 PM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption Novell. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Medeiros, Jose *Sent:* Monday, December 05, 2005 11:24 AM *To:* ActiveDir@mail.activedir.org *Subject:* RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of *Phil Renouf *Sent:* Monday
RE: [ActiveDir] Ntds.dit file corruption
I stopped reading after"great answer"... :) From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dean WellsSent: Tuesday, December 06, 2005 2:14 PMTo: Send - AD mailing listSubject: RE: [ActiveDir] Ntds.dit file corruption Great topic and, IMO, great answer ... I've only a few comments in addition to Joe's reply (inline). --Dean WellsMSEtechnology* Email: dwells@msetechnology.comhttp://msetechnology.com From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joeSent: Tuesday, December 06, 2005 8:56 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I may get into trouble with this post as Brett/Eric/Dean/Steve correct me... But that will be good. [DAW] I'm fairly certain Brattwill have something to say on this one (in his shoes, I know I would). [/DAW] I will start with tryingto differentiate between types of corruption... My idea of AD corruption is underlying table corruption. However some people may consider bad (really unexpected)values in AD to be corruption. The last isn't corruption, AD is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the attribute. If you have the DN of a user in the siteObject attribute that isn't corruption, it isn't good, but it is valid for the schema. Or if you have binary data in a unicode string, again, not corruption (a unicode string IS binary data). That being said, if apps (including parts of AD itself) hit unexpected data, you will have some issues even if it isn't truly "corruption" it may as well be in some cases. In fact, table corruption is probably better than unexpected data in many cases. You might be able to argue that a USN rollback is corruption but I still don't consider it so. Valid data, just out of step. [DAW] That's an interesting one. If you treat thedistributed database as a whole, then USN rollback is indeed a form of corruption even though each instance may deem itselfconsistent and intact. [/DAW] Again corruption to me is in the underlying tables. Since AD doesn't replicate the table structures, you can't pass that table corruption around. Once AD realizes that some portion of the database is corrupt which would probably be recognized byESE saying, "that isn't right" and not passing info back up to higher levels, but instead passing an error. joe From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]Sent: Tuesday, December 06, 2005 3:49 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Is this guaranteed? How can we/you be sure that the system will recognise the corruptions and therefore not replicate them? Surely this is akin to the new feature added to e2k3 sp1, but which is (sadly) missing from AD(?) I must be missing a subtle point - please show me the light :) neil From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve LinehanSent: 05 December 2005 19:26To: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December
RE: [ActiveDir] Ntds.dit file corruption
I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have. Dropping back to the above hypothetical as an ESE dev I can say to the AD devs that until they can prove that ESE actually lost thier column, that it's most likely some sort of AD transactional problem, and the source is an AD bug. If I am feeling unbusy I will debug at the AD logical layer, because I know what it's supposed to look like. Coming back to the original issue of replicating _this kind_ of corruption a normal corruption coming bottom up, because the bits we (ESE) sent down the disk subsystem, were not the exact bits we got back later from the sub-systems is almost always detected by the fact that ESE checksums _every byte_ of it's database pages ... and at this point everyone should be very thankful Win2k3 AD isn't on SQL 2000, because it has few such protections, though SQL 2005 finally caught up, 10 years after the fact, it's such a legacy DB, really ... anyway. When the corruption comes up from the bottom, what happens is ESE detects the data is not checksumming, logs an event, and returns a -1018 error (in this case), and starts rejecting DB operations (such as JetSeek() / JetRetrieveColumn()) that involve that corrupt database page. AD then responds to these failed DB ops with can't authenticate a user, AD can't return the results of a search, or AD can't read or apply data during replication (those 3 at least probably being the most common). In short the system starts limping, without affecting the rest of the distributed system. Coming back to jose's worry of old hardware injecting bad data into the distributed system. Fortunately, when the disk subsystem goes bad, ESE does a pretty good job of protecting you, but there are other sources of corruption, besides corruption, an especially insidious one is the bit flip in memory (and yes I see these too) which injects itself in the middle of the above stack. This kind of corruption can both end up making it's way down to the disk subsystem (with a valid ESE checksum), and up and out to the distributed system. From the perspective of older hardware though, I would _hypothesize_ that if you're going to have something go bad the disk or the memory over time, keep in mind the disk is the only part of the computer that has a moving part. I would expect disks to go bad first. I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... From a storage developers perspective, what someone usually calls corruption, is when the data layer they own or lower returns the wrong result. From a non-storage developers perspective, what someone usually calls corruption, is when the data layer below them returns the wrong result. I'll wax more philosophically on it later Cheers, BrettSh On Tue, 6 Dec 2005, Dean Wells wrote: Great topic and, IMO, great answer ... I've only a few comments in addition to Joe's reply (inline). -- Dean Wells MSEtechnology * Email: dwells mailto:[EMAIL PROTECTED] @msetechnology.com http://msetechnology.com/ http://msetechnology.com _ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Tuesday, December 06, 2005 8:56 AM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I may get into trouble with this post as Brett/Eric/Dean
Re: [ActiveDir] Ntds.dit file corruption
On 12/6/05, joe [EMAIL PROTECTED] wrote: LOL. I enjoyed it which means it is all good as you all exist for my personal entertainment. ;o) Well except for Laura, she exists to hound me to the end of my existence on commas. very glad that you can't throw virtual vegetables at list posters Keep it up, joe, and I'll start proofreading your activedir posts as well. (Note the appropriate comma usage.) :-) - Laura List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
RE: [ActiveDir] Ntds.dit file corruption
snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have. Dropping back to the above hypothetical as an ESE dev I can say to the AD devs that until they can prove that ESE actually lost thier column, that it's most likely some sort of AD transactional problem, and the source is an AD bug. If I am feeling unbusy I will debug at the AD logical layer, because I know what it's supposed to look like. Coming back to the original issue of replicating _this kind_ of corruption a normal corruption coming bottom up, because the bits we (ESE) sent down the disk subsystem, were not the exact bits we got back later from the sub-systems is almost always detected by the fact that ESE checksums _every byte_ of it's database pages ... and at this point everyone should be very thankful Win2k3 AD isn't on SQL 2000, because it has few such protections, though SQL 2005 finally caught up, 10 years after the fact, it's such a legacy DB
RE: [ActiveDir] Ntds.dit file corruption
I like, to fly in; the face of: convention... -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Laura E. Hunter Sent: Tuesday, December 06, 2005 4:53 PM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption On 12/6/05, joe [EMAIL PROTECTED] wrote: LOL. I enjoyed it which means it is all good as you all exist for my personal entertainment. ;o) Well except for Laura, she exists to hound me to the end of my existence on commas. very glad that you can't throw virtual vegetables at list posters Keep it up, joe, and I'll start proofreading your activedir posts as well. (Note the appropriate comma usage.) :-) - Laura List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
RE: [ActiveDir] Ntds.dit file corruption
Good post ~Eric, thanks for chiming in. I see where you are coming from with the corruption at the distributed level. In terms of corruption at that level I see it as corruption but just can't get myself to see it as AD corruption. I am not sure if I can put it down in words why. I just don't. :) joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have. Dropping back to the above hypothetical as an ESE dev I can say to the AD devs that until they can prove that ESE actually lost thier column, that it's most likely some sort of AD transactional problem, and the source is an AD bug. If I am feeling unbusy I will debug at the AD logical layer, because I know what it's supposed to look
RE: [ActiveDir] Ntds.dit file corruption
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Tuesday, December 06, 2005 5:42 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption snip I would generally not call USN rollback a corruption either, but I think Dean make a fair and quasi-valid point that if you consider the distributed system, yes such a thing is a corruption. Feel free to shim in a AD Distributed System Logical Layer in the above stack, between AD Logical Layer and App Logical Layer. I'm waffling on this point though, as somethign smells differnent that other types of corruption. I'm going to think about that for a long time ... in fact Eric yes the ~Eric) is at my door and says he would consider it corruption, so there is a long debate in my future as well ... /snip Over lunch, Brett and I discussed this some more. My contention is that USN rollback would be a form of corruption under a somewhat broad definition. The reality is that there is a layer that Brett mentioned which actually has a two parts when looked at from a high level. Namely, this layer: AD Logical Layer The first piece could be thought of as local logical layer. That is, data hierarchy, conforming to the code assumptions of how it should be, data conforming to the schema as defined, etc. This is a layer of data that clearly need be proper (leaving the definition of proper to another day), else we are in some sort of corrupt state. Brett and I both agree on this I'm pretty sure. However, there is then distributed systems corruption. In AD, one of the services we aim to provide is convergence. If we do not converge, we define this divergence as at a minimum bad, perhaps corrupt. USN rollback breaks our convergence guarantees, it breaks replication such that you will not attain convergence in the system. I would as such consider it a form of corruption. Over Teriyaki a few minutes ago, Brett posited the question well if USN rollback is corruption, what else? Valid question. I would concede that if USN rollback is considered distributed systems corruption, so too would be other conditions which yield divergence. Perhaps this is a slippery slope that goes too far. I need to think about this some more. I would also toss out there that corruption should not be confused with forever broken. There are many states in which the directory can exist where it is functional, but in some way broken. Such divergences can typically be repaired with administrative action, so long as it is a savvy administrator. :) If we are willing to assume that divergence is corruption, I'd tend to believe that most people on this list have recovered from some form of corruption before. The worse the corruption, the more help you likely want to recover from it. :) Anyway, we'll likely debate this for a few months, as we usually do on such points. More thoughts to come as we debate further. ~Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 12:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have. Dropping back to the above hypothetical as an ESE dev I can say to the AD devs that until they can prove that ESE actually lost thier column, that it's most likely some sort of AD transactional problem, and the source is an AD bug. If I am feeling unbusy I will debug at the AD logical layer, because I know what it's supposed to look like. Coming back to the original issue of replicating _this kind_ of corruption a normal corruption coming bottom up, because the bits we (ESE) sent down the disk subsystem, were not the exact bits we got back later from the sub-systems is almost always detected by the fact that ESE checksums _every
RE: [ActiveDir] Ntds.dit file corruption
Cool, got Brett to sit up and type... Crap, now I have to read it. j/k, I like long answers from people like Brett, it gives insight into the person as well as into the technology. When people ask, how do you know so much about , it is because I piss off the people to make them teach me how it really works. That is how I learned most of the Exchange stuff back when I first started working on it. ;o) Joe, is the DB corrupt? An AD object without an RDN? Good example, I would have to say maybe in that case. I expect it would either be a normal occurrence or take a serious failure of the AD App layer to allow that to occur unless ESE for some reason decided not to write or retrieve it properly. Even though it isn't required at the ESE Layer, I expect at some level of AD there is something enforcing the setting of that column. I don't know enough about the mechanics to say if it bad or not. be very thankful Win2k3 AD isn't on SQL 2000, because it has few such protections, though SQL 2005 finally caught up, 10 years after the fact, it's such a legacy DB, really ... anyway. I am. Thank you Brett. Even though I want triggers and business rules, I would rather see them make it into ESE than move AD to SQL. In fact, I tell everyone who will listen that I will likely not willingly get very serious with MIIS while it is sitting on SQL. I would prefer to see ESE under it. I like ESE. I would even wear a Brett says ESE rocks T-Shirt if I had one with that ugly mug of yours on it. joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 3:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have. Dropping back to the above hypothetical as an ESE dev I can say to the AD devs that until they can prove that ESE actually lost thier column, that it's most likely some sort of AD transactional problem, and the source is an AD bug. If I am feeling unbusy I will debug at the AD logical layer, because I know what it's supposed to look like. Coming back to the original issue of replicating _this kind_ of corruption a normal corruption coming bottom up, because the bits we (ESE) sent down the disk subsystem, were not the exact bits we got back later from the sub-systems is almost always detected by the fact that ESE checksums _every byte_ of it's database pages ... and at this point everyone should be very thankful Win2k3 AD isn't on SQL 2000, because it has few such protections, though SQL 2005 finally caught up, 10 years after the fact, it's such a legacy DB, really ... anyway. When the corruption comes up from the bottom, what happens is ESE detects the data is not checksumming, logs an event, and returns a -1018 error (in this case), and starts rejecting DB operations (such as JetSeek() / JetRetrieveColumn()) that involve that corrupt database page. AD then responds to these failed DB ops with can't authenticate a user, AD can't return the results of a search, or AD can't read or apply data during replication (those 3 at least probably being the most common). In short the system starts limping, without affecting the rest of the distributed system. Coming back to jose's worry of old hardware injecting bad data into the distributed system. Fortunately, when the disk subsystem goes bad, ESE does a pretty good job of protecting you, but there are other sources of corruption, besides corruption, an especially insidious one is the bit flip in memory (and yes I see these too) which injects itself in the middle of the above stack. This kind of corruption can both end up making it's way down to the disk subsystem (with a valid ESE checksum), and up and out to the distributed system. From the perspective of older hardware though, I would
RE: [ActiveDir] Ntds.dit file corruption
Absolutely a great way to learn. I haven't tried to piss off people smarter than me approach, but I'll have put that in the bag of tricks ;) I have to disagree Joe. I'd say that if the column were missing the data and that was allowed at that layer, then it's not corruption, it's just unexpected at the other layers. In fact, I'd have to question whether or not it's really an object at all (any longer) because it's a DN, but that's neither here nor there. I suppose the counter to that is that it's still broken. To that, I would say I agree, but it's not corruption which is often very important in the recovery process (diagnosis and prevention). As you mentioned, it's just a storage mechanism - similar to an intelligent shoebox. If I put a rock in there, it remains a rock. If my dog takes the rock out, when I go to get it, it's not corrupt, it's just not there. But it's still a shoe box, and it still operates as expected and if the rock were there it wouldn't change the rock in any unexpected way. It's just that something else took my rock from me. This is only important when it comes to diagnosing and preventing the symptoms you experience when your rock is taken unexpectedly. The end result may be the same regardless. aside I don't know as I'd wear a powder blue shirt with Brett's mug on it, but I might carry a mug with his picture on it. Maybe similar to http://www.cafepress.com/ehlo.10124219 with some snazzy saying on there? Also, I'd love to know how a memory bit flip was diagnosed. If you ever get the time, Brett. From: joe [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Tue, 6 Dec 2005 20:54:56 -0500 Cool, got Brett to sit up and type... Crap, now I have to read it. j/k, I like long answers from people like Brett, it gives insight into the person as well as into the technology. When people ask, how do you know so much about , it is because I piss off the people to make them teach me how it really works. That is how I learned most of the Exchange stuff back when I first started working on it. ;o) Joe, is the DB corrupt? An AD object without an RDN? Good example, I would have to say maybe in that case. I expect it would either be a normal occurrence or take a serious failure of the AD App layer to allow that to occur unless ESE for some reason decided not to write or retrieve it properly. Even though it isn't required at the ESE Layer, I expect at some level of AD there is something enforcing the setting of that column. I don't know enough about the mechanics to say if it bad or not. be very thankful Win2k3 AD isn't on SQL 2000, because it has few such protections, though SQL 2005 finally caught up, 10 years after the fact, it's such a legacy DB, really ... anyway. I am. Thank you Brett. Even though I want triggers and business rules, I would rather see them make it into ESE than move AD to SQL. In fact, I tell everyone who will listen that I will likely not willingly get very serious with MIIS while it is sitting on SQL. I would prefer to see ESE under it. I like ESE. I would even wear a Brett says ESE rocks T-Shirt if I had one with that ugly mug of yours on it. joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 3:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have. Dropping back to the above hypothetical as an ESE dev I can say to the AD devs that until they can prove that ESE actually lost thier column, that it's most likely some sort of AD transactional problem, and the source is an AD bug. If I am feeling unbusy I will debug at the AD logical layer, because I know
RE: [ActiveDir] Ntds.dit file corruption
Ok, a mug with Brett's mug on it and with him saying My ESE can beat up your SQL Server. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Al Mulnick Sent: Tuesday, December 06, 2005 9:38 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Absolutely a great way to learn. I haven't tried to piss off people smarter than me approach, but I'll have put that in the bag of tricks ;) I have to disagree Joe. I'd say that if the column were missing the data and that was allowed at that layer, then it's not corruption, it's just unexpected at the other layers. In fact, I'd have to question whether or not it's really an object at all (any longer) because it's a DN, but that's neither here nor there. I suppose the counter to that is that it's still broken. To that, I would say I agree, but it's not corruption which is often very important in the recovery process (diagnosis and prevention). As you mentioned, it's just a storage mechanism - similar to an intelligent shoebox. If I put a rock in there, it remains a rock. If my dog takes the rock out, when I go to get it, it's not corrupt, it's just not there. But it's still a shoe box, and it still operates as expected and if the rock were there it wouldn't change the rock in any unexpected way. It's just that something else took my rock from me. This is only important when it comes to diagnosing and preventing the symptoms you experience when your rock is taken unexpectedly. The end result may be the same regardless. aside I don't know as I'd wear a powder blue shirt with Brett's mug on it, but I might carry a mug with his picture on it. Maybe similar to http://www.cafepress.com/ehlo.10124219 with some snazzy saying on there? Also, I'd love to know how a memory bit flip was diagnosed. If you ever get the time, Brett. From: joe [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption Date: Tue, 6 Dec 2005 20:54:56 -0500 Cool, got Brett to sit up and type... Crap, now I have to read it. j/k, I like long answers from people like Brett, it gives insight into the person as well as into the technology. When people ask, how do you know so much about , it is because I piss off the people to make them teach me how it really works. That is how I learned most of the Exchange stuff back when I first started working on it. ;o) Joe, is the DB corrupt? An AD object without an RDN? Good example, I would have to say maybe in that case. I expect it would either be a normal occurrence or take a serious failure of the AD App layer to allow that to occur unless ESE for some reason decided not to write or retrieve it properly. Even though it isn't required at the ESE Layer, I expect at some level of AD there is something enforcing the setting of that column. I don't know enough about the mechanics to say if it bad or not. be very thankful Win2k3 AD isn't on SQL 2000, because it has few such protections, though SQL 2005 finally caught up, 10 years after the fact, it's such a legacy DB, really ... anyway. I am. Thank you Brett. Even though I want triggers and business rules, I would rather see them make it into ESE than move AD to SQL. In fact, I tell everyone who will listen that I will likely not willingly get very serious with MIIS while it is sitting on SQL. I would prefer to see ESE under it. I like ESE. I would even wear a Brett says ESE rocks T-Shirt if I had one with that ugly mug of yours on it. joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brett Shirley Sent: Tuesday, December 06, 2005 3:04 PM To: ActiveDir@mail.activedir.org Subject: RE: [ActiveDir] Ntds.dit file corruption I wouldn't say that, joe ... Lets take another hypothetical real quick, lets say you have a column for the RDN of an AD object (well we do) and that value is NULL. From AD's perspective this object is well not really an object, it would be corrupt, and might even crash lsass.exe (I don't know, it might). However, from ESE's persepctive though, the table/row/column is valid, it has a particular column that doesn't have a value. A column which I might add is declared optional (real term is tagged) in the ESE layer schema (real term is catalog). ESE is simply a store of data, it passes no judgement on the data as long as it fits the schema guidelines for the column. Joe, is the DB corrupt? An AD object without an RDN? I have tendency to think in layers and sources of corruption. App Logical Layer AD Logical Layer ESE Logical Layer [ESE] Physical Layer Corruptions coming top down through that stack are protected by the schema configuration/constraints of that layer (as joe astutely pointed out). Corruptions coming bottom up, from disk sub-system hardware, are protected by whatever mechanisms those layers have
RE: [ActiveDir] Ntds.dit file corruption
Correction. I meant to say: Esentutl utility with the /d switch . Not Eseutil /d. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jose Medeiros Sent: Sunday, December 04, 2005 12:42 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Even if it's SCSI on a RAID 5 Array, you can still have corrupt clusters. A power outage or a hard reboot could have damaged the clusters on the drives. Try running Chkdsk /r. And I have an idea, but have not tried it yet, try running Eseutil /d after the chkdsk completes since it creates a new database, it may repair the problem. http://www.mcpmag.com/columns/article.asp?EditorialsID=330 Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Sunday, December 04, 2005 12:13 AM Subject: Re: [ActiveDir] Ntds.dit file corruption Nope just confirmed SCSI ...but there's still Dell hardware to lay blame on here ;-) Brian Desmond wrote: I think those are SATA only? Thanks, Brian Desmond [EMAIL PROTECTED] c - 312.731.3132 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sunday, December 04, 2005 2:21 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf Well he said it's a Dell [ugh] 1420 but do not know if SATA or SCSI. Jose Medeiros wrote: Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org
RE: [ActiveDir] Ntds.dit file corruption
She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress. You might give that a try. If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time): - Try making sure you have the latest driver and motherboard / controller firmware. Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5. - Try swapping out the hard drives, one at a time. - Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
I did? :-) I think I still said all I know is what the poster said :-) I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun. The directory services one is filled with events 'post' blow up. What is interesting is that it seems to me big server land goes .. oh yeah... ntds.dit corruption... and sbsland freaks out. Either we do indeed need to ensure we have a secondary DC or we need to park a second copy of a system state offsite [say at the vap/var] Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress. You might give that a try. If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time): - Try making sure you have the latest driver and motherboard / controller firmware. Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5. - Try swapping out the hard drives, one at a time. - Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info :
Re: [ActiveDir] Ntds.dit file corruption
Those are fine ideas. You may want to have a closer look at that hardware. Whichever the vendor, they usually have their own diagnostics. It's time consuming, but often worth checking along with checking for known issues with drivers, firmware, etc. In my experience, I've mostly seen this type of corruption with faulty hardware. Sometimes drive cache can hurt (not battery backed up array controller, but on the disk) as can bad run of hardware or cracked motherboards. Giving the machine the once-over is a great idea. And if you can't spot it, I might still consider the machine suspect and not worth reinstalling on. Vote of no-confidence so to speak. Keeping good backups (by good, I mean tested) is always recommended regardless of size of company. Keep with that any and all information needed to recover the machine if it were to become a smoking puddle of goo in the wiring closet. Unless the data is not worth recovering. :) From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] Reply-To: ActiveDir@mail.activedir.org To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Date: Mon, 05 Dec 2005 08:52:48 -0800 I did? :-) I think I still said all I know is what the poster said :-) I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun. The directory services one is filled with events 'post' blow up. What is interesting is that it seems to me big server land goes .. oh yeah... ntds.dit corruption... and sbsland freaks out. Either we do indeed need to ensure we have a secondary DC or we need to park a second copy of a system state offsite [say at the vap/var] Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress. You might give that a try. If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time): - Try making sure you have the latest driver and motherboard / controller firmware. Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5. - Try swapping out the hard drives, one at a time. - Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating
RE: [ActiveDir] Ntds.dit file corruption
Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest. Maybe I am just being a worry wort and this really is not an issue. Sincerely, Jose Medeiros ADP | National Account Services ProBusiness Division | Information Services 925.737.7967 | 408-449-6621 CELL -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Monday, December 05, 2005 8:53 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption I did? :-) I think I still said all I know is what the poster said :-) I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun. The directory services one is filled with events 'post' blow up. What is interesting is that it seems to me big server land goes .. oh yeah... ntds.dit corruption... and sbsland freaks out. Either we do indeed need to ensure we have a secondary DC or we need to park a second copy of a system state offsite [say at the vap/var] Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress. You might give that a try. If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time): - Try making sure you have the latest driver and motherboard / controller firmware. Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5. - Try swapping out the hard drives, one at a time. - Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting
Re: [ActiveDir] Ntds.dit file corruption
Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on.Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here.From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem].This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up
RE: [ActiveDir] Ntds.dit file corruption
I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define "ntds.dit file corruption" for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behal
RE: [ActiveDir] Ntds.dit file corruption
We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define "ntds.dit file corruption" for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on.Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all t
Re: [ActiveDir] Ntds.dit file corruption
I was thinking about Longhorn :) It has been brought up here as a possible longhorn feature a couple of times, but yeah that doesn't help much for the immediate future. Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time): - Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sat 12/3/2005 10:58 PM To: ActiveDir@mail.activedir.org
RE: [ActiveDir] Ntds.dit file corruption
RODCs are a LongHorn feature. It will be one-way replication to the RODCs. They will not replicate out anything. If you are on the LongHorn beta you should be able to test this right now. But as Steve (one of the really good PSS guys)said and I can concur as I have seen my share of corrupted DITs, the corruption doesn't replicate. In every case I have seen it the problem has been hardware failure or a firmware/driver matchup issue in the disk subsystem. Fixing them is easy, wipe the machine, do hardware tests, if it passes, do it again. If it passes do it a third time. If it passes, reload and repromo. If it fails one of the tests, get the hardware fixed, reload, and repromo. If SBS, well you have all sorts of issues in that basket as your eggs leak. joe From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 2:24 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration
RE: [ActiveDir] Ntds.dit file corruption
If that failsafe is built in then I am just being a worry wort and I have to admit, I have yet to experience this particular problem. Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Steve LinehanSent: Monday, December 05, 2005 11:26 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption We do not replicate corruption so if you have local corruption as noted below there is no worry that it would replicate around to other servers in the environment. Thanks, -Steve From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil RenoufSent: Monday, December 05, 2005 1:04 PMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define "ntds.dit file corruption" for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE i
RE: [ActiveDir] Ntds.dit file corruption
Novell. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 11:24 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers, BrettSh On Sun, 4 Dec 2005, Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define "ntds.dit file corruption" for us? What sort of
RE: [ActiveDir] Ntds.dit file corruption
BDC From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carpenter Robert A Contr WROCI/Enterprise IT Sent: Monday, December 05, 2005 5:33 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption Novell. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 11:24 AMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway, this kind of thing is usually hardware ... While there are much better disk sub-system testers, one that is freely available to any box with Exchange is jetstress.You might give that a try.If you can reproduce the event / error with jetstress I would not use that box in production. If you do reproduce the issue several times (several times is key, as you want a trend before you start playing the variable game), some things you might vary (one at a time):- Try making sure you have the latest driver and motherboard / controller firmware.Then see if you can reproduce. - Try a different RAID configuration, such as RAID1/RAID1+0 if you're on RAID5.- Try swapping out the hard drives, one at a time.- Adding the jetstress files to the exclude list in the Anti-Virus software. (A low probablility, I've never heard of Anit-Virus causing this paticular type of error, and I can't imagine the mistake an anti-virus product would have to have to cause this side effect) - If you can reproduce it several times, you could followup with Dell. Good luck. I'm not sure if I answered your question ... Cheers
RE: [ActiveDir] Ntds.dit file corruption
For full disclosure I am no longer in the Microsoft Services organization, I was the last time Joe talked to me where I was an Advisory Support Engineer (AKA Alliance Support). I am now a Product Technology Specialist for Directories and Identities in Microsoft's technical pre-sales organization. Not that it changes the answer below. :-) Thanks, -Steve Steve Linehan | Technology Specialist Directories Identities | South Central District | Microsoft Corporation From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joeSent: Monday, December 05, 2005 2:38 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption RODCs are a LongHorn feature. It will be one-way replication to the RODCs. They will not replicate out anything. If you are on the LongHorn beta you should be able to test this right now. But as Steve (one of the really good PSS guys)said and I can concur as I have seen my share of corrupted DITs, the corruption doesn't replicate. In every case I have seen it the problem has been hardware failure or a firmware/driver matchup issue in the disk subsystem. Fixing them is easy, wipe the machine, do hardware tests, if it passes, do it again. If it passes do it a third time. If it passes, reload and repromo. If it fails one of the tests, get the hardware fixed, reload, and repromo. If SBS, well you have all sorts of issues in that basket as your eggs leak. joe From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Medeiros, JoseSent: Monday, December 05, 2005 2:24 PMTo: ActiveDir@mail.activedir.orgSubject: RE: [ActiveDir] Ntds.dit file corruption I was not aware that Microsoft had incorporated such a feature in AD 2003. I know for a fact that Microsoft did not have this feature when AD 2000 was first released because I mentioned it to several Microsoft AD premier support specialists and they each confirmed it was not available ( However it may have been added in a service pack ). I would love to know how to enable a read only DC. I think that is a great idea, I wonder who thought of it. :-) Sincerely,Jose MedeirosADP | National Account ServicesProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Phil RenoufSent: Monday, December 05, 2005 11:04 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruption Will Read Only DC's take care of this? I don't know much about them yet, but it makes sense that if the copy of the dit that a DC has is RO that it won't try to replicate that anywhere and would only be the recipient of replication. Anyone with more knowledge about how RO DC's will work to comment on that? Phil On 12/5/05, Medeiros, Jose [EMAIL PROTECTED] wrote: Well at least the corruption occurred on just a single DC. One thing that has bugged me about Active Directory is not being able to select if you want a DC in a remote office to not have the ability to replicate back in a large enterprise environment. Since most remote offices only have a few people at the location and a DC is usually placed for improvised logon and authentication time, many companies will either use a very low end server or a very old decommissioned one from their production data center ( Which is probably close to useable life ). I am always concerned that once the NTDS.DIT file becomes corrupt it will replicate the corruption to the other DC's in the Forrest.Maybe I am just being a worry wort and this really is not an issue.Sincerely,Jose MedeirosADP | National Account Services ProBusiness Division | Information Services925.737.7967 | 408-449-6621 CELL-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Susan Bradley,CPA aka Ebitz - SBS Rocks [MVP]Sent: Monday, December 05, 2005 8:53 AMTo: ActiveDir@mail.activedir.orgSubject: Re: [ActiveDir] Ntds.dit file corruptionI did? :-)I think I still said all I know is what the poster said:-)I think I need a course in event log reading because even with the logs, and the default size of the logs, I still don't see a smoking gun.Thedirectory services one is filled with events 'post' blow up.What is interesting is that it seems to me big server land goes .. ohyeah... ntds.dit corruption... and sbsland freaks out.Either we doindeed need to ensure we have a secondary DC or we need to park a secondcopy of a system state offsite [say at the vap/var]Brett Shirley wrote: She replied offline, very likely a single bit flip, tragedy, they aren't one release later (Longhorn), where this would've probably been non-disruptively handled, logged, and possibly self-healed: http://blogs.technet.com/efleis/archive/2005/01.aspx Anyway
RE: [ActiveDir] Ntds.dit file corruption
I think those are SATA only? Thanks, Brian Desmond [EMAIL PROTECTED] c - 312.731.3132 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sunday, December 04, 2005 2:21 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf Well he said it's a Dell [ugh] 1420 but do not know if SATA or SCSI. Jose Medeiros wrote: Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
SCSI RAID 5 ( 3 x 36 GB DISKS 10K ) PERC CONTROLLER, DELL SC1420 SERVER Okay so not SATA Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] wrote: http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf Well he said it's a Dell [ugh] 1420 but do not know if SATA or SCSI. Jose Medeiros wrote: Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
Nope just confirmed SCSI ...but there's still Dell hardware to lay blame on here ;-) Brian Desmond wrote: I think those are SATA only? Thanks, Brian Desmond [EMAIL PROTECTED] c - 312.731.3132 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sunday, December 04, 2005 2:21 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf Well he said it's a Dell [ugh] 1420 but do not know if SATA or SCSI. Jose Medeiros wrote: Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
Even if it's SCSI on a RAID 5 Array, you can still have corrupt clusters. A power outage or a hard reboot could have damaged the clusters on the drives. Try running Chkdsk /r. And I have an idea, but have not tried it yet, try running Eseutil /d after the chkdsk completes since it creates a new database, it may repair the problem. http://www.mcpmag.com/columns/article.asp?EditorialsID=330 Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Sunday, December 04, 2005 12:13 AM Subject: Re: [ActiveDir] Ntds.dit file corruption Nope just confirmed SCSI ...but there's still Dell hardware to lay blame on here ;-) Brian Desmond wrote: I think those are SATA only? Thanks, Brian Desmond [EMAIL PROTECTED] c - 312.731.3132 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] Sent: Sunday, December 04, 2005 2:21 AM To: ActiveDir@mail.activedir.org Subject: Re: [ActiveDir] Ntds.dit file corruption http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf Well he said it's a Dell [ugh] 1420 but do not know if SATA or SCSI. Jose Medeiros wrote: Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org
RE: [ActiveDir] Ntds.dit file corruption
Title: [ActiveDir] Ntds.dit file corruption Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define "ntds.dit file corruption" for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. From: [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP]Sent: Sat 12/3/2005 10:58 PMTo: ActiveDir@mail.activedir.orgSubject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September]RE: [ActiveDir] Database Corruption:http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.htmlWe have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultantand PSS have been banging on. Could not get the services back running,changed the RPC service to local system and some service came back up [Idon't have all the details but the consultant opened a support case ofSRX051202605433].Bottom line they are about going to give up and start a restore butbefore they do that I'd like to get the view of the AD gods andgoddesses around here. From all that I've seen, read, seen in the SBSnewsgroup, the corruption of ntds.dit is rare to nil and an underlyingcause is hardware issues [raid, disk subsystem]. This doesn't justhappen.The VAP asked if not properly excluding the ad databases from the a/vwould cause this/trigger this and my expectation is 'no', given that Idoubt the majority of us in SBSland properly set up exclusionsVirus scanning recommendations on a Windows 2000 or on a Windows Server2003 domain controller:http://support.microsoft.com/default.aspx?scid=kb;en-us;822158If this were my hardware and box, I'd be putting this sucker on theoperating table and getting an autopsy before putting it back online.Are we right in being paranoid now about this hardware? For you guys inbig server land you'd just slide over another box into that server role.---Stupid question alertOkay so we know that having a secondary/additional domain controller isa good thing even in SBSland...but question many times the secondserver in SBSland is a terminal server box because we do not support TSin app mode on our PDCs. So we've established that having a domaincontroller and a terminal server is a security issue [see WindowsSecurity resource kit, NIST Terminal services hardening guide, etcetc] If our second server is a member server handing out TSexternally, should that be a candidate for the additional DC? Are theissues of TS on a DC ... true for 'any' DC? Would it be better than toVserver/VPC a Win2k3 inside a workstation in the network if a thirdserver box was not feasible?List info : http://www.activedir.org/List.aspxList FAQ : http://www.activedir.org/ListFAQ.aspxList archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
Given that in SBSland that our AD in wizardly built for us and just works, unfortunately I didn't think to dig deeper into that relay of a statement from the SBSer. I'll check. This is one of those Oh Sh_t moments when we go .. you know those folks who say a second dc were right... events. Eric Fleischman wrote: Going back to the original post, I'm not sure I fully understand the problem yet. Susan, can you define ntds.dit file corruption for us? What sort of corruption? What errors/events lead you to believe this? Specifically, I'm interested in errors from NTDS ISAM or ESE if you have any. *From:* [EMAIL PROTECTED] on behalf of Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] *Sent:* Sat 12/3/2005 10:58 PM *To:* ActiveDir@mail.activedir.org *Subject:* [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
Re: [ActiveDir] Ntds.dit file corruption
http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf Well he said it's a Dell [ugh] 1420 but do not know if SATA or SCSI. Jose Medeiros wrote: Hmm.. I have never experienced this with either McAfee or Symantec AV on any of the DC's that I have built and or maintened. Have you had a chance to run chkdsk /r yet? More then likely the problem is bad clusters on the drive which caused the NTDS.DIT file to become corrupt. Was this server built using IDE /ATA/SATA drives? Jose - Original Message - From: Susan Bradley, CPA aka Ebitz - SBS Rocks [MVP] [EMAIL PROTECTED] To: ActiveDir@mail.activedir.org Sent: Saturday, December 03, 2005 10:58 PM Subject: [ActiveDir] Ntds.dit file corruption SBS box [with Windows 2003 sp1 since September] RE: [ActiveDir] Database Corruption: http://www.mail-archive.com/activedir@mail.activedir.org/msg32676.html We have a SBS 2003 sp1 box with a corrupt ntds.dit that the Consultant and PSS have been banging on. Could not get the services back running, changed the RPC service to local system and some service came back up [I don't have all the details but the consultant opened a support case of SRX051202605433]. Bottom line they are about going to give up and start a restore but before they do that I'd like to get the view of the AD gods and goddesses around here. From all that I've seen, read, seen in the SBS newsgroup, the corruption of ntds.dit is rare to nil and an underlying cause is hardware issues [raid, disk subsystem]. This doesn't just happen. The VAP asked if not properly excluding the ad databases from the a/v would cause this/trigger this and my expectation is 'no', given that I doubt the majority of us in SBSland properly set up exclusions Virus scanning recommendations on a Windows 2000 or on a Windows Server 2003 domain controller: http://support.microsoft.com/default.aspx?scid=kb;en-us;822158 If this were my hardware and box, I'd be putting this sucker on the operating table and getting an autopsy before putting it back online. Are we right in being paranoid now about this hardware? For you guys in big server land you'd just slide over another box into that server role. --- Stupid question alert Okay so we know that having a secondary/additional domain controller is a good thing even in SBSland...but question many times the second server in SBSland is a terminal server box because we do not support TS in app mode on our PDCs. So we've established that having a domain controller and a terminal server is a security issue [see Windows Security resource kit, NIST Terminal services hardening guide, etc etc] If our second server is a member server handing out TS externally, should that be a candidate for the additional DC? Are the issues of TS on a DC ... true for 'any' DC? Would it be better than to Vserver/VPC a Win2k3 inside a workstation in the network if a third server box was not feasible? List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/List.aspx List FAQ: http://www.activedir.org/ListFAQ.aspx List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/