I absolutely agree that if replication is working you are probably in decent shape. As for you how to figure that out I am more of the drop something in and see if it gets around kind. I have been doing AD too long though and maybe my previous issues with repadmin and other tools that allegedly check replication haven't been the best. In a one off setting they tend to be ok and if you know you have an issue they are the best place to start. However I have seen cases (and I can't go into specifics off the top of my head nor even what year I saw the issues) where I have seen repadmin dumps of an entire domain seeming to be perfectly fine because there were no errors but then you notice that the last success was quite a while back. After all repadmin is simply a program, it can have issues like any program. If you create an object and it gets to all DCs, replication is certainly working irregardless of what any tools say. Ditto if an object doesn't get to all DCs.
Dropping in a pebble and watching the ripple spread out is much more positive feedback system in my book. Any place that new object doesn't get to is a good place to start looking with repadmin. joe -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eric Fleischman Sent: Saturday, June 05, 2004 12:14 AM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] AD Health check I intentionally waited for the traffic on this thread to die down. :) If you're one who says "look I just want one test...the best one possible...to measure health" I'd argue that is replication. With one command you can understand replication in the forest: repadmin /showrepl * /csv And once you go forest functional level >=1 I also like repadmin /showutdvec a lot. What's cool about replication is the # of dependencies. If replication is working, it's probably the case that a lot of stuff is working. It's not 100% by any means, I don't mean to imply it is, nor is it a replacement for a comprehensive monitoring solution. But if you're looking for the 15 minutes a day health check, repadmin /showrepl is where it's at IMHO. :) ~Eric -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe Sent: Friday, June 04, 2004 10:28 PM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] AD Health check LOL. I hope I wasn't too bad on my response, I tried to trim it down. There is enough chatter from me in the archives of GG vs DLG vs UG. joe -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Free, Bob Sent: Friday, June 04, 2004 5:54 PM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] AD Health check d�j� vu Set aside a fair amount of time to read the response and hang on to your hat Tom, joe has probably been typing the reply for quite a while now :-] -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kern, Tom Sent: Friday, June 04, 2004 10:22 AM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] AD Health check joe, near the bottom of your email, you site heavy use of gg's. why would heavy use of gg's as opposed to lg's pose a problem? what issues arise from having alot og global groups? thanks -----Original Message----- From: joe [mailto:[EMAIL PROTECTED] Sent: Friday, June 04, 2004 12:47 PM To: [EMAIL PROTECTED] Subject: RE: [ActiveDir] AD Health check > when you've inherited a forest with few domains, what would you check > in the first place to make sure, things are running as they should? I must be weird, given the circumstance (walking in the door on an unknown) I would tackle this completely differently than the other posters have mentioned. I would send an email to a couple of systems people that were already running it and ask, what issues have they been seeing/have seen? Get all documentation they have for the environment configuration that they were aiming for. I would create a new object in every partition (excluding schema) and in each sysvol and in WINS (dynamic entry please, not static) and then let it all replicate. I would look at the configuration container and the sites / site link layout to ascertain what the replication topology and theoretical latencies should be. Looking for any oddball things like weird replication timing, bad schedules, site link bridges, etc. The schedules would probably be the biggest pain as I have nothing to decode those currently from the command line but could probably tie perl and adfind together pretty quickly to do so. If it was small enough I would just eyeball the outputs or actually use the GUI. Then later (not real later, depends on site topology) go looking for all of my objects I created and for the AD objects the whenchanged attribute so I can compare against the theoretical latencies. This will test the LDAP access to every DC making sure they are responding properly. Initially I assumed that but then figured I should document it so it is obvious this is checking another aspect of the health. Also query every WINS Server for the record I added with both netsh and NBLOOKUP/NMBLOOKUP (one is a Samba port to Win32 and one is a Microsoft supplied tool) to test functionality on both WINS interfaces. I would run a little tool I call OOR against all DCs. It initially stood for out of resources which was a huge issue I walked into once when I walked in on an unknown environment. A good 80+ DCs were all reporting out of resources when trying to do NET API type calls against them. The oor tool is a simple perl script that loops through a domain and does a GetUserInfo for the guest account against all DCs. This gets you a good solid baseline on how things are really working versus grabbing a ton of info and munging through reports. Any DC that didn't get the information I focus on looking at replication info and if necessary chasing into FRS or DNS as necessary. That would be my first place stuff... After that, then I would dive into the rest of this. After I know everything is basically functioning as expected, you have time to look at the more detailed stuff that lets you know things that aren't stopping functionality but could be impacting it for performance, etc. This would be to do the intensive check every dc for every error, check all of dns, check all of frs, etc that the diag tools do. I would also start monitoring the DRA Pending queue of each DC on a 2-3 minute swing to see if I have any serious bottle necks there that could be slowing things down. What SP and hotfix levels are the DCs at? What functionality modes are you in? Do you have enough GC coverage for the mode? I.E. If mixed mode you can have one level of coverage, for native mode you may need more coverage. Now the real hard work begins... Finding out what AD permissions have been delegated and to whom and do they make sense? Do you have any serious security holes because of it. Audit all of the computer accounts and remove stale ones. Ditto for user accounts (maybe look at using oldcmp for BOTH of those things, yes it will do user accounts to if you use the -f option and specify a filter that picks out users. Audit the WINS records, any statics, they still needed? If not remove them. Ditto for DNS. What is the group strategy? How are they using them? For DLs? For security? The old legacy UGLY method or something a little more updated? Heavy use of GGs? Why? Doing role based stuff? Heavy use of UNIs? Do you have the proper GC coverage for them? Try to figure out which groups are and aren't being used. Try to clean that up, if there aren't owners for all of the groups, find someone to own them. Every object in AD should have an owner, if you can't find one, you as Enterprise Admin now own it, do you personally need it? No, disable it. For a group you can disable a security group by making it a DL, it will keep the SID so if you need it back, it just won't work as a security group anymore. You can reenable it as a security group if you find it is indeed needed and you have a new owner for it. Look at the naming standards. There better be some. If not, set some. If there are some, do they make sense or are they ad hoc? Naming standard should exist for servers, clients, groups, OUs, sites, sitelinks. I have outlined 1-24 months of work here depending on how large the environment, what tools are available, what problems exist, what work load there is outside of this. I would make sure there was netmon or ethereal available on every DC and any other server I supported and start doing basic network traces of each of the subnets that the machines are on. Oh if there is Exchange in there that adds a bunch more and needs to be kept in mind through the whole thing. Now if in my next job I start doing hot dropping into sites like this describes, I would probably write a bunch of tools to help this process out because it would be seriously a mishmosh of different things at the moment. Heck just having a tool to run on a network and get a good understanding of what is there would be nice, though difficult. joe -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Svetlana Kouznetsova Sent: Friday, June 04, 2004 10:53 AM To: [EMAIL PROTECTED] Subject: [ActiveDir] AD Health check Hi, In my quest to solve various problems in our forest while promoting W2K3 DC, I've now come to the point when I want to ascertain overall current situation in my AD and I need more general advice on : What kind of tests one should do for checking the health of AD (W2K native mode). As far as I can see, there are no certain compulsory things you need to run in your AD from time to time - it all depends on time, skills and perhaps, one's wish as well. But maybe people can share their experience - when you've inherited a forest with few domains, what would you check in the first place to make sure, things are running as they should? I can think of the basics, like Obvious event logs, dcdiag and netdiag netdiag /debug /v - for basically, everything ? dcdiag /test:fsmocheck - to test for all global role-holders are known and responding dcdiag /test:frssysvol - to test frs dcdiag /test:registerindns /dnsdomain:domain - to test, if DC can register DC Locator DNS records nltest/dclist:domain_name - to see if DC can see the rest of the forest nltest /dsgetdc:domain_name /gc - to see if DC can see GC servers in the forest nslookup -d - for testing DNS queries repadmin /bind servername.domain - to test if DC can bind to others for replication. Perhaps, some of them are overkill, but I'm looking for a bit more, then just routine checkup. Can you comment, please? Thanks in advance Lana. List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ List info : http://www.activedir.org/mail_list.htm List FAQ : http://www.activedir.org/list_faq.htm List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/
