Thanks. I'll continue to poke around, and ask a few more questions. Kurt
On Mon, Feb 13, 2012 at 16:18, Michael B. Smith <[email protected]> wrote: > Well, the kernel mode, paged pool, and interrupt time are items that will be > specifically reduced with an x64 OS. > > The I/O situation is indicative of disk queuing which is "hypervisor > related". Dunno how you optimize that in VMware, there are a number of > potentials in Hyper-V. > > Regards, > > Michael B. Smith > Consultant and Exchange MVP > http://TheEssentialExchange.com > > > -----Original Message----- > From: Kurt Buff [mailto:[email protected]] > Sent: Monday, February 13, 2012 5:33 PM > To: NT System Admin Issues > Subject: Re: Picking up file server tuning again > > It *is* a busy box, and migrating the iSCSI LUNs to a 64bit server is > something I've definitely considered. I have a Dell R310 with 16gb RAM > that I could use, but it's already got 9 active VMs, although they're > not heavy hitters. AFAICT, probably the highest-use machines on the > ESXi 4.1 box are the secondary DC (no FSMO roles, but does do DNS and > WINS) and the issuing CA box. > > It's currently a VM on what I believe to be an underpowered ESX 3.5 > box - I think it's possible that it's simply starved for resources on > that ESX box. > > I'm sure there's something out there like perfmon for VMware that I > can use to capture performance over time - I'd like to measure and > analyze the performance of the ESX 3.5 box while the backups are > happening against the file server. > > I'm also considering moving the Win2k3 file server VM to the ESX box > and seeing if the situation improves. > > Kurt > > On Mon, Feb 13, 2012 at 12:08, Michael B. Smith <[email protected]> wrote: >> That's a busy box. I'd suggest moving to a 64-bit OS. >> >> Regards, >> >> Michael B. Smith >> Consultant and Exchange MVP >> http://TheEssentialExchange.com >> >> -----Original Message----- >> From: Kurt Buff [mailto:[email protected]] >> Sent: Monday, February 13, 2012 3:00 PM >> To: NT System Admin Issues >> Subject: Re: Picking up file server tuning again >> >> Ran PAL against the log. >> >> Um, wow. It's a freaking christmas tree - red and yellow all over the >> place in CPU and disk. >> >> Who should I be talking with to analyze this? >> >> A sample of the issues shown - all of which show up in more than one >> time slice - some in every or almost every slice: >> o- More than 50% Processor Utilization >> o- More than 30% privileged (kernel) mode CPU usage >> o- More than 2 packets are waiting in the output queue >> o- Greater than 25ms physical disk READ response times >> o- Greater than 25ms physical disk WRITE response times >> o- More than 80% of Pool Paged Kernel Memory Used >> o- More than 2 I/O's are waiting on the physical disk >> o- 20 (Processor(_Total)\DPC Rate) >> o- More than 30% Interrupt Time >> o- Greater than 1000 page inputs per second (Memory\Pages Input/sec) >> >> Some things that showed no alerts: >> o- Memory\Available MBytes >> o- Memory\Free System Page Table Entrie >> o- Memory\Pages/sec >> o- Memory\System Cache Resident Bytes >> o- Memory\Cache Bytes >> o- Memory\% Committed Bytes In Use >> o- Network Interface(*)\% Network Utilization >> MS TCP Loopback interface >> VMware Accelerated AMD PCNet Adapter >> VMware Accelerated AMD PCNet Adapter#1 >> o- Network Interface(*)\Packets Outbound Errors >> MS TCP Loopback interface >> VMware Accelerated AMD PCNet Adapter >> VMware Accelerated AMD PCNet Adapter#1 >> >> >> Kurt >> >> On Fri, Feb 10, 2012 at 16:04, Brian Desmond <[email protected]> wrote: >>> Rather than trying to do this yourself, check out PAL - >>> http://pal.codeplex.com/. It will setup all the right counters for you and >>> crunch the data. >>> >>> Thanks, >>> Brian Desmond >>> [email protected] >>> >>> w – 312.625.1438 | c – 312.731.3132 >>> >>> -----Original Message----- >>> From: Kurt Buff [mailto:[email protected]] >>> Sent: Friday, February 10, 2012 4:43 PM >>> To: NT System Admin Issues >>> Subject: Picking up file server tuning again >>> >>> I'm getting back to monitoring my situation with the file server again, and >>> just finished a perfmon session covering the 3rd through the 7th of this >>> month. Simultaneously, I set up perfmon on the same workstation to monitor >>> the backup server. >>> >>> If anyone cares to help, I'd be deeply appreciative. >>> >>> I set up perfmon on a Win7 VM on an ESXi 4.1 host to take measurements at >>> 60 second intervals of a whole bunch of counters, many of them probably >>> just noise. >>> >>> I'll describe the history of the configuration first, however: >>> >>> The file server is a Win2k3 R2 VM running on a ESX 3.5 host with 16g of RAM >>> - it's one of 10 VMs, and is definitely the heaviest hitter in terms of >>> disk I/O. About 2.5-3 months ago we noticed that the time to completion for >>> the weekly full backups spiked dramatically. >>> >>> Prior to that time, the fulls would start around 7pm on a Friday, and >>> finish by about 7pm on Sunday. >>> >>> Now they take until Thursday or Friday to complete. >>> >>> This coincided with some changes to the environment: I had to move the VM >>> to a new host (it was a manual copy - we don't have vmotion licensed and >>> configured for these hosts) and at about that time I also had to expand 2 >>> of the 4 LUNS. Finally, the OS drive for the VM on the old host was on a >>> LUN on our Lefthand unit - I had to migrate it to the local disk storage on >>> the new home for the VM. The 4 data drives for this VM are attached via the >>> MSFT iSCSI client running on the VM, not through VMWare's iSCSI client. So, >>> at that point, all of the LUNS were on the Lefthand SAN, which is a 3-node >>> cluster, and we use 2-way replication for all LUNS. The 2 LUNS that were >>> expanded went to 2tb or slightly beyond. The Lefthand has two NSM 2060s and >>> a P4300G2, with 6 and 8 disks each, respectively - a total of 20 disks >>> >>> Since that time, I've also added in our EMC VNXe 3100 with 6 disks in it in >>> a RAID6 array. I mention this because this means that all of the file >>> systems on the VNXe are clean and defragged. >>> >>> Currently, I've migrated 3 of the 4 data LUNs for the VM to the EMC. I made >>> sure to align the partitions on the EMC to a megabyte boundary. >>> >>> So, to make this simpler to visualize, a little table: >>> >>> c: - local disk on ESX 3.5, 40gb, 23.6gb free >>> j: - iSCSI LUN on Lefthand, 2.5tb, 900gb free >>> k: - iSCSI LUN on VNXe, 1.98tb, 336gb free >>> l: - iSCSI LUN on VNXe, 1tb, 79gb free >>> m: - iSCSI LUN on VNXe 750gb, 425gb free >>> >>> I tried to capture separate disk queue stats for each LUN, but in spite of >>> selecting and adding each drive letter separately in the perfmon interface, >>> all I got was _Total. >>> >>> Selected stats are as follows: >>> >>> PhysicalDisk counters >>> Current disk queue length - average 0.483, maximum 33.000 Average disk read >>> queue length - 0.037, maximum 1.294 %disk time - average 34.068, maximum >>> 153.877 Average disk write queue length - average 0.645, maximum 2.828 >>> Average disk queue length - average 0.681, maximum 3.078 >>> >>> I have more data on PhysicalDisk, and data on other objects, including >>> Memory, NetworkInterface, Paging File, Processor and Server Work Queues. >>> >>> If anyone has thoughts, I'd surely like to hear them. >>> >>> Thanks, >>> >>> Kurt >>> >>> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ >>> <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ >>> >>> --- >>> To manage subscriptions click here: >>> http://lyris.sunbelt-software.com/read/my_forums/ >>> or send an email to [email protected] >>> with the body: unsubscribe ntsysadmin >>> >>> >>> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ >>> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ >>> >>> --- >>> To manage subscriptions click here: >>> http://lyris.sunbelt-software.com/read/my_forums/ >>> or send an email to [email protected] >>> with the body: unsubscribe ntsysadmin >> >> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ >> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ >> >> --- >> To manage subscriptions click here: >> http://lyris.sunbelt-software.com/read/my_forums/ >> or send an email to [email protected] >> with the body: unsubscribe ntsysadmin >> >> >> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ >> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ >> >> --- >> To manage subscriptions click here: >> http://lyris.sunbelt-software.com/read/my_forums/ >> or send an email to [email protected] >> with the body: unsubscribe ntsysadmin > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin > > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ --- To manage subscriptions click here: http://lyris.sunbelt-software.com/read/my_forums/ or send an email to [email protected] with the body: unsubscribe ntsysadmin
