RE: Picking up file server tuning again

ed ziots Tue, 14 Feb 2012 05:47:58 -0800

+1 for PST on file shares or running applications from file-shares that leak 
handles, which will run your servers out memory ( usually non-paged) after a 
while. 
 
I always look at running IO meter against my disk sub-systems and taking a 
baseline, to know what I can throw at them before they choke, and keep a look 
at the disk read and writes


Specifically I look at avg disk sec/read and avg disk sec/write ( crucial on 
SQL servers to make sure that I/O is keeping up if you go much above 30 ms then 
you going to have a latency issue)
 
Also on server its always good to have additional memory ( dont starve the 
server) ( Usually keep 1GB free for X86 and 2GB free for x64). 
 
Z

Edward E. Ziots 
Senior Informational Security Engineer
CISSP,Security +,Network+ 

 

> From: [email protected]
> To: [email protected]
> Subject: RE: Picking up file server tuning again
> Date: Tue, 14 Feb 2012 06:03:35 +0000
> 
> Yes. Security tokens are stored in Paged Pool. When you get the token bloat 
> issue (well if you start approaching it), you will start seeing issues on x86 
> application servers where they are running out of paged pool. If you look at 
> a report of paged pool consumers, you'll find the Toke tag at the top. 
> 
> # of spindles is going to directly correlate to disk queue lengths and 
> latency. If you have 2 spindles which can do 100 IOPS each, and you are 
> throwing 225 IOPS at them, you will have a problem. If you add a third 
> spindle, now you have 75 IOPS head room. 
> 
> Thanks,
> Brian Desmond
> [email protected]
> 
> w – 312.625.1438 | c   – 312.731.3132
> 
> 
> -----Original Message-----
> From: Kurt Buff [mailto:[email protected]] 
> Sent: Monday, February 13, 2012 11:13 PM
> To: NT System Admin Issues
> Subject: Re: Picking up file server tuning again
> 
> PSTs on file shares - it's been a while since I looked at that issue.
> 
> Crappy drivers are a small possibility - it is a P2V of an old machine.
> 
> I'm not sure that the number of spindles has anything to do with it, and in 
> any case there isn't anything I can do about that for a while.
> 
> Can you explain what you mean by "large tokens"? Is that related to token 
> bloat in AD, or is it something else?
> 
> Thanks,
> 
> Kurt
> 
> On Mon, Feb 13, 2012 at 19:25, Brian Desmond <[email protected]> wrote:
> > Well, the % Interrupts/DPC Time/Kernel Mode CPU time isn't necessarily 
> > going to be fixed by x64. It may very well mean you've got some crappy 
> > drivers in play.
> >
> > The disk stuff indicates the disk is not fast enough to keep up with 
> > demand. You can solve that with more spindles or faster spindles.
> >
> > Page Pool utilization will be resolved by x64 (or even x86 on 2008). That's 
> > indicative of crappy drivers, large tokens, and/or people doing things like 
> > using PSTs off file shares.
> >
> > Thanks,
> > Brian Desmond
> > [email protected]
> >
> > w – 312.625.1438 | c   – 312.731.3132
> >
> >
> > -----Original Message-----
> > From: Michael B. Smith [mailto:[email protected]]
> > Sent: Monday, February 13, 2012 6:18 PM
> > To: NT System Admin Issues
> > Subject: RE: Picking up file server tuning again
> >
> > Well, the kernel mode, paged pool, and interrupt time are items that will 
> > be specifically reduced with an x64 OS.
> >
> > The I/O situation is indicative of disk queuing which is "hypervisor 
> > related". Dunno how you optimize that in VMware, there are a number of 
> > potentials in Hyper-V.
> >
> > Regards,
> >
> > Michael B. Smith
> > Consultant and Exchange MVP
> > http://TheEssentialExchange.com
> >
> >
> > -----Original Message-----
> > From: Kurt Buff [mailto:[email protected]]
> > Sent: Monday, February 13, 2012 5:33 PM
> > To: NT System Admin Issues
> > Subject: Re: Picking up file server tuning again
> >
> > It *is* a busy box, and migrating the iSCSI LUNs to a 64bit server is 
> > something I've definitely considered. I have a Dell R310 with 16gb RAM 
> > that I could use, but it's already got 9 active VMs, although they're 
> > not heavy hitters. AFAICT, probably the highest-use machines on the 
> > ESXi 4.1 box are the secondary DC (no FSMO roles, but does do DNS and
> > WINS) and the issuing CA box.
> >
> > It's currently a VM on what I believe to be an underpowered ESX 3.5 box - I 
> > think it's possible that it's simply starved for resources on that ESX box.
> >
> > I'm sure there's something out there like perfmon for VMware that I can use 
> > to capture performance over time - I'd like to measure and analyze the 
> > performance of the ESX 3.5 box while the backups are happening against the 
> > file server.
> >
> > I'm also considering moving the Win2k3 file server VM to the ESX box and 
> > seeing if the situation improves.
> >
> > Kurt
> >
> > On Mon, Feb 13, 2012 at 12:08, Michael B. Smith <[email protected]> 
> > wrote:
> >> That's a busy box. I'd suggest moving to a 64-bit OS.
> >>
> >> Regards,
> >>
> >> Michael B. Smith
> >> Consultant and Exchange MVP
> >> http://TheEssentialExchange.com
> >>
> >> -----Original Message-----
> >> From: Kurt Buff [mailto:[email protected]]
> >> Sent: Monday, February 13, 2012 3:00 PM
> >> To: NT System Admin Issues
> >> Subject: Re: Picking up file server tuning again
> >>
> >> Ran PAL against the log.
> >>
> >> Um, wow. It's a freaking christmas tree - red and yellow all over the 
> >> place in CPU and disk.
> >>
> >> Who should I be talking with to analyze this?
> >>
> >> A sample of the issues shown - all of which show up in more than one 
> >> time slice - some in every or almost every slice:
> >> o- More than 50% Processor Utilization
> >> o- More than 30% privileged (kernel) mode CPU usage
> >> o- More than 2 packets are waiting in the output queue
> >> o- Greater than 25ms physical disk READ response times
> >> o- Greater than 25ms physical disk WRITE response times
> >> o- More than 80% of Pool Paged Kernel Memory Used
> >> o- More than 2 I/O's are waiting on the physical disk
> >> o- 20 (Processor(_Total)\DPC Rate)
> >> o- More than 30% Interrupt Time
> >> o- Greater than 1000 page inputs per second (Memory\Pages Input/sec)
> >>
> >> Some things that showed no alerts:
> >> o- Memory\Available MBytes
> >> o- Memory\Free System Page Table Entrie
> >> o- Memory\Pages/sec
> >> o- Memory\System Cache Resident Bytes
> >> o- Memory\Cache Bytes
> >> o- Memory\% Committed Bytes In Use
> >> o- Network Interface(*)\% Network Utilization
> >>     MS TCP Loopback interface
> >>     VMware Accelerated AMD PCNet Adapter
> >>     VMware Accelerated AMD PCNet Adapter#1
> >> o- Network Interface(*)\Packets Outbound Errors
> >>     MS TCP Loopback interface
> >>     VMware Accelerated AMD PCNet Adapter
> >>     VMware Accelerated AMD PCNet Adapter#1
> >>
> >>
> >> Kurt
> >>
> >> On Fri, Feb 10, 2012 at 16:04, Brian Desmond <[email protected]> 
> >> wrote:
> >>> Rather than trying to do this yourself, check out PAL - 
> >>> http://pal.codeplex.com/. It will setup all the right counters for you 
> >>> and crunch the data.
> >>>
> >>> Thanks,
> >>> Brian Desmond
> >>> [email protected]
> >>>
> >>> w – 312.625.1438 | c   – 312.731.3132
> >>>
> >>> -----Original Message-----
> >>> From: Kurt Buff [mailto:[email protected]]
> >>> Sent: Friday, February 10, 2012 4:43 PM
> >>> To: NT System Admin Issues
> >>> Subject: Picking up file server tuning again
> >>>
> >>> I'm getting back to monitoring my situation with the file server again, 
> >>> and just finished a perfmon session covering the 3rd through the 7th of 
> >>> this month. Simultaneously, I set up perfmon on the same workstation to 
> >>> monitor the backup server.
> >>>
> >>> If anyone cares to help, I'd be deeply appreciative.
> >>>
> >>> I set up perfmon on a Win7 VM on an ESXi 4.1 host to take measurements at 
> >>> 60 second intervals of a whole bunch of counters, many of them probably 
> >>> just noise.
> >>>
> >>> I'll describe the history of the configuration first, however:
> >>>
> >>> The file server is a Win2k3 R2 VM running on a ESX 3.5 host with 16g of 
> >>> RAM - it's one of 10 VMs, and is definitely the heaviest hitter in terms 
> >>> of disk I/O. About 2.5-3 months ago we noticed that the time to 
> >>> completion for the weekly full backups spiked dramatically.
> >>>
> >>> Prior to that time, the fulls would start around 7pm on a Friday, and 
> >>> finish by about 7pm on Sunday.
> >>>
> >>> Now they take until Thursday or Friday to complete.
> >>>
> >>> This coincided with some changes to the environment: I had to move 
> >>> the VM to a new host (it was a manual copy - we don't have vmotion 
> >>> licensed and configured for these hosts) and at about that time I 
> >>> also had to expand 2 of the 4 LUNS.  Finally, the OS drive for the 
> >>> VM on the old host was on a LUN on our Lefthand unit - I had to 
> >>> migrate it to the local disk storage on the new home for the VM. The 
> >>> 4 data drives for this VM are attached via the MSFT iSCSI client 
> >>> running on the VM, not through VMWare's iSCSI client. So, at that 
> >>> point, all of the LUNS were on the Lefthand SAN, which is a 3-node 
> >>> cluster, and we use 2-way replication for all LUNS. The 2 LUNS that 
> >>> were expanded went to 2tb or slightly beyond. The Lefthand has two 
> >>> NSM 2060s and a P4300G2, with 6 and 8 disks each, respectively - a 
> >>> total of 20 disks
> >>>
> >>> Since that time, I've also added in our EMC VNXe 3100 with 6 disks in it 
> >>> in a RAID6 array. I mention this because this means that all of the file 
> >>> systems on the VNXe are clean and defragged.
> >>>
> >>> Currently, I've migrated 3 of the 4 data LUNs for the VM to the EMC. I 
> >>> made sure to align the partitions on the EMC to a megabyte boundary.
> >>>
> >>> So, to make this simpler to visualize, a little table:
> >>>
> >>> c: - local disk on ESX 3.5, 40gb, 23.6gb free
> >>> j: - iSCSI LUN on Lefthand, 2.5tb, 900gb free
> >>> k: - iSCSI LUN on VNXe, 1.98tb, 336gb free
> >>> l: - iSCSI LUN on VNXe, 1tb, 79gb free
> >>> m: - iSCSI LUN on VNXe 750gb, 425gb free
> >>>
> >>> I tried to capture separate disk queue stats for each LUN, but in spite 
> >>> of selecting and adding each drive letter separately in the perfmon 
> >>> interface, all I got was _Total.
> >>>
> >>> Selected stats are as follows:
> >>>
> >>>     PhysicalDisk counters
> >>> Current disk queue length - average 0.483, maximum 33.000 Average 
> >>> disk read queue length - 0.037, maximum 1.294 %disk time - average 
> >>> 34.068, maximum 153.877 Average disk write queue length - average 
> >>> 0.645, maximum 2.828 Average disk queue length - average 0.681, 
> >>> maximum 3.078
> >>>
> >>> I have more data on PhysicalDisk, and data on other objects, including 
> >>> Memory, NetworkInterface, Paging File, Processor and  Server Work Queues.
> >>>
> >>> If anyone has thoughts, I'd surely like to hear them.
> >>>
> >>> Thanks,
> >>>
> >>> Kurt
> >>>
> >>> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> >>> <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >>>
> >>> ---
> >>> To manage subscriptions click here:
> >>> http://lyris.sunbelt-software.com/read/my_forums/
> >>> or send an email to [email protected]
> >>> with the body: unsubscribe ntsysadmin
> >>>
> >>>
> >>> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> >>> <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >>>
> >>> ---
> >>> To manage subscriptions click here:
> >>> http://lyris.sunbelt-software.com/read/my_forums/
> >>> or send an email to [email protected]
> >>> with the body: unsubscribe ntsysadmin
> >>
> >> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> >> <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >>
> >> ---
> >> To manage subscriptions click here:
> >> http://lyris.sunbelt-software.com/read/my_forums/
> >> or send an email to [email protected]
> >> with the body: unsubscribe ntsysadmin
> >>
> >>
> >> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> >> <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >>
> >> ---
> >> To manage subscriptions click here:
> >> http://lyris.sunbelt-software.com/read/my_forums/
> >> or send an email to [email protected]
> >> with the body: unsubscribe ntsysadmin
> >
> > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> > <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >
> > ---
> > To manage subscriptions click here: 
> > http://lyris.sunbelt-software.com/read/my_forums/
> > or send an email to [email protected]
> > with the body: unsubscribe ntsysadmin
> >
> >
> > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> > <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >
> > ---
> > To manage subscriptions click here: 
> > http://lyris.sunbelt-software.com/read/my_forums/
> > or send an email to [email protected]
> > with the body: unsubscribe ntsysadmin
> >
> > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> > <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
> >
> > ---
> > To manage subscriptions click here: 
> > http://lyris.sunbelt-software.com/read/my_forums/
> > or send an email to [email protected]
> > with the body: unsubscribe ntsysadmin
> 
> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ 
> <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~
> 
> ---
> To manage subscriptions click here: 
> http://lyris.sunbelt-software.com/read/my_forums/
> or send an email to [email protected]
> with the body: unsubscribe ntsysadmin
> 
> 
> 
> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~
> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~
> 
> ---
> To manage subscriptions click here: 
> http://lyris.sunbelt-software.com/read/my_forums/
> or send an email to [email protected]
> with the body: unsubscribe ntsysadmin
                                          
~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin

RE: Picking up file server tuning again

Reply via email to