>From: Matthew Nuzum <[EMAIL PROTECTED]> >Sent: Sep 28, 2005 4:02 PM >Subject: [PERFORM] Logarithmic change (decrease) in performance > Small nit-pick: A "logarithmic decrease" in performance would be a relatively good thing, being better than either a linear or exponential decrease in performance. What you are describing is the worst kind: an _exponential_ decrease in performance.
>Something interesting is going on. I wish I could show you the graphs, >but I'm sure this will not be a surprise to the seasoned veterans. > >A particular application server I have has been running for over a >year now. I've been logging cpu load since mid-april. > >It took 8 months or more to fall from excellent performance to >"acceptable." Then, over the course of about 5 weeks it fell from >"acceptable" to "so-so." Then, in the last four weeks it's gone from >"so-so" to alarming. > >I've been working on this performance drop since Friday but it wasn't >until I replied to Arnau's post earlier today that I remembered I'd >been logging the server load. I grabbed the data and charted it in >Excel and to my surprise, the graph of the server's load average looks >kind of like the graph of y=x^2. > >I've got to make a recomendation for a solution to the PHB and my >analysis is showing that as the dataset becomes larger, the amount of >time the disk spends seeking is increasing. This causes processes to >take longer to finish, which causes more processes to pile up, which >causes processes to take longer to finish, which causes more processes >to pile up etc. It is this growing dataset that seems to be the source >of the sharp decrease in performance. > >I knew this day would come, but I'm actually quite surprised that when >it came, there was little time between the warning and the grande >finale. I guess this message is being sent to the list to serve as a >warning to other data warehouse admins that when you reach your >capacity, the downward spiral happens rather quickly. > Yep, definitely been where you are. Bottom line: you have to reduce the sequential seeking behavior of the system to within an acceptable window and then keep it there. 1= keep more of the data set in RAM 2= increase the size of your HD IO buffers 3= make your RAID sets wider (more parallel vs sequential IO) 4= reduce the atomic latency of your RAID sets (time for Fibre Channel 15Krpm HD's vs 7.2Krpm SATA ones?) 5= make sure your data is as unfragmented as possible 6= change you DB schema to minimize the problem a= overall good schema design b= partitioning the data so that the system only has to manipulate a reasonable chunk of it at a time. In many cases, there's a number of ways to accomplish the above. Unfortunately, most of them require CapEx. Also, ITRW world such systems tend to have this as a chronic problem. This is not a "fix it once and it goes away forever". This is a part of the regular maintenance and upgrade plan(s). Good Luck, Ron ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org