RE: Establishing baseline metrics

Robin D. Wilson Mon, 01 Jul 2013 13:33:19 -0700

I'm thinking I look at performance testing differently than a lot of people... 
For me, the objective of performance testing is to
establish what your system _can_ do, not what you need to accomplish. So when 
you are setting up your tests, you are trying to drive
your systems at maximum capacity for some extended period of time. Then you 
measure that capacity as your 'baseline'.


For every subsequent release of your code, you measure it against the 
'baseline', and determine whether the code got faster or
slower. If you determine that the slower (or faster) response is acceptable to 
your end users (because you were nowhere near the
user's acceptable standard), you can reset your baseline to that standard. If 
your slower standard is encroaching on the usability
of the system - you can declare that baseline as the minimum spec, and then 
fail any code that exceeds that standard.

As for how you determine what is acceptable to a 'user', that can be handled in 
a number of ways - without actually improving the
'real' performance of the system. Consider a web page that loads a bunch of 
rows of data in a big table. For most users, if you can
start reading the table within 1-2 seconds, that is acceptable for a system's 
performance. But if there are hundreds of rows of
data, you would not need to load _all_ the rows within 1-2 seconds to actually 
meet their performance criteria. You only need to
load enough rows that the table fills the browser - so they can start reading - 
within the 1-2 second period. JMeter cannot really
measure this timing, it can only measure the 'overall response time' (indeed, I 
don't know any testing tool that can do it). So
trying to define a performance benchmark in terms of what 'users' experience is 
really difficult, and nearly useless (to me anyway).

I look at performance testing as a way to cross-check my development team 
against the perpetual tendency to gum-up the code and slow
things down. So in order to make the testing effective for the developers, I 
need to perf test _very_specific_ things. Trying to
performance test the "system" as a whole is nearly an impossible task - not 
only because there are so many variables that influence
the tests, but precisely because "all of those variables" make it impossible to 
debug which one causes the bottleneck when there is
a change in performance from one release to the next. (Have you ever sent your 
programmers off to 'fix' a performance problem that
turned out to be caused by an O/S update on your server? I have...)

Instead, we create performance tests that test specific functional systems. 
That is, the "login" perf test. The "registration" perf
test. The "..." perf test. Each one of these tests is run independently, so 
that when we encounter a slower benchmark - we can tell
the developers immediately where to concentrate their efforts in fixing the 
problem. (We also monitor all parts of the system (CPU,
IO, Database Transactions (reads, writes, full table scans, etc.) from all 
servers involved in the test. The goal is not to simulate
'real user activity', it is to max out the capacity of at least 1 of the 
servers in the test (specifically the one executing the
'application logic'). If we max out that one server, we know that our 
'benchmark' is the most we can expect of a single member of
our cluster of machines. (We also test a cluster of 2 machines - and measure 
the fall-off in capacity between a 1-member cluster and
2-member cluster, this gives us an idea of how much impact our 'clustering' 
system has on performance as well.) I suppose you could
say that I look at it as if, we measure the 'maximum capacity', and so long as 
the number of users doesn't exceed that - we will
perform OK.

We do run some 'all-encompassing' system tests as well, but those are more for 
'stress' testing than for performance benchmarking.
We are specifically looking for things that start to break-down after hours of 
continuous operation at peak capacity. So we monitor
error logs and look to make sure that we aren't throwing errors while under 
stress.

The number one thing to keep in mind about performance testing is that you have 
to use 'real data'. We actually download our
production database every weekend, and strip out any 'personal information' 
(stuff that we protect in our production environment) by
either nulling it out, or replacing it with bogus data. This allows us to run 
our performance tests against a database that has 100s
of millions of rows of data. Nearly all of our performance 'bugs' have been 
caused by poor data handling in the code (SQL requests
that don't use indices (causing a full table scan), badly formed joins, 
fetching a few rows of data and then looping through them in
the code (when the 'few rows of data' from your 'dev' environment become 
100,000 rows with the production data, this tends to bog
the code down a lot), etc.). So if you are testing with 'faked' data, odds are 
good you will miss a lot of performance issues - no
matter what form of performance testing you use.

I will say that we have served over 130M web pages in 1 month, using only 5 
servers (4 tomcats, and 1 DB server)... Those pages
represented about 10X in "GET" requests to our servers... 

--
Robin D. Wilson
Sr. Director of Web Development
KingsIsle Entertainment, Inc.
http://www.kingsisle.com


-----Original Message-----
From: nmq [mailto:nmq0...@gmail.com] 
Sent: Monday, July 01, 2013 2:33 PM
To: JMeter Users List
Subject: Establishing baseline metrics

Hi all

This is not a JMeter specific questions but since this user list comprises of 
experts in performance testing, I figured it would be
a good place to ask this question.

My question is how do you establish baselines for a website's performance if 
you do not have any historic data?  Lets say this is a
new website and its for a limited number of customers.

How do you determine what should be the number of concurrent users you should 
simulate.

Lets say the executives say off at the top of their heads, that the maximum 
number of concurrent users would be 50 at peak times.
Does that mean I should not go beyond 50 or should I still do tests with a 
higher number?

How can I go about establishing baselines for page load times, if I do not have 
any historic data and have no industry benchmarks or
competitor data.

Would it make sense to say let's see how the website is doing throughout the 
development phase and establish our baseline using the
current response times?

I would appreciate any input.


Regards
Sam


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@jmeter.apache.org
For additional commands, e-mail: user-h...@jmeter.apache.org

RE: Establishing baseline metrics

Reply via email to