Re: [gridengine users] SGE Benchmark Tools

Fritz Ferstl Thu, 17 Feb 2011 05:50:34 -0800

The question is: what to measure in a queuing system? This could be:

The things you are discussing below, Reuti, are all valid and will havevarying degrees of pertinence for different people. That also means theywill not be easy to integrate into a single benchmark. One could build abenchmark for each of them, of course.

I can only say what I would probably do if I was tasked to build abenchmark in this context:

- I'd take an application profile, probably something well understoodlike SPEC. I might only pick a few of its apps for simplicity, though.(This part would need precise definition, really, but I don't want toget into this here.)

- I'd then say: "use whatever you want but run as many runs through thatapplication portfolio in a matter of N hours and make sure you do thefollowing:* Within a percentage X (say 5%) each application in the portfolioneeds to consume the same amount of wallclock time * core count (so youcan't just run the app which you prefer)* Count the number of jobs you've been able to run (individual runsfor each app from the portfolio)

  * Report that number - that is your benchmark result.

(Again, you'd need to define that in much more detail because cheatingon benchmarks is as old as benchmarking itself but you get my idea, I hope.)

Granted, this does not benchmark a scheduler per se. So it's kind ofoff-topic here, but it would sort of benchmark the efficiency of a datacenter. You basically would be able to say I'm getting 20% more out ofmy DC at the same cost as the other guy. Or to get to the same resultI'd need to spend $Y at Amazon. Stuff like that.

The DRM only would be a component in the picture but that's fine. So iseverything else and that is the reality.


Cheers,

Fritz


Am 17.02.11 12:25, Reuti wrote:

Am 17.02.2011 um 09:07 schrieb Fritz Ferstl:

My 2 cents here (and I'm aware they will not help Eric ... and apologies in 
advance for the rant, it's a long-term heartfelt topic  ...):

A DRM benchmark would be nice to have. Benchmarking in general is an almost 
vain attempt. You have to be very prescriptive of the boundary conditions to 
achieve comparable results. And such narrow boundary conditions almost never 
can reflect reality. So all benchmarks are up for interpretation and up for 
debate.

But, benchmarks are still a useful means to provide at least some orientation. As Chris 
has stated, the variability in the use case scenarios of workload managers is certainly 
even bigger than in classical performance benchmarks such as SPEC or Linpack. You also 
have to be careful what you are measuring: the underlying HW, network&  storage 
performance? Or the efficiency of the SW? Or the ability to tune the workload 
management system - in itself and in combination with HW&  SW underneath? Or the 
suitability of the workload management system for a specific application case?

So I guess that probably a suite of benchmarks would be needed, maybe akin to SPEC, to 
provide at least a roughly representative picture. And you'd have to either standardize 
on the HW, e.g. take 100 Amazon dedicated servers and run with that, or you'd have to do 
it like for Linpack and say: "I don't care what you use and how much of it but 
report the resulting throughput vs time numbers on these use cases." I.e. how fast 
can you possibly get. In other words something like the Top500 for workload management 
environments.

For many companies and institutions workload managers have become the most 
central workhorse - the conveyor belt of a data center. If it stops, all stops. 
If you can make it run quicker you achieve your results sooner. If it enables 
you to do so you can be much more flexible in responding to changing demands. 
So it's almost ironic that large computing centers are benchmarking individual 
server performance, run something like Linpack to advertise their peak 
performance and create their own site-specific application benchmark suites for 
selecting new HW. But they often do not benchmark with the workload management 
system in the picture which later, in combination with tuning and the rest of 
the environment, will define the efficiency of the data center.

So a benchmark for DRMs would be a highly useful tool. I've always wondered how 
to get an initiative started which would lead to such a benchmark ...

The question is: what to measure in a queuing system? This could be:

a) - The load on the qmaster machine to handle the submitted jobs (used disk,
memory, cpu-time).

b) - Time vasted to switch from one job to the next one on an exechost (maybe already
sending the next job thereto in a "ready" state beforehand and release it as
soon as the former job finished). This would be interesting for workflows where many
short jobs are submitted.

c) - Time vasted by resource reservation. It was a couple of times on the
former list to have some kind of real-time features in SGE to know exactly when
a jobs starts and will end (assuming you know the necessary execution times
beforehand). In this context also solving a cutting-stock-problem would fit (to
minimize the overall wallclock time): having a bunch of jobs, is the queuing
system capable of reordering them in such a way, that the resource reservation
vasted is at its minimum (possibly zero) and to finish the complete bunch of
jobs in the shortest wallclock time possible?

d) - Can I tell the scheduler that I have varying resource requirements over
the runtime of the job to lower vasted resources? Can the queuing system send
my job around in the cluster depending on the resources need in certains steps
of a job? All to minimize the unused resources.

e) - Can c) be combined with some job dependencies in a directed graph (a job
with error shouldn't trigger the next with a job-hold to start) and decisions
which job of two alternatives should be executed at all? (sadly the project
http://wildfire.bii.a-star.edu.sg/ stopped)

Somehow I have the impression, that there is no real benchmark possible in a sense of "always
faster" and next year "faster again". This would end up with the criticism you state
about a plain Linpack test.

It's more like a conformance test according to certain set up rules. Imagine
the case you have queuing system A which can reorder jobs and minimize vasted
resources (but has a high impact on the qmaster machine), and queuing system B
which puts nearly no load on the qmaster machine (but has just a FIFO for
handling jobs). Which one would you buy?

NB: for us in quantum chemistry, where jobs run for weeks or even months, a)
and b) isn't much of a concern. c) would be interesting though. d) would be
hard to phrase in exact times.

-- Reuti

Any ideas?

Cheers,

Fritz


Am 16.02.11 22:38, schrieb Chris Dagdigian:


What exactly are you trying to benchmark? Job types and workflows are
far to variable to produce a usable generic reference.

The real benchmark is "does it do what I need?" and there are many
people on this list who can help you zero in on answering that question.

SGE is used on anything from single-node servers to the 60,000+ CPU
cores on the RANGER cluster over at TACC.

The devil is in the details of what you are trying to do of course!

-Chris



Eric Kaufmann wrote:

I am fairly new to SGE. I am interested in getting some benchmark
information from SGE.

Are there any tools for this etc?

Thanks,

Eric

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users



---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message. This message has been content scanned by the Univa Mail 
system.



---------------------------------------------------------------------

<fferstl.vcf>_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


--
Fritz Ferstl   --   CTO and Business Development, EMEA
Univa   --   The Data Center Optimization Company
E-Mail: [email protected]
Web: http://www.univa.com
Phone: +49.9471.200.195
Mobile: +49.170.819.7390


---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message. This message has been content scanned by the Univa Mail 
system.



---------------------------------------------------------------------

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] SGE Benchmark Tools

Reply via email to