The question is: what to measure in a queuing system? This could be:

The things you are discussing below, Reuti, are all valid and will have varying degrees of pertinence for different people. That also means they will not be easy to integrate into a single benchmark. One could build a benchmark for each of them, of course.

I can only say what I would probably do if I was tasked to build a benchmark in this context:

- I'd take an application profile, probably something well understood like SPEC. I might only pick a few of its apps for simplicity, though. (This part would need precise definition, really, but I don't want to get into this here.)

- I'd then say: "use whatever you want but run as many runs through that application portfolio in a matter of N hours and make sure you do the following: * Within a percentage X (say 5%) each application in the portfolio needs to consume the same amount of wallclock time * core count (so you can't just run the app which you prefer) * Count the number of jobs you've been able to run (individual runs for each app from the portfolio)
  * Report that number - that is your benchmark result.

(Again, you'd need to define that in much more detail because cheating on benchmarks is as old as benchmarking itself but you get my idea, I hope.)

Granted, this does not benchmark a scheduler per se. So it's kind of off-topic here, but it would sort of benchmark the efficiency of a data center. You basically would be able to say I'm getting 20% more out of my DC at the same cost as the other guy. Or to get to the same result I'd need to spend $Y at Amazon. Stuff like that.

The DRM only would be a component in the picture but that's fine. So is everything else and that is the reality.

Cheers,

Fritz


Am 17.02.11 12:25, Reuti wrote:
Am 17.02.2011 um 09:07 schrieb Fritz Ferstl:

My 2 cents here (and I'm aware they will not help Eric ... and apologies in 
advance for the rant, it's a long-term heartfelt topic  ...):

A DRM benchmark would be nice to have. Benchmarking in general is an almost 
vain attempt. You have to be very prescriptive of the boundary conditions to 
achieve comparable results. And such narrow boundary conditions almost never 
can reflect reality. So all benchmarks are up for interpretation and up for 
debate.

But, benchmarks are still a useful means to provide at least some orientation. As Chris 
has stated, the variability in the use case scenarios of workload managers is certainly 
even bigger than in classical performance benchmarks such as SPEC or Linpack. You also 
have to be careful what you are measuring: the underlying HW, network&  storage 
performance? Or the efficiency of the SW? Or the ability to tune the workload 
management system - in itself and in combination with HW&  SW underneath? Or the 
suitability of the workload management system for a specific application case?

So I guess that probably a suite of benchmarks would be needed, maybe akin to SPEC, to 
provide at least a roughly representative picture. And you'd have to either standardize 
on the HW, e.g. take 100 Amazon dedicated servers and run with that, or you'd have to do 
it like for Linpack and say: "I don't care what you use and how much of it but 
report the resulting throughput vs time numbers on these use cases." I.e. how fast 
can you possibly get. In other words something like the Top500 for workload management 
environments.

For many companies and institutions workload managers have become the most 
central workhorse - the conveyor belt of a data center. If it stops, all stops. 
If you can make it run quicker you achieve your results sooner. If it enables 
you to do so you can be much more flexible in responding to changing demands. 
So it's almost ironic that large computing centers are benchmarking individual 
server performance, run something like Linpack to advertise their peak 
performance and create their own site-specific application benchmark suites for 
selecting new HW. But they often do not benchmark with the workload management 
system in the picture which later, in combination with tuning and the rest of 
the environment, will define the efficiency of the data center.

So a benchmark for DRMs would be a highly useful tool. I've always wondered how 
to get an initiative started which would lead to such a benchmark ...

The question is: what to measure in a queuing system? This could be:

a) - The load on the qmaster machine to handle the submitted jobs (used disk, 
memory, cpu-time).

b) - Time vasted to switch from one job to the next one on an exechost (maybe already 
sending the next job thereto in a "ready" state beforehand and release it as 
soon as the former job finished). This would be interesting for workflows where many 
short jobs are submitted.

c) - Time vasted by resource reservation. It was a couple of times on the 
former list to have some kind of real-time features in SGE to know exactly when 
a jobs starts and will end (assuming you know the necessary execution times 
beforehand). In this context also solving a cutting-stock-problem would fit (to 
minimize the overall wallclock time): having a bunch of jobs, is the queuing 
system capable of reordering them in such a way, that the resource reservation 
vasted is at its minimum (possibly zero) and to finish the complete bunch of 
jobs in the shortest wallclock time possible?

d) - Can I tell the scheduler that I have varying resource requirements over 
the runtime of the job to lower vasted resources? Can the queuing system send 
my job around in the cluster depending on the resources need in certains steps 
of a job? All to minimize the unused resources.

e) - Can c) be combined with some job dependencies in a directed graph (a job 
with error shouldn't trigger the next with a job-hold to start) and decisions 
which job of two alternatives should be executed at all? (sadly the project 
http://wildfire.bii.a-star.edu.sg/ stopped)

Somehow I have the impression, that there is no real benchmark possible in a sense of "always 
faster" and next year "faster again". This would end up with the criticism you state 
about a plain Linpack test.

It's more like a conformance test according to certain set up rules. Imagine 
the case you have queuing system A which can reorder jobs and minimize vasted 
resources (but has a high impact on the qmaster machine), and queuing system B 
which puts nearly no load on the qmaster machine (but has just a FIFO for 
handling jobs). Which one would you buy?

NB: for us in quantum chemistry, where jobs run for weeks or even months, a) 
and b) isn't much of a concern. c) would be interesting though. d) would be 
hard to phrase in exact times.

-- Reuti


Any ideas?

Cheers,

Fritz


Am 16.02.11 22:38, schrieb Chris Dagdigian:

What exactly are you trying to benchmark? Job types and workflows are
far to variable to produce a usable generic reference.

The real benchmark is "does it do what I need?" and there are many
people on this list who can help you zero in on answering that question.

SGE is used on anything from single-node servers to the 60,000+ CPU
cores on the RANGER cluster over at TACC.

The devil is in the details of what you are trying to do of course!

-Chris



Eric Kaufmann wrote:
I am fairly new to SGE. I am interested in getting some benchmark
information from SGE.

Are there any tools for this etc?

Thanks,

Eric


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message. This message has been content scanned by the Univa Mail 
system.



---------------------------------------------------------------------

<fferstl.vcf>_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


--
Fritz Ferstl   --   CTO and Business Development, EMEA
Univa   --   The Data Center Optimization Company
E-Mail: [email protected]
Web: http://www.univa.com
Phone: +49.9471.200.195
Mobile: +49.170.819.7390


---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message. This message has been content scanned by the Univa Mail 
system.



---------------------------------------------------------------------

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to