Hi, Am 06.07.2011 um 21:24 schrieb Peskin, Eric:
> We have a cluster managed by SGE. Specifically, this is the version from the > rpm sge-V62u4-1 installed on a Linux cluster running Rocks. > > We want to start charging users for use of the cluster. Our plan is to > charge by the core hour. In other words, using one CPU core for 12 hours > should cost the same as using 12 CPU cores for 1 hour. > > I think I should be able to gather this information using qacct. For > example, if I understand correctly, > > qacct -o -d 30 > > should output usage for each user over the last 30 days. not directly. It will look into the accounting file for jobs which started in the last 30 days and also finished already. For still running ones it can't generate any output. Also jobs started before "now -30 days" aren't taken into account. To start, you can look into an individual entry for a job by "-j <job_id>". > However, I am still unclear (after having read the man page and some web > sites) on the interpretation of the various columns of output from qacct > (especially WALLCLOCK, UTIME, STIME, and CPU). While UTIME and STIME are values computed by the kernel, the CPU and MEM (IO too) are computed by SGE's shepherd. If no process (of a serial ot Open MP) job is jumping out of the process tree, the computed values by the kernel, should almost be the same as the ones computed by SGE. As SGE is using an additonal group ID to keep track of it, the kernel can only generate the correct values in case a normal end of a job. If you used `qdel`, these values are often wrong (the ones by the kernel). > Sometimes the value in the WALLCLOCK column is greater than that in the CPU > column and sometimes it is the other way around. Also, sometimes the CPU > column is the sum of UTIME + STIME, but sometimes it is quite different. > > I want to make sure we get this right in the face of parallelism. SGE has > multiple ways to run jobs in parallel (e.g., qmake, array jobs, and parallel > environments). If you run a parallel job, it's important to have a proper tight integration of all slave processes into SGE, i.e. slave tasks are started by `qrsh - inherit ...`, and not by `ssh` or `rsh` from one node to another. One way to spot a wrong setup is to disable `ssh` and `rsh` inside the cluster (I usually limit it to admin staff), and force users to even use `qrsh -l node22` or alike in case they want to check the state of a job on the nodes (here you could also limit h_cpu to 60 to avoid abuse of this granted feature). If you have this, you will either get one entry per job plus one for each `qrsh -inherit ...` and the summarized output (like you used above) will add this accordingly as long as you set "accounting_summary false" in the PE definition. If you set "accounting_summary true" instead, there will be only one entry per job and the vaules for CPU, MEM and IO are summarized for all slave tasks (unfortunately not the kernel reported values, but it's an RFE to add them too). Array jobs are not so special. You will have one entry per task (for serial array jobs) and you could use "-t <taskid>" to get a single entry out of the accounting. > What is the most reliable way to track core-hours, such that occupying 100 > cores for a day costs 100 times as much as occupying just 1 core for a day? Occupying or using? -- Reuti > Any advice would be greatly appreciated. > > ------------------------------------------------------------ > This email message, including any attachments, is for the sole use of the > intended recipient(s) and may contain information that is proprietary, > confidential, and exempt from disclosure under applicable law. Any > unauthorized review, use, disclosure, or distribution is prohibited. If you > have received this email in error please notify the sender by return email > and delete the original message. Please note, the recipient should check this > email and any attachments for the presence of viruses. The organization > accepts no liability for any damage caused by any virus transmitted by this > email. > ================================= > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
