Hi everyone,
currently I am trying to upgrade an old SGE cluster from an 6.2 release to SGE 8.1.3. The upgrade itself is no problem, however the sge_execd of 8.1.3 consumes a lot of CPU on the exec host when a job is running--sometimes up to 30% of a CPU. For a 40 minute test job (Abaqus/Standard FEA) the execd consumed around 10 minutes of CPU time. After the job finished the CPU consumption of the sge_execd dropped as well. This was on an RHEL6.4 system, but was observed on an RHEL5.9 system as well. After some digging with strace and seeing the sge_execd opens /proc/<pid>/... files every second, the root of the problem seems to be the function linux_read_status() in daemons/common/procfs.c, which tries to gather process statistics for each SGE task. This seems to be done every second and not for load_report_time intervals. As the smap for the Abaqus process gets pretty big sge_execd also spends a significant time to parse it. From my small test job: # wc -c /proc/11479/smaps 3108194 /proc/11479/smaps >From what I can see in the source the only thing really parsed from /proc/<pid>/smaps is the process resident size and swap usage to calculate the total size of the process, which is initialized from /proc/<pid>/stat. I think this could be retrieved from /proc/<pid>/status far more easily where I think the VmSwap line in /proc/<pid>/status was added in kernel 2.6.31. Funny thing is that in procfs.c this is also done, but only as a fallback if there is no "Swap:" data in /proc/<pid>/smap and not as a default. My personal point of view is that it's quite unneccessary to burn all these CPU cycles every second just to have overly accurate data of process size in RSS + swap. If the system doesn't give it to you directly like in /proc/<pid>/status maybe it's not worth obsessing over it. So, wouldn't it be better to change the default to parse /proc/<pid>/status, and maybe only enable the detailed parsing of the complete MMU map (which is basically what /proc/<pid>/smap is) only with an configurable option that isn't enabled by default? Regards, Thomas Mainka -- -- Vorstandsvorsitzender/Chairman of the board of management: Gerd-Lothar Leonhart Vorstand/Board of Management: Dr. Bernd Finkbeiner, Michael Heinrichs, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
