Am 26.09.2012 um 21:16 schrieb Brendan Moloney: > Virtual memory includes things like shared libraries (even though these are > only loaded into memory once for all processes that use them).
If they were compiled with -fPIC (which is usually done), otherwise they will be loaded for each application and relocations of their addresses computed before execution. -- Reuti > -Brendan > ________________________________________ > From: [email protected] [[email protected]] On Behalf > Of Jérémie Dubois-Lacoste [[email protected]] > Sent: Wednesday, September 26, 2012 3:10 AM > To: [email protected] > Subject: Re: [gridengine users] Memory values reported by SGE too high > > Oh! Thanks, my mistake. > So it seems SGE is correct with the memory measurement, it reports > the same values as what we see if we launch things directly on the > nodes. However these values are still surprisingly high. > We'll investigate further if something is wrong with our kernel. > > Thanks, > > Jérémie > > > 2012/9/25 Reuti <[email protected]>: >> Am 25.09.2012 um 14:26 schrieb Jérémie Dubois-Lacoste: >> >>> Hi All, >>> >>> We recently reinstalled our cluster and we have some serious issues. >>> Contrary to our previous installation, we now installed a fully 64bits >>> system. We use Rocks cluster 6\CentOS 6.3, >>> and SGE 6.2u5. >>> >>> The memory values reported by SGE are very high compared >>> to the actual need of every jobs, and many get killed because >>> they exceed the limit, while they should not. >>> I found this thread about too low memory reports: >>> http://comments.gmane.org/gmane.comp.clustering.gridengine.users/19303 >>> >>> But I didn't find anything about too high memory reports... >>> >>> >>> Here is a simple test to make it clear: >>> >>> I submit a very stupid python script "minimal.py", wich is just: >>> ----- >>> import time >>> >>> time.sleep(30) >>> print("done") >>> ----- >>> >>> * I tried to run it directly to check the memory consumption with: >>> $ /usr/bin/time -v python minimal.py >>> And I get: Maximum resident set size (kbytes): 15376 >>> >>> >>> * Then, when submitting the jobs with: >>> qsub -m ase -M <my_mail> -b y -N memTest -o test.out -e test.err -cwd >>> "python minimal.py" >>> I go checking on the computation node where it gets scheduled and I "top": >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 20240 myName 23 3 114m 3844 1832 S 0.0 0.0 0:00.14 python >>> minimal.py >> >> The virtual size is listed here as 114m as well. >> >> -- Reuti >> >> >>> So I understand it uses 3.8Mb of RAM. >>> >>> >>> * But from the e-mail I get when the jobs terminate: >>> Job 1879536 (memTest) Complete >>> User = myName >>> Queue = [email protected] >>> Host = compute-3-0.local >>> Start Time = 09/25/2012 13:46:45 >>> End Time = 09/25/2012 13:47:15 >>> User Time = 00:00:00 >>> System Time = 00:00:00 >>> Wallclock Time = 00:00:30 >>> CPU = 00:00:00 >>> Max vmem = 114.441M >>> Exit Status = 0 >>> >>> >>> It says 114Mb, I don't understand this huge difference. >>> >>> >>> The consequence is that most of the jobs get killed by "fakely" (I presume) >>> exceeding the hard memory limit. Any clue is welcome! >>> >>> >>> Sincerely, >>> >>> Jérémie >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >>> >> > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
