Hi, On Apr 28, 2010, at 18:58, Stephan Wiesand <[email protected]> wrote: > On Apr 27, 2010, at 00:15 , Brett Viren wrote: >> We recently started running our C++ analysis code on 64bit SL5.3 and >> have been surprised to find the memory usage is about 2x what we are >> used when running it on 32 bits. Comparing a few basic applications >> like sleep(1) show similar memory usage. Others, like sshd, show only a >> 30% size increase (maybe that is subject to configuration differences >> between the two hosts). >> >> I understand that pointers must double in size but the bulk of our >> objects are made of ints and floats and these are 32/64 bit-invariant. >> I found[1] that poorly defined structs containing pointers can bloat >> even on non-pointer data members due the padding needed to keep >> everything properly aligned. It would kind of surprise me if this is >> what is behind what we see. >> >> Does anyone have experience in understanding or maybe even combating >> this increase in a program's memory footprint when going to 64 bits? > > Is it real or virtual memory usage that's increasing beyond expectations? > > Example: glibc's locale handling code will behave quite differently in the > 64-bit case. In 32-bit mode, even virtual address space is a scarce resource, > while in 64-bit mode it isn't. So in the latter case, they simply mmap the > whole file providing the info for the locale in use, while in the former they > use a small address window they slide to the appropriate position. The 64-bit > case is simpler and thus probably less code, more robust and easier to > maintain. And it's probably faster. The 32-bit case uses less *virtual* > memory - but *real* memory usage is about the same, since only those pages > actually read will ever be paged in. This has a dramatic effect on the VSZ of > "hello world in python". It does not on anything that really matters - in > particular, checking the memory footprints of sleep & co. is not very useful > because they're really small compared to typical HEP analysis apps anyway.
You can work around the locale thing for any batch application (for which that usually should not matter) by setting the LANG envvar to "C". For a single process this will only be about 50MB, though. The big difference most of us saw was due to the linker forcing shared libraries text/data to align to 2MB, while we have very many very small (<<2MB) libraries. You should see this explicitly if you do a 'pmap' of your process once it is running and has loaded all libraries. You'll see memory sections with no permissions next to those corresponding to libraries. Assuming you aren't using huge memory pages in your application there is a linker option (don't recall off the top of my head the name) in SL5 binutils ld which allows you to reduce this. But what both of these things say is that VSIZE for 64bit is not a very good measure of how much memory an app really needs. Taking out fake accounting things like the two above our estimate is that our (CMS) applications typically only need 20-25% more memory at 64bit relative to 32bit. (From the small code size increase, data type increases for ptr's and whatnot and some increase from overhead/alignment for live objects in the heap..) We are actually preparing some proposals/recommendations about measuring memory use, as in addition to this VSIZE/64bit confusion the introduction of "multicore" applications which share memory also misleads people... Pete
