Hi Nicolas - I agree - there's nothing wrong at all with your app consuming 60GB of RAM. It wasn't clear to me from the original post that the memory consumer was known and understood. Now I know that it is. Cool.
Given all that, and Steve's information on segmap, I have to agree that the memory you save me tweaking segmap down buys you the headroom you need to run your workload without hitting a memory deficit and starting to page. Thanks - I think we're all on the same page now! :^) /jim Nicolas Michael wrote: > Hi Jim, > > thanks for your quick reply. My comments inline. > > Jim Mauro schrieb: > >>> - SPARC, 64 GB memory >>> - UFS, PxFS file systems >>> >>> Our application is writing some logs to disk (4 GB / hour), flushing >>> some mmapped files from time to time (4 GB each 15 min), but is not >>> doing much disk I/O. >>> Once our application is started and "warm", it doesn't allocate any >>> further memory. At this point, we have 3-4 GB of free memory >>> (vmstat) and nothing paged out to disk (swap -l). >> Well, something is certainly consuming memory, because you indicate >> this is a 64GB system, and you show >> 3-4GB free. Who/what is consuming 60GB of RAM? > > Our application! ;-) > I don't want to go into the details here, but there's nothing wrong > about that. We know where all this memory is coming from (there are > some processes with large heaps, some large shm segments and so on). > > Steve has some slides on our application, in case you're really > interested... > >>> Since those memory requests are not coming from our application, I >>> assume that those 5 GB (3 GB less free memory plus 2 GB paged-out >>> data) are used for the file system cache. I always thought the fs >>> cache would never grow any more once memory gets short, so it should >>> never cause paging activity (since the cache list is part of the >>> free list). Reading Solaris Internals, I just learned that there's >>> not only a cache list, but also a segmap cache. As I understand >>> this, the segmap cache may very well grow up to 12% of main memory >>> and may even cause application pages to be paged out, correct? So, >>> this might be what's happening here. Can I somehow monitor the >>> segmap cache (since it is kernel memory, is it reported as "Kernel" >>> in ::memstat?)? >>> >> Thinks of UFS as having an L1 and L2 cache (like processors). segmap >> is the L1 cache, when segmap fills up, pages get pushed >> out the cache list (the "L2" cache), where they can be reclaimed back >> into segmap if they are referenced again via read/write >> before. > > Ok, thanks. > >> The 12% of memory being consumed by segmap is not what's hurting you >> here (at least, I would be very >> surprised if it is). > > We easily consume ~ 60 GB of memory just with our application > (including kernel, libs etc.). This doesn't allow us to spend 12% of > total memory for segmap cache in addition to that. If we would really > use all segmap cache that's possible (7.68 GB), we would exceed our > physical memory -- and I think this is happening. > > We can't reduce our application's demand for memory (in fact, we > already did reduce it by something like 20 GB to fit into 64 GB of > memory), so we need to reduce the max segmap cache size. Otherwise we > would need to install more memory in the system (which we don't want). > >>> My idea is now to set segmap_percent=1 to decrease the max size of >>> the segmap cache and this way avoid having pages paged out due to >>> growing fs cache. In a testrun with this configuration, my free >>> memory doesn't fall below 3.5 GB any more and nothing is being paged >>> out -- saving me 4.5 GB of memory! >>> >> Does this machine really have 64GB of RAM (as indicated above)? > > Yep! > >>> Since we don't do much disk I/O, I would assume that we don't gain >>> much from the segmap cache anyway, so I would like to configure it >>> to 1%. File system pages will still be cached in the cache list as >>> long as memory is available, right? With the advantage, that the >>> cache list is basically "free" memory and would never cause other >>> pages to be paged out. >> Generally, yes. > > Ok. > >>> I'm not sure, but as I understand it the segmap cache is still used >>> during read and write operations, right? So, every time we write a >>> file, we always write into the segmap cache. If this cache is small >>> (let's say: 1% = 640 MB), we might be slowed done when writing more >>> than 640 MB all at once. However, if we would only write 64 MB every >>> minute, pages from the segmap cache would migrate to the cache list >>> and make room for more pages in the segmap cache, so next time we >>> write 64 MB, would there again be enough space in the segmap cache >>> for the write operation? >>> >> Generally, yes, assuming the writes are not to files with >> O_SYNC/O_DSYNC, in which case every write must go through >> the cache anyway. > > Thanks. > >>> Also, just to be sure: memory mapped files are never read or written >>> through the segmap cache, so shrinking that cache has no effect on >>> memory mapped files, right? >>> >> That is correct. mmap()'d files are not cached in segmap. > > Ok, that's good to know. > >> Something is missing here, or the 64GB value is wrong. >> >> You need to figure out who/what is consuming 60GB of RAM. >> Use ' echo "::memstat" | mdb -k' for a high-order profile. > > As I said above, there's really nothing wrong with our application > consuming 60 GB... ;-) > > But here it is: > > Page Summary Pages MB %Tot > ------------ ---------------- ---------------- ---- > Kernel 291570 2277 4% > Anon 6892465 53847 83% > Exec and libs 45966 359 1% > Page cache 751264 5869 9% > Free (cachelist) 137777 1076 2% > Free (freelist) 179173 1399 2% > > Total 8298215 64829 > Physical 8166070 63797 > > This snapshot has been taken before I reconfigured the system. So this > is with segmap_percent=12. It was taken 2 hours after a long testrun. > As I wrote above, free memory jumped from 1 GB to 2.5 GB 1 hour after > we stopped the load. The only explanation I have for this is pages > being freed from the segmap cache. > > Steve wrote, segmap cache is part of "Page cache". Assuming, there was > 1.5 GB more data in the segmap cache during the testrun, this would > make 7.4 GB Page cache. 4 GB of it are memory mapped files. This > leaves 3.4 GB for segmap cache. Seems to me that's just 50% of its > possible max size, but still too much for our system. > > I believe we don't need that much for segmap. All we are doing on the > file system (except for the mmapped files) is write a large logfile > sequentially, close it and copy it to a different location, and later > on ftp it somewhere. This shouldn't require much segmap cache... > > Thanks a lot, > Nick. _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org