Re: [OMPI devel] SM backing file size

Ralph Castain Fri, 14 Nov 2008 08:42:52 -0500

Hi Eugene

I too am interested - I think we need to do something about the smbacking file situation as larger core machines are slated to becomemore prevalent shortly.

I appreciate your info on the sizes and controls. One other question:what happens when there isn't enough memory to support all this? Arewe smart enough to detect this situation? Does the sm subsystemquietly shut down? Warn and shut down? Segfault?


I have two examples so far:

1. using a ramdisk, /tmp was set to 10MB. OMPI was run on a singlenode, 2ppn, with btl=openib,sm,self. The program started, butsegfaulted on the first MPI_Send. No warnings were printed.

2. again with a ramdisk, /tmp was reportedly set to 16MB (unverified -some uncertainty, could be have been much larger). OMPI was run onmultiple nodes, 16ppn, with btl=openib,sm,self. The program ran tocompletion without errors or warning. I don't know the communicationpattern - could be no local comm was performed, though that soundsdoubtful.

If someone doesn't know, I'll have to dig into the code and figure outthe response - just hoping that someone can spare me the pain.


Thanks
Ralph


On Nov 13, 2008, at 3:21 PM, Eugene Loh wrote:

Ralph Castain wrote:
As has frequently been commented upon at one time or another, theshared memory backing file can be quite huge. There used to be aparam for controlling this size, but I can't find it in 1.3 - orat least, the name or method for controlling file size has morphedinto something I don't recognize.
Can someone more familiar with that subsystem point me to one ormore params that will allow us to control the size of that file?It is swamping our systems and causing OMPI to segfault.
Sounds like you've already gotten your answers, but I'll add my$0.02 anyhow.
The file size is the number of local processes (call it n) timesmpool_sm_per_peer_size (default 32M), but with a minimum ofmpool_sm_min_size (default 128M) and a maximum of mpool_sm_max_size(default 2G? 256M?). So, you can tweak those parameters to controlfile size.
Another issue is possibly how small a backing file you can get awaywith. That is, just forcing the file to be smaller may not beenough since your job may no longer run. The backing file seems tobe used mainly by:
*) eager-fragment free lists: We start with enough eager fragmentsso that we could have two per connection. So, you could bump the smeager size down if you need to shoehorn a job into a very smallbacking file.
*) large-fragment free lists: We start with 8*n large fragments.If this term plagues you, you can bump the sm chunk size down orreduce the value of 8 (using btl_sm_free_list_num, I think).
*) FIFOs: The code tries to align a number of things on pagesizeboundaries, so you end up with about 3*n*n*pagesize overhead here.If this term is causing you problems, you're stuck (unless youmodify OMPI).
I'm interested in this subject!  :^)
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] SM backing file size

Reply via email to