I am running into similar issues with both Mellanox and IBM HCAs. On a node installed with RHEL6.2 and MLNX_OFED-1.5.3-3.0.0, there is a significant hit to locked memory when going with the device's max_cqe. Here, for comparison's sake is the memory utilization for a simple MPI process when using the new cq_size default, and when restricting it to 1500:
cq_size = max_cqe: VmPeak: 348736 kB VmSize: 348352 kB VmLck: 292096 kB VmHWM: 304896 kB VmRSS: 304896 kB VmData: 333504 kB cq_size = 1500 VmPeak: 86720 kB VmSize: 86336 kB VmLck: 30080 kB VmHWM: 42880 kB VmRSS: 42880 kB VmData: 71488 kB For our Power systems using the IBM eHCA, the default value exhausts memory and we can't even run. --Brad On Fri, Jul 6, 2012 at 5:21 AM, TERRY DONTJE <terry.don...@oracle.com>wrote: > > > On 7/5/2012 5:47 PM, Shamis, Pavel wrote: > > I mentioned on the call that for Mellanox devices (+OFA verbs) this resource > is really cheap. Do you run mellanox hca + OFA verbs ? > > (I'll reply because I know Terry is offline for the rest of the day) > > Yes, he does. > > I asked because SUN used to have own verbs driver. > > I noticed this on a Solaris machine, I am not sure I have the same set up > for Linux but I'll look and see if I can reproduce the same issue on Linux. > > --td > > The heart of the question: is it incorrect to assume that we'll consume > (num CQE * CQE size) registered memory for each QP opened? > > QP or CQ ? I think you don't want to assume anything there. Verbs (user and > kernel) do their own magic there. > I think Mellanox should address this question. > > Regards, > Pasha > _______________________________________________ > devel mailing > listdevel@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.781.442.2631 > Oracle * - Performance Technologies* > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >