Re: [O-MPI devel] Linux processor affinity
Just recently finished checking. For the collection of Linux hosts I have access to, the probe results are the same regardless of the choice of set or get. I agree 100% that "get" is a safer probe. -Paul Jeff Squyres wrote: On Dec 9, 2005, at 3:06 PM, Bogdan Costescu wrote: rc = sched_setaffinity(0, sizeof(mask), mask); This changes whatever affinity might have been set before this check, for example by a (smart, don't know if such exists now) batch system. I haven't checked if it's possible, but I think that a similar solution based on sched_getaffinity would be much better, as this would not disturb the current settings. Paul and I were discussing this earlier (off list). He was investigating doing the same check with sched_getaffinity() -- I don't know if he has finished checking into that already. -- {+} Jeff Squyres {+} The Open MPI Project {+} http://www.open-mpi.org/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
Re: [O-MPI devel] Linux processor affinity
On Thu, 8 Dec 2005, Jeff Squyres wrote: This is friggen' amazing. Let me disagree with you here... and not because I proposed a different solution. ;-) rc = sched_setaffinity(0, sizeof(mask), mask); This changes whatever affinity might have been set before this check, for example by a (smart, don't know if such exists now) batch system. I haven't checked if it's possible, but I think that a similar solution based on sched_getaffinity would be much better, as this would not disturb the current settings. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: bogdan.coste...@iwr.uni-heidelberg.de
Re: [O-MPI devel] Linux processor affinity
If one looks though enough kernel versions, one finds that some of them differ in what they will accept for the len. Some produce EINVAL if len!=sizeof(long), others (especially Altix) produce EINVAL if len is too short to cover all the machine's CPUs. I think I recall finding one that was even happy w/ len==0. So, even if one ignores the 2-argument version in some 2.5.x kernels, the caller needs to know if the len to pass should always be sizeof(long), or if it should reflect the true number of CPUs present (as one must on an Altix). -Paul Bogdan Costescu wrote: On Thu, 8 Dec 2005, Jeff Squyres wrote: Check out http://svn.open-mpi.org/svn/ompi/trunk/opal/mca/paffinity/ linux/paffinity_linux.h -- there's a big comment in that file about the problem, to include descriptions of the 3 APIs. I'm sorry, but that is not quite what I wrote about in my message. The comments refer to the _glibc_ view of the functions, at least I couldn't see how they map to my reading of the _kernel_ source code. Let's take one that is specifically mentioned there: Mandrake 10.0, kernel based on 2.6.3, in file kernel/sched.c there is the function: /** * sys_sched_setaffinity - set the cpu affinity of a process * @pid: pid of the process * @len: length in bytes of the bitmask pointed to by user_mask_ptr * @user_mask_ptr: user-space pointer to the new cpu mask */ asmlinkage long sys_sched_setaffinity(pid_t pid, unsigned int len, unsigned long __user *user_mask_ptr) which again has 3 arguments that look exactly like the ones that I mentioned previously. I don't have access to the source code of the SGI Altix kernel, so I can't check the other one mentioned there as a 2-args function. But so far all _kernel_ prototypes of the function that I have looked at are exactly the same with 3 arguments. The solution that I proposed works much like a statically linked binary - it calls via a syscall the _kernel_ function that has a constant (so far) prototype. It doesn't call the _glibc_ function that changes prototype. -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
Re: [O-MPI devel] Linux processor affinity
On Thu, 8 Dec 2005, Jeff Squyres wrote: Check out http://svn.open-mpi.org/svn/ompi/trunk/opal/mca/paffinity/ linux/paffinity_linux.h -- there's a big comment in that file about the problem, to include descriptions of the 3 APIs. I'm sorry, but that is not quite what I wrote about in my message. The comments refer to the _glibc_ view of the functions, at least I couldn't see how they map to my reading of the _kernel_ source code. Let's take one that is specifically mentioned there: Mandrake 10.0, kernel based on 2.6.3, in file kernel/sched.c there is the function: /** * sys_sched_setaffinity - set the cpu affinity of a process * @pid: pid of the process * @len: length in bytes of the bitmask pointed to by user_mask_ptr * @user_mask_ptr: user-space pointer to the new cpu mask */ asmlinkage long sys_sched_setaffinity(pid_t pid, unsigned int len, unsigned long __user *user_mask_ptr) which again has 3 arguments that look exactly like the ones that I mentioned previously. I don't have access to the source code of the SGI Altix kernel, so I can't check the other one mentioned there as a 2-args function. But so far all _kernel_ prototypes of the function that I have looked at are exactly the same with 3 arguments. The solution that I proposed works much like a statically linked binary - it calls via a syscall the _kernel_ function that has a constant (so far) prototype. It doesn't call the _glibc_ function that changes prototype. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: bogdan.coste...@iwr.uni-heidelberg.de