Re: [OMPI users] [EXT] Re: [EXT] Re: Error handling

2023-07-19 Thread Alexander Stadik via users
Hey Jeff,

George Bosilca already cleared it up in a previous answer, I tested everything 
again, by simply considering the modulo 256 everything behaves as expected.

BR Alex

From: Jeff Squyres (jsquyres) 
Sent: Wednesday, July 19, 2023 5:09 PM
To: George Bosilca ; Open MPI Users 

Cc: Alexander Stadik 
Subject: [EXT] Re: [OMPI users] [EXT] Re: Error handling

External: Check sender address and use caution opening links or attachments

MPI_Allreduce should work just fine, even with negative numbers.  If you are 
seeing something different, can you provide a small reproducer program that 
shows the problem?  We can dig deeper into if if we can reproduce the problem.

mpirun's exit status can't distinguish between MPI processes who call 
MPI_Finalize and then return a non-zero exit status and those who invoked 
MPI_Abort.  But if you have 1 process that invokes MPI_Abort with an exit 
status <255, it should be reflected in mpirun's exit status.  For example:


$ cat abort.c

#include 

#include 



int main(int argc, char *argv[])

{

int i, rank, size;



MPI_Init(NULL, NULL);

MPI_Comm_rank(MPI_COMM_WORLD, );

MPI_Comm_size(MPI_COMM_WORLD, );



if (rank == size - 1) {

int err_code = 79;

fprintf(stderr, "I am rank %d and am aborting with error code %d\n",

rank, err_code);

MPI_Abort(MPI_COMM_WORLD, err_code);

}



fprintf(stderr, "I am rank %d and am exiting with 0\n", rank);

MPI_Finalize();

return 0;

}



$ mpicc abort.c -o abort



$ mpirun --host mpi004:2,mpi005:2 -np 4 ./abort

I am rank 0 and am exiting with 0

I am rank 1 and am exiting with 0

I am rank 2 and am exiting with 0

I am rank 3 and am aborting with error code 79

--

MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD

with errorcode 79.



NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--



$ echo $?

79


From: users 
mailto:users-boun...@lists.open-mpi.org>> on 
behalf of Alexander Stadik via users 
mailto:users@lists.open-mpi.org>>
Sent: Wednesday, July 19, 2023 12:45 AM
To: George Bosilca mailto:bosi...@icl.utk.edu>>; Open MPI 
Users mailto:users@lists.open-mpi.org>>
Cc: Alexander Stadik 
mailto:alexander.sta...@essteyr.com>>
Subject: Re: [OMPI users] [EXT] Re: Error handling

Hey George,

I said random only because I do not see the method behind it, but exactly like 
this when I do allreduce by MIN and return a negative number I get either 248, 
253, 11 or 6 usually. Meaning that's purely a number from MPI side.

The Problem with MPI_Abort is it shows the correct number in its output in 
Logfile, but it does not communicate its value to other processes, or forward 
its value to exit. So one also always sees these "random" values.

When using positive numbers in range it seems to work, so my question was on 
how it works, and how one can do it? Is there a way to let MPI_Abort 
communicate  the value as exit code?
Why do negative numbers not work, or does one simply have to always use 
positive numbers? Why I would prefer Abort is because it seems safer.

BR Alex


Von: George Bosilca mailto:bosi...@icl.utk.edu>>
Gesendet: Dienstag, 18. Juli 2023 18:47
An: Open MPI Users mailto:users@lists.open-mpi.org>>
Cc: Alexander Stadik 
mailto:alexander.sta...@essteyr.com>>
Betreff: [EXT] Re: [OMPI users] Error handling

External: Check sender address and use caution opening links or attachments

Alex,

How are your values "random" if you provide correct values ? Even for negative 
values you could use MIN to pick one value and return it. What is the problem 
with `MPI_Abort` ? it does seem to do what you want.

  George.


On Tue, Jul 18, 2023 at 4:38 AM Alexander Stadik via users 
mailto:users@lists.open-mpi.org>> wrote:
Hey everyone,

I am working for longer time now with cuda-aware OpenMPI, and developed longer 
time back a small exceptions handling framework including MPI and CUDA 
exceptions.
Currently I am using MPI_Abort with costum error numbers, to terminate 
everything elegantly, which works well, by just reading the logfile in case of 
a crash.

Now I was wondering how one can handle return / exit codes properly between 
processes, since we would like to filter non-zero exits by return code.

One way is a simple Allreduce (in my case) + exit instead of Abort. But the 
problem seems to be the values are always "random" (since I was using negative 
codes), only by using MPI error codes it seems to work correctly.
But usage of that is limited.

Any suggestions on how to do this / how it can work properly?

BR Alex


[https://www.essteyr.com/wp-content/uploads/2020/02/pic-1_1568d80e-78e3-426f-85e8-4bf0051208351.png]

Re: [OMPI users] [EXT] Re: [EXT] Re: Error handling

2023-07-19 Thread Alexander Stadik via users
Hey George,

thanks of course, this fully explains it, I simply assumed it being a problem 
of the child process.
In this case there is also no issue with negative values when considering the 
modulo 256.

BR Alex

From: George Bosilca 
Sent: Wednesday, July 19, 2023 4:45 PM
To: Alexander Stadik 
Cc: Open MPI Users 
Subject: [EXT] Re: [EXT] Re: [OMPI users] Error handling

External: Check sender address and use caution opening links or attachments

Alex,

exit(status) does not make status available to the parent process wait, instead 
it makes the low 8 bits available to the parent as unsigned. This explains why 
small positive values seem to work correctly while negative values do not 
(because of the 32 bits negative value representation in complement to two).

  George.


On Wed, Jul 19, 2023 at 12:45 AM Alexander Stadik 
mailto:alexander.sta...@essteyr.com>> wrote:
Hey George,

I said random only because I do not see the method behind it, but exactly like 
this when I do allreduce by MIN and return a negative number I get either 248, 
253, 11 or 6 usually. Meaning that's purely a number from MPI side.

The Problem with MPI_Abort is it shows the correct number in its output in 
Logfile, but it does not communicate its value to other processes, or forward 
its value to exit. So one also always sees these "random" values.

When using positive numbers in range it seems to work, so my question was on 
how it works, and how one can do it? Is there a way to let MPI_Abort 
communicate  the value as exit code?
Why do negative numbers not work, or does one simply have to always use 
positive numbers? Why I would prefer Abort is because it seems safer.

BR Alex


Von: George Bosilca mailto:bosi...@icl.utk.edu>>
Gesendet: Dienstag, 18. Juli 2023 18:47
An: Open MPI Users mailto:users@lists.open-mpi.org>>
Cc: Alexander Stadik 
mailto:alexander.sta...@essteyr.com>>
Betreff: [EXT] Re: [OMPI users] Error handling

External: Check sender address and use caution opening links or attachments

Alex,

How are your values "random" if you provide correct values ? Even for negative 
values you could use MIN to pick one value and return it. What is the problem 
with `MPI_Abort` ? it does seem to do what you want.

  George.


On Tue, Jul 18, 2023 at 4:38 AM Alexander Stadik via users 
mailto:users@lists.open-mpi.org>> wrote:
Hey everyone,

I am working for longer time now with cuda-aware OpenMPI, and developed longer 
time back a small exceptions handling framework including MPI and CUDA 
exceptions.
Currently I am using MPI_Abort with costum error numbers, to terminate 
everything elegantly, which works well, by just reading the logfile in case of 
a crash.

Now I was wondering how one can handle return / exit codes properly between 
processes, since we would like to filter non-zero exits by return code.

One way is a simple Allreduce (in my case) + exit instead of Abort. But the 
problem seems to be the values are always "random" (since I was using negative 
codes), only by using MPI error codes it seems to work correctly.
But usage of that is limited.

Any suggestions on how to do this / how it can work properly?

BR Alex


[https://www.essteyr.com/wp-content/uploads/2020/02/pic-1_1568d80e-78e3-426f-85e8-4bf0051208351.png]

[https://www.essteyr.com/wp-content/uploads/2021/01/ESSSignatur3.png]

[https://www.essteyr.com/wp-content/uploads/2020/02/linkedin_38a91193-02cf-4df9-8e91-230f7459e9c3.png]
 
[https://www.essteyr.com/wp-content/uploads/2020/02/twitter_5fc7318f-c0e4-495c-b96c-ebd9cf186067.png]
   
[https://www.essteyr.com/wp-content/uploads/2020/02/facebook_ee01289e-1a90-48d0-8e82-049bb3c3a46b.png]
   
[https://www.essteyr.com/wp-content/uploads/2020/09/SocialLink_Instagram_32x32_ea55186d-8d0b-4f5e-a023-02e04995f5bf.png]
 

[cid:image001.png@01D9BAD2.AFB31D30]

DI Alexander Stadik

Head of Large Scale Solutions
Research & Development | Large Scale Solutions

[cid:image002.png@01D9BAD2.AFB31D30]Book a 
Meeting

Phone:  +4372522044622
Company: +43725220446

Mail: alexander.sta...@essteyr.com


Register of Firms No.: FN 427703 a
Commercial Court: District Court Steyr
UID: ATU69213102

[https://www.essteyr.com/wp-content/uploads/2018/09/pic-2_f96fc865-57a5-4ef1-a924-add9b85d55cc1.png]

ESS Engineering Software Steyr GmbH • Berggasse 35 • 4400 • Steyr • Austria

[https://www.essteyr.com/wp-content/uploads/2018/09/pic-2_1df6b77f-61f1-40d3-a337-0145e62afb3e1.png]

This message is confidential. It may also be privileged or otherwise protected 
by work product immunity or other legal rules. If you have received it by 
mistake, please let us know by e-mail 

Re: [OMPI users] How to use hugetlbfs with openmpi and ucx

2023-07-19 Thread Chandran, Arun via users
Hi,

I am trying to use static huge pages, not transparent huge pages.

Ucx is allowed to allocate via hugetlbfs.

$ ./bin/ucx_info -c | grep -i huge
UCX_SELF_ALLOC=huge,thp,md,mmap,heap
UCX_TCP_ALLOC=huge,thp,md,mmap,heap
UCX_SYSV_HUGETLB_MODE=try --->It is trying this and failing
UCX_SYSV_FIFO_HUGETLB=n
UCX_POSIX_HUGETLB_MODE=try---> it is trying this and failing
UCX_POSIX_FIFO_HUGETLB=n
UCX_ALLOC_PRIO=md:sysv,md:posix,huge,thp,md:*,mmap,heap
UCX_CMA_ALLOC=huge,thp,mmap,heap

It is failing even though I have static hugepages available in my system.

$ cat /proc/meminfo | grep HugePages_Total
HugePages_Total:  20

THP is also enabled:
$ cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

--Arun

-Original Message-
From: Florent GERMAIN  
Sent: Wednesday, July 19, 2023 7:51 PM
To: Open MPI Users ; Chandran, Arun 

Subject: RE: How to use hugetlbfs with openmpi and ucx

Hi,
You can check if there are dedicated huge pages on your system or if 
transparent huge pages are allowed.

Transparent huge pages on rhel systems :
$cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
-> this means that transparent huge pages are selected through mmap + 
-> madvise always = always try to aggregate pages on thp (for large 
-> enough allocation with good alignment) never = never try to aggregate 
-> pages on thp

Dedicated huge pages on rhel systems :
$ cat /proc/meminfo | grep HugePages_Total
HugePages_Total:   0
-> no dedicated huge pages here

It seems that ucx tries to use dedicated huge pages (mmap(addr=(nil), 
length=6291456, flags= HUGETLB, fd=29)).
If there are no dedicated huge pages available, mmap fails.

Huge pages can accelerate virtual address to physical address translation and 
reduce TLB consumption.
It may be useful for large and frequently used buffers.

Regards,
Florent

-Message d'origine-
De : users  De la part de Chandran, Arun via 
users Envoyé : mercredi 19 juillet 2023 15:44 À : users@lists.open-mpi.org Cc : 
Chandran, Arun  Objet : [OMPI users] How to use 
hugetlbfs with openmpi and ucx

Hi All,

I am trying to see whether hugetlbfs is improving the latency of communication 
with a small send receive program.

mpirun -np 2 --map-by core --bind-to core --mca pml ucx  --mca 
opal_common_ucx_tls any --mca opal_common_ucx_devices any -mca pml_base_verbose 
10 --mca mtl_base_verbose 10 -x OMPI_MCA_pml_ucx_verbose=10 -x 
UCX_LOG_LEVEL=debu
g -x UCX_PROTO_INFO=y   send_recv 1000 1


But the internal buffer allocation in ucx is unable to select the hugetlbfs.

[1688297246.205092] [lib-ssp-04:4022755:0] ucp_context.c:1979 UCX  DEBUG 
allocation method[2] is 'huge'
[1688297246.208660] [lib-ssp-04:4022755:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 8447 bytes with hugetlb-> I checked the code, 
this is a valid failure as the size is small compared to huge page size of 2MB
[1688297246.208704] [lib-ssp-04:4022755:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 4292720 bytes with hugetlb
[1688297246.210048] [lib-ssp-04:4022755:0]mm_posix.c:332  UCX  DEBUG   
shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=29) failed: 
Invalid argument
[1688297246.211451] [lib-ssp-04:4022754:0] ucp_context.c:1979 UCX  DEBUG 
allocation method[2] is 'huge'
[1688297246.214849] [lib-ssp-04:4022754:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 8447 bytes with hugetlb
[1688297246.214888] [lib-ssp-04:4022754:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 4292720 bytes with hugetlb
[1688297246.216235] [lib-ssp-04:4022754:0]mm_posix.c:332  UCX  DEBUG   
shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=29) failed: 
Invalid argument

Can someone suggest what are the steps to be done to enable hugetlbfs [I cannot 
run my application as root] ? Is using hugetlbfs for the internal buffers is 
recommended?

--Arun


Re: [OMPI users] libnuma.so error

2023-07-19 Thread Gus Correa via users
If it is installed, libunuma should be in:
/usr/lib64/libnuma.so
as a softlink to the actual number-versioned  library.
In general the loader is configured to search for shared libraries
in /usr/lib64 ("ldd " may shed some light here).

You can check if the numa packages are installed with:
yum list | grep numa (CentOS 7, RHEL 7)
dnf list | grep numa (CentOS 8, RHEL 8, RockyLinux 8, Fedora, etc)
apt list | grep numa (Debian, Ubuntu)

If not, you can install (or ask the system administrator to do it).

I hope this helps,
Gus Correa


On Wed, Jul 19, 2023 at 11:55 AM Jeff Squyres (jsquyres) via users <
users@lists.open-mpi.org> wrote:

> It's not clear if that message is being emitted by Open MPI.
>
> It does say it's falling back to a different behavior if libnuma.so is not
> found, so it appears if it's treating it as a warning, not an error.
> --
> *From:* users  on behalf of Luis
> Cebamanos via users 
> *Sent:* Wednesday, July 19, 2023 10:09 AM
> *To:* users@lists.open-mpi.org 
> *Cc:* Luis Cebamanos 
> *Subject:* [OMPI users] libnuma.so error
>
> Hello,
>
> I was wondering if anyone has ever seen the following runtime error:
>
> mpirun -np 32 ./hello
> .
> [LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file
> or directory
> [LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET
> manual.
> .
>
> The funny thing is that the binary is executed despite the errors.
> What could be causing it?
>
> Regards,
> Lusi
>


Re: [OMPI users] [EXT] Re: Error handling

2023-07-19 Thread George Bosilca via users
I think the root cause was that he expected the negative integer resulting
from the reduction to be the exit code of the application, and as I
explained in my prior email that's not how exit() works.

The exit() issue aside, MPI_Abort seems to be the right function for this
usage.

  George.


On Wed, Jul 19, 2023 at 11:08 AM Jeff Squyres (jsquyres) 
wrote:

> MPI_Allreduce should work just fine, even with negative numbers.  If you
> are seeing something different, can you provide a small reproducer program
> that shows the problem?  We can dig deeper into if if we can reproduce the
> problem.
>
> mpirun's exit status can't distinguish between MPI processes who call
> MPI_Finalize and then return a non-zero exit status and those who invoked
> MPI_Abort.  But if you have 1 process that invokes MPI_Abort with an exit
> status <255, it should be reflected in mpirun's exit status.  For example:
>
> $ cat abort.c
>
> #include 
>
> #include 
>
>
> int main(int argc, char *argv[])
>
> {
>
> int i, rank, size;
>
>
> MPI_Init(NULL, NULL);
>
> MPI_Comm_rank(MPI_COMM_WORLD, );
>
> MPI_Comm_size(MPI_COMM_WORLD, );
>
>
> if (rank == size - 1) {
>
> int err_code = 79;
>
> fprintf(stderr, "I am rank %d and am aborting with error code
> %d\n",
>
> rank, err_code);
>
> MPI_Abort(MPI_COMM_WORLD, err_code);
>
> }
>
>
> fprintf(stderr, "I am rank %d and am exiting with 0\n", rank);
>
> MPI_Finalize();
>
> return 0;
>
> }
>
>
> $ mpicc abort.c -o abort
>
>
> $ mpirun --host mpi004:2,mpi005:2 -np 4 ./abort
>
> I am rank 0 and am exiting with 0
>
> I am rank 1 and am exiting with 0
>
> I am rank 2 and am exiting with 0
>
> I am rank 3 and am aborting with error code 79
>
> --
>
> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
>
> with errorcode 79.
>
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>
> You may or may not see output from other processes, depending on
>
> exactly when Open MPI kills them.
>
> --
>
>
> $ echo $?
>
> 79
>
> --
> *From:* users  on behalf of Alexander
> Stadik via users 
> *Sent:* Wednesday, July 19, 2023 12:45 AM
> *To:* George Bosilca ; Open MPI Users <
> users@lists.open-mpi.org>
> *Cc:* Alexander Stadik 
> *Subject:* Re: [OMPI users] [EXT] Re: Error handling
>
> Hey George,
>
> I said random only because I do not see the method behind it, but exactly
> like this when I do allreduce by MIN and return a negative number I get
> either 248, 253, 11 or 6 usually. Meaning that's purely a number from MPI
> side.
>
> The Problem with MPI_Abort is it shows the correct number in its output in
> Logfile, but it does not communicate its value to other processes, or
> forward its value to exit. So one also always sees these "random" values.
>
> When using positive numbers in range it seems to work, so my question was
> on how it works, and how one can do it? Is there a way to let MPI_Abort
> communicate  the value as exit code?
> Why do negative numbers not work, or does one simply have to always use
> positive numbers? Why I would prefer Abort is because it seems safer.
>
> BR Alex
>
>
> --
> *Von:* George Bosilca 
> *Gesendet:* Dienstag, 18. Juli 2023 18:47
> *An:* Open MPI Users 
> *Cc:* Alexander Stadik 
> *Betreff:* [EXT] Re: [OMPI users] Error handling
>
> External: Check sender address and use caution opening links or
> attachments
>
> Alex,
>
> How are your values "random" if you provide correct values ? Even for
> negative values you could use MIN to pick one value and return it. What is
> the problem with `MPI_Abort` ? it does seem to do what you want.
>
>   George.
>
>
> On Tue, Jul 18, 2023 at 4:38 AM Alexander Stadik via users <
> users@lists.open-mpi.org> wrote:
>
> Hey everyone,
>
> I am working for longer time now with cuda-aware OpenMPI, and developed
> longer time back a small exceptions handling framework including MPI and
> CUDA exceptions.
> Currently I am using MPI_Abort with costum error numbers, to terminate
> everything elegantly, which works well, by just reading the logfile in case
> of a crash.
>
> Now I was wondering how one can handle return / exit codes properly
> between processes, since we would like to filter non-zero exits by return
> code.
>
> One way is a simple Allreduce (in my case) + exit instead of Abort. But
> the problem seems to be the values are always "random" (since I was using
> negative codes), only by using MPI error codes it seems to work correctly.
> But usage of that is limited.
>
> Any suggestions on how to do this / how it can work properly?
>
> BR Alex
>
>
> 
>
> 
>   
> 

Re: [OMPI users] [EXT] Re: Error handling

2023-07-19 Thread Jeff Squyres (jsquyres) via users
MPI_Allreduce should work just fine, even with negative numbers.  If you are 
seeing something different, can you provide a small reproducer program that 
shows the problem?  We can dig deeper into if if we can reproduce the problem.

mpirun's exit status can't distinguish between MPI processes who call 
MPI_Finalize and then return a non-zero exit status and those who invoked 
MPI_Abort.  But if you have 1 process that invokes MPI_Abort with an exit 
status <255, it should be reflected in mpirun's exit status.  For example:


$ cat abort.c

#include 

#include 


int main(int argc, char *argv[])

{

int i, rank, size;


MPI_Init(NULL, NULL);

MPI_Comm_rank(MPI_COMM_WORLD, );

MPI_Comm_size(MPI_COMM_WORLD, );


if (rank == size - 1) {

int err_code = 79;

fprintf(stderr, "I am rank %d and am aborting with error code %d\n",

rank, err_code);

MPI_Abort(MPI_COMM_WORLD, err_code);

}


fprintf(stderr, "I am rank %d and am exiting with 0\n", rank);

MPI_Finalize();

return 0;

}


$ mpicc abort.c -o abort


$ mpirun --host mpi004:2,mpi005:2 -np 4 ./abort

I am rank 0 and am exiting with 0

I am rank 1 and am exiting with 0

I am rank 2 and am exiting with 0

I am rank 3 and am aborting with error code 79

--

MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD

with errorcode 79.


NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--


$ echo $?

79


From: users  on behalf of Alexander Stadik 
via users 
Sent: Wednesday, July 19, 2023 12:45 AM
To: George Bosilca ; Open MPI Users 

Cc: Alexander Stadik 
Subject: Re: [OMPI users] [EXT] Re: Error handling

Hey George,

I said random only because I do not see the method behind it, but exactly like 
this when I do allreduce by MIN and return a negative number I get either 248, 
253, 11 or 6 usually. Meaning that's purely a number from MPI side.

The Problem with MPI_Abort is it shows the correct number in its output in 
Logfile, but it does not communicate its value to other processes, or forward 
its value to exit. So one also always sees these "random" values.

When using positive numbers in range it seems to work, so my question was on 
how it works, and how one can do it? Is there a way to let MPI_Abort 
communicate  the value as exit code?
Why do negative numbers not work, or does one simply have to always use 
positive numbers? Why I would prefer Abort is because it seems safer.

BR Alex



Von: George Bosilca 
Gesendet: Dienstag, 18. Juli 2023 18:47
An: Open MPI Users 
Cc: Alexander Stadik 
Betreff: [EXT] Re: [OMPI users] Error handling

External: Check sender address and use caution opening links or attachments

Alex,

How are your values "random" if you provide correct values ? Even for negative 
values you could use MIN to pick one value and return it. What is the problem 
with `MPI_Abort` ? it does seem to do what you want.

  George.


On Tue, Jul 18, 2023 at 4:38 AM Alexander Stadik via users 
mailto:users@lists.open-mpi.org>> wrote:
Hey everyone,

I am working for longer time now with cuda-aware OpenMPI, and developed longer 
time back a small exceptions handling framework including MPI and CUDA 
exceptions.
Currently I am using MPI_Abort with costum error numbers, to terminate 
everything elegantly, which works well, by just reading the logfile in case of 
a crash.

Now I was wondering how one can handle return / exit codes properly between 
processes, since we would like to filter non-zero exits by return code.

One way is a simple Allreduce (in my case) + exit instead of Abort. But the 
problem seems to be the values are always "random" (since I was using negative 
codes), only by using MPI error codes it seems to work correctly.
But usage of that is limited.

Any suggestions on how to do this / how it can work properly?

BR Alex



[https://www.essteyr.com/wp-content/uploads/2020/02/pic-1_1568d80e-78e3-426f-85e8-4bf0051208351.png]

[https://www.essteyr.com/wp-content/uploads/2021/01/ESSSignatur3.png]

[https://www.essteyr.com/wp-content/uploads/2020/02/linkedin_38a91193-02cf-4df9-8e91-230f7459e9c3.png]
 
[https://www.essteyr.com/wp-content/uploads/2020/02/twitter_5fc7318f-c0e4-495c-b96c-ebd9cf186067.png]
   
[https://www.essteyr.com/wp-content/uploads/2020/02/facebook_ee01289e-1a90-48d0-8e82-049bb3c3a46b.png]
   
[https://www.essteyr.com/wp-content/uploads/2020/09/SocialLink_Instagram_32x32_ea55186d-8d0b-4f5e-a023-02e04995f5bf.png]
 

Re: [OMPI users] [EXT] Re: Error handling

2023-07-19 Thread George Bosilca via users
Alex,

exit(status) does not make status available to the parent process wait,
instead it makes the low 8 bits available to the parent as unsigned. This
explains why small positive values seem to work correctly while negative
values do not (because of the 32 bits negative value representation in
complement to two).

  George.


On Wed, Jul 19, 2023 at 12:45 AM Alexander Stadik <
alexander.sta...@essteyr.com> wrote:

> Hey George,
>
> I said random only because I do not see the method behind it, but exactly
> like this when I do allreduce by MIN and return a negative number I get
> either 248, 253, 11 or 6 usually. Meaning that's purely a number from MPI
> side.
>
> The Problem with MPI_Abort is it shows the correct number in its output in
> Logfile, but it does not communicate its value to other processes, or
> forward its value to exit. So one also always sees these "random" values.
>
> When using positive numbers in range it seems to work, so my question was
> on how it works, and how one can do it? Is there a way to let MPI_Abort
> communicate  the value as exit code?
> Why do negative numbers not work, or does one simply have to always use
> positive numbers? Why I would prefer Abort is because it seems safer.
>
> BR Alex
>
>
> --
> *Von:* George Bosilca 
> *Gesendet:* Dienstag, 18. Juli 2023 18:47
> *An:* Open MPI Users 
> *Cc:* Alexander Stadik 
> *Betreff:* [EXT] Re: [OMPI users] Error handling
>
> External: Check sender address and use caution opening links or
> attachments
>
> Alex,
>
> How are your values "random" if you provide correct values ? Even for
> negative values you could use MIN to pick one value and return it. What is
> the problem with `MPI_Abort` ? it does seem to do what you want.
>
>   George.
>
>
> On Tue, Jul 18, 2023 at 4:38 AM Alexander Stadik via users <
> users@lists.open-mpi.org> wrote:
>
> Hey everyone,
>
> I am working for longer time now with cuda-aware OpenMPI, and developed
> longer time back a small exceptions handling framework including MPI and
> CUDA exceptions.
> Currently I am using MPI_Abort with costum error numbers, to terminate
> everything elegantly, which works well, by just reading the logfile in case
> of a crash.
>
> Now I was wondering how one can handle return / exit codes properly
> between processes, since we would like to filter non-zero exits by return
> code.
>
> One way is a simple Allreduce (in my case) + exit instead of Abort. But
> the problem seems to be the values are always "random" (since I was using
> negative codes), only by using MPI error codes it seems to work correctly.
> But usage of that is limited.
>
> Any suggestions on how to do this / how it can work properly?
>
> BR Alex
>
>
> 
>
> 
>   
> 
>
> DI Alexander Stadik
>
> Head of Large Scale Solutions
> Research & Development | Large Scale Solutions
>
> Book a Meeting
> 
>
>
> Phone:  +4372522044622
> Company: +43725220446
>
> Mail: alexander.sta...@essteyr.com
>
>
> Register of Firms No.: FN 427703 a
> Commercial Court: District Court Steyr
> UID: ATU69213102
>
> ESS Engineering Software Steyr GmbH • Berggasse 35 • 4400 • Steyr • Austria
>
> This message is confidential. It may also be privileged or otherwise
> protected by work product immunity or other legal rules. If you have
> received it by mistake, please let us know by e-mail reply and delete it
> from your system; you may not copy this message or disclose its contents to
> anyone. Please send us by fax any message containing deadlines as incoming
> e-mails are not screened for response deadlines. The integrity and security
> of this message cannot be guaranteed on the Internet.
>
> 
>
>
>


Re: [OMPI users] libnuma.so error

2023-07-19 Thread Jeff Squyres (jsquyres) via users
It's not clear if that message is being emitted by Open MPI.

It does say it's falling back to a different behavior if libnuma.so is not 
found, so it appears if it's treating it as a warning, not an error.

From: users  on behalf of Luis Cebamanos via 
users 
Sent: Wednesday, July 19, 2023 10:09 AM
To: users@lists.open-mpi.org 
Cc: Luis Cebamanos 
Subject: [OMPI users] libnuma.so error

Hello,

I was wondering if anyone has ever seen the following runtime error:

mpirun -np 32 ./hello
.
[LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file
or directory
[LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET
manual.
.

The funny thing is that the binary is executed despite the errors.
What could be causing it?

Regards,
Lusi


Re: [OMPI users] libnuma.so error

2023-07-19 Thread Gilles Gouaillardet via users
Luis,

That can happen if a component is linked with libnuma.so:
Open MPI will fail to open it and try to fallback on an other one.

You can run ldd on the mca_*.so components in the /.../lib/openmpi directory
to figure out which is using libnuma.so and assess if it is needed or not.


Cheers,

Gilles

On Wed, Jul 19, 2023 at 11:36 PM Luis Cebamanos via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> I was wondering if anyone has ever seen the following runtime error:
>
> mpirun -np 32 ./hello
> .
> [LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file
> or directory
> [LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET
> manual.
> .
>
> The funny thing is that the binary is executed despite the errors.
> What could be causing it?
>
> Regards,
> Lusi
>


[OMPI users] libnuma.so error

2023-07-19 Thread Luis Cebamanos via users

Hello,

I was wondering if anyone has ever seen the following runtime error:

mpirun -np 32 ./hello
.
[LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file 
or directory
[LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET 
manual.

.

The funny thing is that the binary is executed despite the errors.
What could be causing it?

Regards,
Lusi


[OMPI users] How to use hugetlbfs with openmpi and ucx

2023-07-19 Thread Chandran, Arun via users
Hi All,

I am trying to see whether hugetlbfs is improving the latency of communication 
with a small send receive program.

mpirun -np 2 --map-by core --bind-to core --mca pml ucx  --mca 
opal_common_ucx_tls any --mca opal_common_ucx_devices any -mca pml_base_verbose 
10 --mca mtl_base_verbose 10 -x OMPI_MCA_pml_ucx_verbose=10 -x 
UCX_LOG_LEVEL=debu
g -x UCX_PROTO_INFO=y   send_recv 1000 1


But the internal buffer allocation in ucx is unable to select the hugetlbfs.

[1688297246.205092] [lib-ssp-04:4022755:0] ucp_context.c:1979 UCX  DEBUG 
allocation method[2] is 'huge'
[1688297246.208660] [lib-ssp-04:4022755:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 8447 bytes with hugetlb-> I checked the code, 
this is a valid failure as the size is small compared to huge page size of 2MB
[1688297246.208704] [lib-ssp-04:4022755:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 4292720 bytes with hugetlb
[1688297246.210048] [lib-ssp-04:4022755:0]mm_posix.c:332  UCX  DEBUG   
shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=29) failed: 
Invalid argument
[1688297246.211451] [lib-ssp-04:4022754:0] ucp_context.c:1979 UCX  DEBUG 
allocation method[2] is 'huge'
[1688297246.214849] [lib-ssp-04:4022754:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 8447 bytes with hugetlb
[1688297246.214888] [lib-ssp-04:4022754:0] mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 4292720 bytes with hugetlb
[1688297246.216235] [lib-ssp-04:4022754:0]mm_posix.c:332  UCX  DEBUG   
shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=29) failed: 
Invalid argument

Can someone suggest what are the steps to be done to enable hugetlbfs [I cannot 
run my application as root] ? Is using hugetlbfs for the internal buffers is 
recommended?

--Arun