Hi,

I am trying to use static huge pages, not transparent huge pages.

Ucx is allowed to allocate via hugetlbfs.

$ ./bin/ucx_info -c | grep -i huge
UCX_SELF_ALLOC=huge,thp,md,mmap,heap
UCX_TCP_ALLOC=huge,thp,md,mmap,heap
UCX_SYSV_HUGETLB_MODE=try --->It is trying this and failing
UCX_SYSV_FIFO_HUGETLB=n
UCX_POSIX_HUGETLB_MODE=try---> it is trying this and failing
UCX_POSIX_FIFO_HUGETLB=n
UCX_ALLOC_PRIO=md:sysv,md:posix,huge,thp,md:*,mmap,heap
UCX_CMA_ALLOC=huge,thp,mmap,heap

It is failing even though I have static hugepages available in my system.

$ cat /proc/meminfo | grep HugePages_Total
HugePages_Total:      20

THP is also enabled:
$ cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

--Arun

-----Original Message-----
From: Florent GERMAIN <florent.germ...@eviden.com> 
Sent: Wednesday, July 19, 2023 7:51 PM
To: Open MPI Users <users@lists.open-mpi.org>; Chandran, Arun 
<arun.chand...@amd.com>
Subject: RE: How to use hugetlbfs with openmpi and ucx

Hi,
You can check if there are dedicated huge pages on your system or if 
transparent huge pages are allowed.

Transparent huge pages on rhel systems :
$cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
-> this means that transparent huge pages are selected through mmap + 
-> madvise always = always try to aggregate pages on thp (for large 
-> enough allocation with good alignment) never = never try to aggregate 
-> pages on thp

Dedicated huge pages on rhel systems :
$ cat /proc/meminfo | grep HugePages_Total
HugePages_Total:       0
-> no dedicated huge pages here

It seems that ucx tries to use dedicated huge pages (mmap(addr=(nil), 
length=6291456, flags= HUGETLB, fd=29)).
If there are no dedicated huge pages available, mmap fails.

Huge pages can accelerate virtual address to physical address translation and 
reduce TLB consumption.
It may be useful for large and frequently used buffers.

Regards,
Florent

-----Message d'origine-----
De : users <users-boun...@lists.open-mpi.org> De la part de Chandran, Arun via 
users Envoyé : mercredi 19 juillet 2023 15:44 À : users@lists.open-mpi.org Cc : 
Chandran, Arun <arun.chand...@amd.com> Objet : [OMPI users] How to use 
hugetlbfs with openmpi and ucx

Hi All,

I am trying to see whether hugetlbfs is improving the latency of communication 
with a small send receive program.

mpirun -np 2 --map-by core --bind-to core --mca pml ucx  --mca 
opal_common_ucx_tls any --mca opal_common_ucx_devices any -mca pml_base_verbose 
10 --mca mtl_base_verbose 10 -x OMPI_MCA_pml_ucx_verbose=10 -x 
UCX_LOG_LEVEL=debu
g -x UCX_PROTO_INFO=y   send_recv 1000 1


But the internal buffer allocation in ucx is unable to select the hugetlbfs.

[1688297246.205092] [lib-ssp-04:4022755:0]     ucp_context.c:1979 UCX  DEBUG 
allocation method[2] is 'huge'
[1688297246.208660] [lib-ssp-04:4022755:0]         mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 8447 bytes with hugetlb---------> I checked the code, 
this is a valid failure as the size is small compared to huge page size of 2MB
[1688297246.208704] [lib-ssp-04:4022755:0]         mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 4292720 bytes with hugetlb
[1688297246.210048] [lib-ssp-04:4022755:0]        mm_posix.c:332  UCX  DEBUG   
shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=29) failed: 
Invalid argument
[1688297246.211451] [lib-ssp-04:4022754:0]     ucp_context.c:1979 UCX  DEBUG 
allocation method[2] is 'huge'
[1688297246.214849] [lib-ssp-04:4022754:0]         mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 8447 bytes with hugetlb
[1688297246.214888] [lib-ssp-04:4022754:0]         mm_sysv.c:97   UCX  DEBUG   
mm failed to allocate 4292720 bytes with hugetlb
[1688297246.216235] [lib-ssp-04:4022754:0]        mm_posix.c:332  UCX  DEBUG   
shared memory mmap(addr=(nil), length=6291456, flags= HUGETLB, fd=29) failed: 
Invalid argument

Can someone suggest what are the steps to be done to enable hugetlbfs [I cannot 
run my application as root] ? Is using hugetlbfs for the internal buffers is 
recommended?

--Arun

Reply via email to