[dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

2016-06-30 Thread Thomas Monjalon
> > EAL memory init allocates all free hugepages of the whole system,
> > which seen from sysfs, even when applications do not ask so many.
> > When there is a limitation on how many hugepages an application can
> > use (such as cgroup.hugetlb), or hugetlbfs is specified with an
> > option of size (exceeding the quota of the fs), it just fails to
> > start even there are enough hugepages allocated.
> >
> > To fix above issue, this patch:
> >   - Changes the logic to continue memory init to see if hugetlb
> > requirement of application can be addressed by already allocated
> > hugepages.
> >   - To make sure each hugepage is allocated successfully, we add a
> > recover mechanism, which relies on a mem access to fault-in
> > hugepages, and if it fails with SIGBUS, recover to previously
> > saved stack environment with siglongjmp().
> >
> > For the case of CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS (enabled by
> > default when compiling IVSHMEM target), it's indispensable to
> > mapp all free hugepages in the system. Under this case, it fails
> > to start when allocating fails.
[...]
> > Signed-off-by: Jianfeng Tan
> > Acked-by: Neil Horman
> 
> Acked-by: Sergio Gonzalez Monroy 

Applied, thanks


[dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

2016-06-08 Thread Sergio Gonzalez Monroy
On 31/05/2016 04:37, Jianfeng Tan wrote:
> EAL memory init allocates all free hugepages of the whole system,
> which seen from sysfs, even when applications do not ask so many.
> When there is a limitation on how many hugepages an application can
> use (such as cgroup.hugetlb), or hugetlbfs is specified with an
> option of size (exceeding the quota of the fs), it just fails to
> start even there are enough hugepages allocated.
>
> To fix above issue, this patch:
>   - Changes the logic to continue memory init to see if hugetlb
> requirement of application can be addressed by already allocated
> hugepages.
>   - To make sure each hugepage is allocated successfully, we add a
> recover mechanism, which relies on a mem access to fault-in
> hugepages, and if it fails with SIGBUS, recover to previously
> saved stack environment with siglongjmp().
>
> For the case of CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS (enabled by
> default when compiling IVSHMEM target), it's indispensable to
> mapp all free hugepages in the system. Under this case, it fails
> to start when allocating fails.
>
> Test example:
>a. cgcreate -g hugetlb:/test-subgroup
>b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
>c. cgexec -g hugetlb:test-subgroup \
>./examples/helloworld/build/helloworld -c 0x2 -n 4
>
> 
> Fixes: af75078fece ("first public release")
>
> Signed-off-by: Jianfeng Tan
> Acked-by: Neil Horman
> ---
> v5:
>   - Make this method as default instead of using an option.
>   - When SIGBUS is triggered in the case of RTE_EAL_SINGLE_FILE_SEGMENTS,
> just return error.
>   - Add prefix "huge_" to newly added function and static variables.
>   - Move the internal_config.memory assignment after the page allocations.
> v4:
>   - Change map_all_hugepages to return unsigned instead of int.
> v3:
>   - Reword commit message to include it fixes the hugetlbfs quota issue.
>   - setjmp -> sigsetjmp.
>   - Fix RTE_LOG complaint from ERR to DEBUG as it does not mean init error
> so far.
>   - Fix the second map_all_hugepages's return value check.
> v2:
>   - Address the compiling error by move setjmp into a wrap method.
>
>   lib/librte_eal/linuxapp/eal/eal.c|  20 -
>   lib/librte_eal/linuxapp/eal/eal_memory.c | 138 
> ---
>   2 files changed, 125 insertions(+), 33 deletions(-)
>

Acked-by: Sergio Gonzalez Monroy 


[dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

2016-06-06 Thread Pei, Yulong
Tested-by: Yulong Pei 

1. Run dpdk app with multiple mount points, it works as expected.
2. Create new cgroup with limited hugepages like the following, and Run dpdk 
app with the newly created cgroup, it works as expected.

#cgcreate -g hugetlb:/test-subgroup
# cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
# cgexec -g hugetlb:test-subgroup ./x86_64-native-linuxapp-gcc/app/testpmd -c 
0x3 -n 4 -- -i

Best Regards
Yulong Pei

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jianfeng Tan
Sent: Tuesday, May 31, 2016 11:37 AM
To: dev at dpdk.org
Cc: Gonzalez Monroy, Sergio ; nhorman at 
tuxdriver.com; david.marchand at 6wind.com; thomas.monjalon at 6wind.com; Tan, 
Jianfeng 
Subject: [dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

EAL memory init allocates all free hugepages of the whole system, which seen 
from sysfs, even when applications do not ask so many.
When there is a limitation on how many hugepages an application can use (such 
as cgroup.hugetlb), or hugetlbfs is specified with an option of size (exceeding 
the quota of the fs), it just fails to start even there are enough hugepages 
allocated.

To fix above issue, this patch:
 - Changes the logic to continue memory init to see if hugetlb
   requirement of application can be addressed by already allocated
   hugepages.
 - To make sure each hugepage is allocated successfully, we add a
   recover mechanism, which relies on a mem access to fault-in
   hugepages, and if it fails with SIGBUS, recover to previously
   saved stack environment with siglongjmp().

For the case of CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS (enabled by default when 
compiling IVSHMEM target), it's indispensable to mapp all free hugepages in the 
system. Under this case, it fails to start when allocating fails.

Test example:
  a. cgcreate -g hugetlb:/test-subgroup
  b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
  c. cgexec -g hugetlb:test-subgroup \
  ./examples/helloworld/build/helloworld -c 0x2 -n 4


Fixes: af75078fece ("first public release")

Signed-off-by: Jianfeng Tan 
Acked-by: Neil Horman 
---
v5:
 - Make this method as default instead of using an option.
 - When SIGBUS is triggered in the case of RTE_EAL_SINGLE_FILE_SEGMENTS,
   just return error.
 - Add prefix "huge_" to newly added function and static variables.
 - Move the internal_config.memory assignment after the page allocations.
v4:
 - Change map_all_hugepages to return unsigned instead of int.
v3:
 - Reword commit message to include it fixes the hugetlbfs quota issue.
 - setjmp -> sigsetjmp.
 - Fix RTE_LOG complaint from ERR to DEBUG as it does not mean init error
   so far.
 - Fix the second map_all_hugepages's return value check.
v2:
 - Address the compiling error by move setjmp into a wrap method.

 lib/librte_eal/linuxapp/eal/eal.c|  20 -
 lib/librte_eal/linuxapp/eal/eal_memory.c | 138 ---
 2 files changed, 125 insertions(+), 33 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8aafd51..4a8dfbd 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -465,24 +465,6 @@ eal_parse_vfio_intr(const char *mode)
return -1;
 }

-static inline size_t
-eal_get_hugepage_mem_size(void)
-{
-   uint64_t size = 0;
-   unsigned i, j;
-
-   for (i = 0; i < internal_config.num_hugepage_sizes; i++) {
-   struct hugepage_info *hpi = _config.hugepage_info[i];
-   if (hpi->hugedir != NULL) {
-   for (j = 0; j < RTE_MAX_NUMA_NODES; j++) {
-   size += hpi->hugepage_sz * hpi->num_pages[j];
-   }
-   }
-   }
-
-   return (size < SIZE_MAX) ? (size_t)(size) : SIZE_MAX;
-}
-
 /* Parse the arguments for --log-level only */  static void  
eal_log_level_parse(int argc, char **argv) @@ -766,8 +748,6 @@ rte_eal_init(int 
argc, char **argv)
if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
if (internal_config.no_hugetlbfs)
internal_config.memory = MEMSIZE_IF_NO_HUGE_PAGE;
-   else
-   internal_config.memory = eal_get_hugepage_mem_size();
}

if (internal_config.vmware_tsc_map == 1) { diff --git 
a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..dc6f49b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -80,6 +80,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include 
 #include 
@@ -309,6 +311,21 @@ get_virtual_area(size_t *size, size_t hugepage_sz)
return addr;
 }

+static sigjmp_buf huge_jmpenv;
+
+static void huge_sigbus_handler(int signo __rte_unused) {
+   siglongjmp(huge_jmpenv, 1);
+}
+
+/* Put setjmp

[dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

2016-05-31 Thread Jianfeng Tan
EAL memory init allocates all free hugepages of the whole system,
which seen from sysfs, even when applications do not ask so many.
When there is a limitation on how many hugepages an application can
use (such as cgroup.hugetlb), or hugetlbfs is specified with an
option of size (exceeding the quota of the fs), it just fails to
start even there are enough hugepages allocated.

To fix above issue, this patch:
 - Changes the logic to continue memory init to see if hugetlb
   requirement of application can be addressed by already allocated
   hugepages.
 - To make sure each hugepage is allocated successfully, we add a
   recover mechanism, which relies on a mem access to fault-in
   hugepages, and if it fails with SIGBUS, recover to previously
   saved stack environment with siglongjmp().

For the case of CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS (enabled by
default when compiling IVSHMEM target), it's indispensable to
mapp all free hugepages in the system. Under this case, it fails
to start when allocating fails.

Test example:
  a. cgcreate -g hugetlb:/test-subgroup
  b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
  c. cgexec -g hugetlb:test-subgroup \
  ./examples/helloworld/build/helloworld -c 0x2 -n 4


Fixes: af75078fece ("first public release")

Signed-off-by: Jianfeng Tan 
Acked-by: Neil Horman 
---
v5:
 - Make this method as default instead of using an option.
 - When SIGBUS is triggered in the case of RTE_EAL_SINGLE_FILE_SEGMENTS,
   just return error.
 - Add prefix "huge_" to newly added function and static variables.
 - Move the internal_config.memory assignment after the page allocations.
v4:
 - Change map_all_hugepages to return unsigned instead of int.
v3:
 - Reword commit message to include it fixes the hugetlbfs quota issue.
 - setjmp -> sigsetjmp.
 - Fix RTE_LOG complaint from ERR to DEBUG as it does not mean init error
   so far.
 - Fix the second map_all_hugepages's return value check.
v2:
 - Address the compiling error by move setjmp into a wrap method.

 lib/librte_eal/linuxapp/eal/eal.c|  20 -
 lib/librte_eal/linuxapp/eal/eal_memory.c | 138 ---
 2 files changed, 125 insertions(+), 33 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8aafd51..4a8dfbd 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -465,24 +465,6 @@ eal_parse_vfio_intr(const char *mode)
return -1;
 }

-static inline size_t
-eal_get_hugepage_mem_size(void)
-{
-   uint64_t size = 0;
-   unsigned i, j;
-
-   for (i = 0; i < internal_config.num_hugepage_sizes; i++) {
-   struct hugepage_info *hpi = _config.hugepage_info[i];
-   if (hpi->hugedir != NULL) {
-   for (j = 0; j < RTE_MAX_NUMA_NODES; j++) {
-   size += hpi->hugepage_sz * hpi->num_pages[j];
-   }
-   }
-   }
-
-   return (size < SIZE_MAX) ? (size_t)(size) : SIZE_MAX;
-}
-
 /* Parse the arguments for --log-level only */
 static void
 eal_log_level_parse(int argc, char **argv)
@@ -766,8 +748,6 @@ rte_eal_init(int argc, char **argv)
if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
if (internal_config.no_hugetlbfs)
internal_config.memory = MEMSIZE_IF_NO_HUGE_PAGE;
-   else
-   internal_config.memory = eal_get_hugepage_mem_size();
}

if (internal_config.vmware_tsc_map == 1) {
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..dc6f49b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -80,6 +80,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include 
 #include 
@@ -309,6 +311,21 @@ get_virtual_area(size_t *size, size_t hugepage_sz)
return addr;
 }

+static sigjmp_buf huge_jmpenv;
+
+static void huge_sigbus_handler(int signo __rte_unused)
+{
+   siglongjmp(huge_jmpenv, 1);
+}
+
+/* Put setjmp into a wrap method to avoid compiling error. Any non-volatile,
+ * non-static local variable in the stack frame calling sigsetjmp might be
+ * clobbered by a call to longjmp.
+ */
+static int huge_wrap_sigsetjmp(void)
+{
+   return sigsetjmp(huge_jmpenv, 1);
+}
 /*
  * Mmap all hugepages of hugepage table: it first open a file in
  * hugetlbfs, then mmap() hugepage_sz data in it. If orig is set, the
@@ -316,7 +333,7 @@ get_virtual_area(size_t *size, size_t hugepage_sz)
  * in hugepg_tbl[i].final_va. The second mapping (when orig is 0) tries to
  * map continguous physical blocks in contiguous virtual blocks.
  */
-static int
+static unsigned
 map_all_hugepages(struct hugepage_file *hugepg_tbl,
struct hugepage_info *hpi, int orig)
 {
@@ -394,9 +411,9 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
/* try