[dpdk-dev] [PATCH v3] eal: make hugetlb initialization more robust

2016-05-10 Thread Sergio Gonzalez Monroy

Hi Jianfeng,

On 09/05/2016 11:48, Jianfeng Tan wrote:

>   /* find physical addresses and sockets for each hugepage */
> @@ -1172,8 +1255,9 @@ rte_eal_hugepage_init(void)
>   hp_offset += new_pages_count[i];
>   #else
>   /* remap all hugepages */
> - if (map_all_hugepages(_hp[hp_offset], hpi, 0) < 0){
> - RTE_LOG(DEBUG, EAL, "Failed to remap %u MB pages\n",
> + if ((uint32_t)map_all_hugepages(_hp[hp_offset], hpi, 0) !=
> + hpi->num_pages[0]) {

It probably makes more sense to have map_all_hugepages return uint32_t 
instead.

Sergio



[dpdk-dev] [PATCH v3] eal: make hugetlb initialization more robust

2016-05-10 Thread Tan, Jianfeng
Hi Sergio,

> -Original Message-
> From: Gonzalez Monroy, Sergio
> Sent: Tuesday, May 10, 2016 4:55 PM
> To: Tan, Jianfeng; dev at dpdk.org
> Cc: david.marchand at 6wind.com; nhorman at tuxdriver.com
> Subject: Re: [PATCH v3] eal: make hugetlb initialization more robust
> 
> 
> Hi Jianfeng,
> 
> On 09/05/2016 11:48, Jianfeng Tan wrote:
> 
> > /* find physical addresses and sockets for each hugepage */
> > @@ -1172,8 +1255,9 @@ rte_eal_hugepage_init(void)
> > hp_offset += new_pages_count[i];
> >   #else
> > /* remap all hugepages */
> > -   if (map_all_hugepages(_hp[hp_offset], hpi, 0) < 0){
> > -   RTE_LOG(DEBUG, EAL, "Failed to remap %u MB
> pages\n",
> > +   if ((uint32_t)map_all_hugepages(_hp[hp_offset], hpi,
> 0) !=
> > +   hpi->num_pages[0]) {
> 
> It probably makes more sense to have map_all_hugepages return uint32_t
> instead.

Yes, I agree. I was wrongly expecting there's a freebsd version 
map_all_hugepages with the same function type.

I'll fix this in next version.

Thanks,
Jianfeng

> 
> Sergio



[dpdk-dev] [PATCH v3] eal: make hugetlb initialization more robust

2016-05-09 Thread Jianfeng Tan
This patch adds an option, --huge-trybest, to use a recover mechanism to
the case that there are not so many hugepages (declared in sysfs), which
can be used. It relys on a mem access to fault-in hugepages, and if fails
with SIGBUS, recover to previously saved stack environment with
siglongjmp().

Besides, this solution fixes an issue when hugetlbfs is specified with an
option of size. Currently DPDK does not respect the quota of a hugetblfs
mount. It fails to init the EAL because it tries to map the number of free
hugepages in the system rather than using the number specified in the quota
for that mount.

It's still an open issue with CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS. Under
this case (such as IVSHMEM target), having hugetlbfs mounts with quota will
fail to remap hugepages as it relies on having mapped all free hugepages
in the system.

Test example:
  a. cgcreate -g hugetlb:/test-subgroup
  b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
  c. cgexec -g hugetlb:test-subgroup \
  ./examples/helloworld/build/helloworld -c 0x2 -n 4 --huge-trybest

Signed-off-by: Jianfeng Tan 
Acked-by: Neil Horman 
---
v3:
 - Reword commit message to include it fixes the hugetlbfs quota issue.
 - setjmp -> sigsetjmp.
 - Fix RTE_LOG complaint from ERR to DEBUG as it does not mean init error
   so far.
 - Fix the second map_all_hugepages's return value check.
v2:
 - Address the compiling error by move setjmp into a wrap method.

 lib/librte_eal/common/eal_common_options.c |   4 +
 lib/librte_eal/common/eal_internal_cfg.h   |   1 +
 lib/librte_eal/common/eal_options.h|   2 +
 lib/librte_eal/linuxapp/eal/eal.c  |   1 +
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 115 +
 5 files changed, 110 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 3efc90f..e9a111d 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -95,6 +95,7 @@ eal_long_options[] = {
{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
{OPT_VMWARE_TSC_MAP,0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
{OPT_XEN_DOM0,  0, NULL, OPT_XEN_DOM0_NUM },
+   {OPT_HUGE_TRYBEST,  0, NULL, OPT_HUGE_TRYBEST_NUM },
{0, 0, NULL, 0}
 };

@@ -899,6 +900,9 @@ eal_parse_common_option(int opt, const char *optarg,
return -1;
}
break;
+   case OPT_HUGE_TRYBEST_NUM:
+   internal_config.huge_trybest = 1;
+   break;

/* don't know what to do, leave this to caller */
default:
diff --git a/lib/librte_eal/common/eal_internal_cfg.h 
b/lib/librte_eal/common/eal_internal_cfg.h
index 5f1367e..90a3533 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -64,6 +64,7 @@ struct internal_config {
volatile unsigned force_nchannel; /**< force number of channels */
volatile unsigned force_nrank;/**< force number of ranks */
volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
+   volatile unsigned huge_trybest;   /**< try best to allocate hugepages */
unsigned hugepage_unlink; /**< true to unlink backing files */
volatile unsigned xen_dom0_support; /**< support app running on Xen 
Dom0*/
volatile unsigned no_pci; /**< true to disable PCI */
diff --git a/lib/librte_eal/common/eal_options.h 
b/lib/librte_eal/common/eal_options.h
index a881c62..02397c5 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -83,6 +83,8 @@ enum {
OPT_VMWARE_TSC_MAP_NUM,
 #define OPT_XEN_DOM0  "xen-dom0"
OPT_XEN_DOM0_NUM,
+#define OPT_HUGE_TRYBEST  "huge-trybest"
+   OPT_HUGE_TRYBEST_NUM,
OPT_LONG_MAX_NUM
 };

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8aafd51..eeb1d4e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -343,6 +343,7 @@ eal_usage(const char *prgname)
   "  --"OPT_CREATE_UIO_DEV"Create /dev/uioX (usually done by 
hotplug)\n"
   "  --"OPT_VFIO_INTR" Interrupt mode for VFIO 
(legacy|msi|msix)\n"
   "  --"OPT_XEN_DOM0"  Support running on Xen dom0 without 
hugetlbfs\n"
+  "  --"OPT_HUGE_TRYBEST"  Try best to accommodate hugepages\n"
   "\n");
/* Allow the application to print its usage message too if hook is set 
*/
if ( rte_application_usage_hook ) {
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..cb0df76 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -80,6 +80,8 @@
 #include 
 #include