On 08/03/2016 01:42, Jianfeng Tan wrote: > This patch adds an option, --huge-trybest, to use a recover mechanism to > the case that there are not so many hugepages (declared in sysfs), which > can be used. It relys on a mem access to fault-in hugepages, and if fails > with SIGBUS, recover to previously saved stack environment with > siglongjmp(). > > Test example: > a. cgcreate -g hugetlb:/test-subgroup > b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup > c. cgexec -g hugetlb:test-subgroup \ > ./examples/helloworld/build/helloworld -c 0x2 -n 4 --huge-trybest
I think you should mention in the commit message that this option also covers the case of hugetlbfs mount with quota. > > +static sigjmp_buf jmpenv; > + > +static void sigbus_handler(int signo __rte_unused) > +{ > + siglongjmp(jmpenv, 1); > +} > + > +/* Put setjmp into a wrap method to avoid compiling error. Any non-volatile, > + * non-static local variable in the stack frame calling setjmp might be > + * clobbered by a call to longjmp. > + */ > +static int wrap_setjmp(void) > +{ > + return setjmp(jmpenv); > +} Use sigsetjmp instead of setjmp and restore the signal masks. > /* > * Mmap all hugepages of hugepage table: it first open a file in > * hugetlbfs, then mmap() hugepage_sz data in it. If orig is set, the > @@ -396,7 +413,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, > if (fd < 0) { > RTE_LOG(ERR, EAL, "%s(): open failed: %s\n", __func__, > strerror(errno)); > - return -1; > + return i; When using --try-best, we could get an error and still work as expected. It can be confusing for users to see an error when it is expected behavior. Any thoughts? > } > > /* map the segment, and populate page tables, > @@ -407,7 +424,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, > RTE_LOG(ERR, EAL, "%s(): mmap failed: %s\n", __func__, > strerror(errno)); > close(fd); > - return -1; > + return i; > } > Same comment as above > /* set shared flock on the file. */ > if (flock(fd, LOCK_SH | LOCK_NB) == -1) { > RTE_LOG(ERR, EAL, "%s(): Locking file failed:%s \n", > __func__, strerror(errno)); > close(fd); > - return -1; > + return i; Same comment as above > @@ -1137,10 +1206,24 @@ rte_eal_hugepage_init(void) > continue; > > /* map all hugepages available */ > - if (map_all_hugepages(&tmp_hp[hp_offset], hpi, 1) < 0){ > - RTE_LOG(DEBUG, EAL, "Failed to mmap %u MB hugepages\n", > - (unsigned)(hpi->hugepage_sz / > 0x100000)); > - goto fail; > + pages_old = hpi->num_pages[0]; > + pages_new = map_all_hugepages(&tmp_hp[hp_offset], hpi, 1); > + if (pages_new < pages_old) { > + RTE_LOG(DEBUG, EAL, > + "%d not %d hugepages of size %u MB allocated\n", > + pages_new, pages_old, > + (unsigned)(hpi->hugepage_sz / 0x100000)); > + if (internal_config.huge_trybest) { > + int pages = pages_old - pages_new; > + > + internal_config.memory -= > + hpi->hugepage_sz * pages; > + nr_hugepages -= pages; > + hpi->num_pages[0] = pages_new; > + if (pages_new == 0) > + continue; > + } else > + goto fail; > } There is another call to map_all_hugepages that you are not updating the check of the return value. Sergio