Roland Dreier Wrote:
> > ibv_reg_mr() fails to register a memory region allocated on huge page and
> not
> > the default page size. This happens because ibv_madvise_range() aligns
> memory
> > region to the default system page size before calling to madvise() which
> fails
> > with EINVAL error. madvise() fails because it expects that the start and
> end
> > pointer of the memory range be huge page aligned.
>
> Seems unfortunate. I wonder if there's a way the kernel madvise could
> help us here?
>
> > +/*
> > + * Get the kernel default huge page size.
> > + */
> > +static int get_huge_page_size()
> > +{
> > + int fd;
> > + char buf[MEMINFO_SIZE];
> > + int mem_file_len;
> > + char *p_hpage_val = NULL;
> > + char *end_pointer = NULL;
> > + char file_name[] = "/proc/meminfo";
> > + const char label[] = "Hugepagesize:";
> > + int ret_val = 0;
> > +
> > + fd = open(file_name, O_RDONLY);
> > + if (fd < 0)
> > + return fd;
> > +
> > + mem_file_len = read(fd, buf, sizeof(buf) - 1);
> > +
> > + close(fd);
> > + if (mem_file_len < 0)
> > + return mem_file_len;
> > +
> > + buf[mem_file_len] = '\0';
> > +
> > + p_hpage_val = strstr(buf, label);
> > + if (!p_hpage_val) {
> > + errno = EINVAL;
> > + return -1;
> > + }
> > + p_hpage_val += strlen(label);
> > +
> > + errno = 0;
> > + ret_val = strtol(p_hpage_val, &end_pointer, 0);
> > +
> > + if (errno != 0)
> > + return -1;
> > +
> > + return ret_val * 1024;
> > +}
>
> This seems to duplicate but only partially a similar function from
> libhugetlbfs. Is there any way we can just use that directly? eg
> libhugetlbfs handles the case where there are multiple huge page sizes
> (and that exists even on mainstream x86 with 2MB and 1GB pages possible
> on the same system).
>
> - R.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hi Roland,
After the patches, which handle madvise failure, are applied(these pathces were
submited under the topic:"libibverbs: Undo changes in memory range tree when
madvise() fails"), I would like to renew the discussion about this patch, which
actually depends on the above patches, since it may cause madvise failure.
>This seems to duplicate but only partially a similar function from
>libhugetlbfs. Is there any way we can just use that directly? eg
>libhugetlbfs handles the case where there are multiple huge page sizes
>(and that exists even on mainstream x86 with 2MB and 1GB pages possible
>on the same system).
In order to avoid adding additional dependency to libibverbs, maybe we should
just to enhance the get_huge_page_size() so it will support multiple huge page
sizes?
-Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html