Nishanth,

  This really helped a lot !! I am going through the libhugetlbfs code. Will 
start porting 
once i understand the complete flow.

  Current stable version for mips linux is 2.6.26. Probably will migrate to it 
then.

Thanks,
Mehul.

There are only 10 kind of people in the world ... those who knows binary and 
those who do not......

--- On Tue, 12/30/08, Nishanth Aravamudan <n...@us.ibm.com> wrote:
From: Nishanth Aravamudan <n...@us.ibm.com>
Subject: Re: [Libhugetlbfs-devel] Lib hugetlbfs
To: "mehul vora" <mehu...@yahoo.com>
Cc: libhugetlbfs-devel@lists.sourceforge.net, a...@us.ibm.com
Date: Tuesday, December 30, 2008, 11:17 AM

On 29.12.2008 [21:53:03 -0800], mehul vora wrote:
> Nishanth,
> 
>   Thanks, i will probably migrate to 2.6.27 then.

Well, 2.6.28 is out -- I'd probably go with that, in case there are
kernel fixes even then. Not sure if there were or not, you could check
the changelog.

>   Another doubt i had was, how libhugetlbfs allocates hugepages for
> application's data/text segment  without any kernel component in
> "fork/execv/elfload" path ? Could you help me in understanding
the
> flow ? 

Hugepage allocation for program segments does not involve the kernel for
anything other than the fault path.

Basically, for each segment you want to back with hugepages, copy the
data from the normal smallpage backed segment into a hugetlbfs file
(unlinked normally, so not visible to snooping applications). Then,
unmap the program segments (this is the black hole part of the code
where we no longer have a text segment to rely on, for instance) and map
in the hugetlbfs files in their place.

On fork, the kernel will just perform COW on hugepages (I think) when
faults are taken. On exec, you are just rerunning the original program
again, so the steps I mentioned above will be done again.

For a very heavy fork/exec workload with locality, it might make sense
to specify HUGETLB_SHARE=1 (hugectl --share, I think) to share the
hugetlbfs file which contains the text segments. That should reduce copy
times (to zero for any but the first process), fault times (because
every process has it mapped read-only as text) and hugepage consumption
(as there will only be one copy of the text segment in hugepages, no
matter how many of the same process are running, as long as they all
specify SHARE=1). Experimenting with that will indicate the benefits for
your application, of course.

>   I checked the file "hugeedit" file, it modifies the program
header
> of text/data segment and sets "PF_LINUX_HUGETLB" bit in pflags.
But
> who reads this finally ?  Does the kernel architecture port of
> hugetlbfs has to read this field and allocate a page accordingly ?

hugeedit (I think this is covered in the HOWTO, but I'm not sure) is
only used to change the default behavior for applications. With
old-style relinking (using linker scripts), the default was to remap
hugepages into any appropriately marked segments. But with new-style
relinking, there is no marking of segments, so there would need to be an
environment variable specified, or the program run under hugectl, I
think, for hugepages to be used to back the segments. hugeedit modifies
the binary's flags for the segments we want to remap to be "remap by
default" by setting that PF_LINUX_HUGETLB flag. It also can unset it.
The flag is only used by libhugetlbfs, not the kernel.

Thanks,
Nish

-- 
Nishanth Aravamudan <n...@us.ibm.com>
IBM Linux Technology Center



      
------------------------------------------------------------------------------
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to