On 07.02.2007 [14:27:10 -0600], Adam Litke wrote:
> On Wed, 2007-02-07 at 11:37 -0800, Nishanth Aravamudan wrote:
> > @@ -831,6 +832,57 @@ static void remap_segments(struct seg_info *seg, int 
> > num)
> >     /* The segments are all back at this point.
> >      * and it should be safe to reference static data
> >      */
> > +
> > +   /*
> > +    * This pagecache dropping code should not be used for shared
> > +    * segments.  But we currently only share read-only segments, so
> > +    * the below check for PROT_WRITE is implicitly sufficient.
> > +    *
> > +    * Note: if minimal_copy is enabled, it is overkill to try and
> > +    * save huge pages here, as we will end up using more than
> > +    * normal anyways. Also, due to limitations on certain
> > +    * architectures, we would need to avoid prefaulting in the
> > +    * extracopy area so as to not use an inordinate number of huge
> > +    * pages.
> > +    */
> > +   if (minimal_copy) {
> 
> Now the added indent is really making me want to say this is getting
> large enough to want its own function.  I think the previous patch to
> store the extracopy info in the seg array allows you to pass only one
> param (the seg array) into the drop_cache() function.

heh, yep, updated again:

Author: Nishanth Aravamudan <[EMAIL PROTECTED]>
Date:   Mon Feb 5 14:22:02 2007 -0800

elflink: drop hugepage cached pages for writable segments

We currently use extra hugepages for writable segments because of the
first MAP_SHARED mmap() which stays resident in the page cache.  Use a
helper function to force a COW for the filesz portion, as well as the
extracopy area, of writable segments and then fadvise() to drop the page
cache pages while keeping our PRIVATE mapping. This is mutually
exclusive to segment sharing, but is also orthogonal in code because we
only allow sharing of read-only segments.  Also, if the minimal_copy
algorithm is disabled, we disable this optimization.

Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]>

diff --git a/elflink.c b/elflink.c
index 780d87c..9e4f6a0 100644
--- a/elflink.c
+++ b/elflink.c
@@ -788,6 +788,40 @@ static int obtain_prepared_file(struct seg_info 
*htlb_seg_info)
        return 0;
 }
 
+static void drop_cache(struct seg_info seg)
+{
+       long hpage_size = gethugepagesize();
+       void *p;
+       char c;
+
+       /*
+        * take a COW fault on each hugepage in
+        * the segment's file data ...
+        */
+       for (p = seg.vaddr; p <= seg.vaddr + seg.filesz;
+                                               p += hpage_size) {
+               memcpy(&c, p, 1);
+               memcpy(p, &c, 1);
+       }
+       /*
+        * ... as well as each huge page in the
+        * extracopy area
+        *
+        */
+       for (p = seg.extra_vaddr; p && p <= seg.extra_vaddr +
+                                       seg.extrasz; p += hpage_size) {
+               memcpy(&c, p, 1);
+               memcpy(p, &c, 1);
+       }
+       /*
+        * Note: fadvise() failing is not
+        * actually an error, as we'll just use
+        * an extra set of hugepages (in the
+        * pagecache).
+        */
+       posix_fadvise(seg.fd, 0, 0, POSIX_FADV_DONTNEED);
+}
+
 static void remap_segments(struct seg_info *seg, int num)
 {
        long hpage_size = gethugepagesize();
@@ -831,6 +865,23 @@ static void remap_segments(struct seg_info *seg, int num)
        /* The segments are all back at this point.
         * and it should be safe to reference static data
         */
+
+       /*
+        * This pagecache dropping code should not be used for shared segments.
+        * But we currently only share read-only segments, so the below check
+        * for PROT_WRITE is implicitly sufficient.
+        *
+        * Note: if minimal_copy is enabled, it is overkill to try and
+        * save huge pages here by dropping them out of the cache, as we
+        * will end up using more than normal anyways. Also, due to
+        * limitations on certain architectures, we would need to avoid
+        * prefaulting in the extracopy area so as to not use an
+        * inordinate number of huge pages.
+        */
+       if (minimal_copy)
+               for (i = 0; i < num; i++)
+                       if (seg[i].prot & PROT_WRITE)
+                               drop_cache(seg[i]);
 }
 
 static int check_env(void)

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Libhugetlbfs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to