Hi Greg,

Upon further reflection (and running your script), I am
very puzzled you are getting anywhere near 1000 segvn_fault
calls for the cp.  When I run your script and do a cp on a 1000 page file,
I get about 128 segvn_fault calls for the cp.
This is a much more reasonable number, since cp
is mmap-ing the file, and the file system code
should be faulting in multiple pages at a time.
For instance, if the file system brings in 56k, (14 pages
on my amd64 box), and my disk is reasonably fast,
by the time cp gets a bit into the first 56k, I suspect
that all of the data is in  memory and there is no
trapping into the kernel at all until the next 56k
needs to be read in.  (I guess I am assuming the hat
layer is setting up pte's as the pages are brought in,
not as cp is accessing them).

What file system is your file in, and what hardware
are you running on?

max

Quoting Brendan Gregg <[EMAIL PROTECTED]>:

G'Day Folks,

I'm revisiting segvn activity analysis to see it's hit rate from the page
cache. A while ago (before I had source code access) I tried writing this
from Kstat. Now with DTrace and source I want to nail a solution, and
then go back and rework the Kstat version.

However I've run into a problem - I'm hoping someone with experience in
this corner of the kernel can suggest what I may be doing wrong. (It's
quite possible I've assumed the wrong thing)...

I want segvn hit rate. There must be a function in the segvn code
somewhere that checks whether a page is already in the page cache or not,
which would be ideal for DTrace to trace. I would expect this to be
related to when a page fault occurs from a segvn mapping.

I've looked at a number of functions, including the following,

        segvn_fault
        page_exists
        page_lookup
        page_reclaim
        ufs_getpage
        ufs_getpage_ra
        seg_plookup

To test them I'm running cp on a large file (1000 pages). I would
expect that something would be called 1000 times for every page fault,
before we determine if the page is in the cache or not. (and assuming cp
isn't using MPSS).

segvn_fault looks promising but is called between 850 and 1000 times.
Since this is before we go to the page cache, I was expecting to always
see 1000.

All of the functions I've tried trace 95% of the expected segvn activity,
but never reliably 100%. I'm missing segvn_fault()s somehow. Or
page_exists(), or page_lookup()s, or whatever I try.

I currently suspect that read ahead is interfering with my expected
activity - populating mappings before they have faulted... Which sounds a
little far fetched, so I thought I'd better post about this.


The following output is where I'm at,

# ./segvn.d
Sampling...
^C
segvn_fault
-----------
CMD                                                  FILE    COUNT
bash                                      /lib/libdl.so.1        1
bash                                  /lib/libsocket.so.1        1
bash                                  /lib/libcurses.so.1        2
bash                                          /usr/bin/ln        2
dtrace                            /usr/lib/libdtrace.so.1        3
bash                                     /lib/libnsl.so.1        7
bash                                         /lib/ld.so.1        9
bash                                      /./usr/bin/bash       24
bash                       /usr/lib/libc/libc_hwcap1.so.1       24
cp                         /usr/lib/libc/libc_hwcap1.so.1       38
cp                                           /extra1/1000      992

CMD                                                  FILE        BYTES
bash                                      /lib/libdl.so.1         4096
bash                                  /lib/libsocket.so.1         4096
bash                                  /lib/libcurses.so.1         8192
dtrace                            /usr/lib/libdtrace.so.1        12288
bash                                          /usr/bin/ln        28672
bash                                     /lib/libnsl.so.1        28672
bash                       /usr/lib/libc/libc_hwcap1.so.1        98304
bash                                      /./usr/bin/bash        98304
cp                         /usr/lib/libc/libc_hwcap1.so.1       155648
bash                                         /lib/ld.so.1       163840
cp                                           /extra1/1000      4063232

So cp has faulted on 992 pages of the "/extra1/1000" file, but not the
full 1000. The bytes info corresponds to this problem (this is x86).


The scripts is,

-----segvn.d-----
#!/usr/sbin/dtrace -s

#pragma D option quiet

dtrace:::BEGIN
{
        trace("Sampling...\n");
}

fbt::segvn_fault:entry
/(int)((struct segvn_data *)args[1]->s_data)->vp != NULL/
{
        self->vn =  (struct vnode *)((struct segvn_data *)args[1]->s_data)->vp;
        @faults[execname, stringof(self->vn->v_path)] = count();
        @bytes[execname, stringof(self->vn->v_path)] = sum(args[3]);
}

dtrace:::END
{
        printf("segvn_fault\n-----------\n");
        printf("%-16s %40s %8s\n", "CMD", "FILE", "COUNT");
        printa("%-16s %40s [EMAIL PROTECTED]", @faults);
        printf("\n%-16s %40s %12s\n", "CMD", "FILE", "BYTES");
        printa("%-16s %40s [EMAIL PROTECTED]", @bytes);
}
-----segvn.d-----


suggestions are most appreciated... I'm still reading through source...

thanks,

Brendan

[Sydney, Australia]

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org




_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to