>> The first set of fetches start at 11:38:07.890567 and continue until
>> 11:41:08.004459.  That's 60.113892 seconds.
>
>This just means that that is how long the entire write of the file took does
>it not? What is the accummulated time for each fetchdata RPC?

I didn't calculate that, actually.

>11:41:07.986469 janeway.afscb > q.afsfs: rx data fs call fetch-data fid
>536872319/16262/19711 offset 409468928 length 16384 (52)
>11:41:07.987155 q.afsfs > janeway.afscb: rx data fs reply fetch-data (152)
>11:41:08.003740 janeway.afscb > q.afsfs: rx data fs call fetch-data fid
>536872319/16262/19711 offset 409534464 length 16384 (52)
>11:41:08.004459 q.afsfs > janeway.afscb: rx data fs reply fetch-data (152)
>11:41:08.746114 janeway.afscb > q.afsfs: rx ack (65)
>
>This is two of the earlier FetchData RPCs. The first takes .000686 seconds and
>the next takes .000719 seconds. So for the sake of argument, say each
>takes .001 seconds. Let's say you've got a 64K chunk size and this is 
>a 700Meg file, if we did a bogus fetch of every chunk that would be
>11,200 bogus fetches. At .001 seconds each that is 11.2 seconds of FetchData
>RPC's. 

Well, that's probably close to accurate.  However, the reason I calculated
the start-end time of the whole set of FetchData RPCs was that in my
way of thinking, if you omitted the FetchData RPCs completely, you'd gain
back the entire time you spent doing them.  It's completely possible
that my thinking is wrong and you'd still be doing stuff regardless during
that time.

>I'd suggest that the overhead is elsewhere, but not having any idea
>how the 700Meg file is generated or the environment of the test I couldn't
>even begin to guess.

Well, the environment is a bit strange, but the test is simply doing
a "mv bigfile /afs/...".

One thing I didn't mention was that the reason I started looking at this in
the first place was that while copying these files into afs I noticed
some places where it would "pause" for a while; the file on the client
wouldn't get any bigger, and it would eventually unfreeze and then
start getting bigger again.  The "freeze" seemed to correspond to
the FetchData RPCs, but maybe that analysis is incorrect.

>But I agree that we shouldn't be doing the extra RPCs if possible. One 
>problem is that the original hack didn't work once part of the file was
>written to the server.

Well, if a cache manager came our way that was changed to not do the
extra RPCs, I'd be willing to run this test again.

(I one thought about hacking up a version of cp that stored the file
directly by using AFS calls so I could bypass the cache manager completely;
I sometimes wonder what kind of performance I'd get out of that).

--Ken

Reply via email to