On Tue, Mar 13, 2018 at 2:38 AM, William Allen Simpson
> On 3/12/18 6:25 PM, Matt Benjamin wrote:
>> If I understand correctly, we always insert records in xid order, and
>> xid is monotonically increasing by 1. I guess pings might come back
>> in any order,
> No, they always come back in order. This is TCP. I've gone to some
> lengths to fix the problem that operations were being executed in
> arbitrary order. (As was reported in the past.)
We're aware of the issues with former req queuing. It was one of my
top priorities to fix in napalm, and we did it.
> For UDP, there is always the possibility of loss or re-ordering of
> datagrams, one of the reasons for switching to TCP in NFSv3 (and
> eliminating UDP in NFSv4).
> Threads can still block in apparently random order, because of
> timing variances inside FSAL calls. Should not be an issue here.
>> but if we assume xids retire in xid order also,
> They do. Should be no variance. Eliminating the dupreq caching --
> also using the rbtree -- significantly improved the timing.
It's certainly correct not to cache, but it's also a special case that
arises from...benchmarking with rpcping, not NFS.
Same goes for retire order. Who said, let's assume the rpcping
requests retire in order? Oh yes, me above. Do you think NFS
requests in general are required to retire in arrival order? No, of
course not. What workload is the general case for the DRC? NFS.
> Apparently picked the worst tree choice for this data, according to
> computer science. If all you have is a hammer....
What motivates you to write this stuff?
Here are two facts you may have overlooked:
1. The DRC has a constant insert-delete workload, and for this
application, IIRC, I put the last inserted entries directly into the
cache. This both applies standard art on trees (rbtree vs avl
perfomance on insert/delete heavy workloads, and ostensibly avoids
searching the tree in the common case; I measured hitrate informally,
looked to be working).
2. the key in the DRC caches is hk,not xid.
>> and keep
>> a window of 10000 records in-tree, that seems maybe like a reasonable
>> starting point for measuring this?
>> I've not tried 10,000 or 100,000 recently. (The original code
> default sent 100,000.)
> I've not recorded how many remain in-tree during the run.
> In my measurements, using the new CLNT_CALL_BACK(), the client thread
> starts sending a stream of pings. In every case, it peaks at a
> relatively stable rate.
> For 1,000, <4,000/s. For 100, 40,000/s. Fairly linear relationship.
> By running multiple threads, I showed that each individual thread ran
> roughly the same (on average). But there is some variance per run.
> I only posted the 5 thread results, lowest and highest achieved.
> My original message had up to 200 threads and 4 results, but I decided
> such a long series was overkill, so removed them before sending.
> That 4,000 and 40,000 per client thread was stable across all runs.
>> I wrote a gtest program (gerrit) that I think does the above in a
>> single thread, no locks, for 1M cycles (search, remove, insert). On
>> lemon, compiled at O2, the gtest profiling says the test finishes in
>> less than 150ms (I saw as low as 124). That's over 6M cycles/s, I
> What have you compared it to? Need a gtest of avl and tailq with the
> same data. That's what the papers I looked at do....
The point is, that is very low latency, a lot less than I expected.
It's probably minimized from CPU caching and so forth, but it tries to
address the more basic question, is expected or unexpected latency
from searching the rb tree a likely contributor to overall latency?
If we get 2M retires per sec (let alone 6-7), is that a likely
The rb tree either is, or isn't a major contributor to latency. We'll
ditch it if it is. Substituting a tailq (linear search) seems an
unlikely choice, but if you can prove your case with the numbers, no
one's going to object.
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Nfs-ganesha-devel mailing list