On 3/13/18 8:27 AM, Matt Benjamin wrote:
On Tue, Mar 13, 2018 at 2:38 AM, William Allen Simpson
but if we assume xids retire in xid order also,
They do. Should be no variance. Eliminating the dupreq caching --
also using the rbtree -- significantly improved the timing.
It's certainly correct not to cache, but it's also a special case that
arises from...benchmarking with rpcping, not NFS.
Never-the-less, "significantly improved the timing".
Duplicates are rare. The DRC needs to be able to get out of the way,
and shouldn't add significant overhead.
Same goes for retire order. Who said, let's assume the rpcping
requests retire in order? Oh yes, me above.
Actually, me in an earlier part of the thread.
Do you think NFS
requests in general are required to retire in arrival order? No, of
course not. What workload is the general case for the DRC? NFS.
The question is not, do (RPC CALL) NFS requests retire in arrival order.
The question in this thread is how far out of order do RPC REPLY retire,
and best computer science data structure(s) for this workload.
Apparently picked the worst tree choice for this data, according to
computer science. If all you have is a hammer....
What motivates you to write this stuff?
Here are two facts you may have overlooked:
1. The DRC has a constant insert-delete workload, and for this
application, IIRC, I put the last inserted entries directly into the
cache. This both applies standard art on trees (rbtree vs avl
perfomance on insert/delete heavy workloads, and ostensibly avoids
searching the tree in the common case; I measured hitrate informally,
looked to be working).
I have no idea why we are off on this tangent here. The subject is
rpcping, not the DRC.
As to the DRC, we know that in fact the ntirpc "citihash" was of the
wrong data in GSS (the always changing ciphertext instead of the
plaintext), so in that case there was *no* hit rate at all.
In ntirpc v1.6, we now have a formal call to checksum, instead of an
ad hoc addition to the decode. So we should be getting a better hit
rate. I look forward to publication of your hit rate results.
2. the key in the DRC caches is hk,not xid.
That should improve the results for DRC RB-trees.
As I've mentioned before, I've never really examined the DRC code.
In person yesterday afternoon, you agreed that the repeated mallocs
in that code provide contention during concurrent thread processing
in the main path.
I've promised to take a look during my zero-copy efforts.
But this thread is about rpcping data structures.
What have you compared it to? Need a gtest of avl and tailq with the
same data. That's what the papers I looked at do....
The rb tree either is, or isn't a major contributor to latency. We'll
ditch it if it is. Substituting a tailq (linear search) seems an
unlikely choice, but if you can prove your case with the numbers, no
one's going to object.
Thank you. I'll probably try that in a week or so.
Right now, as mentioned on the conference call, I need some help
diagnosing why the rpcping code crashes. Some assumptions about
threading seem to be wrong. DanG is helping immensely!
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Nfs-ganesha-devel mailing list