Change subject: gtest, dupreq:  profile and test DRC

gtest, dupreq:  profile and test DRC

Single- and multi-threaded correctness and profiling support
e.g., DRC request/retire rates wth NFSv3/4.0 TCP DRC.

Contains the following profile-guided rework:

0. testing w/1-entry cache--0 would be optimal, will send later
1. simplify locking patterns, lift lanes into drc_t
2. introduce per-DRC free lists (remove most allocations)
3. s/mutex/spinlock/
4. lower refcnt into lanes, move cost into teardown
5. added benchmarking defines (price out dv->sp, all rbt ops)

On 32-core E5-2630, observed >2.5M DRC retires/s w/2 threads (4.5M w/1 thread)


sudo chrt -rr 1 perf record -g --call-graph dwarf ./gtest/test_drc --nlanes=117 
--nthreads=2 --dhiwat=32

sudo chrt -rr 1 perf record -g --call-graph dwarf ./gtest/test_drc --nlanes=117 
--nthreads=19 --dhiwat=32 --per_thread_xprt

1. cost of rbt operations at n <= 64 less than 20% of runtime, worst case
2. cost of per-req spinlock negligible
3. contention for lane->sp dominates retire rate
3.3 scalability +varies geometrically w/nlanes, -varies faster w/nthreads
4. nlanes scale curve falls off above nlanes=117 (flat at 743)

