on 10/06/2014 03:14 Matthew Ahrens said the following:
> I tested performance of these rrw lock changes by reading cached data with 28
> CPUs, on illumos. I found no improvement, because lock contention from ARC's
> add_reference/remove_reference() slows things down so much that the rrw lock
> is
> not noticeable. However, after removing this source of contention, the rrw
> lock
> changes provided a 20% performance improvement.
By the way, in FreeBSD arcs_list and arcs_mtx were split into multiple lists and
locks quite a while ago. There are 16 sub-lists for metadata buffers and 16 for
data buffers. This is how FreeBSD definition of arc_state looks:
#define ARC_BUFC_NUMDATALISTS 16
#define ARC_BUFC_NUMMETADATALISTS 16
#define ARC_BUFC_NUMLISTS (ARC_BUFC_NUMMETADATALISTS +
ARC_BUFC_NUMDATALISTS)
typedef struct arc_state {
uint64_t arcs_lsize[ARC_BUFC_NUMTYPES]; /* amount of evictable data */
uint64_t arcs_size; /* total amount of data in this state */
list_t arcs_lists[ARC_BUFC_NUMLISTS]; /* list of evictable buffers */
struct arcs_lock arcs_locks[ARC_BUFC_NUMLISTS]
__aligned(CACHE_LINE_SIZE);
} arc_state_t;
An appropriate sub-list and a lock are chosen based on buf_hash().
> details, including flame graphs
> <http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html> of kernel CPU
> time:
>
> reads of 8KB of cached data
> recordsize=8k; all other properties default
> 28 vCPUs on VMware ESX, 4 hardware sockets, each with 8 cores, each with 2
> hyperthreads
> 28 processes
> 28 files (each process reads from its own file)
> see fio config below.
>
> Stock illumos: ~350,000 IOPS
> http://mahrens.org/flame/cached_reads_28_cpu_stock.svg
> primary problem is arc_buf_add_ref() / arc_buf_remove_ref() (>75% of all CPU)
>
> Applying Alexander's changes to break up the rrw lock: ~350,000 IOPS
> http://mahrens.org/flame/cached_reads_28_cpu_rrw.svg
> primary problem is still arc_buf_add_ref() / arc_buf_remove_ref() (>75% of
> all CPU)
>
> Fixing arc_buf_add/remove_ref() by creating per-cpu arcs_lists: 1,000,000 IOPS
> http://mahrens.org/flame/cached_reads_28_cpu_agg.svg
> rrw_enter/exit taking ~8% of all CPU
>
> Both arc_buf_add/removeref and rrw lock fixes: 1,200,000 IOPS
> http://mahrens.org/flame/cached_reads_28_cpu_agg_rrw.svg
> rrw_enter/exit taking ~1.5% of all CPU
>
> --matt
>
> fio config file:
> [global]
>
> fallocate=none
> ioengine=psync
> numjobs=28
> iodepth=1
> time_based
> runtime=10m
> bs=8k
> rw=randread
> filesize=1g
> size=1g
> randrepeat=0
> use_os_rand=1
>
> [test]
> directory=/test/fs
>
>
>
> On Mon, Jun 9, 2014 at 11:51 AM, Alexander Motin <[email protected]
> <mailto:[email protected]>> wrote:
>
> On 09.06.2014 21:33, Matthew Ahrens wrote:
>
> Changes look good to me. I've ported them to illumos:
>
> http://reviews.csiden.org/r/__46/diff/#index_header
> <http://reviews.csiden.org/r/46/diff/#index_header>
>
> I'll do some more testing and then put them up for a "real" illumos
> code
> review.
>
>
> Thank you! I hope to see both patches upstream soon -- I hate to drag many
> uncommitted patches after me. :)
>
>
> Alexander, do you want to add any copyright notice to your patch
> before
> it goes to illumos? (it is not required)
>
>
> Do as you see better. I don't put too much importance to it -- I already
> have enough of my name spread over FreeBSD sources. ;) If that is expected
> -- you may put there one line like: "Alexander Motin <[email protected]>".
>
> On Sun, Jun 8, 2014 at 1:53 PM, Alexander Motin <[email protected]
> <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>
> On 08.06.2014 21:32, Alexander Motin wrote:
>
> On 07.06.2014 22:35, Matthew Ahrens wrote:
>
> code review:
>
> zfsvfs_hold/rele need to pass the correct tags through to
> rrw_enter().
> The existing code will not work (if
> reference_tracking_enable), because
> you will be holding with tag "zfsvfs_hold" and releasing
> with
> "zfsvfs_rele". The existing code holds and releases with
> tag
> e.g.
> "zfs_prop_set_userquota"; that needs to remain the case.
>
>
> Indeed there was a bug, thank you.
>
> I think it would be cleaner to encapsulate the multitude
> of
> locks within
> the rrw code, but I can live with it as is.
>
>
> I had such thought initially, but then decided it will be too
> big mess
> to do it inside. Also it may not always be needed. But now
> you've asked,
> I've made intermediate solution: let me introduce Recursable
> Read-Mostly
> locks, as a wrapper on top of existing Recursable Read-Write
> locks. :)
>
> The attached patch implements rrmlock, and uses them for
> teardown lock.
> It indeed looks much better. Thanks for pushing me to do it.
> :)
> What do
> you think?
>
>
> I'm sorry, I've missed a bug until built kernel with debugging.
> Fixed version is attached.
>
> --
> Alexander Motin
>
>
>
>
> --
> Alexander Motin
>
>
>
>
> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer
>
--
Andriy Gapon
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer