On Thu, Jun 5, 2014 at 12:50 PM, Alexander Motin <[email protected]> wrote:
> On 05.06.2014 22:37, Matthew Ahrens wrote: > >> Interesting, what platform are you testing on? I have not seen >> substantial congestion on this lock on illumos, testing with up to ~1 >> million IOPS (reads of cached 8k blocks). >> > > Now I am testing this on dual IvyBridge Xeon E5-2690 v2 system (40 > (2x10x2) logical cores). > > Mentioned earlier SPEC NFS test was running dual Westmere Xeon E5645 > system (24 (2x6x2) logical cores). There problem was much less noticeable, > but IOPS in that test were much lower too. > > I'd like to note that I've seen quite a lot of examples already, when > congestion barely measurable on 24 Westmere cores just explodes on 40 > IvyBridge. This looks like one of them. > Ah, interesting. I have been testing with up to 24 CPUs. Will have to find a larger machine :) Have you seen contention on the arcs_mtx, when manipulating the arc lists from add_reference() and remove_reference()? I typically see contention on that before I see contention on e.g. the godfather zio. I am working on a fix for the arcs_mtx, but it is much more involved because we need to split the list into per-CPU lists. --matt > > On Thu, Jun 5, 2014 at 12:34 PM, Alexander Motin <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi again! Another day, another patch. :) >> >> Testing the same setup as in earlier "Godfather ZIO lock congestion" >> thread (small strided reads from 256 threads concurrently on 40-core >> machine) but with ARC size sufficient to fit all test data, I hit >> another lock congestion on RRW lock of zfsvfs->z_teardown_lock. >> Earlier I've seen the same congestion even on smaller machine while >> profiling SPEC NFS benchmark. >> >> To avoid the congestion with attached patch I've replaced single >> teardown RRW lock per struct vfszfs with bunch (17) of them. Read >> acquisitions are randomly distributed among them based on curthread >> pointer to avoid any measurable congestion in a hot path. Write >> acquisition are going to all the locks, but they should be rare >> enough to not bother. >> >> As result, performance on this test setup increased from ~475K IOPS >> to ~1.3M IOPS. >> >> Any comments? >> >> -- >> Alexander Motin >> >> _______________________________________________ >> developer mailing list >> [email protected] <mailto:[email protected]> >> http://lists.open-zfs.org/mailman/listinfo/developer >> >> >> > > -- > Alexander Motin >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
