On Fri, Oct 31, 2014 at 2:24 AM, Arne Jansen <[email protected]> wrote:
> Ok, we're getting closer. Next version still at: > > http://cr.illumos.org/~webrev/sensille/find_parallel_dp/ > > On 10/29/2014 03:24 AM, Matthew Ahrens wrote: > > dmu_objset_find_parallel_impl - it might make sense to call the callback > before > > doing taskq_dispatch, that way we can say that the traversal is pre-order > > (callback on parent is executed before callbacks on children). > > This would put the callback in the hot path and could slow down the > process a > bit. I'd rather only do it if someone needs it. > > > > > It would be really great to extend this to be a replacement for > > dmu_objset_find_dp(), perhaps with a flag for single-threaded callbacks > (i.e. > > create the taskq with just one thread). I took a look at all the > consumers of > > dmu_objset_find_dp(), and I think they would all work fine with pre-order > > iteration (as you're doing) rather than post-order. > > Done. Again, I left out the single-thread flag until someone needs it. > Unfortunately I have no good way to test this. In my previous > implementation > I was quite confident that it works fine, but I didn't test the other > consumers > of dmu_objset_find_dp. Maybe I have to take a look at the testsuite... > We will definitely need to test all of the callers. Unless they are modified, some of the callers will require single-threaded execution of their callbacks (i.e. create taskq with one thread). --matt > > -Arne > > > > > > vdev_count_leaves_impl - can this be "static"? > > > > zil.c lines 639, 698: incorrect indentation (should be 2 tabs + 4 spaces) > > > > It's conventional to use %llu and then cast the argument to "unsigned > long > > long", as opposed to using PRIu64. > > > > --matt > > > > > > > > On Tue, Oct 28, 2014 at 1:59 AM, Arne Jansen <[email protected] > > <mailto:[email protected]>> wrote: > > > > Like this? > > > > http://cr.illumos.org/~webrev/sensille/find_parallel_dp/ > > > > -Arne > > > > On 10/16/2014 10:12 PM, Matthew Ahrens via illumos-developer wrote: > > > (resend to cc a few more lists -- sorry) > > > > > > I think the overall idea of this change is sound -- and that's a > great > > > performance improvement. > > > > > > Can you change dmu_objset_find_parallel() to work like > dmu_objset_find_dp(), and > > > then change zil_check_log_chain(0 and zil_claim() to work off of > the > > > dsl_dataset_t*, rather than the char*name? Then I think the other > threads won't > > > need to grab the namespace lock. > > > > > > --matt > > > > > > On Thu, Oct 16, 2014 at 12:05 PM, Arne Jansen via illumos-developer > > > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto: > [email protected]>>> > > wrote: > > > > > > In our setup zpool import takes a very long time (about 15 > minutes) due > > > to a large count of filesystems (> 5000). > > > During this time, you can see the disks at only a few percent > busy in > > > iostat, while the pool is at 100%, which indicates a > single-threaded > > > import process. > > > What's causing this is checking of the zil chains for each > filesystem. > > > Each filesystem is visisted twice, once to check the chain and > once > > > to claim it. This is done in a sequential manner. > > > I've created a proof-of-concept patch that switches this > process to a > > > parallel enumeration using a taskq. > > > In my setup this speeds up the import process by a factor of > 20, > > > bringing all disks (30 in my case) to nearly 100% busy. > > > There's only one problem: locking. spa_open is called with > > > spa_namespace_lock held. With this, later on > zil_check_log_chain and > > > zil_claim are called. These call in turn spa_open, which tries > to claim > > > spa_namespace_lock again. This problem is not new. The current > solution > > > looks like this: > > > > > > /* > > > * As disgusting as this is, we need to support recursive > calls to this > > > * function because dsl_dir_open() is called during > spa_load(), and ends > > > * up calling spa_open() again. The real fix is to figure out > how to > > > * avoid dsl_dir_open() calling this in the first place. > > > */ > > > if (mutex_owner(&spa_namespace___lock) != curthread) { > > > mutex_enter(&spa_namespace___lock); > > > locked = B_TRUE; > > > } > > > > > > This doesn't work anymore when I call > zil_claim/check_log_chain through > > > a taskq and thus from a different thread. My current hacky > solution is > > > to pass a flag through the call chain to spa_open to signal > that the > > > lock isn't needed. It works, but it doesn't look too nice... > > > > > > My question now is how to build a cleaner solution. The > snippet above > > > is already ugly in itself, so it would be good to get rid of > it at all. > > > The functions involved in the call chains are all very > commonly used > > > functions. Changing the signature of those will probably give > a quite > > > bulky patch, so I'd prefer to find a nice small unintrusive > way. > > > > > > The patch currently looks like this: > > > > > > http://cr.illumos.org/~webrev/__sensille/find_parallel/ > > > <http://cr.illumos.org/~webrev/sensille/find_parallel/> > > > > > > Hopefully you can give me some hints on how to solve this > recursive > > > locking riddle... > > > > > > -Arne > > > > > > > > > > > > ------------------------------__------------- > > > illumos-developer > > > Archives: https://www.listbox.com/__member/archive/182179/=now > > > <https://www.listbox.com/member/archive/182179/=now> > > > RSS Feed: > > > > https://www.listbox.com/__member/archive/rss/182179/__21175174-cd73734d > > > < > https://www.listbox.com/member/archive/rss/182179/21175174-cd73734d> > > > Modify Your Subscription: > > > > https://www.listbox.com/__member/?&id___secret=21175174-792643f6 > > > <https://www.listbox.com/member/?&> > > > Powered by Listbox: http://www.listbox.com > > > > > > > > > *illumos-developer* | Archives > > > <https://www.listbox.com/member/archive/182179/=now> > > > < > https://www.listbox.com/member/archive/rss/182179/21501168-10807f51> | > Modify > > > > > < > https://www.listbox.com/member/?member_id=21501168&id_secret=21501168-469bb5e1 > > > > > Your Subscription [Powered by Listbox] <http://www.listbox.com > > > > > > > > > > >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
