On Sat, Jun 13, 2015 at 3:37 PM, Arne Jansen <[email protected]> wrote:

> On 13/06/15 15:51, Matthew Ahrens wrote:
>
>> rrw_enter_read_impl() should be static.
>> The 'prio' argument to it should be boolean_t (and the values passed
>> should be B_TRUE, B_FALSE).
>>
>> Perhaps dmu_objset_find_dp_cb() should have a comment explaining why it
>> needs to use the _prio() routine.
>>
>> Otherwise, looks good.
>>
>>
> Updated in place.
>

Looks good.  Thanks for fixing this!

--matt


>
> -arne
>
>  --matt
>>
>> On Fri, Jun 12, 2015 at 9:55 PM, Arne Jansen <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     On 13/06/15 06:47, Arne Jansen wrote:
>>
>>         On 13/06/15 04:18, Matthew Ahrens wrote:
>>
>>
>>
>>             On Thu, Jun 4, 2015 at 1:31 AM, Arne Jansen
>>             <[email protected] <mailto:[email protected]>
>>             <mailto:[email protected] <mailto:[email protected]>>> wrote:
>>
>>                  On 06/03/2015 10:40 AM, Arne Jansen wrote:
>>                  > Unfortunately we just discovered a deadlock with this
>>             code:
>>                  >
>>                  > When dmu_objset_find_dp gets called with a read lock
>>             held, it fans
>>                  > out the work to the task queue. Each task in turn
>>             acquires its own
>>                  > read lock before calling the callback. If during this
>>             process
>>             anyone
>>                  > tries to a acquire a write lock, it will stall all
>>             read lock
>>             requests.
>>                  > Thus the tasks will never finish, the read lock of
>>             the caller will
>>                  > never get freed and the write lock never acquired.
>>             deadlock.
>>                  >
>>                  > I'm still thinking about how to fix this. Probably
>>             the best way
>>             would
>>                  > be that the caller doesn't hold the read lock in the
>>             first
>>             place, but
>>                  > I'm not sure if that's feasible for all callers.
>>                  >
>>
>>                  I think it is not. My current plan for fixing this is
>>             basically to
>>                  implement
>>                  the plan sketched out in the first comment of
>>             dmu_objset_find_dp:
>>
>>                    * In the future it might be possible to get some
>>             magic into
>>                    * dsl_pool_config_held in a way that it returns true
>> for
>>                    * the worker threads so that a single lock held from
>> this
>>                    * thread suffices.
>>
>>                  So instead of getting the lock in dmu_objset_find_dp_cb
>>             with
>>                  dsl_pool_config_enter, I'd call a to-be-implemented
>>             function
>>                  dsl_pool_config_inherit, that marks the thread as a
>>             co-owner of
>>             the lock
>>                  in the thread's tsd. dsl_pool_config_held can than
>>             check for this
>>                  co-ownership.
>>
>>                  This would also make it possible to parallelize the
>>             write lock case,
>>                  although
>>                  this would need extra testing.
>>
>>                  What do you think?
>>
>>
>>             I think we need to fix this sooner rather than later, either
>>             along the
>>             lines of what you described, or by reverting the change.
>>
>>             As we discussed on IRC, I think it would be cleanest to add
>> this
>>             lock-inheriting logic to the rrw code.  But in the interest
>>             of getting
>>             it fixed expeditiously, I'm open to whatever code you have.
>>
>>
>>         I took the chicken way out and implemented the priority read_lock
>>         you proposed, as it is the least intrusive patch and minimizes the
>>         need for testing. It is currently under test here. I was going to
>>         send it out on monday, but can also do so later this day for a
>>         quick review.
>>
>>
>>     http://cr.illumos.org/~webrev/sensille/5981_deadlock/
>>
>>
>>         -Arne
>>
>>
>>             --matt
>>             *illumos-zfs* | Archives
>>             <https://www.listbox.com/member/archive/182191/=now>
>>             <
>> https://www.listbox.com/member/archive/rss/182191/23449240-486d00d3>
>>             |
>>             Modify
>>             <https://www.listbox.com/member/?&;>
>>             Your Subscription    [Powered by Listbox]
>>             <http://www.listbox.com>
>>
>>
>>
>>
>>         -------------------------------------------
>>         illumos-zfs
>>         Archives: https://www.listbox.com/member/archive/182191/=now
>>         RSS Feed:
>>
>> https://www.listbox.com/member/archive/rss/182191/23449240-486d00d3
>>         Modify Your Subscription:
>>         https://www.listbox.com/member/?&;
>>
>>         Powered by Listbox: http://www.listbox.com
>>
>>
>>
>>
>>     -------------------------------------------
>>     illumos-zfs
>>     Archives: https://www.listbox.com/member/archive/182191/=now
>>     RSS Feed:
>>     https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
>>     Modify Your Subscription: https://www.listbox.com/member/?&;
>>     Powered by Listbox: http://www.listbox.com
>>
>>
>> *illumos-zfs* | Archives
>> <https://www.listbox.com/member/archive/182191/=now>
>> <https://www.listbox.com/member/archive/rss/182191/23449240-486d00d3> |
>> Modify
>> <https://www.listbox.com/member/?&;>
>> Your Subscription       [Powered by Listbox] <http://www.listbox.com>
>>
>>
>
>
> -------------------------------------------
> illumos-zfs
> Archives: https://www.listbox.com/member/archive/182191/=now
> RSS Feed:
> https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
> Modify Your Subscription:
> https://www.listbox.com/member/?member_id=21635000&id_secret=21635000-73dc201a
> Powered by Listbox: http://www.listbox.com
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to