Just to be clear, this is the change with which we are running:
--- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c
+++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c
@@ -1381,6 +1381,7 @@ sa_handle_get_from_db(objset_t *os, dmu_buf_t *db, void
*userp,
/* if one doesn't exist then create a new one, and initialize it */
handle = (hdl_type == SA_HDL_SHARED) ? dmu_buf_get_user(db) : NULL;
+ VERIFY3P(handle, ==, NULL);
if (handle == NULL) {
sa_handle_t *newhandle;
handle = kmem_cache_alloc(sa_cache, KM_SLEEP);
on 22/10/2013 10:30 Andriy Gapon said the following:
> on 18/10/2013 20:43 Matthew Ahrens said the following:
>> It's shared among everyone who has a hold on the dbuf. When there are no
>> more
>> holds, the dbuf may be evicted, and call sa_evict() to get rid of the shared
>> sa
>> handle. Saving the sa_handle_t in the dbufs' user pointer allows e.g.
>> zfs_zget() to find it. Although I guess it does use "immediate evict", so
>> you'd
>> probably only find it if 2 threads are concurrently accessing the same
>> object?
>> I would presume that is possible.
>
> So now I see it. Essentially SA_HDL_SHARED allows to find sa from dbuf using
> dmu_buf_get_user. But that does not imply sharing by itself.
>
> Some observations:
> - immediate evict is used indeed
> - sa_evict consists only of a call to panic, so I guess that it is not
> expected
> to be evicted before sa_handle_destroy is called
> - all calls to sa_handle_get_from_db(SA_HDL_SHARED) happen within ZFS_OBJ_HOLD
>
> I guess that the implication is that a znode normally[*] maintains a hold on
> the
> dbuf that backs the sa handle and thus there can be no eviction.
> [*] - this is not true only between zfs_suspend_fs and zfs_resume_fs.
>
>> Have you tried adding an assertion (or dtracing) that there is no sharing
>> (i.e.
>> that we never find an existing sa handle attached to the dbuf, from
>> zfs_zget()
>> or sa_handle_get_from_db())? You would probably need a workload that
>> accesses
>> the same object concurrently, perhaps over NFS (so it has to use zfs_zget()
>> rather than having the vnode in hand). That should show you if / how the
>> sharing code is exercised.
>
> Yes, I have tried that and could not detect any sharing occurring in
> sa_handle_get_from_db.
>
> Looking at zfs_zget specifically it does the following (many details omitted):
> ZFS_OBJ_HOLD_ENTER
> sa_buf_hold
> hdl = dmu_buf_get_user
> if hdl != NULL
> ...
> else
> ...
> sa_handle_get_from_db(SA_HDL_SHARED)
> ...
> ZFS_OBJ_HOLD_EXIT
>
> So no sharing should happen in sa_handle_get_from_db, because of the locking
> and
> checking in zfs_zget.
>
> To summarize. A sa handle can not be shared between two different znodes.
> So,
> sharing logic in sa_handle_get_from_db seems to be redundant given the logic
> of
> zfs_zget. In more strong words, I believe that allowing sharing in
> sa_handle_get_from_db is a bug because it could mask other bugs.
>
>> On Fri, Oct 18, 2013 at 3:10 AM, Andriy Gapon <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>
>> sa_handle_get_from_db can allocate sa_handle_t in two modes:
>> SA_HDL_PRIVATE and
>> SA_HDL_SHARED.
>>
>> SA_HDL_PRIVATE case is easy -- we just allocate a new handle object with
>> fields
>> set to appropriate values. sa_bonus points to a buffer, but the buffer
>> has no
>> knowledge of the SA handle.
>>
>> In the case of SA_HDL_SHARED we allocate a new handle and associate it
>> with the
>> buffer via dmu_buf_set_user_ie, but only if the buffer does not have a SA
>> already associated with it. In the latter case, sa_handle_get_from_db
>> simply
>> returns the associated handle. So, in this sense the handle gets shared.
>>
>> What concerns me is that there is no corresponding "unsharing" mechanism
>> in
>> sa_handle_destroy. It would destroy the handle regardless of any
>> possible
>> sharing.
>>
>> So I suspect that if any actual sharing would ever occur then there
>> could be a
>> trouble.
>> As far as I could see in the code, no actual sharing ever happens
>> (modulo bugs).
>> So in the end the only actual purpose of SA_HDL_SHARED seems to be
>> debugging
>> via sa_evict -> panic.
>>
>> I would like to ask you if my understanding of that code is correct.
>> Thank you.
>> --
>> Andriy Gapon
>>
>> _______________________________________________
>> developer mailing list
>> [email protected] <mailto:[email protected]>
>> http://lists.open-zfs.org/mailman/listinfo/developer
>>
>>
>
>
--
Andriy Gapon
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer