Hello list,

(I sure hope this maintains some legibility)

So we have cause to add another SA_ZPL_ entry under OSX (because its fun,
and someone gave me root).

In our case, we have added SA_ZPL_ADDTIME, which keeps the time when an
entry was added to a directory. (So it is updated if you move the entry to
a new directory, diverging from CRTIME - used by Spotlight for indexing).

It seemed fairly straight forward, diff looks like:

include//sys/zfs_sa.h     ZPL_DXATTR,
include//sys/zfs_sa.h+#ifdef __APPLE__
include//sys/zfs_sa.h+  ZPL_ADDTIME,
include//sys/zfs_sa.h+#endif
include//sys/zfs_sa.h   ZPL_END

include//sys/zfs_znode.h #define SA_ZPL_PAD(z) z->z_attr_table[ZPL_PAD]
include//sys/zfs_znode.h+#ifdef __APPLE__
include//sys/zfs_znode.h+#define SA_ZPL_ADDTIME(z) z->z_attr_table[ZPL_ADDTIME]
include//sys/zfs_znode.h+#endif

module//zfs/zfs_sa.c     {"ZPL_DXATTR", 0, SA_UINT8_ARRAY, 0},
module//zfs/zfs_sa.c+#ifdef __APPLE__
module//zfs/zfs_sa.c+ {"ZPL_ADDTIME",sizeof(uint64_t)*2,SA_UINT64_ARRAY,0},
module//zfs/zfs_sa.c+#endif
module//zfs/zfs_sa.c    {NULL, 0, 0, 0}


Which takes care of the initialisation. Then we go update it when needed.
Currently, we have two places that could happen. First one is in zfs_dir.c,
when it is already working on a bulk update:

        if (!(flag & ZRENAMING) &&
                zfsvfs->z_use_sa == B_TRUE) {
                timestruc_t     now;
                gethrestime(&now);
                ZFS_TIME_ENCODE(&now, addtime);
                SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ADDTIME(zfsvfs), NULL,
                    addtime, sizeof (addtime));
        }

and I bumped the total number of "bulk" entries by one.



Second case is an update outside of a dmu_tx, so I do it manually with:

        /* Write the new ADDTIME to SA */
        if ((zfsvfs->z_use_sa == B_TRUE) &&
                !vfs_isrdonly(zfsvfs->z_vfs) &&
                spa_writeable(dmu_objset_spa(zfsvfs->z_os))) {

                tx = dmu_tx_create(zfsvfs->z_os);
                dmu_tx_hold_sa_create(tx, sizeof(addtime));
                dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_TRUE);

                error = dmu_tx_assign(tx, TXG_WAIT);
                if (error) {
                        dmu_tx_abort(tx);
                } else {
      error = sa_update(zp->z_sa_hdl, SA_ZPL_ADDTIME(zfsvfs),
                                   &addtime, sizeof(addtime), tx);
                        if (error)
                                dmu_tx_abort(tx);
                        else
                                dmu_tx_commit(tx);
                }

                if (error)


To read the value in vnop_getattr, I use the simple:

                sa_lookup(zp->z_sa_hdl, SA_ZPL_ADDTIME(zfsvfs),
                                  &addtime, sizeof(addtime));



That is all well so far. This does indeed appear to work, and ADDTIME is
updated, and fetches as expected. Hurrah!




But, the timing is a bit suspicious. Soon after adding this code, lets say
48 hours or so, we have come across SA corruption. Since we got tired of SA
panics from previous problem we shared with ZOL, we still have some
additional sanity testing.



One of them has tripped on my test pools which was previously free of troubles.


kernel[0]: ZFS: NULL SA: SA_COPY_DATA(0, 0, 0xffffff80b60d32b8, 16) i=15


from sa.c's sa_build_layouts()

if (!attr_desc[i].sa_data) {
        printf("ZFS: NULL SA: SA_COPY_DATA(%p, %p, %p, %u) i=%u\n",
                   attr_desc[i].sa_data_func, attr_desc[i].sa_data,
                   data_start, length, i);
} else {
        SA_COPY_DATA(attr_desc[i].sa_data_func, attr_desc[i].sa_data,
                                 data_start, length);
}


Generally followed by a
kernel[0]: kernel memory allocator: redzone violation: write past end of buffer
Jul 28 14:27:58 icgi-vip kernel[0]: buffer=0xffffff80b6dd1b00
bufctl=0xffffff80b6cba7b8  cache: kmem_alloc_640

but I suspect that is simply from us skipping a copy (but still
incrementing counters), and is directly from the sa_data conditional we added.



So, are we using the SA_ZPL_ wrong? Can we not dynamically add attributes
we like, when we like? All other SA entries are always set/read from
zfs_znode after all. 99% of the time everything does work correctly, but it
seems eventually something happens to the SA.

Any pointers are appreciated :)

Lund

-- 
Jorgen Lundman       | <[email protected]>
Unix Administrator   | +81 (0)90-5578-8500          (work)
Shibuya-ku, Tokyo    | +81 (0)80-2090-5800          (cell)
Japan                | +81 (0)3 -3375-1767          (home)
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to