Hi,

Last week we finally were able to reproduce and diagnose very old ZFS
issue (checked FreeBSD, CentOS/ZoL and SmartOS), when replication code
in receive_object() falsely assumes that if received object block size
is different from local, then it must be a new object and calls
dmu_object_reclaim() to wipe it out.  In most cases it is not a problem,
since all dnode, bonus buffer and data block(s) are immediately
rewritten any way, but the problem is that spill block (if used) is not.
This means loss of ACLs, extended attributes, etc.

This issue can be triggered in very simple way:
1. create 4KB file with 10+ ACL entries (FreeBSD/SmartOS)
2. on Linux, you need to zfs set acltype=posixacl and xattr=sa
3. on Linux, you need to use setfattr to set extended attributes on the
file to ensure the spill_blkptr is used
4. take snapshot and send snapshot to different dataset (with equal
settings on Linux)
5. append another 4KB to the file
6. take snapshot and send incremental between first snapshot and this
snapshot to other dataset
7. witness corruption (note on Linux you need to run `getfattr -d -m -
file` to see that the extended attributes were lost)

I've made an experimental patch below, trying to only change object
block size on the receiving side if it looks like possible valid object
grow.  That fixes the problem on my tests.  I can still guess scenarios
how it could possibly be fooled, but those are much less straightforward.

Any comments or better ideas?

--- dmu_send.c  (revision 339883)
+++ dmu_send.c  (working copy)
@@ -2190,6 +2190,7 @@ receive_object(struct receive_writer_arg *rwa, str

        tx = dmu_tx_create(rwa->os);
        dmu_tx_hold_bonus(tx, object);
+       dmu_tx_hold_write(tx, object, 0, 0);
        err = dmu_tx_assign(tx, TXG_WAIT);
        if (err != 0) {
                dmu_tx_abort(tx);
@@ -2203,7 +2204,9 @@ receive_object(struct receive_writer_arg *rwa, str
                    drro->drr_bonustype, drro->drr_bonuslen,
                    drro->drr_dn_slots << DNODE_SHIFT, tx);
        } else if (drro->drr_type != doi.doi_type ||
-           drro->drr_blksz != doi.doi_data_block_size ||
+           drro->drr_blksz < doi.doi_data_block_size ||
+           (drro->drr_blksz > doi.doi_data_block_size &&
+            doi.doi_max_offset > doi.doi_data_block_size) ||
            drro->drr_bonustype != doi.doi_bonus_type ||
            drro->drr_bonuslen != doi.doi_bonus_size) {
                /* currently allocated, but with different properties */
@@ -2210,6 +2213,10 @@ receive_object(struct receive_writer_arg *rwa, str
                err = dmu_object_reclaim(rwa->os, drro->drr_object,
                    drro->drr_type, drro->drr_blksz,
                    drro->drr_bonustype, drro->drr_bonuslen, tx);
+       } else if (drro->drr_blksz != doi.doi_data_block_size) {
+               /* currently allocated, just with one block of different
size */
+               err = dmu_object_set_blocksize(rwa->os, drro->drr_object,
+                   drro->drr_blksz, 0, tx);
        }
        if (err != 0) {
                dmu_tx_commit(tx);


-- 
Alexander Motin

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/T1be15fbaaa263b00-M8a12865ba14f30126f6200ca
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription

Reply via email to