Hi, Last week we finally were able to reproduce and diagnose very old ZFS issue (checked FreeBSD, CentOS/ZoL and SmartOS), when replication code in receive_object() falsely assumes that if received object block size is different from local, then it must be a new object and calls dmu_object_reclaim() to wipe it out. In most cases it is not a problem, since all dnode, bonus buffer and data block(s) are immediately rewritten any way, but the problem is that spill block (if used) is not. This means loss of ACLs, extended attributes, etc.
This issue can be triggered in very simple way: 1. create 4KB file with 10+ ACL entries (FreeBSD/SmartOS) 2. on Linux, you need to zfs set acltype=posixacl and xattr=sa 3. on Linux, you need to use setfattr to set extended attributes on the file to ensure the spill_blkptr is used 4. take snapshot and send snapshot to different dataset (with equal settings on Linux) 5. append another 4KB to the file 6. take snapshot and send incremental between first snapshot and this snapshot to other dataset 7. witness corruption (note on Linux you need to run `getfattr -d -m - file` to see that the extended attributes were lost) I've made an experimental patch below, trying to only change object block size on the receiving side if it looks like possible valid object grow. That fixes the problem on my tests. I can still guess scenarios how it could possibly be fooled, but those are much less straightforward. Any comments or better ideas? --- dmu_send.c (revision 339883) +++ dmu_send.c (working copy) @@ -2190,6 +2190,7 @@ receive_object(struct receive_writer_arg *rwa, str tx = dmu_tx_create(rwa->os); dmu_tx_hold_bonus(tx, object); + dmu_tx_hold_write(tx, object, 0, 0); err = dmu_tx_assign(tx, TXG_WAIT); if (err != 0) { dmu_tx_abort(tx); @@ -2203,7 +2204,9 @@ receive_object(struct receive_writer_arg *rwa, str drro->drr_bonustype, drro->drr_bonuslen, drro->drr_dn_slots << DNODE_SHIFT, tx); } else if (drro->drr_type != doi.doi_type || - drro->drr_blksz != doi.doi_data_block_size || + drro->drr_blksz < doi.doi_data_block_size || + (drro->drr_blksz > doi.doi_data_block_size && + doi.doi_max_offset > doi.doi_data_block_size) || drro->drr_bonustype != doi.doi_bonus_type || drro->drr_bonuslen != doi.doi_bonus_size) { /* currently allocated, but with different properties */ @@ -2210,6 +2213,10 @@ receive_object(struct receive_writer_arg *rwa, str err = dmu_object_reclaim(rwa->os, drro->drr_object, drro->drr_type, drro->drr_blksz, drro->drr_bonustype, drro->drr_bonuslen, tx); + } else if (drro->drr_blksz != doi.doi_data_block_size) { + /* currently allocated, just with one block of different size */ + err = dmu_object_set_blocksize(rwa->os, drro->drr_object, + drro->drr_blksz, 0, tx); } if (err != 0) { dmu_tx_commit(tx); -- Alexander Motin ------------------------------------------ openzfs: openzfs-developer Permalink: https://openzfs.topicbox.com/groups/developer/T1be15fbaaa263b00-M8a12865ba14f30126f6200ca Delivery options: https://openzfs.topicbox.com/groups/developer/subscription