Łukasz wrote:
How it got that way, I couldn't really say without looking at your code.

It works like this:
...
we set max_txg
  ba.max_txg = (spa_get_dsl(filesystem->os->os_spa))->dp_tx.tx_synced_txg;

So, how do you send the initial stream? Presumably you need to do it with ba.max_txg = 0? If, say the first 320MB were written before your first ba.max_txg, then you wouldn't be sending that data, thus explaining the behavior you're seeing.

It seems to me that your algorithm is fundamentally flawed -- if the filesystem is changing, it will not result in a consistent (from the ZPL's point if view) filesystem. For example:

There are two directories, A and B.  You last sent txg 10.

In txg 13, a file is renamed from directory A to directory B.

It is now txg 15, and you begin traversing to do a send, from txg 10 -> 15.

While that's in progress, a new file is created in directory A, and synced out in txg 16.

When you visit directory A, you see that its birth time is 16 > 15, so you don't send it. When you visit directory B, you see that its birth time is 13 <= 15 so you send it.

Now the other side has two links to the file, when it should have one.

Given that you don't actually have the data from txg 15 (because you didn't take a snapshot), I don't see how you could make this work.

(Also FYI, traversing changing filesystems in this way will almost certainly break once we rewrite as part of the pool space reduction work.)

--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to