Re: [zfs-discuss] zfs send/receive script

2012-03-09 Thread Ian Collins

On 03/10/12 02:48 AM, Cameron Hanover wrote:

On Mar 6, 2012, at 8:26 AM, Carsten John wrote:


Hello everybody,

I set up a script to replicate all zfs filesystems (some 300 user home directories in 
this case) within a given pool to a "mirror" machine. The basic idea is to send 
the snapshots incremental if the corresponding snapshot exists on the remote side or send 
a complete snapshot if no corresponding previous snapshot is available

Thee setup basically works, but form time to time (within a run over all 
filesystems) I get error messages like:

"cannot receive new filesystem stream: dataset is busy" or

"cannot receive incremental filesystem stream: dataset is busy"

I've seen similar error messages from a script I've written, as well.  Mine 
does create a lock file and won't run if a `zfs send` is already in progress.
My only guess is that the second (or third, or...) filesystem starts sending to 
the receiving host before the latter has fully finished the `zfs recv` process. 
 I've considered putting a 5 second pause between successive processes, but the 
errors are intermittent enough that it's pretty low on my to-do list.


I have also seen the same issue (a long time ago) and the application I 
use for replication still has a one second pause between sends to "fix"  
the problem.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive script

2012-03-09 Thread Cameron Hanover
I've seen similar error messages from a script I've written, as well.  Mine 
does create a lock file and won't run if a `zfs send` is already in progress.
My only guess is that the second (or third, or...) filesystem starts sending to 
the receiving host before the latter has fully finished the `zfs recv` process. 
 I've considered putting a 5 second pause between successive processes, but the 
errors are intermittent enough that it's pretty low on my to-do list.

-
Cameron Hanover
chano...@umich.edu

"They that can give up essential liberty to obtain a little temporary safety 
deserve neither liberty nor safety." 
--Benjamin Franklin

On Mar 6, 2012, at 8:26 AM, Carsten John wrote:

> Hello everybody,
> 
> I set up a script to replicate all zfs filesystems (some 300 user home 
> directories in this case) within a given pool to a "mirror" machine. The 
> basic idea is to send the snapshots incremental if the corresponding snapshot 
> exists on the remote side or send a complete snapshot if no corresponding 
> previous snapshot is available
> 
> Thee setup basically works, but form time to time (within a run over all 
> filesystems) I get error messages like:
> 
> "cannot receive new filesystem stream: dataset is busy" or
> 
> "cannot receive incremental filesystem stream: dataset is busy"
> 
> The complete script is available under:
> 
> http://pastebin.com/AWevkGAd
> 
> 
> does anybody have a suggestion what might cause the dataset to be busy?
> 
> 
> 
> thx
> 
> 
> Carsten
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive script

2012-03-06 Thread Paul Kraus
On Tue, Mar 6, 2012 at 10:19 AM, Edward Ned Harvey
 wrote:
>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Carsten John



>> "cannot receive new filesystem stream: dataset is busy" or
>>
>> "cannot receive incremental filesystem stream: dataset is busy"
>>
>> does anybody have a suggestion what might cause the dataset to be busy?



> What else could cause it to be busy?  A receive is definitely one.  A scrub
> or a resilver - maybe, I'm not sure.  But my best guess is that only a
> receive would do this to you.

I have NOT seen issues recv'ing a zfs send while a scrub was running.
I am at zpool 22.

Always implement locking on tasks that should be single threaded (like
zfs send \ zfs recv on a given dataset).

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, Troy Civic Theatre Company
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive script

2012-03-06 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Carsten John
> 
> I set up a script to replicate all zfs filesystems (some 300 user home
> directories in this case) within a given pool to a "mirror" machine. The
basic
> idea is to send the snapshots incremental if the corresponding snapshot
> exists on the remote side or send a complete snapshot if no corresponding
> previous snapshot is available
> 
> Thee setup basically works, but form time to time (within a run over all
> filesystems) I get error messages like:
> 
> "cannot receive new filesystem stream: dataset is busy" or
> 
> "cannot receive incremental filesystem stream: dataset is busy"
> 
> does anybody have a suggestion what might cause the dataset to be busy?

Usually a dataset is "busy" when it's in the middle of receiving another
stream, or it thinks it is.  I haven't read your script, but I bet you don't
set a flag to indicate you're already running, and I bet you're running your
script via cron, and sometimes it takes longer to complete than the amount
of time between cron tasks, right?  Just an educated guess.  But even if I'm
wrong about your cron schedule, it's still a really good guess about the
root cause of your problem.

What else could cause it to be busy?  A receive is definitely one.  A scrub
or a resilver - maybe, I'm not sure.  But my best guess is that only a
receive would do this to you.

On some versions, there was a bug.  If the system crashed mid-receive, it
would keep the partially received clone indefinitely, breaking all future
receives, until you destroy that faulty clone.  This problem has been fixed,
assuming you've applied updates in the last year or so.  This does not match
the behavior you're seeing, does it?  If so, we'll tell you how to destroy
the hidden clone.  (And you should apply updates.)



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss