Re: running mksnap_ffs

2007-01-17 Thread Willem Jan Withagen

Kris Kennaway wrote:

Or waiting until the snapshot operation finishes.  You (still) haven't
determined that it's actually hanging as opposed to just waiting for
the snapshot operation to finish.


Just upgraded to 6.2-STABLE, and I must say that things are a LOT better:

 - It did return a prompt
 - and ran for about 30 minutes on 1,5T
 - Performance gets a little slugish.
 - Playing MP3 of a samba share missed only one beat.

In all it is usable to take snapshots over night.

So that makes me a happy camper...

--WjW



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-17 Thread Kris Kennaway
On Wed, Jan 17, 2007 at 02:48:18PM +0100, Willem Jan Withagen wrote:
 Kris Kennaway wrote:
 Or waiting until the snapshot operation finishes.  You (still) haven't
 determined that it's actually hanging as opposed to just waiting for
 the snapshot operation to finish.
 
 Just upgraded to 6.2-STABLE, and I must say that things are a LOT better:
 
  - It did return a prompt
  - and ran for about 30 minutes on 1,5T
  - Performance gets a little slugish.
  - Playing MP3 of a samba share missed only one beat.
 
 In all it is usable to take snapshots over night.
 
 So that makes me a happy camper...

That's good - although I don't remember you mentioning you weren't
running 6.2, or I would have told you to do this immediately to pick
up the various fixes made over the past N months.

Kris


pgpKasnDJJCRn.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-17 Thread Willem Jan Withagen

Kris Kennaway wrote:

On Wed, Jan 17, 2007 at 02:48:18PM +0100, Willem Jan Withagen wrote:

Kris Kennaway wrote:

Or waiting until the snapshot operation finishes.  You (still) haven't
determined that it's actually hanging as opposed to just waiting for
the snapshot operation to finish.

Just upgraded to 6.2-STABLE, and I must say that things are a LOT better:

 - It did return a prompt
 - and ran for about 30 minutes on 1,5T
 - Performance gets a little slugish.
 - Playing MP3 of a samba share missed only one beat.

In all it is usable to take snapshots over night.

So that makes me a happy camper...


That's good - although I don't remember you mentioning you weren't
running 6.2, or I would have told you to do this immediately to pick
up the various fixes made over the past N months.


Actually I was running 6.2-something, where something was probably from around 
mid September. So I've now caught up with that. Something I did indicate when 
the discussion started, but then that is already a long time ago.


--WjW


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-16 Thread Doug Ambrisko
Scott Oertel writes:
| Kris Kennaway wrote:
|  On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
|
|  Hi,
| 
|  I got the following Filesystem:
|  FilesystemSizeUsed   Avail Capacity iused ifree %iused 
|  /dev/da0a 1.3T422G823G34%  565952 1828334700%
| 
|  Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
|  The system is used as SMB/NFS server for my other systems here.
| 
|  I would like to make weekly snapshots, but manually running mksnap_ffs 
|  freezes access to the disk (I sort of expected that) but the process 
|  never terminates. So I let is sit overnight, but looking a gstat did not 
|  reveil any activity what so ever...
|  The disk was not released, mksnap_ffs could not be terminated.
|  And things resulted in me rebooting the system.
| 
|  So:
|   - How long should I expect making a snapshot to take:
| 5, 15, 30min, 1, 2 hour or even more???
| 
|  Yes :) Snapshots were not designed for use in this way (they were
|  designed to support background fsck and allow faster system recovery
|  after power failure), so they don't scale as well as you might like on
|  large filesystems.
| 
| If snapshots were designed to support background fsck, then why did they 
| not make it more scalable? If you can't create a snapshot without the 
| system locking up, that means fsck won't be able to either, making 
| background fsck worthless for systems with large storage.

FWIW, with this patch I find making snap-shots a lot more reliable:

--- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006
+++ sys/ufs/ffs/ffs_snapshot.c  Mon Nov 20 14:59:13 2006
@@ -282,6 +282,8 @@ restart:
if (error)
goto out;
bawrite(nbp);
+   if (cg % 10 == 0)
+   ffs_syncvnode(vp, MNT_WAIT);
}
/*
 * Copy all the cylinder group maps. Although the
@@ -303,6 +305,8 @@ restart:
goto out;
error = cgaccount(cg, vp, nbp, 1);
bawrite(nbp);
+   if (cg % 10 == 0)
+   ffs_syncvnode(vp, MNT_WAIT);
if (error)
goto out;
}

or things can get wedged.  We have some other patches as well that might
be required.  As a hack on a local server we have been using snap shots
to do a hot back-up of a data base each morning.  This is based on
6.x.

Doug A.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-16 Thread Kris Kennaway
On Tue, Jan 16, 2007 at 10:13:57AM -0800, Doug Ambrisko wrote:

 FWIW, with this patch I find making snap-shots a lot more reliable:
 
 --- sys/ufs/ffs/ffs_snapshot.c.orig   Wed Mar 22 09:42:31 2006
 +++ sys/ufs/ffs/ffs_snapshot.cMon Nov 20 14:59:13 2006
 @@ -282,6 +282,8 @@ restart:
   if (error)
   goto out;
   bawrite(nbp);
 + if (cg % 10 == 0)
 + ffs_syncvnode(vp, MNT_WAIT);
   }
   /*
* Copy all the cylinder group maps. Although the
 @@ -303,6 +305,8 @@ restart:
   goto out;
   error = cgaccount(cg, vp, nbp, 1);
   bawrite(nbp);
 + if (cg % 10 == 0)
 + ffs_syncvnode(vp, MNT_WAIT);
   if (error)
   goto out;
   }
 
 or things can get wedged.  We have some other patches as well that might
 be required.  As a hack on a local server we have been using snap shots
 to do a hot back-up of a data base each morning.  This is based on
 6.x.

What do you mean by get wedged?  Are you seeing a deadlock, and if
so then what are the details?  When you say 6.x, do you mean
up-to-date RELENG_6?  There were various snapshot deadlock fixes
committed over the past year including some in the past few months.

Kris


pgpFAdfdGDyph.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-16 Thread Doug Ambrisko
Kris Kennaway writes:
| On Tue, Jan 16, 2007 at 10:13:57AM -0800, Doug Ambrisko wrote:
|
|  FWIW, with this patch I find making snap-shots a lot more reliable:
| 
|  --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006
|  +++ sys/ufs/ffs/ffs_snapshot.c  Mon Nov 20 14:59:13 2006
|  @@ -282,6 +282,8 @@ restart:
|  if (error)
|  goto out;
|  bawrite(nbp);
|  +   if (cg % 10 == 0)
|  +   ffs_syncvnode(vp, MNT_WAIT);
|  }
|  /*
|   * Copy all the cylinder group maps. Although the
|  @@ -303,6 +305,8 @@ restart:
|  goto out;
|  error = cgaccount(cg, vp, nbp, 1);
|  bawrite(nbp);
|  +   if (cg % 10 == 0)
|  +   ffs_syncvnode(vp, MNT_WAIT);
|  if (error)
|  goto out;
|  }
| 
|  or things can get wedged.  We have some other patches as well that might
|  be required.  As a hack on a local server we have been using snap shots
|  to do a hot back-up of a data base each morning.  This is based on
|  6.x.
|
| What do you mean by get wedged?  Are you seeing a deadlock, and if
| so then what are the details?  When you say 6.x, do you mean
| up-to-date RELENG_6?  There were various snapshot deadlock fixes
| committed over the past year including some in the past few months.

The file-system would come to a stop, processes stuck on bio, snap-shots
not finishing etc.  This was caused by the system running out of usable
buffers.  The change forces them to be flushed every so often.  This is
independant of locking.  10 might be to aggresive.  Some scaling of
nbuf would probably be better.

Doug A.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-16 Thread Kris Kennaway
On Tue, Jan 16, 2007 at 09:26:47PM +0100, Willem Jan Withagen wrote:
 Doug Ambrisko wrote:
 |  or things can get wedged.  We have some other patches as well that 
 might
 |  be required.  As a hack on a local server we have been using snap shots
 |  to do a hot back-up of a data base each morning.  This is based on
 |  6.x.
 |
 | What do you mean by get wedged?  Are you seeing a deadlock, and if
 | so then what are the details?  When you say 6.x, do you mean
 | up-to-date RELENG_6?  There were various snapshot deadlock fixes
 | committed over the past year including some in the past few months.
 
 The file-system would come to a stop, processes stuck on bio, snap-shots
 not finishing etc.  This was caused by the system running out of usable
 buffers.  The change forces them to be flushed every so often.  This is
 independant of locking.  10 might be to aggresive.  Some scaling of
 nbuf would probably be better.
 
 When I run mksnap_ffs it runs to the point where ANY access to the 
 filesystem gives that process a lockup.

Yes, that is expected.  Actually it begins when something accesses the
directory in which the snapshot is being made, since that causes the
parent directory to be locked...then something tries to access the
parent directory, which eventually cascades back to the root.

 Getting the file system back is only thru hard reboot. Trying to do it 
 the gentle way locks the whole system.

Or waiting until the snapshot operation finishes.  You (still) haven't
determined that it's actually hanging as opposed to just waiting for
the snapshot operation to finish.

Kris

pgpKqk209RvS8.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-16 Thread Willem Jan Withagen

Kris Kennaway wrote:
..


The file-system would come to a stop, processes stuck on bio, snap-shots
not finishing etc.  This was caused by the system running out of usable
buffers.  The change forces them to be flushed every so often.  This is
independant of locking.  10 might be to aggresive.  Some scaling of
nbuf would probably be better.
When I run mksnap_ffs it runs to the point where ANY access to the 
filesystem gives that process a lockup.


Yes, that is expected.  Actually it begins when something accesses the
directory in which the snapshot is being made, since that causes the
parent directory to be locked...then something tries to access the
parent directory, which eventually cascades back to the root.

Getting the file system back is only thru hard reboot. Trying to do it 
the gentle way locks the whole system.


Or waiting until the snapshot operation finishes.  You (still) haven't
determined that it's actually hanging as opposed to just waiting for
the snapshot operation to finish.


True, and that is what I was refering to.

* I've let it run for 12 hours on 1,5T (that's why I asked for other
experiences)
* I looked at diskstats with gstat:
that turned out that everything was idle for  5 minutes

Then I concluded that it was locked.

IF you can give me a fair estimate of time  1 day I'll be willing to let it 
sit for so long. But I'm not going to wait forever. :)


--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-16 Thread Willem Jan Withagen

Doug Ambrisko wrote:

|  or things can get wedged.  We have some other patches as well that might
|  be required.  As a hack on a local server we have been using snap shots
|  to do a hot back-up of a data base each morning.  This is based on
|  6.x.
|
| What do you mean by get wedged?  Are you seeing a deadlock, and if
| so then what are the details?  When you say 6.x, do you mean
| up-to-date RELENG_6?  There were various snapshot deadlock fixes
| committed over the past year including some in the past few months.

The file-system would come to a stop, processes stuck on bio, snap-shots
not finishing etc.  This was caused by the system running out of usable
buffers.  The change forces them to be flushed every so often.  This is
independant of locking.  10 might be to aggresive.  Some scaling of
nbuf would probably be better.


When I run mksnap_ffs it runs to the point where ANY access to the filesystem 
gives that process a lockup.
Getting the file system back is only thru hard reboot. Trying to do it the 
gentle way locks the whole system.


I'm refering further testing and trying until I have more time to upgrade to 
6.2-RELEASE and put some of the debug options in the kernel.


On the otherhand is this my main fileserver. So testing too much is sort of 
dangerous, and running a fsck on 1.5T is very tedious.


--WjW

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-16 Thread Doug Ambrisko
Kris Kennaway writes:
| On Tue, Jan 16, 2007 at 09:26:47PM +0100, Willem Jan Withagen wrote:
|  Doug Ambrisko wrote:
|  |  or things can get wedged.  We have some other patches as well that 
|  might
|  |  be required.  As a hack on a local server we have been using snap shots
|  |  to do a hot back-up of a data base each morning.  This is based on
|  |  6.x.
|  |
|  | What do you mean by get wedged?  Are you seeing a deadlock, and if
|  | so then what are the details?  When you say 6.x, do you mean
|  | up-to-date RELENG_6?  There were various snapshot deadlock fixes
|  | committed over the past year including some in the past few months.
|  
|  The file-system would come to a stop, processes stuck on bio, snap-shots
|  not finishing etc.  This was caused by the system running out of usable
|  buffers.  The change forces them to be flushed every so often.  This is
|  independant of locking.  10 might be to aggresive.  Some scaling of
|  nbuf would probably be better.
|  
|  When I run mksnap_ffs it runs to the point where ANY access to the 
|  filesystem gives that process a lockup.
| 
| Yes, that is expected.  Actually it begins when something accesses the
| directory in which the snapshot is being made, since that causes the
| parent directory to be locked...then something tries to access the
| parent directory, which eventually cascades back to the root.
| 
|  Getting the file system back is only thru hard reboot. Trying to do it 
|  the gentle way locks the whole system.
| 
| Or waiting until the snapshot operation finishes.  You (still) haven't
| determined that it's actually hanging as opposed to just waiting for
| the snapshot operation to finish.

In my case is was easy to see that all the buffers were exhausted and
the system was churning waiting for some to become available.  Since they
were all used up it never recovered.  By sync'ing the buffers they got
cleaned up and then the system never ran out.  The snap shot was then
able to finish.  Via the debugger you can see this happen.  I traced
this problem in the debugger.  There are other issues with the buffer
deamon as well.  We hit these since we run with a relatively low
nbuf.  The buffers can be get frag'ed so bad that it can't flush
things since it can't get a full-size buffer.  Another problem is that
it can end up waiting on itself since the current code can't use
it's emergency space to flush stuff.  You can see this via ps etc.
It's not a good thing if the buffer daemon is waiting on itself :-(

We have patches to this as well but they need some more work.  I was
working with Tor, on this but then I got swamped at work with our 4.X - 6.X
and platform transition.  All I can say is that we don't suffer from
these problems now :-)  I have printf's the log this stuff when some of
these bugs are hit.  Now the system survives those lock-up points.

Doug A.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-16 Thread Kris Kennaway
On Tue, Jan 16, 2007 at 01:17:33PM -0800, Doug Ambrisko wrote:
 Kris Kennaway writes:
 | On Tue, Jan 16, 2007 at 09:26:47PM +0100, Willem Jan Withagen wrote:
 |  Doug Ambrisko wrote:
 |  |  or things can get wedged.  We have some other patches as well that 
 |  might
 |  |  be required.  As a hack on a local server we have been using snap 
 shots
 |  |  to do a hot back-up of a data base each morning.  This is based on
 |  |  6.x.
 |  |
 |  | What do you mean by get wedged?  Are you seeing a deadlock, and if
 |  | so then what are the details?  When you say 6.x, do you mean
 |  | up-to-date RELENG_6?  There were various snapshot deadlock fixes
 |  | committed over the past year including some in the past few months.
 |  
 |  The file-system would come to a stop, processes stuck on bio, snap-shots
 |  not finishing etc.  This was caused by the system running out of usable
 |  buffers.  The change forces them to be flushed every so often.  This is
 |  independant of locking.  10 might be to aggresive.  Some scaling of
 |  nbuf would probably be better.
 |  
 |  When I run mksnap_ffs it runs to the point where ANY access to the 
 |  filesystem gives that process a lockup.
 | 
 | Yes, that is expected.  Actually it begins when something accesses the
 | directory in which the snapshot is being made, since that causes the
 | parent directory to be locked...then something tries to access the
 | parent directory, which eventually cascades back to the root.
 | 
 |  Getting the file system back is only thru hard reboot. Trying to do it 
 |  the gentle way locks the whole system.
 | 
 | Or waiting until the snapshot operation finishes.  You (still) haven't
 | determined that it's actually hanging as opposed to just waiting for
 | the snapshot operation to finish.
 
 In my case is was easy to see that all the buffers were exhausted and
 the system was churning waiting for some to become available.  Since they
 were all used up it never recovered.  By sync'ing the buffers they got
 cleaned up and then the system never ran out.  The snap shot was then
 able to finish.  Via the debugger you can see this happen.  I traced
 this problem in the debugger.  There are other issues with the buffer
 deamon as well.  We hit these since we run with a relatively low
 nbuf.  The buffers can be get frag'ed so bad that it can't flush
 things since it can't get a full-size buffer.  Another problem is that
 it can end up waiting on itself since the current code can't use
 it's emergency space to flush stuff.  You can see this via ps etc.
 It's not a good thing if the buffer daemon is waiting on itself :-(
 
 We have patches to this as well but they need some more work.  I was
 working with Tor, on this but then I got swamped at work with our 4.X - 6.X
 and platform transition.  All I can say is that we don't suffer from
 these problems now :-)  I have printf's the log this stuff when some of
 these bugs are hit.  Now the system survives those lock-up points.

Thanks for clarifying.  Hopefully you and Tor can get something
committed soon!

Kris


pgpHADe7e0Fpa.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-16 Thread Kris Kennaway
On Tue, Jan 16, 2007 at 09:55:00PM +0100, Willem Jan Withagen wrote:
 Kris Kennaway wrote:
 ..
 
 The file-system would come to a stop, processes stuck on bio, snap-shots
 not finishing etc.  This was caused by the system running out of usable
 buffers.  The change forces them to be flushed every so often.  This is
 independant of locking.  10 might be to aggresive.  Some scaling of
 nbuf would probably be better.
 When I run mksnap_ffs it runs to the point where ANY access to the 
 filesystem gives that process a lockup.
 
 Yes, that is expected.  Actually it begins when something accesses the
 directory in which the snapshot is being made, since that causes the
 parent directory to be locked...then something tries to access the
 parent directory, which eventually cascades back to the root.
 
 Getting the file system back is only thru hard reboot. Trying to do it 
 the gentle way locks the whole system.
 
 Or waiting until the snapshot operation finishes.  You (still) haven't
 determined that it's actually hanging as opposed to just waiting for
 the snapshot operation to finish.
 
 True, and that is what I was refering to.
 
 * I've let it run for 12 hours on 1,5T (that's why I asked for other
   experiences)
 * I looked at diskstats with gstat:
   that turned out that everything was idle for  5 minutes
 
 Then I concluded that it was locked.

OK, that does sound like it's deadlocked.  You could try Doug's patch,
or it might be another (unknown) condition.  If so, you'll need to do
some additional debugging with a serial console to figure out what is
wrong.

Kris


pgpZLz9YTBAVB.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-16 Thread Doug Ambrisko
Kris Kennaway writes:
| Thanks for clarifying.  Hopefully you and Tor can get something
| committed soon!

I'm not sure about that.  I have to see what has changed since then.
That was ... uhm a year ago when I dropped the ball.

It's probably a good task for me to look at in the context of -current
again.  I should have disks to build a 1.5T file system to play with.

Doug A.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-11 Thread Kris Kennaway
On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused 
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs 
 freezes access to the disk (I sort of expected that) but the process 
 never terminates. So I let is sit overnight, but looking a gstat did not 
 reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
  - How long should I expect making a snapshot to take:
   5, 15, 30min, 1, 2 hour or even more???

Yes :) Snapshots were not designed for use in this way (they were
designed to support background fsck and allow faster system recovery
after power failure), so they don't scale as well as you might like on
large filesystems.

Kris


pgp1dCb5bCvL1.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-11 Thread Scott Oertel

Kris Kennaway wrote:

On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
  

Hi,

I got the following Filesystem:
FilesystemSizeUsed   Avail Capacity iused ifree %iused 
/dev/da0a 1.3T422G823G34%  565952 1828334700%


Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
The system is used as SMB/NFS server for my other systems here.

I would like to make weekly snapshots, but manually running mksnap_ffs 
freezes access to the disk (I sort of expected that) but the process 
never terminates. So I let is sit overnight, but looking a gstat did not 
reveil any activity what so ever...

The disk was not released, mksnap_ffs could not be terminated.
And things resulted in me rebooting the system.

So:
 - How long should I expect making a snapshot to take:
5, 15, 30min, 1, 2 hour or even more???



Yes :) Snapshots were not designed for use in this way (they were
designed to support background fsck and allow faster system recovery
after power failure), so they don't scale as well as you might like on
large filesystems.

Kris
  



If snapshots were designed to support background fsck, then why did they 
not make it more scalable? If you can't create a snapshot without the 
system locking up, that means fsck won't be able to either, making 
background fsck worthless for systems with large storage.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-11 Thread Kris Kennaway
On Thu, Jan 11, 2007 at 11:25:34AM -0800, Scott Oertel wrote:
 Kris Kennaway wrote:
 On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
   
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused 
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs 
 freezes access to the disk (I sort of expected that) but the process 
 never terminates. So I let is sit overnight, but looking a gstat did not 
 reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
  - How long should I expect making a snapshot to take:
 5, 15, 30min, 1, 2 hour or even more???
 
 
 Yes :) Snapshots were not designed for use in this way (they were
 designed to support background fsck and allow faster system recovery
 after power failure), so they don't scale as well as you might like on
 large filesystems.
 
 Kris
   
 
 
 If snapshots were designed to support background fsck, then why did they 
 not make it more scalable? If you can't create a snapshot without the 
 system locking up, that means fsck won't be able to either, making 
 background fsck worthless for systems with large storage.

locking up != taking a long time to complete.  You haven't
differentiated between those two situations yet.

Kris


pgp9vyECdpRSr.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-11 Thread Scott Oertel

Kris Kennaway wrote:

On Thu, Jan 11, 2007 at 11:25:34AM -0800, Scott Oertel wrote:
  

Kris Kennaway wrote:


On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
 
  

Hi,

I got the following Filesystem:
FilesystemSizeUsed   Avail Capacity iused ifree %iused 
/dev/da0a 1.3T422G823G34%  565952 1828334700%


Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
The system is used as SMB/NFS server for my other systems here.

I would like to make weekly snapshots, but manually running mksnap_ffs 
freezes access to the disk (I sort of expected that) but the process 
never terminates. So I let is sit overnight, but looking a gstat did not 
reveil any activity what so ever...

The disk was not released, mksnap_ffs could not be terminated.
And things resulted in me rebooting the system.

So:
- How long should I expect making a snapshot to take:
5, 15, 30min, 1, 2 hour or even more???
   


Yes :) Snapshots were not designed for use in this way (they were
designed to support background fsck and allow faster system recovery
after power failure), so they don't scale as well as you might like on
large filesystems.

Kris
 
  
If snapshots were designed to support background fsck, then why did they 
not make it more scalable? If you can't create a snapshot without the 
system locking up, that means fsck won't be able to either, making 
background fsck worthless for systems with large storage.



locking up != taking a long time to complete.  You haven't
differentiated between those two situations yet.

Kris
  



It depends, sometimes it just takes a really long time during which the 
system is unresponsive and unstable, or it just completely locks up. 
Does it make that much of a difference? in either case, snapshotting 
large drives is not very efficient, and can't be considered for 
background fsck, or daily backup. Which are the two main purposes of 
snapshots.



--Scott
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-11 Thread Kris Kennaway
On Thu, Jan 11, 2007 at 01:40:12PM -0800, Scott Oertel wrote:
 Kris Kennaway wrote:
 On Thu, Jan 11, 2007 at 11:25:34AM -0800, Scott Oertel wrote:
   
 Kris Kennaway wrote:
 
 On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
  
   
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused 
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs 
 freezes access to the disk (I sort of expected that) but the process 
 never terminates. So I let is sit overnight, but looking a gstat did 
 not reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
 - How long should I expect making a snapshot to take:
   5, 15, 30min, 1, 2 hour or even more???

 
 Yes :) Snapshots were not designed for use in this way (they were
 designed to support background fsck and allow faster system recovery
 after power failure), so they don't scale as well as you might like on
 large filesystems.
 
 Kris
  
   
 If snapshots were designed to support background fsck, then why did they 
 not make it more scalable? If you can't create a snapshot without the 
 system locking up, that means fsck won't be able to either, making 
 background fsck worthless for systems with large storage.
 
 
 locking up != taking a long time to complete.  You haven't
 differentiated between those two situations yet.
 
 Kris
   
 
 
 It depends, sometimes it just takes a really long time during which the 
 system is unresponsive and unstable, or it just completely locks up. 
 Does it make that much of a difference? in either case, snapshotting 
 large drives is not very efficient, and can't be considered for 
 background fsck, or daily backup. Which are the two main purposes of 
 snapshots.

Those are completely different situations, as I have tried to
emphasize.  If you are interested in proceeding to debug the deadlock
issue, please follow the directions in the developers handbook chapter
on kernel debugging (in particular obtain 'show lockedvnods' output
with DEBUG_VFS_LOCKS and DEBUG_LOCKS enabled).

kris


pgpOQUuv79frW.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-04 Thread Willem Jan Withagen

LI Xin wrote:

Willem Jan Withagen wrote:

Hi,

I got the following Filesystem:
FilesystemSizeUsed   Avail Capacity iused ifree %iused
/dev/da0a 1.3T422G823G34%  565952 1828334700%

Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
The system is used as SMB/NFS server for my other systems here.

I would like to make weekly snapshots, but manually running mksnap_ffs
freezes access to the disk (I sort of expected that) but the process
never terminates. So I let is sit overnight, but looking a gstat did not
reveil any activity what so ever...
The disk was not released, mksnap_ffs could not be terminated.
And things resulted in me rebooting the system.

So:
 - How long should I expect making a snapshot to take:
5, 15, 30min, 1, 2 hour or even more???


This depends how much cylinder groups do you have.  If you have a lot of
large files, using newfs -b 32768 instead of the default settings
would speed up the process drastically.  Note that this might be
unfeasable because you already have data on the disk.


Well the disk is loaded with very different types of files
It is my home file server and contains 10 years of Email in Mailbox/ format, 
al types of development work, my complete ripped CD collection (and more), 
next to that I've started to see how I can networkstream my DvD collection.

So it depends on what you call a large file. :)


Another suggestion is to separate the volume into smaller slices, this
would reduce the impact.


I always seem to make the wrong sizes, run out of space, and start to symlink.
Which drives me completely crazy. So this time I went for one big slice, but 
makeing backups now starts to become a serious point of attention. :~}



BTW.  Our experience with a semi full 1.3T volume is that the snapshot
would take about 1 hour on FreeBSD 5.x, but I doubt that it is not
really comparable to your situation as the hardware is very different.


Can you give me an idea of what type of HW you're running. So I can guestimate 
from there on. This means that you don't have access to the volume for about 1 
hour?



 - How do I diagnose the reason why it is not terminating?


This might be somewhat complicated.  Check out the developers' handbook.


Done that, but mucking on a system that important makes me hesitate. Although 
not being able to make backups nerves me too.


--WjW

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-03 Thread Kostik Belousov
On Wed, Jan 03, 2007 at 12:05:26AM +0100, Willem Jan Withagen wrote:
 Gary Palmer wrote:
 On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused 
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs 
 freezes access to the disk (I sort of expected that) but the process 
 never terminates. So I let is sit overnight, but looking a gstat did not 
 reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
  - How long should I expect making a snapshot to take:
 5, 15, 30min, 1, 2 hour or even more???
  - How do I diagnose the reason why it is not terminating?
 
 You forgot to mention what revision of FreeBSD you are running, and
 if you are using quotas or anything else on the filesystem that
 could impact this.
 
 Yes, I pressed send somewhat to fast:
 
 [~] [EMAIL PROTECTED] uname -a
 FreeBSD bigsurf.digiware.nl 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #3: Wed 
 Sep 27 15:57:20 CEST 2006 
 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/BIGSURF amd64

See
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
for instruction how to gather information needed to debug the problem.


pgpoGD6fhLQB1.pgp
Description: PGP signature


Re: running mksnap_ffs

2007-01-03 Thread LI Xin
Willem Jan Withagen wrote:
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs
 freezes access to the disk (I sort of expected that) but the process
 never terminates. So I let is sit overnight, but looking a gstat did not
 reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
  - How long should I expect making a snapshot to take:
 5, 15, 30min, 1, 2 hour or even more???

This depends how much cylinder groups do you have.  If you have a lot of
large files, using newfs -b 32768 instead of the default settings
would speed up the process drastically.  Note that this might be
unfeasable because you already have data on the disk.

Another suggestion is to separate the volume into smaller slices, this
would reduce the impact.

BTW.  Our experience with a semi full 1.3T volume is that the snapshot
would take about 1 hour on FreeBSD 5.x, but I doubt that it is not
really comparable to your situation as the hardware is very different.

  - How do I diagnose the reason why it is not terminating?

This might be somewhat complicated.  Check out the developers' handbook.

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


running mksnap_ffs

2007-01-02 Thread Willem Jan Withagen

Hi,

I got the following Filesystem:
FilesystemSizeUsed   Avail Capacity iused ifree %iused 
/dev/da0a 1.3T422G823G34%  565952 1828334700%


Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
The system is used as SMB/NFS server for my other systems here.

I would like to make weekly snapshots, but manually running mksnap_ffs 
freezes access to the disk (I sort of expected that) but the process 
never terminates. So I let is sit overnight, but looking a gstat did not 
reveil any activity what so ever...

The disk was not released, mksnap_ffs could not be terminated.
And things resulted in me rebooting the system.

So:
 - How long should I expect making a snapshot to take:
5, 15, 30min, 1, 2 hour or even more???
 - How do I diagnose the reason why it is not terminating?

--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-02 Thread Michael Proto
Willem Jan Withagen wrote:
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs
 freezes access to the disk (I sort of expected that) but the process
 never terminates. So I let is sit overnight, but looking a gstat did not
 reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
  - How long should I expect making a snapshot to take:
 5, 15, 30min, 1, 2 hour or even more???
  - How do I diagnose the reason why it is not terminating?
 
 --WjW

For a point of reference, I have 2 300GB SerialATA disks in a RAID1
config that I take daily snapshots of.

df info:
Filesystem   1K-blocks  Used Avail Capacity  Mounted on
/dev/ar0s1d  283810134 160945668 11718826458%/r1

As of last night, this snapshot took 18m59.77s to complete.


-Proto
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-02 Thread Gary Palmer
On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:
 Hi,
 
 I got the following Filesystem:
 FilesystemSizeUsed   Avail Capacity iused ifree %iused 
 /dev/da0a 1.3T422G823G34%  565952 1828334700%
 
 Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
 The system is used as SMB/NFS server for my other systems here.
 
 I would like to make weekly snapshots, but manually running mksnap_ffs 
 freezes access to the disk (I sort of expected that) but the process 
 never terminates. So I let is sit overnight, but looking a gstat did not 
 reveil any activity what so ever...
 The disk was not released, mksnap_ffs could not be terminated.
 And things resulted in me rebooting the system.
 
 So:
  - How long should I expect making a snapshot to take:
   5, 15, 30min, 1, 2 hour or even more???
  - How do I diagnose the reason why it is not terminating?

You forgot to mention what revision of FreeBSD you are running, and
if you are using quotas or anything else on the filesystem that
could impact this.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running mksnap_ffs

2007-01-02 Thread Willem Jan Withagen

Gary Palmer wrote:

On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:

Hi,

I got the following Filesystem:
FilesystemSizeUsed   Avail Capacity iused ifree %iused 
/dev/da0a 1.3T422G823G34%  565952 1828334700%


Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb.
The system is used as SMB/NFS server for my other systems here.

I would like to make weekly snapshots, but manually running mksnap_ffs 
freezes access to the disk (I sort of expected that) but the process 
never terminates. So I let is sit overnight, but looking a gstat did not 
reveil any activity what so ever...

The disk was not released, mksnap_ffs could not be terminated.
And things resulted in me rebooting the system.

So:
 - How long should I expect making a snapshot to take:
5, 15, 30min, 1, 2 hour or even more???
 - How do I diagnose the reason why it is not terminating?


You forgot to mention what revision of FreeBSD you are running, and
if you are using quotas or anything else on the filesystem that
could impact this.


Yes, I pressed send somewhat to fast:

[~] [EMAIL PROTECTED] uname -a
FreeBSD bigsurf.digiware.nl 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #3: Wed Sep 
27 15:57:20 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/BIGSURF 
 amd64


--WjW


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]