Re: amplot interpretation problem

2005-03-03 Thread Kevin Dalley
I'm in the middle of my first backup using

estimate server

for all of my samba backups.

Most of the DLEs completed nicely, but a few are still running with
ludicrously large percent complete listed.  Upon further reading, it
isn't quite as bad as it first appears.

condor://cardinal/g$   0   724687k dumping  7863520k (1085.09%) (4:31:54)
condor://kinka/m$  2   310043k dumping  8292320k (2674.57%) (3:49:41)
condor://pigeon/e$ 1 4555k dumping  7151360k (157000.22%) (0:41:47)
condor://redbird/i$0  9600581k dumping 10174464k (105.98%) (0:52:32)
nyinyi:/nyinyi 0 21106878k flushing to tape (10:19:18)


The 157,000% complete has a very small initial estimate, so it isn't
quite as bad as it first appears.

I'm running with a patched version of amanda which probably runs
correctly, but eats up memory.  Using "estimate server" allowed me to
finish the estimate without killing any of the large smbclients which
I previously had.  I only killed one tar smbclient during my nightly
run. This is probably an improvement over losing files in the backup,
if I can force the missing DLE to be backed up through manual
intervention. 

I set runtapes to 3, while only using a single tape each day.
autoflush is set to on.  This method isn't as upset by inaccurate
estimates.  If the backup is too large, it is saved until the next
day.  If there is a big backlog, as at present, I don't write to tape
until at least one day later, which means that I have accurate
knowledge of the size of the backup before writing.

Kevin Dalley <[EMAIL PROTECTED]> writes:

> I'm just starting to use it, due to samba failures.  What happens if
> the DLE does not have a previous run?  Does it revert to the old
> standard?

-- 
Kevin Dalley
[EMAIL PROTECTED]


Re: amplot interpretation problem

2005-03-02 Thread Paul Bijnens
FM wrote:
thanks again :-)
here my updated config  :
inparallel 8
netusage  3
maxdumps 10
maxdumps 1 is normal, you may increase maxdumps if you have
a very powerful box.

dumporder 
I read about dump at :
http://dump.sourceforge.net/isdumpdeprecated.html
so I'll try it again.
last question ... for new ;-), what do you mean by :
 "set the correct "splindle" for each DLE"
A disklist entry can have as 4th parameter a number, indicating
on which disk the DLE is on. Amanda will not schedule another dump
on the same disk.  This avoids heavy arm movements and slowdown
of disk and dump or estimate.
When having maxdumps set to 1, you won't notice this, but
when allowing more dumps on one host (maxdump > 1), you should
make sure that you have asigned spindle numbers.
Like this:
the.host.tv   /disk1/dir1  comp-user-tar  1
the.host.tv   /disk1/dir2  comp-user-tar  1
the.host.tv   /disk2/dir1  comp-user-tar  2
The first two entries have the same  spindle number: amanda will
not start dumps for the second when the first is still busy.
When having maxdumps set to 2, amanda may schedule a dump for the
third disk at the same time as one of the other two.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amplot interpretation problem

2005-03-02 Thread Kevin Dalley
I'm just starting to use it, due to samba failures.  What happens if
the DLE does not have a previous run?  Does it revert to the old
standard?

Paul Bijnens <[EMAIL PROTECTED]> writes:

> FM wrote:
>
>> Thanks for all those great comments !
>> Is the a way to tune (or bypass) the estimated time ?
>
> In amanda 2.4.5 (currently in beta) you can get very fast
> (but less accurate) estimates, taking less than a second.
> They are based on statistics from the previous runs.
>
> I have not used it myself (yet).

-- 
Kevin Dalley
[EMAIL PROTECTED]


Re: amplot interpretation problem

2005-03-02 Thread Brian Cuttler

No bandwidth utilized ?

Seem to have pleanty of holding disk.

I think I missed the start of the thread, how many client
systems ? What are maxdumps and inparallel set to ?

What is your compression algorithm ? client/server/hw/none ?
Or some mixture of client/server/none for different clients ?

---
   Brian R Cuttler [EMAIL PROTECTED]
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: amplot interpretation problem

2005-03-02 Thread FM
thanks again :-)
here my updated config  :
inparallel 8
netusage  3
maxdumps 10
dumporder 
I read about dump at :
http://dump.sourceforge.net/isdumpdeprecated.html
so I'll try it again.
last question ... for new ;-), what do you mean by :
 "set the correct "splindle" for each DLE"
thanks again

Paul Bijnens wrote:
FM wrote:
here is some info :
We have performance problem with a 47 GB partition (on the backup 
server).
OS : Linux Redhat Enterprise AS 3 V4
Version : amanda-2.4.4p1

/dev/cciss/c1d0p2 160G   56G   97G  37% /amanda
inparallel 50

50 dumpers!! and amplot shows only a few active.  Probably overkill.
netusage  3
maxdumps 5

So you do allow 5 dumps on the same host?
Only do this when having set the correct "splindle" for each DLE,
otherwise you're garuanteed to step on your toes by trashing
the disk, resulting in an even slower estimate/backup.

holdingdisk hd1 {
comment "main holding disk"
directory "/amanda/daily"
use -50 Mb
chunksize 4 Gb
(...)
}
Can force the backup order so this partition is the first one to be 
backup ?

In amanda.conf add:
dumperorder "T..."
# fill the dots with other letters, as many
# as you have "inparallel" (best set them to a reasonablenumber like 8)
# See man page for possible letters to choose
# I have: "" in my config


Re: amplot interpretation problem

2005-03-02 Thread Paul Bijnens
FM wrote:
Thanks for all those great comments !
Is the a way to tune (or bypass) the estimated time ?
In amanda 2.4.5 (currently in beta) you can get very fast
(but less accurate) estimates, taking less than a second.
They are based on statistics from the previous runs.
I have not used it myself (yet).
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amplot interpretation problem

2005-03-02 Thread Paul Bijnens
FM wrote:
here is some info :
We have performance problem with a 47 GB partition (on the backup server).
OS : Linux Redhat Enterprise AS 3 V4
Version : amanda-2.4.4p1
/dev/cciss/c1d0p2 160G   56G   97G  37% /amanda
inparallel 50
50 dumpers!! and amplot shows only a few active.  Probably overkill.
netusage  3
maxdumps 5
So you do allow 5 dumps on the same host?
Only do this when having set the correct "splindle" for each DLE,
otherwise you're garuanteed to step on your toes by trashing
the disk, resulting in an even slower estimate/backup.

holdingdisk hd1 {
comment "main holding disk"
directory "/amanda/daily"
use -50 Mb
chunksize 4 Gb
(...)
}
Can force the backup order so this partition is the first one to be 
backup ?
In amanda.conf add:
dumperorder "T..."
# fill the dots with other letters, as many
# as you have "inparallel" (best set them to a reasonablenumber like 8)
# See man page for possible letters to choose
# I have: "" in my config
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amplot interpretation problem

2005-03-02 Thread FM
Thanks for all those great comments !
Is the a way to tune (or bypass) the estimated time ?

Paul Bijnens wrote:
FM wrote:
Some partition have millions of html files
I do not use dump because of this L Torvalds email :
http://lwn.net/2001/0503/a/lt-dump.php3

That was 2001.
In that time ext2dump had no maintainer.
That has change, and dump has catched up again.
And the main reason (but that counts for Solaris too),
is that using dump on active filesystems can result
in worthless backups.  On a quiet filesystem it is
normally safe.  (But I'm a gnutar user, personally.)
Another possibility is to mount with "noatime" option.
It will not reduce the 10 hours to 1, but will gain
maybe 10-20%.
The reason that estimate takes longer is probably because
the estimate need to be done for level 0, level N, and maybe
level N+1 for eachfilesystem.



Re: amplot interpretation problem

2005-03-02 Thread Paul Bijnens
FM wrote:
Some partition have millions of html files
I do not use dump because of this L Torvalds email :
http://lwn.net/2001/0503/a/lt-dump.php3
That was 2001.
In that time ext2dump had no maintainer.
That has change, and dump has catched up again.
And the main reason (but that counts for Solaris too),
is that using dump on active filesystems can result
in worthless backups.  On a quiet filesystem it is
normally safe.  (But I'm a gnutar user, personally.)
Another possibility is to mount with "noatime" option.
It will not reduce the 10 hours to 1, but will gain
maybe 10-20%.
The reason that estimate takes longer is probably because
the estimate need to be done for level 0, level N, and maybe
level N+1 for eachfilesystem.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amplot interpretation problem

2005-03-02 Thread Brian Cuttler

You have enough amanda work area to allow more than 5 concurrent
dumps to run. This is a separate issue, the second half of the
graph, the 1st half, estimate phase you already have far more capable
help on than I can give.

On Wed, Mar 02, 2005 at 03:38:29PM -0500, FM wrote:
> here is some info :
> 
> We have performance problem with a 47 GB partition (on the backup server).
> OS : Linux Redhat Enterprise AS 3 V4
> Version : amanda-2.4.4p1
> 
> /dev/cciss/c1d0p2 160G   56G   97G  37% /amanda
> 
> inparallel 50
> netusage  3
> maxdumps 5
> 
> holdingdisk hd1 {
>  comment "main holding disk"
>  directory "/amanda/daily"
>  use -50 Mb
>  chunksize 4 Gb
> (...)
> }
> Can force the backup order so this partition is the first one to be backup ?
> 
> thanks !
> 
> 
> Brian Cuttler wrote:
> > No bandwidth utilized ?
> > 
> > Seem to have pleanty of holding disk.
> > 
> > I think I missed the start of the thread, how many client
> > systems ? What are maxdumps and inparallel set to ?
> > 
> > What is your compression algorithm ? client/server/hw/none ?
> > Or some mixture of client/server/none for different clients ?
> > 
> > ---
> >Brian R Cuttler [EMAIL PROTECTED]
> >Computer Systems Support(v) 518 486-1697
> >Wadsworth Center(f) 518 473-6384
> >NYS Department of HealthHelp Desk 518 473-0773
> > 
---
   Brian R Cuttler [EMAIL PROTECTED]
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: amplot interpretation problem

2005-03-02 Thread FM
here is some info :
We have performance problem with a 47 GB partition (on the backup server).
OS : Linux Redhat Enterprise AS 3 V4
Version : amanda-2.4.4p1
/dev/cciss/c1d0p2 160G   56G   97G  37% /amanda
inparallel 50
netusage  3
maxdumps 5
holdingdisk hd1 {
comment "main holding disk"
directory "/amanda/daily"
use -50 Mb
chunksize 4 Gb
(...)
}
Can force the backup order so this partition is the first one to be backup ?
thanks !
Brian Cuttler wrote:
No bandwidth utilized ?
Seem to have pleanty of holding disk.
I think I missed the start of the thread, how many client
systems ? What are maxdumps and inparallel set to ?
What is your compression algorithm ? client/server/hw/none ?
Or some mixture of client/server/none for different clients ?
---
   Brian R Cuttler [EMAIL PROTECTED]
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773


Re: amplot interpretation problem

2005-03-02 Thread Paul Bijnens
FM wrote:
Thanks :)
4 h to estimate !
Even 10 hours!  + 8 hours to dump the filesystem.
here is the result of amplot -l -e -p amdump.1
You clearly have an estimate problem.
Using gnutar with a filesystem with zillion small files?
But why would the estimate take even longer than the real dump?
Maybe you have some unresponsive nfs mount from some dead host?
You're the perfect test candidate for the faster, statistics
based estimate, that is currently in 2.4.5beta.
Or switch to dump, if that is possible on the filesystem;

--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***


Re: amplot interpretation problem

2005-03-02 Thread FM
Thanks :)
4 h to estimate !
here is the result of amplot -l -e -p amdump.1

Paul Bijnens wrote:
FM wrote:
I add the amplot result file of : amplot -l -p amdump.1
What am i suppose to understand in this graph :)

You forgot the "-e" option to amplot to extend the graph beyond the
default of 4 hours.
The dump took 18 hours, and apparently the first 4+ hours were entirely
spent by waiting for the estimates:  all flat lines on the server.



20050301.pdf
Description: Adobe PDF document


Re: amplot interpretation problem

2005-03-02 Thread Paul Bijnens
FM wrote:
I add the amplot result file of : amplot -l -p amdump.1
What am i suppose to understand in this graph :)
You forgot the "-e" option to amplot to extend the graph beyond the
default of 4 hours.
The dump took 18 hours, and apparently the first 4+ hours were entirely
spent by waiting for the estimates:  all flat lines on the server.
--
Paul Bijnens, XplanationTel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax  +32 16 397.512
http://www.xplanation.com/  email:  [EMAIL PROTECTED]
***
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, F6, *
* quit,  ZZ, :q, :q!,  M-Z, ^X^C,  logoff, logout, close, bye,  /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* kill -9 1,  Alt-F4,  Ctrl-Alt-Del,  AltGr-NumLock,  Stop-A,  ...*
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out  *
***