Thank you, Jean-louis,
Given the tradeoffs, I guess it's best for me to turn off interactivity. We run amcheck on cron in
the mid afternoon, when people are around. So the correct tape should be in place. In an unusual
case where the tape library becomes unavailable after that, I would rather just have a fall back
backup with incrementals.
From a development perspective, is there any way to have the processes communicate and do some sort
of optimization? For example, maybe go ahead and get started on the DLEs that are scheduled for
incremental. Then, if there is no resolution of the tape issue, start balancing the remaining
holding disk space by doing smaller fulls and dropping larger fulls back to incremental. (There are
probably better ways of optimizing and delaying commitment to fall backs versus bulls.) Finally, if
that is completed, and the tape issue is still not resolved, send the report and terminate, or flush
everything to tape if the tape issue is resolved before amanda decides to terminate. Then, if people
come in in the morning and resolve the tape issue, they could call amflush. As it stands, they can't
do anything, because you don't want to initiate the backup process just as the work day is beginning.
On 7/10/15 7:30 AM, Jean-Louis Martineau wrote:
Chris,
You are right, amanda wait to find a tape before it start to do backup.
At that point, the driver process don't know if the taper process is still searching for a tape
(scanning a large library can take a long time) or if it wait for user input (interactivity).
The problem is if we immediately start backup and the taper reply their is no
tape available.
It might already started large FULL backup that will mostly fill the holding disk not allowing
enoung space for the other DLEs.
I agree it should immediately start backup if 'reserve' is set to 0.
Jean-louis
On 09/07/15 03:16 PM, Chris Hoogendyk wrote:
amanda@wahoo:~/daily/log$ cat amdump.20150708224002
--------
driver: flush size 0
amanda@wahoo:~/daily/log$
On 7/9/15 3:10 PM, Jean-Louis Martineau wrote:
On 09/07/15 03:06 PM, Chris Hoogendyk wrote:
But, amanda did not continue. After the estimates, there was nothing. Zero backups. And, when I
came in in the morning, amanda was basically dead and gone, aside from the bazillion defunct
amanda mail processes. Is it possible that the mail issue created too many processes and
exceeded some limit?
Maybe, can you send me the amdump.X log file?
As far as a timeout is concerned, I couldn't find a timeout option in
http://wiki.zmanda.com/man/amanda-interactivity.7.html.
It's not there, it is a proposed enhancement.
Jean-Louis
On 7/9/15 2:28 PM, Jean-Louis Martineau wrote:
Chris,
Amanda should continue to do the backup to holding disk while it wait for a new tape, this is
constructive work.
You could remove the interactivity setting if you don't what its behaviour.
What can be done is to add a timeout to the intertivity plugin to abort if there is no tape
available in that time.
Jean-Louis
On 09/07/15 01:43 PM, Chris Hoogendyk wrote:
I have an Ubuntu 14.04 LTS server with Amanda 3.3.6 using an NEO200 LTO6
connected by SAS.
I ran aptitude updates yesterday evening and rebooted.
This morning, I had no email report from Amanda, and the mail admin said that the backup
server and been sending email every 10 seconds to unknown users admin1 and admin2.
After killing 5600 defunct processes from amanda trying to send email every 10 seconds
overnight, I began tracking down the causes.
First off, for some reason the reboot decided to reconfigure the /dev/sg devices, and the
tape library changed from /dev/sg10 to /dev/sg8. So, I changed the specification in
amanda.conf to /dev/tape/by-id/scsi-1IBM_3573-TL_00L2U78AN152_LL0 (which currently links to
/dev/sg8). I had done that on another backup server, because we were changing and adding
hardware. I hadn't thought of this one, because it was stable and no changes were planned.
OK. Done with that.
Then, in looking for reference to admin1, I came upon the interactivity. Why would anyone
ever choose to set the default resend-delay for email to 10 seconds? That's nuts. I changed
it to 0, which means send only one email. With that out of the way, I changed the admin1 to
amanda, which has an alias including our admin group. Then I changed the check-file-delay to
1800, or 30 minutes, which makes some kind of sense for an overnight run when admins aren't
checking things all that often.
So, that pretty well wraps it up, except for one kind of important thing. I would prefer for
amanda to proceed with backups and keep them on the holding disk. If a device and/or a tape
becomes available, then proceed to put things on tape. But, do something constructive rather
than just hanging and waiting. In our case it is more likely to be morning before anyone
responds anyway. I'd rather have fallback incrementals than nothing at all. In versions of
amanda before the interactivity plugin, that is what happened.
Is this something that can be configured around?
Or does it require tweaking code by someone who knows the code and the
tradeoffs?
--
---------------
Chris Hoogendyk
-
O__ ---- Systems Administrator
c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 317 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<[email protected]>
---------------
Erdös 4