Thank you, Jean-louis,

Given the tradeoffs, I guess it's best for me to turn off interactivity. We run amcheck on cron in the mid afternoon, when people are around. So the correct tape should be in place. In an unusual case where the tape library becomes unavailable after that, I would rather just have a fall back backup with incrementals.

From a development perspective, is there any way to have the processes communicate and do some sort of optimization? For example, maybe go ahead and get started on the DLEs that are scheduled for incremental. Then, if there is no resolution of the tape issue, start balancing the remaining holding disk space by doing smaller fulls and dropping larger fulls back to incremental. (There are probably better ways of optimizing and delaying commitment to fall backs versus bulls.) Finally, if that is completed, and the tape issue is still not resolved, send the report and terminate, or flush everything to tape if the tape issue is resolved before amanda decides to terminate. Then, if people come in in the morning and resolve the tape issue, they could call amflush. As it stands, they can't do anything, because you don't want to initiate the backup process just as the work day is beginning.


On 7/10/15 7:30 AM, Jean-Louis Martineau wrote:
Chris,

You are right, amanda wait to find a tape before it start to do backup.
At that point, the driver process don't know if the taper process is still searching for a tape (scanning a large library can take a long time) or if it wait for user input (interactivity).

The problem is if we immediately start backup and the taper reply their is no 
tape available.
It might already started large FULL backup that will mostly fill the holding disk not allowing enoung space for the other DLEs.

I agree it should immediately start backup if 'reserve' is set to 0.

Jean-louis

On 09/07/15 03:16 PM, Chris Hoogendyk wrote:
amanda@wahoo:~/daily/log$ cat amdump.20150708224002

--------
driver: flush size 0
amanda@wahoo:~/daily/log$

On 7/9/15 3:10 PM, Jean-Louis Martineau wrote:
On 09/07/15 03:06 PM, Chris Hoogendyk wrote:
But, amanda did not continue. After the estimates, there was nothing. Zero backups. And, when I came in in the morning, amanda was basically dead and gone, aside from the bazillion defunct amanda mail processes. Is it possible that the mail issue created too many processes and exceeded some limit?

Maybe, can you send me the amdump.X log file?

As far as a timeout is concerned, I couldn't find a timeout option in http://wiki.zmanda.com/man/amanda-interactivity.7.html.
It's not there, it is a proposed enhancement.

Jean-Louis


On 7/9/15 2:28 PM, Jean-Louis Martineau wrote:
Chris,

Amanda should continue to do the backup to holding disk while it wait for a new tape, this is constructive work.
You could remove the interactivity setting if you don't what its behaviour.

What can be done is to add a timeout to the intertivity plugin to abort if there is no tape available in that time.

Jean-Louis

On 09/07/15 01:43 PM, Chris Hoogendyk wrote:
I have an Ubuntu 14.04 LTS server with Amanda 3.3.6 using an NEO200 LTO6 
connected by SAS.

I ran aptitude updates yesterday evening and rebooted.

This morning, I had no email report from Amanda, and the mail admin said that the backup server and been sending email every 10 seconds to unknown users admin1 and admin2.

After killing 5600 defunct processes from amanda trying to send email every 10 seconds overnight, I began tracking down the causes.

First off, for some reason the reboot decided to reconfigure the /dev/sg devices, and the tape library changed from /dev/sg10 to /dev/sg8. So, I changed the specification in amanda.conf to /dev/tape/by-id/scsi-1IBM_3573-TL_00L2U78AN152_LL0 (which currently links to /dev/sg8). I had done that on another backup server, because we were changing and adding hardware. I hadn't thought of this one, because it was stable and no changes were planned. OK. Done with that.

Then, in looking for reference to admin1, I came upon the interactivity. Why would anyone ever choose to set the default resend-delay for email to 10 seconds? That's nuts. I changed it to 0, which means send only one email. With that out of the way, I changed the admin1 to amanda, which has an alias including our admin group. Then I changed the check-file-delay to 1800, or 30 minutes, which makes some kind of sense for an overnight run when admins aren't checking things all that often.

So, that pretty well wraps it up, except for one kind of important thing. I would prefer for amanda to proceed with backups and keep them on the holding disk. If a device and/or a tape becomes available, then proceed to put things on tape. But, do something constructive rather than just hanging and waiting. In our case it is more likely to be morning before anyone responds anyway. I'd rather have fallback incrementals than nothing at all. In versions of amanda before the interactivity plugin, that is what happened.

Is this something that can be configured around?

Or does it require tweaking code by someone who knows the code and the 
tradeoffs?









--
---------------

Chris Hoogendyk

-
   O__  ---- Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 317 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst

<[email protected]>

---------------

Erdös 4

Reply via email to