Guy,
The reporting issue in amstatus (reporting negative value) is already
fix, check the svn tree.
I'm surprised that you complain about a small amstatus reporting issue
but not about the fact that your backup failed, you should have at least
posted the error from the email report.
I found the following in the amdump.1 file you sent me privately:
GLib-CRITICAL **: g_main_context_find_source_by_id: assertion `source_id
> 0' failed
It's the first time I see it, I think it is related to amanda exceeding
some glib capacity because your dump are really large.
Are you using a 32 bits amanda?
Can you try the attached patch?
Jean-Louis
On 10/01/2012 09:21 AM, Guy Sisalli wrote:
We're noticing peculiar behavior running Amanda 3.3.2 under FreeBSD 9.
Backups appear to work fine until dealing with a sizable DLE
consisting of a single zfs snapshot. We ran a backup without a holding
area and dumped directly to tape. Things went okay, and then the 10 TB
DLE backup suddenly went wrong at the 6-7 TB mark.
At somewhere between 6 and 7 TB, while writing to tape 6, the readout
on amstatus displayed negative values (see below). At this point, the
10 TB DLE was still listed as being sent to tape, but no further
progress was being registered on that DLE.
Instead, amanda had moved on to dump a second DLE to tape. We only
have one tape drive, but amstatus was listing two DLEs as being sent
to tape, and only one was making progress.
After all the other DLEs had finished, work resumed on the 10 TB DLE,
but amanda claimed to be writing more than 100% of the DLE to tape.
The same negative write symptom appeared on this last tape. Is this
some sort of known issue?
Using /usr/local/etc/amanda/dailyset/state/log/amdump.1
From Fri Sep 28 17:16:47 EDT 2012
localhost:/transport/alpha 0 46g finished (14:58:32)
localhost:/transport/bravo 0 0g finished (17:20:20)
localhost:/transport/charlie 0 278g finished (16:37:19)
localhost:/transport/delta 0 202g finished (15:51:05)
localhost:/transport/echo 0 1674g finished (1+21:17:28)
localhost:/transport/foxtrot 0 0g finished (14:48:18)
localhost:/transport/golf 0 0g finished (14:47:45)
localhost:/transport/hotel 0 2453g finished (1+4:01:32)
localhost:/transport/india 0 63g finished (15:17:03)
localhost:/transport/juliet 0 16g finished (14:51:02)
localhost:/transport/kilo 0 49g finished (15:06:35)
localhost:/transport/lima 0 0g finished (14:48:00)
localhost:/transport/mike 0 10052g failed: killed while
dumping to tape (10578g done (105.24%)) (1+12:14:40)
SUMMARY part real estimated
size size
partition : 13
estimated : 13 14838g
flush : 0 0g
failed : 0 0g ( 0.00%)
wait for dumping: 0 0g ( 0.00%)
dumping to tape : 1 10578g 10052g (105.24%) ( 67.74%)
dumping : 0 0g 0g ( 0.00%) ( 0.00%)
dumped : 12 4786g 4786g (100.00%) ( 32.26%)
wait for writing: 0 0g 0g ( 0.00%) ( 0.00%)
wait to flush : 0 0g 0g (100.00%) ( 0.00%)
writing to tape : 0 0g 0g ( 0.00%) ( 0.00%)
failed to tape : 0 0g 0g ( 0.00%) ( 0.00%)
taped : 12 4786g 4786g (100.00%) ( 32.26%)
tape 1 : 1 1524g 1524g ( 69.32%) Aarc00054 (46
chunks)
tape 2 : 0 1508g 1508g ( 68.58%) Aarc00056 (44
chunks)
tape 3 : 0 1504g 1504g ( 68.37%) Aarc00057 (43
chunks)
tape 4 : 0 1506g 1506g ( 68.47%) Aarc00058 (44
chunks)
tape 5 : 0 1473g 1473g ( 66.97%) Aarc00044 (43
chunks)
tape 6 : 10 -5978g 4073g (-271.75%) Aarc00043 (65
chunks)
tape 7 : 1 2395g 2395g (108.89%) Aarc00041 (70
chunks)
tape 8 : 1 1950g 1950g ( 88.65%) Aarc00042 (58
chunks)
tape 9 : 0 1489g 1489g ( 67.70%) Aarc00049 (43
chunks)
tape 10 : 1 -2588g 7463g (-117.64%) Aarc00050 (11
chunks)
3 dumpers idle : no-dumpers
taper status: Idle
taper qlen: 0
network free kps: 158000
holding space : 0g ( 0.00%)
dumper0 busy : 8:13:08 ( 19.13%)
dumper1 busy : 21:26:17 ( 49.90%)
taper busy : 1+18:54:12 ( 99.86%)
0 dumpers busy : 0:09:11 ( 0.36%) no-dumpers: 0:05:53 (
64.13%)
not-idle: 0:03:17 (
35.86%)
1 dumper busy : 22:10:24 ( 51.61%) no-dumpers: 22:10:24
(100.00%)
2 dumpers busy : 20:38:18 ( 48.04%) no-dumpers: 20:38:18
(100.00%)
diff --git a/server-src/dumper.c b/server-src/dumper.c
index 7712887..adb90ea 100644
--- a/server-src/dumper.c
+++ b/server-src/dumper.c
@@ -1867,25 +1867,33 @@ handle_filter_stderr(
* Startup a timeout in the event handler. If the arg is 0,
* then remove the timeout.
*/
+static event_handle_t *ev_timeout = NULL;
+static time_t timeout_time;
+
static void
timeout(
time_t seconds)
{
- static event_handle_t *ev_timeout = NULL;
+ timeout_time = time(NULL) + seconds;
/*
- * First, remove a timeout if one is active.
+ * remove a timeout if seconds is 0
*/
- if (ev_timeout != NULL) {
- event_release(ev_timeout);
- ev_timeout = NULL;
+ if (seconds == 0) {
+ if (ev_timeout != NULL) {
+ event_release(ev_timeout);
+ ev_timeout = NULL;
+ }
+ return;
}
/*
- * Now, schedule a new one if 'seconds' is greater than 0
+ * schedule a timeout if it not already scheduled
*/
- if (seconds > 0)
- ev_timeout = event_register((event_id_t)seconds, EV_TIME, timeout_callback, NULL);
+ if (ev_timeout == NULL) {
+ ev_timeout = event_register((event_id_t)seconds, EV_TIME,
+ timeout_callback, NULL);
+ }
}
/*
@@ -1896,8 +1904,20 @@ static void
timeout_callback(
void * unused)
{
+ time_t now = time(NULL);
(void)unused; /* Quiet unused parameter warning */
+ if (ev_timeout != NULL) {
+ event_release(ev_timeout);
+ ev_timeout = NULL;
+ }
+
+ if (timeout_time > now) { /* not a data timeout yet */
+ ev_timeout = event_register((event_id_t)(timeout_time-now), EV_TIME,
+ timeout_callback, NULL);
+ return;
+ }
+
assert(unused == NULL);
g_free(errstr);
errstr = g_strdup(_("data timeout"));