Gene,

Try this newer patch, remove the previous patch before applying this one.

Jean-Louis

On 03/28/2012 06:54 AM, gene heskett wrote:
On Wednesday, March 28, 2012 06:45:20 AM gene heskett did opine:

On Tuesday, March 27, 2012 06:01:21 AM gene heskett did opine:
On Saturday, March 24, 2012 02:43:58 AM gene heskett did opine:
On Friday, March 23, 2012 09:45:12 AM Jean-Louis Martineau did opine:
Hi Gene,

Can you try the attached patch? (it is lightly tested and
uncommitted).

Jean-Louis
Building now, I applied it to 4.0-4613.

Last night it promoted about 20 DLE's some as much as 4 days, but
the balance report stubbornly maintains its going to do a 45Gb
backup sunday morning.  It is not promoting or adjusting anything
over about 4Gb, they are apparently untouchable.

After the build/install,
[root@coyote amanda]# su amanda -c "amadmin Daily balance"

  due-date  #fs    orig MB     out MB   balance

----------------------------------------------

  3/23 Fri    0          0          0      ---
  3/24 Sat    4      20596      11128    -24.4%
  3/25 Sun    5      48037      44776   +204.4%
  3/26 Mon    8       7023       3605    -75.5%
  3/27 Tue   19      30582      14054     -4.5%

----------------------------------------------
TOTAL       36     106238      73563     14712

   (estimated 5 runs per dumpcycle)

So it remains to be seen if this patch works.  I've no clue if the
amadmin /config/ balance uses the code just patched.  I'll find out
in the morning I guess.

I just parked this on another screen and checked my emails from
amanda, scanning the planners output, clear back to Feb 24th.  It is
as if /usr/movies at 16.6Gb does not exist, it never attempts to
promote it to a less busy run.  Ditto for /usr/pix at 7.5Gb, it is
never mentioned by the planner.  Those 2 are stuck on the same level
0 day together and have been for very a long time.  I would do a one
time force on /usr/movies, but that would hide the problem only for
me.

Could this be a>4Gb problem in the math someplace?

Thanks Jean-Louis

Cheers, Gene
While it didn't solve the problem in one run, the amadmin Daily
balance report now has non-zero in the next day, and the maximum is
now 144%, still Sunday.

  due-date  #fs    orig MB     out MB   balance

----------------------------------------------

  3/24 Sat    1      17730      10132    -31.2%
  3/25 Sun    4      39335      36074   +144.9%
  3/26 Mon    8       7023       3605    -75.5%
  3/27 Tue    8      21789       9613    -34.8%
  3/28 Wed   15      20464      14241     -3.3%

----------------------------------------------
TOTAL       36     106341      73665     14733

   (estimated 5 runs per dumpcycle)

Apparently there are 4 DLE's that somehow exceed the planners ability
to move them.  Actually, if I back up to the last level 0 on
/usr/movies, I find there are 5 DLE's that exceed 4Gb in that days
email.
/usr/movies     @ 16Gb
/root           @ 8Gb
/usr/pix        @ 7.6Gb
/var            @ 8.7Gb
/usr/share      @ 7.6Gb

But 2 days later
/root was promoted 2 days
/usr/share was promoted 2 days
So there goes the 4Gb theory.  I think.

Maybe this will do it, given enough time.  It does seem to be drifting
in the right direction according to this run just finished.

Status updates by the day.

Cheers, Gene
After this nights run, the balance is further improved.

  due-date  #fs    orig MB     out MB   balance
----------------------------------------------
  3/27 Tue    1      21597       9471    -35.2%
  3/28 Wed    1       8796       8796    -39.8%
  3/29 Thu    2      24685      13791     -5.6%
  3/30 Fri    2      23673      23673    +62.0%
  3/31 Sat   30      27020      17321    +18.6%
----------------------------------------------
TOTAL       36     105771      73052     14610
   (estimated 5 runs per dumpcycle)

I am thinking this patch is a keeper.

Cheers, Gene
  due-date  #fs    orig MB     out MB   balance
----------------------------------------------
  3/28 Wed    1       8796       8796    -39.8%
  3/29 Thu    2      24685      13791     -5.6%
  3/30 Fri    2      23673      23673    +62.0%
  3/31 Sat   13      19011      13411     -8.2%
  4/01 Sun   18      29614      13387     -8.4%
----------------------------------------------
TOTAL       36     105779      73058     14611
   (estimated 5 runs per dumpcycle)

Things are improving, but it has yet to move the offenders, instead is
trying to make the rest fit around them.  It will be interesting to see
what the planner does on the next run as it moved a bunch up 4 days last
night:

   planner: Incremental of coyote:/home bumped to level 3.
   planner: Full dump of coyote:/GenesAmandaHelper-0.6 promoted from 4 days
ahead.
   planner: Full dump of coyote:/usr/local promoted from 4 days ahead.
   planner: Full dump of coyote:/lib promoted from 4 days ahead.
   planner: Full dump of shop:/usr/src promoted from 4 days ahead.
   planner: Full dump of shop:/etc promoted from 4 days ahead.
   planner: Full dump of coyote:/usr/include promoted from 4 days ahead.
   planner: Full dump of shop:/var/lib/amanda promoted from 4 days ahead.
   planner: Full dump of shop:/root promoted from 4 days ahead.
   planner: Full dump of shop:/usr/local promoted from 4 days ahead.
   planner: Full dump of shop:/usr/lib/amanda promoted from 4 days ahead.
   planner: Full dump of coyote:/etc promoted from 4 days ahead.
   planner: Full dump of coyote:/sbin promoted from 4 days ahead.
   planner: Full dump of coyote:/bin promoted from 4 days ahead.
   planner: Full dump of coyote:/usr/uclibc promoted from 4 days ahead.
   planner: Full dump of coyote:/usr/games promoted from 4 days ahead.
   planner: Full dump of coyote:/usr/X11R6 promoted from 4 days ahead.
   planner: Full dump of coyote:/usr/libexec promoted from 4 days ahead.

Its shuffling stuff around at a much higher rate, but still hasn't touched
2 of the biggest ones.

Thanks Jean-Louis.

Cheers, Gene

diff --git a/server-src/planner.c b/server-src/planner.c
index 5b61c97..d609332 100644
--- a/server-src/planner.c
+++ b/server-src/planner.c
@@ -107,6 +107,7 @@ typedef struct est_s {
     double fullcomp, incrcomp;
     char *errstr;
     char *degr_mesg;
+    gboolean large;
 } est_t;
 
 #define est(dp)	((est_t *)(dp)->up)
@@ -161,6 +162,7 @@ static void get_estimates(void);
 static void analyze_estimate(disk_t *dp);
 static void handle_failed(disk_t *dp);
 static void delay_dumps(void);
+static void promote_largest(void);
 static int promote_highest_priority_incremental(void);
 static int promote_hills(void);
 static void output_scheduleline(disk_t *dp);
@@ -655,6 +657,7 @@ main(
 	    total_lev0, balanced_size);
 
     balance_threshold = balanced_size * PROMOTE_THRESHOLD;
+    promote_largest();
     moved_one = 1;
     while((balanced_size - total_lev0) > balance_threshold && moved_one)
 	moved_one = promote_highest_priority_incremental();
@@ -2526,8 +2529,6 @@ static one_est_t *pick_inclevel(
 */
 
 static void delay_one_dump(disk_t *dp, int delete, ...);
-static int promote_highest_priority_incremental(void);
-static int promote_hills(void);
 
 /* delay any dumps that will not fit */
 static void delay_dumps(void)
@@ -2900,6 +2901,85 @@ static void delay_one_dump(disk_t *dp, int delete, ...)
 }
 
 
+static void promote_largest(void)
+{
+    disk_t    *dp;
+    int        i;
+    int        check_days;
+    one_est_t *level0_est;
+    char      *qname;
+
+    /* Tag the largest runspercycle dles */
+    for (i=0; i < runs_per_cycle; i++) {
+	gint64 level0_size = 0;
+	disk_t *level0_dp = NULL;
+    	for (dp = schedq.head; dp != NULL; dp = dp->next) {
+	    if (est(dp)->large)
+		continue;
+	    level0_est = est_for_level(dp, 0);
+	    if ((!level0_dp && level0_est->nsize > 0) ||
+		(level0_est->csize > level0_size)) {
+		level0_size = level0_est->csize;
+		level0_dp = dp;
+	    }
+	}
+	if (level0_dp) {
+	    est(level0_dp)->large = TRUE;
+	}
+    }
+
+    /* Do a large dle is already scheduled today? */
+    for (dp = schedq.head; dp != NULL; dp = dp->next) {
+	if (est(dp)->next_level0 <= 0 && est(dp)->large) {
+	    return;
+	}
+    }
+
+    /* find a day with two largest and promote one */
+    for (check_days=1; check_days < conf_dumpcycle; check_days++) {
+	gint64 level0_size = 0;
+	disk_t *level0_dp = NULL;
+    	for (dp = schedq.head; dp != NULL; dp = dp->next) {
+	    if (!est(dp)->large)
+		continue;
+	    if (est(dp)->next_level0 != check_days)
+		continue;
+	    level0_est = est_for_level(dp, 0);
+	    if (!level0_dp) {
+		level0_size = level0_est->csize;
+		level0_dp = dp;
+	    } else {
+		if (level0_est->csize < level0_size) {
+		    level0_dp = dp;
+		}
+		/* promote level0_dp */
+		dp = level0_dp;
+
+		qname = quote_string(dp->name);
+		total_size = total_size - est(dp)->dump_est->csize + level0_est->csize;
+		total_lev0 = (gint64)total_lev0 + level0_est->csize;
+
+		est(dp)->degr_est = est(dp)->dump_est;
+		est(dp)->dump_est = level0_est;
+		est(dp)->next_level0 = 0;
+
+		g_fprintf(stderr,
+			_("   promote largest: moving %s:%s up, total_lev0 %1.0lf, total_size %lld\n"),
+			dp->host->hostname, qname,
+			total_lev0, (long long)total_size);
+
+		log_add(L_INFO,
+			plural(_("Full dump of %s:%s promoted from %d day ahead."),
+			       ("Full dump of %s:%s promoted from %d days ahead."),
+				check_days),
+			dp->host->hostname, qname, check_days);
+		amfree(qname);
+		return;
+	    }
+	}
+    }
+}
+
 static int promote_highest_priority_incremental(void)
 {
     disk_t *dp, *dp1, *dp_promote;
@@ -2907,6 +2987,7 @@ static int promote_highest_priority_incremental(void)
     int check_days;
     int nb_today, nb_same_day, nb_today2;
     int nb_disk_today, nb_disk_same_day;
+    gint64 same_day_lev0;
     char *qname;
 
     /*
@@ -2928,6 +3009,9 @@ static int promote_highest_priority_incremental(void)
 	if(est(dp)->next_level0 > dp->maxpromoteday)
 	    continue;
 
+	if (est(dp)->large)
+	    continue;
+
 	new_total = total_size - est(dp)->dump_est->csize + level0_est->csize;
 	new_lev0 = (gint64)total_lev0 + level0_est->csize;
 
@@ -2935,11 +3019,14 @@ static int promote_highest_priority_incremental(void)
 	nb_same_day = 0;
 	nb_disk_today = 0;
 	nb_disk_same_day = 0;
+	same_day_lev0 = 0;
 	for(dp1 = schedq.head; dp1 != NULL; dp1 = dp1->next) {
-	    if(est(dp1)->dump_est->level == 0)
+	    if (est(dp1)->dump_est->level == 0) {
 		nb_disk_today++;
-	    else if(est(dp1)->next_level0 == est(dp)->next_level0)
+	    } else if (est(dp1)->next_level0 == est(dp)->next_level0) {
 		nb_disk_same_day++;
+		same_day_lev0 += est(dp1)->last_lev0size;
+	    }
 	    if(g_str_equal(dp->host->hostname, dp1->host->hostname)) {
 		if(est(dp1)->dump_est->level == 0)
 		    nb_today++;
@@ -2952,10 +3039,23 @@ static int promote_highest_priority_incremental(void)
 	if(new_total > tape_length)
 	    continue;
 
-	/* do not promote if overflow balanced size and something today */
+	/* do not promote if todays full is larher than that day full */
+	if (new_lev0 > same_day_lev0) {
+	    /* TODO: It might be good to promote in some case  */
+	    /*     6 1 1 1 1 1 1                               */
+	    /*     9 5                                         */
+	    /* promoting the 5 on the first day and            */
+	    /* promoting 3x1 the next day                      */
+	    continue;
+	}
+
+	/* do not promote if not overflow balanced size and something today */
+	/* and if today full will be larger than that day full */
 	/* promote if nothing today */
-	if((new_lev0 > (gint64)(balanced_size + balance_threshold)) &&
-		(nb_disk_today > 0))
+	if ((new_lev0 < (gint64)(balanced_size + balance_threshold)) &&
+	    (same_day_lev0 < (gint64)(balanced_size + balance_threshold)) &&
+	    (new_lev0 > same_day_lev0 - level0_est->csize) &&
+	    (nb_disk_today > 0))
 	    continue;
 
 	/* do not promote if only one disk due that day and nothing today */

Reply via email to