Gene,
Try this newer patch, remove the previous patch before applying this one.
Jean-Louis
On 03/28/2012 06:54 AM, gene heskett wrote:
On Wednesday, March 28, 2012 06:45:20 AM gene heskett did opine:
On Tuesday, March 27, 2012 06:01:21 AM gene heskett did opine:
On Saturday, March 24, 2012 02:43:58 AM gene heskett did opine:
On Friday, March 23, 2012 09:45:12 AM Jean-Louis Martineau did opine:
Hi Gene,
Can you try the attached patch? (it is lightly tested and
uncommitted).
Jean-Louis
Building now, I applied it to 4.0-4613.
Last night it promoted about 20 DLE's some as much as 4 days, but
the balance report stubbornly maintains its going to do a 45Gb
backup sunday morning. It is not promoting or adjusting anything
over about 4Gb, they are apparently untouchable.
After the build/install,
[root@coyote amanda]# su amanda -c "amadmin Daily balance"
due-date #fs orig MB out MB balance
----------------------------------------------
3/23 Fri 0 0 0 ---
3/24 Sat 4 20596 11128 -24.4%
3/25 Sun 5 48037 44776 +204.4%
3/26 Mon 8 7023 3605 -75.5%
3/27 Tue 19 30582 14054 -4.5%
----------------------------------------------
TOTAL 36 106238 73563 14712
(estimated 5 runs per dumpcycle)
So it remains to be seen if this patch works. I've no clue if the
amadmin /config/ balance uses the code just patched. I'll find out
in the morning I guess.
I just parked this on another screen and checked my emails from
amanda, scanning the planners output, clear back to Feb 24th. It is
as if /usr/movies at 16.6Gb does not exist, it never attempts to
promote it to a less busy run. Ditto for /usr/pix at 7.5Gb, it is
never mentioned by the planner. Those 2 are stuck on the same level
0 day together and have been for very a long time. I would do a one
time force on /usr/movies, but that would hide the problem only for
me.
Could this be a>4Gb problem in the math someplace?
Thanks Jean-Louis
Cheers, Gene
While it didn't solve the problem in one run, the amadmin Daily
balance report now has non-zero in the next day, and the maximum is
now 144%, still Sunday.
due-date #fs orig MB out MB balance
----------------------------------------------
3/24 Sat 1 17730 10132 -31.2%
3/25 Sun 4 39335 36074 +144.9%
3/26 Mon 8 7023 3605 -75.5%
3/27 Tue 8 21789 9613 -34.8%
3/28 Wed 15 20464 14241 -3.3%
----------------------------------------------
TOTAL 36 106341 73665 14733
(estimated 5 runs per dumpcycle)
Apparently there are 4 DLE's that somehow exceed the planners ability
to move them. Actually, if I back up to the last level 0 on
/usr/movies, I find there are 5 DLE's that exceed 4Gb in that days
email.
/usr/movies @ 16Gb
/root @ 8Gb
/usr/pix @ 7.6Gb
/var @ 8.7Gb
/usr/share @ 7.6Gb
But 2 days later
/root was promoted 2 days
/usr/share was promoted 2 days
So there goes the 4Gb theory. I think.
Maybe this will do it, given enough time. It does seem to be drifting
in the right direction according to this run just finished.
Status updates by the day.
Cheers, Gene
After this nights run, the balance is further improved.
due-date #fs orig MB out MB balance
----------------------------------------------
3/27 Tue 1 21597 9471 -35.2%
3/28 Wed 1 8796 8796 -39.8%
3/29 Thu 2 24685 13791 -5.6%
3/30 Fri 2 23673 23673 +62.0%
3/31 Sat 30 27020 17321 +18.6%
----------------------------------------------
TOTAL 36 105771 73052 14610
(estimated 5 runs per dumpcycle)
I am thinking this patch is a keeper.
Cheers, Gene
due-date #fs orig MB out MB balance
----------------------------------------------
3/28 Wed 1 8796 8796 -39.8%
3/29 Thu 2 24685 13791 -5.6%
3/30 Fri 2 23673 23673 +62.0%
3/31 Sat 13 19011 13411 -8.2%
4/01 Sun 18 29614 13387 -8.4%
----------------------------------------------
TOTAL 36 105779 73058 14611
(estimated 5 runs per dumpcycle)
Things are improving, but it has yet to move the offenders, instead is
trying to make the rest fit around them. It will be interesting to see
what the planner does on the next run as it moved a bunch up 4 days last
night:
planner: Incremental of coyote:/home bumped to level 3.
planner: Full dump of coyote:/GenesAmandaHelper-0.6 promoted from 4 days
ahead.
planner: Full dump of coyote:/usr/local promoted from 4 days ahead.
planner: Full dump of coyote:/lib promoted from 4 days ahead.
planner: Full dump of shop:/usr/src promoted from 4 days ahead.
planner: Full dump of shop:/etc promoted from 4 days ahead.
planner: Full dump of coyote:/usr/include promoted from 4 days ahead.
planner: Full dump of shop:/var/lib/amanda promoted from 4 days ahead.
planner: Full dump of shop:/root promoted from 4 days ahead.
planner: Full dump of shop:/usr/local promoted from 4 days ahead.
planner: Full dump of shop:/usr/lib/amanda promoted from 4 days ahead.
planner: Full dump of coyote:/etc promoted from 4 days ahead.
planner: Full dump of coyote:/sbin promoted from 4 days ahead.
planner: Full dump of coyote:/bin promoted from 4 days ahead.
planner: Full dump of coyote:/usr/uclibc promoted from 4 days ahead.
planner: Full dump of coyote:/usr/games promoted from 4 days ahead.
planner: Full dump of coyote:/usr/X11R6 promoted from 4 days ahead.
planner: Full dump of coyote:/usr/libexec promoted from 4 days ahead.
Its shuffling stuff around at a much higher rate, but still hasn't touched
2 of the biggest ones.
Thanks Jean-Louis.
Cheers, Gene
diff --git a/server-src/planner.c b/server-src/planner.c
index 5b61c97..d609332 100644
--- a/server-src/planner.c
+++ b/server-src/planner.c
@@ -107,6 +107,7 @@ typedef struct est_s {
double fullcomp, incrcomp;
char *errstr;
char *degr_mesg;
+ gboolean large;
} est_t;
#define est(dp) ((est_t *)(dp)->up)
@@ -161,6 +162,7 @@ static void get_estimates(void);
static void analyze_estimate(disk_t *dp);
static void handle_failed(disk_t *dp);
static void delay_dumps(void);
+static void promote_largest(void);
static int promote_highest_priority_incremental(void);
static int promote_hills(void);
static void output_scheduleline(disk_t *dp);
@@ -655,6 +657,7 @@ main(
total_lev0, balanced_size);
balance_threshold = balanced_size * PROMOTE_THRESHOLD;
+ promote_largest();
moved_one = 1;
while((balanced_size - total_lev0) > balance_threshold && moved_one)
moved_one = promote_highest_priority_incremental();
@@ -2526,8 +2529,6 @@ static one_est_t *pick_inclevel(
*/
static void delay_one_dump(disk_t *dp, int delete, ...);
-static int promote_highest_priority_incremental(void);
-static int promote_hills(void);
/* delay any dumps that will not fit */
static void delay_dumps(void)
@@ -2900,6 +2901,85 @@ static void delay_one_dump(disk_t *dp, int delete, ...)
}
+static void promote_largest(void)
+{
+ disk_t *dp;
+ int i;
+ int check_days;
+ one_est_t *level0_est;
+ char *qname;
+
+ /* Tag the largest runspercycle dles */
+ for (i=0; i < runs_per_cycle; i++) {
+ gint64 level0_size = 0;
+ disk_t *level0_dp = NULL;
+ for (dp = schedq.head; dp != NULL; dp = dp->next) {
+ if (est(dp)->large)
+ continue;
+ level0_est = est_for_level(dp, 0);
+ if ((!level0_dp && level0_est->nsize > 0) ||
+ (level0_est->csize > level0_size)) {
+ level0_size = level0_est->csize;
+ level0_dp = dp;
+ }
+ }
+ if (level0_dp) {
+ est(level0_dp)->large = TRUE;
+ }
+ }
+
+ /* Do a large dle is already scheduled today? */
+ for (dp = schedq.head; dp != NULL; dp = dp->next) {
+ if (est(dp)->next_level0 <= 0 && est(dp)->large) {
+ return;
+ }
+ }
+
+ /* find a day with two largest and promote one */
+ for (check_days=1; check_days < conf_dumpcycle; check_days++) {
+ gint64 level0_size = 0;
+ disk_t *level0_dp = NULL;
+ for (dp = schedq.head; dp != NULL; dp = dp->next) {
+ if (!est(dp)->large)
+ continue;
+ if (est(dp)->next_level0 != check_days)
+ continue;
+ level0_est = est_for_level(dp, 0);
+ if (!level0_dp) {
+ level0_size = level0_est->csize;
+ level0_dp = dp;
+ } else {
+ if (level0_est->csize < level0_size) {
+ level0_dp = dp;
+ }
+ /* promote level0_dp */
+ dp = level0_dp;
+
+ qname = quote_string(dp->name);
+ total_size = total_size - est(dp)->dump_est->csize + level0_est->csize;
+ total_lev0 = (gint64)total_lev0 + level0_est->csize;
+
+ est(dp)->degr_est = est(dp)->dump_est;
+ est(dp)->dump_est = level0_est;
+ est(dp)->next_level0 = 0;
+
+ g_fprintf(stderr,
+ _(" promote largest: moving %s:%s up, total_lev0 %1.0lf, total_size %lld\n"),
+ dp->host->hostname, qname,
+ total_lev0, (long long)total_size);
+
+ log_add(L_INFO,
+ plural(_("Full dump of %s:%s promoted from %d day ahead."),
+ ("Full dump of %s:%s promoted from %d days ahead."),
+ check_days),
+ dp->host->hostname, qname, check_days);
+ amfree(qname);
+ return;
+ }
+ }
+ }
+}
+
static int promote_highest_priority_incremental(void)
{
disk_t *dp, *dp1, *dp_promote;
@@ -2907,6 +2987,7 @@ static int promote_highest_priority_incremental(void)
int check_days;
int nb_today, nb_same_day, nb_today2;
int nb_disk_today, nb_disk_same_day;
+ gint64 same_day_lev0;
char *qname;
/*
@@ -2928,6 +3009,9 @@ static int promote_highest_priority_incremental(void)
if(est(dp)->next_level0 > dp->maxpromoteday)
continue;
+ if (est(dp)->large)
+ continue;
+
new_total = total_size - est(dp)->dump_est->csize + level0_est->csize;
new_lev0 = (gint64)total_lev0 + level0_est->csize;
@@ -2935,11 +3019,14 @@ static int promote_highest_priority_incremental(void)
nb_same_day = 0;
nb_disk_today = 0;
nb_disk_same_day = 0;
+ same_day_lev0 = 0;
for(dp1 = schedq.head; dp1 != NULL; dp1 = dp1->next) {
- if(est(dp1)->dump_est->level == 0)
+ if (est(dp1)->dump_est->level == 0) {
nb_disk_today++;
- else if(est(dp1)->next_level0 == est(dp)->next_level0)
+ } else if (est(dp1)->next_level0 == est(dp)->next_level0) {
nb_disk_same_day++;
+ same_day_lev0 += est(dp1)->last_lev0size;
+ }
if(g_str_equal(dp->host->hostname, dp1->host->hostname)) {
if(est(dp1)->dump_est->level == 0)
nb_today++;
@@ -2952,10 +3039,23 @@ static int promote_highest_priority_incremental(void)
if(new_total > tape_length)
continue;
- /* do not promote if overflow balanced size and something today */
+ /* do not promote if todays full is larher than that day full */
+ if (new_lev0 > same_day_lev0) {
+ /* TODO: It might be good to promote in some case */
+ /* 6 1 1 1 1 1 1 */
+ /* 9 5 */
+ /* promoting the 5 on the first day and */
+ /* promoting 3x1 the next day */
+ continue;
+ }
+
+ /* do not promote if not overflow balanced size and something today */
+ /* and if today full will be larger than that day full */
/* promote if nothing today */
- if((new_lev0 > (gint64)(balanced_size + balance_threshold)) &&
- (nb_disk_today > 0))
+ if ((new_lev0 < (gint64)(balanced_size + balance_threshold)) &&
+ (same_day_lev0 < (gint64)(balanced_size + balance_threshold)) &&
+ (new_lev0 > same_day_lev0 - level0_est->csize) &&
+ (nb_disk_today > 0))
continue;
/* do not promote if only one disk due that day and nothing today */