Elizabeth Schwartz wrote:
> Fixed-in-stone parameters:
>
> 1) We outsource backup to central IT.
> 2) We are charged by the gigabyte, per backup run
> 3) The backup provider uses Legato
> 4) We would like to minimize backup cost.
> 5) Our data tends to be large and "clumpy" - in some random 1-to-4-day
> period, someone will write 10 gigs of data which will then sit there
> for months unchanged.
> 6) The data is on a very very reliable SAN so full restores are
> unlikely . We are willing to risk slow restores
> 7) We do individual file restores  occasionally. No more than  once a month 
> max.
>
>
> The current backup strategy, which we didn't design and *** initially
> were unable to negotiate changes in *** is to do a full every 59 days,
> then a weekly rotation of 7,6,4,5,3,2,1 (which is the same as doing
> 2,2,2,2,2,2,1 , I know). This strategy minimizes tapes needed to do a
> full restore, but for data like ours it tends to maximize *cost*.
> It's time to  negotiate a change.
>
> The obvious thing would be to go to full every 59 days and a
> 1,2,3,4,5,6,7 rotation, but I'm thinking we should get a little
> creative. If we do a full on Sunday and someone dumps 10G on Monday,
> that 10G will get backed up eight times.  I'm thinking the first week
> should start with a 1, second with a 2, third with a 3, etc ...
> (Legato has 9 levels, I believe...?)
>
> any thoughts, suggestions, clue bonks?

So, you are not doing daily backups? only weekly?

If so, then 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 would get you the incremental 
changes each week (for 10 weeks) with no duplication of what had already 
been backed up. I'm not a fan of such approaches. I do daily backups and 
keep them going back 6 weeks as well as periodic archives. However, like 
you said, you need to minimize cost and aren't so concerned about fulls, 
because you have faith in your SAN. I won't get into that. It is what it is.

Your comment about the 10G getting backed up eight times doesn't make 
sense to me. The Unix definition of the levels 0 through 9 is that 0 is 
full and each higher level backs up anything that has changed since the 
last backup of a lower level. If the definition was anything that has 
changed since the last level 0, then there would be no difference 
between the levels and no point in having them.

If you wanted to run daily backups and still minimize duplication (and 
thus costs), you could run something like:

week 1: level 0,1,2,3,4 (running the 0 on Friday night and skipping 
Saturday and Sunday night)
week 2: level 1,2,3,4,5 (running the 1 on Friday . . .)
week 3: level 2,3,4,5,6
week 4: level 3,4,5,6,7
week 5: level 4,5,6,7,8
week 6: level 5,6,7,8,9

Slightly odd definition of a week, since I'm taking the full on a Friday 
and calling the week Friday through Thursday. It really is getting the 
starting point for the week at the end of the last week and then getting 
incrementals for each day of the week.

Then you would have any 10G file backed up at most twice, since items 
done by the daily incremental backups from Monday through Thursday would 
be included in the weekly incremental on Friday. After 6 weeks you would 
run out of levels and might as well start out again with a full.

These approaches apply, of course, to minimizing cost. Like you said, IT 
was doing their scheme based on trying to minimize the number of tapes 
required for a full recovery. Oftentimes, people come up with schemes 
that compromise between the two approaches. I really like Amanda's 
intelligent planner, because it minimizes the variation in resource 
usage, smoothing server demand, tape use, and network use over the dump 
cycle (say, a week). I don't think there is any other product that does 
that. But that doesn't solve your problem either. Works for me, because 
we run our own backups.


-- 
---------------

Chris Hoogendyk

-
   O__  ---- Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst 

<[email protected]>

--------------- 

Erdös 4


_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to