On Sat, Nov 28, 2009 at 7:57 AM, Ralf Gross <ralf-li...@ralfgross.de> wrote:
> Arno Lehmann schrieb:
> > 27.11.2009 13:23, Ralf Gross wrote:
> > > [crosspost to -users and -devel list]
> > >
> > > Hi,
> > >
> > > we are happily using bacula since a few years and already backing up
> > > some dozens of TB (large video files) to tape.
> > >
> > > In the next 2-3 years the amount of data will be growing to 300+ TB.
> > > We are looking for some very pricy solutions for the primary storage
> > > at the moment (NetApp etc). But we (I) are also looking if it is
> > > possible to go on with the way we store the data right now. Which is
> > > just some large raid arrays and backup2tape.
> >
> > Good luck... while I agree that SAN/NAS appliances tend to look
> > expensive, they've got their advantages when your space has to grow to
> > really big sizes. Managing only one setup, when some physical disk
> > arrays work together is one of these advantages.
>
> I fully agree. But this comes with a price that is 5-10 time higher
> than a setup with simple RAID arrays and a large changer. In the end
> I'll present 2 or 3 concepts and others will decide how valuable the
> data is.
>
>
> > Also, if you're running a number of big RAID arrays, reliable vendor
> > support is surely beneficial.
> >
> > > I've no problem with the primary storage and 10 very big raid boxes
> > > (high availability is not needed). What frightens me is to backup all
> > > the data. Most files will be written once and maybe never been
> > > accessed again.
> >
> > Yearly full backups, and then only incrementals (using accurate backup
> > mode) should be a usable approach. Depending on how often you expect
> > to need recovery, you may want your boss to spend some money on a
> > bigger tape library :-) to make sure most data can be restored without
> > manually loading tapes.
>
> A big 500 slot lib ist already part of the idea.
>
>
> > > But the data need to be online and there is a
> > > requirement for backup and the ability to restore deleted files
> > > (retention time can be different, going from 6 months to a couple of
> > > years).
> >
> > 6 months retention time and actually pruning data would be easier with
> > full backups more often than one year, probably.
> >
> > I think you should start by defining how long you want to keep your
> > data, how to do full backups when those jobs will surely run longer
> > than your regular backup windows (either splitting the jobs into
> > smaller parts, or making sure you can run backups over a
> > non-production network; measuring impact of backups on other file
> > system accesses).
>
> Some of the data will only be for a couple of months on the filer,
> some for a couple of years. The filer(s) won't be that busy, and there
> is a dedicated LAN for backups.
>
>
> > > The snapshot feature of some commerical products is a very nice
> > > feature for taking backups and it's a huge benefit that only the
> > > deltas are stored.
> >
> > You can build on snapshot capability of SAN filers with Bacula. You'll
> > still get normal file backups, but that's an advantage IMO... the most
> > useful aspect of those snapshots is that you get a consistent stae of
> > the file system, and don't affect production access more than necessary.
> >
> > > Backup2tape means that with the classic Full/Diff/Incr setup we'll
> > > need many tapes. Even if the data on the primary storage won't change.
> >
> > Sure - a backup is reliable if you've got at least two copies of your
> > files, so for 300 TB, you'll need some tapes. But tapes are probably
> > cheaper than the required disk capacity for a NetApp filer :-)
>
> Compared to a NetApp filer, tapes are definitly cheaper. Using cheaper
> raid arrays it might be a bit different.
>
> We are in a similar situation where we will be possibly expanding our
storage much faster than we ever anticipated. We recently attended a small
vendor show and I was impressed by F5 APX product. Our data would be mostly
read/written for a short while then sit for a long time and this data would
be intermixed. We will also be presenting this data over cifs. F5 APX seems
to be a great fit for the following reasons: 1. We can put policies on the
data and tier it transparently, when data has not been accessed for 90 days,
move it to tier 2 disk and after another predefined time move it to tier 3
(dedup box or something). 2. Cut down back-up times significantly, we would
do GFS on tier 1 (only files that have recently changed), then do monthlies
on tier2 right after the policy moves it to tier2 and then do something like
once every 6 months on tier 3. With tier 3 being on a dedup box, we may even
be able to write backups to the dedup box and get backups for free. We have
a Neo8000 that we are going to try to wait until LTO5 comes out to upgrade
the drives on to be our archival. I was looking for a transparent teiring
solution, I didn't realize how it could reduce back-up time as well. The
only thing I'm not sure if F5 can handle is Shadow Copy. I'm not sure if
there is anything else out there that does the same thing as F5, but we will
be looking into it before we purchase. That may give you an idea that you
may not have had previously.
Robert LeBlanc
Life Sciences & Undergraduate Education Computer Support
Brigham Young University
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users