Re: [BackupPC-users] Problems with hardlink-based backups...

dan Wed, 02 Sep 2009 19:28:57 -0700

>
>
> You seem to have the illusion that sql can magically avoid the head motions
> that
> make backuppc slow while still getting the same things on the same disks.
>  While
> it is possible to tune most sql servers to put different tables on
> different
> drives, there's a fair chance that in a default configuration it will
> perform
> worse than using links.
>
>
I dont think so.  MySQL like any database is designed to handle small chunks
of data very efficiently.  It will cache and reorder writes much better than
a filesystem for pieces of data that are consistantly the same size.
Remember, I dont think for 1 second that the files themselves should be
storage in the database.  We are talking about the difference between head
movement over a large disk vs head movement on a tiny file (relative to the
cpool).   We are also talking about entries in the database that are just
attributes, filename, hash, disk location, dates..  Not big data.



> I'm not sure being platform-limited matters that much.  Why wouldn't I be
> able
> to install OpenSolaris/zfs on anything where I'd be likely to run a
> database?
>
> I guess we looked at this from different angles.  What if you are a linux
admin and not a solaris admin or moreover, what if you think Sun is not a
terrible honest company and might switch teams.  I guess there are plenty of
political/religeous/whatever reasons to stay away from any one platform.  I
also think linux is per stable and performant than solaris and also think
that btrfs may be a good or better solution..



> >     - Allows for more granular security and access controls to backups
> >
> > How about much much easier PHP work getting backup info and a faster
> > backuppc interface only having to hit the database for all its info and
> > not having to touch the filesystem.
>
> I don't have a lot of use for backuppc info other than knowing that it
> completed.
>
> I like to access the data quickly.  My servers have so much data that the
cgi cannot be used to pull data.



> > One of the biggest concerns with backuppc that is constantly discussed
> > on this list is syncing the backup data between two or more servers.
> > Simply reducing the file count by eliminating the hardlinks would allow
> > rsync to be used reliably and effectively.  SQL replication can keep
> > metadata updated constantly and a watchdog that monitors the SQL for
> > changes could keep the filesystems that store data synced easily as
> > well.
>
> Maybe, but again it won't do what you want by default.  Most sql
> replication
> schemes work more or less in real time which probably isn't what you want
> at all.
>
>
Because mysql master->slave or master-master uses transaction logs you dont
have to have the slave online.  when you start it up it will sync up from
the transaction log.


>  >  Once the metadata and config moves to a database, so many things
> > become very easy.
>
> No, they just become different. You now have to use database-specific tools
> to
> touch anything and if the file contents aren't included in the database
> you've
> made it impossible to do atomic operations.
>

I would say they become easier.  Simply speeking, you can do a select on the
database for files ending is *.xls|*.doc with a date more recent than last
friday and show filepath and filename and also only show the newest of each
file and do it by host/backup/date/etc.  You can execute a script to take
the filepath and filename and copy the file from the path and name it
properly as well as apply permissions or ACL to it and you can do this in
php in the browser or on the command line with a single piece of code.


> > A single backuppc server could handle many more
> > concurrent backups because multple data storage devices can seperate IO
> > and relieve the pressure on the IO system of the OS.
>
> That has yet to be proven.
>

I cant argue that directly but we can make SOME assumptions that will hold
true.  writing 2 files to 2 different disks will be faster than writing 2
files to once disk, all things being equal.  seperating the most IO
intensive tasks to devices that are better at IO will improve performance in
direct IO(this is true now if you use backuppc on fast SSD drives, its just
too expensive in $$).  There are a lot of unknowns.  How much performance
would be gained by pushing metadata off to disk?  would that performance
improvement show up in a backup or can it only be realized in synthetic
benchmarks.  I have zero doubt that some parts will be significantly faster
but will conceed that those parts may only make up 5% of the total backup
time, which could save a whopping 3%.  I feel that this would be
significantly faster because eliminating writing the hardlinks will save
head travel time, which binds up IO.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

_______________________________________________
BackupPC-users mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Re: [BackupPC-users] Problems with hardlink-based backups...

Reply via email to