2011/1/1 Kern Sibbald <k...@sibbald.com>

Hello Kern and others,


> The first thing that one must do is specify what problem of deduplication
> one
> is trying to resolve:
>
> 1. Deduplication by the Bacula Storage daemon
>
> 2. Deduplication in the Bacula Client (File daemon)
>
> 3. Deduplication by the underlying filesystem where the SD writes data
> (e.g.
> ZFS).
>
>
4. Global deduplication performed on File Daemon but with dictionary
maintained on Bacula Director/Storage Daemon
- backup of particular data block isn't performed when SD already has a such
data block, no matter which client is an original owner of the block
- reduces data stored on SD like p.1 or p.3 approaches AND reduces network
traffic like p.2 approach

Use case: A company has one production database (or vm image file) and
multiply test/development environments, all with backup. In most cases
difference between all of those databases (vm images) is less then 1% of
data blocks. During backup only 1% of data blocks is backuped and send
through network.


> (...)
> Item 1 is probably something that will never be needed due to the fact that
> there are more and more very good filesystems that already do the job
> especially if a new (additional) Volume format were to be implemented.
>
>
As Howard mentioned earlier currently there are no serious dedup enabled fs
at production stage (excluding solaris/zfs which is not opensource any
more). You can use dedicated appliance like Data Domain's products but it is
a different kind of solution.


> I've noticed that a few months after we discussed various features, the
> same
> thing was implemented by Zmanda, so I am a bit reluctant to give any
> details.
>

Wow, sounds like a some kind of conspiracy :)


> However, if there are programmers that want do development, we would be
> happy
> to discuss off list.  Please keep in mind that we sometimes receive patches
> that programmers have made without discussing it with us, and often such
> patches are not appropriate for Bacula for lots of reasons: limited to a
> particular OS, doesn't respect coding standards, is not scalable, doesn't
> fit
> Bacula way of doing things, doesn't use Bacula "infrastructure" (mostly
> libbac.so), ...
>

Is it possible to publish those patches somewhere? It could be useful to
others.

regards

-- 
Radosław Korzeniewski
rados...@korzeniewski.net
------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to