Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Garance AE Drosehn

> On Sep 7, 2017, at 2:54 PM, Scott Robison  wrote:
> 
> On Thu, Sep 7, 2017 at 11:55 AM, Garance AE Drosehn wrote:
>> 
>> De-duplication works great in some other situations because the other 
>> service is de-duplicating files across (say) 20 different Windows machines.  
>> So the first machine may use up 10-gig of space, but all of the rest of the 
>> machines might have less than a gig of additional unique files per machine.  
>> Tarsnap cannot do that kind of optimization across multiple machines.
> 
> Depending on just how paranoid you are, and how disciplined you are,
> you can have multiple machines deduplicate to a common store, as long
> as you only have one running at a time.
> 
> What you wrote does not preclude that from happening, I just wanted to
> point it out in case others might be interested in something like
> that.

Hmm. Yes, that's an interesting point.  I can see how that could be setup and 
work okay.  I hadn't thought about doing things that way, but I see how it 
could work as long as it was setup with care.

But I'm not going to do it for my own machines!  :)

-- 
Garance Alistair Drosehn=  gadc...@earthlink.net
Senior Systems Programmer   or   g...@freebsd.org



Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Scott Robison
On Thu, Sep 7, 2017 at 11:55 AM, Garance AE Drosehn
 wrote:
>
>> On Sep 7, 2017, at 4:21 AM, a...@sdf.org wrote:
>>
>>> So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can
>>> explain what kind data are part of those 1.45 GB (can give a breakup too
>>> if needed).
>>>
>>> Can I compress it more, make de-duplication more aggressive? I have used
>>> the default config that came with the app.
>>
>> Or should I ask it in a separate thread? Or rather shoot a mail to Tarsnap
>> support? (But I see Graham is CCed in this mail thread).
>>
>> Thanks.
>
> There is one thing to note about de-duplication in tarsnap, compared to some 
> other services.  Tarsnap can only de-duplicate across the files *you* are 
> backing up, on a single specific machine you are backing up.  This is a 
> side-effect of all the privacy and encryption that tarsnap guarantees.
>
> De-duplication works great in some other situations because the other service 
> is de-duplicating files across (say) 20 different Windows machines.  So the 
> first machine may use up 10-gig of space, but all of the rest of the machines 
> might have less than a gig of additional unique files per machine.  Tarsnap 
> cannot do that kind of optimization across multiple machines.

Depending on just how paranoid you are, and how disciplined you are,
you can have multiple machines deduplicate to a common store, as long
as you only have one running at a time.

What you wrote does not preclude that from happening, I just wanted to
point it out in case others might be interested in something like
that.

-- 
Scott Robison


Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Graham Percival
On Thu, Sep 07, 2017 at 08:21:42AM -, a...@sdf.org wrote:
> I believe too much feedback in my last mail might have been off-putting
> and judgemental (and kinda unsolicited advice) :-) Or not.

Hi Amar,

No, your previous mail wasn't too judgemental at all!  Sorry for the delay.

> > I am backing up 1.45 GB and I see the usage are 1.07 GB. Yes, there are
> > other archives but they are in KBs. Even after the first archive I believe
> > it was either 1.07 GB or something very close to it - it was > 1 GB.
> >
> > So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can
> > explain what kind data are part of those 1.45 GB (can give a breakup too
> > if needed).
> >
> > Can I compress it more, make de-duplication more aggressive? I have used
> > the default config that came with the app.

The short answer is no, you cannot alter any parameters of Tarsnap's
deduplication and compression.

At first glance, going from 1.45 GB to 1.07 GB looks reasonable.  It very much
depends on what data you're backing up, of course!  If you'd like to give me
more details about what you're backing up (either publicly or privately), I
could try to make a better guess.

Tarsnap's deduplication is primarily useful when you archive the same
directories later on -- it will only upload data that's changed (plus a little
bit of metadata).  For example, one of the tarsnap.com servers generates an
archive every hour; the sum of all those archives is 96000 GB, but thanks to
deduplication that's reduced to 56 GB, and after compression (that particular
data) is 16 GB:
http://www.tarsnap.com/deduplication.html

Note that deduplication works best if your data is not already compressed:
http://www.tarsnap.com/tips.html#compression

I'm guessing that your data includes some photos?  Most image formats are
already compressed, so Tarsnap's deduplication and compression won't do much to
their file size.


If you'd like a technical discussion about the details of deduplication, see
https://www.tarsnap.com/download/EuroBSDCon13.pdf
One detail that isn't covered there is that after deduplication, each block is
compressed with zlib.  There's technical info about zlib and the DEFLATE method 
here:
https://zlib.net/zlib_tech.html

Cheers,
- Graham


Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Graham Percival
On Thu, Sep 07, 2017 at 03:55:07PM -0400, Garance AE Drosehn wrote:
> > Depending on just how paranoid you are, and how disciplined you are,
> > you can have multiple machines deduplicate to a common store, as long
> > as you only have one running at a time.
> 
> Hmm. Yes, that's an interesting point.  I can see how that could be setup and
> work okay.  I hadn't thought about doing things that way, but I see how it
> could work as long as it was setup with care.

We have a few notes about the process:
http://www.tarsnap.com/multiple-machines.html
but it's almost certainly not worth it for for general-purpose desktop
computers.

Cheers,
- Graham Percival