Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-21 Thread amar
Thank you for the detailed explanation, Graham.

> could be DICOM images (which include compression)

Yes, they are from a CT Scan.

> text form or "Training Center XML",

Yes.

> reduction of 1.45 GB to 1.07 GB for the initial archive, on the data you
> provided, looks quite reasonable".

This makes sense to me now, after you explained it so clearly. Thank you
again.

--
Amar



Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-21 Thread Graham Percival
On Thu, Sep 21, 2017 at 06:58:43PM -, a...@sdf.org wrote:
> Sorry for the delay. Here's the file type distribution (some are unknown;
> I just shared as powershell printed):

Thanks!  This looks quite reasonable to me.  Just to keep the data together, in
another email you wrote:

> I am backing up 1.45 GB and I see the usage are 1.07 GB.

Disregarding the blank extensions in your list, and only looking at extensions
which are more than 50 MB, we have:

> Extension  Size (MB) Count
> -  - -
> jpg   391.85  4653
> pdf   180.03   456
> tcx108.9   146
> mp4 80.2 7
> dcm63.2656

jpg files are typically compressed.  This can be disabled in most image
programs, but I'm willing to bet that these are compressed.  Tarsnap's
deduplication will be useful when you make additional archives, but Tarsnap's
compression stage will not save you a lot of space on those jpgs.

Same goes for mp4 and pdf -- they're already compressed.

I'm not certain about "tcx" and "dcm" files.  Googling suggests that the latter
could be DICOM images (which include compression) or "DCM audio module" (I'm
not certain about the compression status there).  "tcx" could be TurboCAD in
text form or "Training Center XML", both of which *will* benefit from Tarsnap's
compression.

Assuming that the above guesses are plausible, and only looking at the small
table of file extensions, there's 823 Mb total data, of which we could expect
to see significant reduction in 171 Mb of that (due to compression).  This
value depends a huge amount on the actual content, so people are quite
reluctant to say anything like "assuming typical user data, DEFLATE will give
you a reduction of xy%".

That said, let's assume that we reduce 75% of that 171 MB.  That saves us 128
MB.

You saw a reduction of 1.45 GB to to 1.07 GB, or a saving of 380 MB.  I was
looking at roughly half of your data, so let's double the estimated saving and
get 250 MB.  So you're seeing more compression than we expected.


There's a huge amount of quibble room here.  Deduplication will probably have
/some/ kind of benefit to your initial data.  Your jpg files were probably
compressed by a further 0.2% or 0.5% by Tarsnap.  Your data in file extensions
less than 50 MB might be easier to compress.  And the figure of "75%" was a
complete guess on my part.

Still, in terms of a "back-of-the-napkin analysis", the answer is "yes, your
reduction of 1.45 GB to 1.07 GB for the initial archive, on the data you
provided, looks quite reasonable".

Cheers,
- Graham


Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-21 Thread amar
Hi Graham,

Sorry for the delay. Here's the file type distribution (some are unknown;
I just shared as powershell printed):


Extension  Size (MB) Count
-  - -
txt 7.88   630
  236.45   846
plist   0.2813
doentry 0.01 5
jpg   391.85  4653
xls 2.38 6
tcx108.9   146
html   22.92   805
vcf 4.4428
csv 3.1816
aclcddb 0.03 1
abcddb  1.78 3
abcdmr  0.04 1
tbz 0.11 1
jpeg2.38   208
abcdp   0.86   711
abcdi  0 3
lockfile   0 6
abcdg   0.01 3
backup  0.13 1
ics 0.01 1
vcs0 1
ldif0.07 1
xml  5.2   214
smsbackup   0.03 1
png 27.4   641
pdf   180.03   456
doc 5.3419
properties  0.0128
gif  0.647
json   18.61 20479
expense...  0.04 1
htm12.1131
css 2.76   156
zip 7.9831
gpx 0.74 3
enex0.23 2
js 10.93   306
webp0.01 5
asc 0.0720
txt~   0 3
sig0 1
kdb0 2
kdbx0.43 4
tar 8.05 1
mp3 0.09 3
unknown 0.01 8
mp4 80.2 7
dat0 3
im  0.01 1
lm 0 1
torrent 0.1211
pptx1.91 2
accdb   3.38 1
odt 4.8215
php 0.01 5
odp 2.85 4
ppt 3.94 5
dll 6.4639
jar 0.02 2
bat0 3
exe  4.827
stackdump  0 1
class   0.2225
C   0.0112
java0.2323
gz  0.14 3
mdb 0.3624
userprefs   0.01 3
usertasks  0 1
cs   0.695
csproj  0.0520
pidb0.2517
xsd 0.01 2
ico 0.0210
resources   0.04 5
docx2.1732
sln 0.0319
xsc0 1
xss0 1
pfx0 1
user   0 4
resx0.11 7
settings   0 2
rtf 0.03 5
as  0.01 5
swf15.5734
air 1.43 4
mxml 0.129
out 0.07 6
cpp 0.0753
obj 0.1113
csm 0.23 1
lock   0 1
cfs22.03 1
stamp  0 1
swc 0.17 3
md 0 1
vim 0.01 3
sh  0.07 8
makefile   0 2
vmb0 1
markdown0.01 2
sdf 1.98 2
suo 0.2417
rc 0 2
vcxproj 0.01 2
filters0 2
h  0 2
lastbui... 0 2
pch 2.44 1
unsucce... 0 2
tlog0.0110
pdb 0.38 9
res0 1
manifest   0 1
idb 0.02 1
mov 0.16 2
xhtml   0.02 1
aspx0.2333
vb  0.1766
config  0.03 9
xslt0.0110
vbproj  0.02 4
Cache   0.01 4
myapp  0 2
vsmacros0.71 1
asmx   0 1
master 0 1
xsl0 2
refresh0 3
bmp 1.95 5
disco  0 1
discomap   0 1
wsdl   0 1
cnf0 6
btr 0.04 3
lck0 1
py 0 4
url0 1
flv 3.04 1
M4A 0.03 1
deb 0.61 4
rpm 0.16 1
ods 0.03 2
7z  0.94 9
pc 0 1
stetic  0.02 2
test-cache  0.02 8
confi  0 1
c~ 0 1
o   0.04 1
8  0 1
go 0 1
go~0 1
pl  0.02 2
tcl0 1
aux0 1
tex 0.14 2
xps 0.94 7
inf0 1
dcm63.2656
RAW34.11 3
mobi1.84 1
3gp 0.89 1
archive0 2
tif 3.84 2
nameman... 0 1
timesin... 0 1
usherli... 0 1
dmg 9.67 1
bckup  0 1



Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Graham Percival
On Thu, Sep 07, 2017 at 03:55:07PM -0400, Garance AE Drosehn wrote:
> > Depending on just how paranoid you are, and how disciplined you are,
> > you can have multiple machines deduplicate to a common store, as long
> > as you only have one running at a time.
> 
> Hmm. Yes, that's an interesting point.  I can see how that could be setup and
> work okay.  I hadn't thought about doing things that way, but I see how it
> could work as long as it was setup with care.

We have a few notes about the process:
http://www.tarsnap.com/multiple-machines.html
but it's almost certainly not worth it for for general-purpose desktop
computers.

Cheers,
- Graham Percival


Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Garance AE Drosehn

> On Sep 7, 2017, at 2:54 PM, Scott Robison  wrote:
> 
> On Thu, Sep 7, 2017 at 11:55 AM, Garance AE Drosehn wrote:
>> 
>> De-duplication works great in some other situations because the other 
>> service is de-duplicating files across (say) 20 different Windows machines.  
>> So the first machine may use up 10-gig of space, but all of the rest of the 
>> machines might have less than a gig of additional unique files per machine.  
>> Tarsnap cannot do that kind of optimization across multiple machines.
> 
> Depending on just how paranoid you are, and how disciplined you are,
> you can have multiple machines deduplicate to a common store, as long
> as you only have one running at a time.
> 
> What you wrote does not preclude that from happening, I just wanted to
> point it out in case others might be interested in something like
> that.

Hmm. Yes, that's an interesting point.  I can see how that could be setup and 
work okay.  I hadn't thought about doing things that way, but I see how it 
could work as long as it was setup with care.

But I'm not going to do it for my own machines!  :)

-- 
Garance Alistair Drosehn=  gadc...@earthlink.net
Senior Systems Programmer   or   g...@freebsd.org



Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Scott Robison
On Thu, Sep 7, 2017 at 11:55 AM, Garance AE Drosehn
 wrote:
>
>> On Sep 7, 2017, at 4:21 AM, a...@sdf.org wrote:
>>
>>> So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can
>>> explain what kind data are part of those 1.45 GB (can give a breakup too
>>> if needed).
>>>
>>> Can I compress it more, make de-duplication more aggressive? I have used
>>> the default config that came with the app.
>>
>> Or should I ask it in a separate thread? Or rather shoot a mail to Tarsnap
>> support? (But I see Graham is CCed in this mail thread).
>>
>> Thanks.
>
> There is one thing to note about de-duplication in tarsnap, compared to some 
> other services.  Tarsnap can only de-duplicate across the files *you* are 
> backing up, on a single specific machine you are backing up.  This is a 
> side-effect of all the privacy and encryption that tarsnap guarantees.
>
> De-duplication works great in some other situations because the other service 
> is de-duplicating files across (say) 20 different Windows machines.  So the 
> first machine may use up 10-gig of space, but all of the rest of the machines 
> might have less than a gig of additional unique files per machine.  Tarsnap 
> cannot do that kind of optimization across multiple machines.

Depending on just how paranoid you are, and how disciplined you are,
you can have multiple machines deduplicate to a common store, as long
as you only have one running at a time.

What you wrote does not preclude that from happening, I just wanted to
point it out in case others might be interested in something like
that.

-- 
Scott Robison


Re: Tarsnap GUI shows 0 data archived or backed up

2017-09-07 Thread Graham Percival
On Thu, Sep 07, 2017 at 08:21:42AM -, a...@sdf.org wrote:
> I believe too much feedback in my last mail might have been off-putting
> and judgemental (and kinda unsolicited advice) :-) Or not.

Hi Amar,

No, your previous mail wasn't too judgemental at all!  Sorry for the delay.

> > I am backing up 1.45 GB and I see the usage are 1.07 GB. Yes, there are
> > other archives but they are in KBs. Even after the first archive I believe
> > it was either 1.07 GB or something very close to it - it was > 1 GB.
> >
> > So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can
> > explain what kind data are part of those 1.45 GB (can give a breakup too
> > if needed).
> >
> > Can I compress it more, make de-duplication more aggressive? I have used
> > the default config that came with the app.

The short answer is no, you cannot alter any parameters of Tarsnap's
deduplication and compression.

At first glance, going from 1.45 GB to 1.07 GB looks reasonable.  It very much
depends on what data you're backing up, of course!  If you'd like to give me
more details about what you're backing up (either publicly or privately), I
could try to make a better guess.

Tarsnap's deduplication is primarily useful when you archive the same
directories later on -- it will only upload data that's changed (plus a little
bit of metadata).  For example, one of the tarsnap.com servers generates an
archive every hour; the sum of all those archives is 96000 GB, but thanks to
deduplication that's reduced to 56 GB, and after compression (that particular
data) is 16 GB:
http://www.tarsnap.com/deduplication.html

Note that deduplication works best if your data is not already compressed:
http://www.tarsnap.com/tips.html#compression

I'm guessing that your data includes some photos?  Most image formats are
already compressed, so Tarsnap's deduplication and compression won't do much to
their file size.


If you'd like a technical discussion about the details of deduplication, see
https://www.tarsnap.com/download/EuroBSDCon13.pdf
One detail that isn't covered there is that after deduplication, each block is
compressed with zlib.  There's technical info about zlib and the DEFLATE method 
here:
https://zlib.net/zlib_tech.html

Cheers,
- Graham


Re: Tarsnap GUI shows 0 data archived or backed up

2017-08-31 Thread amar
Hi,

Thanks again.

> 2. You started a backup for said Job at 00-57-08 and then another which
> failed with:
> [18/08/17 1:04 AM] Backup Job_weekly tarsnap backup_2017-08-18_01-04-30
> failed: tarsnap: Transaction already in progress

I was not aware of this. I didn't get any notifications/prompt either. I
thought it could run more than one backup in parallel. My bad. Maybe it's
in the docs which I didn't go through. Also, I started using Tarsnap only
because of this GUI. I have known about since 2010 (iirc) but I waited
till the GUI arrived. So I kind of thought I will see what happens loud
and clear.

(off-topic - I am not CLI averse, but I, for some reason, I prefer GUI for
backup etc)

> Which is correct since you can't run multiple create/delete operations
> with tarsnap, these two operations are mutually-exclussive.
>
> 3. You exited the app and ran it again, while the first backup ran in the
> background still. You tried a manual backup again which failed for the
> same reason.

Got it now.

> 4. You started a new session when you woke up, I see 8:40 AM. You queued a
> dropbox on-demand backup which started running immediately. I see no
> backup finished for the initial backup at 2017-08-18_00-57-08 which I
> assume you killed the process either by system shutdown or otherwise.

I had closed the lid of my Mac. That must be the reason. I didn't know
that the previous one was killed. I couldn't see it in the GUI. Actually I
couldn't see much (and I believe a lot about this may have to do with the
fact that I know little about this app - downloaded and started using with
similar expectations as some popular backup apps and since this one is
quite new I shouldn't have)

> 5. Then you followed with more actions which failed with:
> [18/08/17 1:35 PM] Backup Job_weekly tarsnap backup_2017-08-18_10-03-42
> failed: tarsnap: Error looking up v1-0-0-server.tarsnap.com: nodename nor
> servname provided, or not known
> These are network issues, check your firewall if you have one or could
> have been just a temporary network problem on your side.

No. Not something I can remember. Maybe a few seconds or at max few
minutes but definitely not more than that.

Firewall is off.

> I see you had better success with creating archives and jobs later on
> between 21-22. Can you confirm that you have those backups showing up in
> the GUI?

Yes, I have a job and it has been backing up daily. I think I will leave
it at that.

Yes, I still have that Job and couple of archives related to that.

Naming the columns in GUI will help.

Also, and this is just a feedback, instead of showing size of every
archive as full 1.55GB (in my case) as the prominent/visible size it will
be better to show the actual size/diff of that archive and show full size
elsewhere in info or in another column (if there are to be more columns).

Also when I see Settings > Account I see "total backup size" as 18.4 GB ~
1.55 x 12 (12 is my number of archives). While in a way this correct this
doesn't give the clear picture. That is definitely not the total backup or
data size either on my machine or Tarsnap servers. Yes, if done
individually it  would have been that but I think it's not.

I have doubt - that I might be missing something here. As in Tarsnap,
maybe creates separate snapshot points in time of all the data and
non-unique (previously backed up) data/chunks/pieces are kind of
referenced from where they are. Or something like that. Even in that case,
the total data being backed up shown as actual data on client x no. of
archives kinda seems weird.

Log could either be ON by defaults or shown in at least the GUI setup
steps as an option. I believe this is very crucial in seeing what is
happening especially since GUI is kinda new (not really new new though :P)

In fact I would like to see how much of my data is currently selected for
being backed up and I am not able to see that. It is 1.45 GB.

I have a questions though and instead of asking that in a separate thread
I thought I will ask that here itself:

I am backing up 1.45 GB and I see the usage are 1.07 GB. Yes, there are
other archives but they are in KBs. Even after the first archive I believe
it was either 1.07 GB or something very close to it - it was > 1 GB.

So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can
explain what kind data are part of those 1.45 GB (can give a breakup too
if needed).

Can I compress it more, make de-duplication more aggressive? I have used
the default config that came with the app.

Regards.

--
Amar



Re: Tarsnap GUI shows 0 data archived or backed up

2017-08-18 Thread Graham Percival
On Fri, Aug 18, 2017 at 06:55:54AM -, a...@sdf.org wrote:
> Also, I see 0 archives. I am wondering why no archives were created.
> Because when I created a backup it goes to "Backup" tab and when I create
> a job it goes to "Jobs" tab. Nothing ever shows up in "Archives". Neither
> did anything show up in the task's Archive tab in the morning.
> 
> I had followed a combination of
> http://shinnok.com/rants/2016/02/19/using-tarsnap-gui-on-os-x/ and Getting
> Started on tarsnap.com.

I'm sorry that this isn't working well.  Let's find out where your data is.  I
suggest using the command-line interface for this, since it's easier to give
command via email.

The first question is "how many key files do you have?":
- in "Getting Started", did you create a key by running tarnsap-keygen ?
  (that's step 5. Register the machine(s) on which you will be using Tarsnap)
- in the GUI setup wizard, did you use the "Create new keyfile" or "Use
  existing keyfile" option?  (note that unfortunately the docs on shinnok.com
  is 1.5 years old, so the screens don't match the current GUI)

If you have a single keyfile, we can see if there's any archives with:
  $ tarsnap --list-archives
(if you put it in /root/tarsnap.key, or else
  $ tarsnap --list-archives --keyfile ~/location/of/tarsnap.key --cachedir 
/tmp/temp-tarsnap-cacheidr

If you have two (or more) keyfiles, you can run the above command multiple
times, once for every keyfile.


Once we know where your data is, we can work on getting the GUI to recognize
it.  (I suspect that you might have multiple keyfiles, which would explain the
GUI's confusion.)

Cheers,
- Graham Percival