Re: Tarsnap GUI shows 0 data archived or backed up
Thank you for the detailed explanation, Graham. > could be DICOM images (which include compression) Yes, they are from a CT Scan. > text form or "Training Center XML", Yes. > reduction of 1.45 GB to 1.07 GB for the initial archive, on the data you > provided, looks quite reasonable". This makes sense to me now, after you explained it so clearly. Thank you again. -- Amar
Re: Tarsnap GUI shows 0 data archived or backed up
On Thu, Sep 21, 2017 at 06:58:43PM -, a...@sdf.org wrote: > Sorry for the delay. Here's the file type distribution (some are unknown; > I just shared as powershell printed): Thanks! This looks quite reasonable to me. Just to keep the data together, in another email you wrote: > I am backing up 1.45 GB and I see the usage are 1.07 GB. Disregarding the blank extensions in your list, and only looking at extensions which are more than 50 MB, we have: > Extension Size (MB) Count > - - - > jpg 391.85 4653 > pdf 180.03 456 > tcx108.9 146 > mp4 80.2 7 > dcm63.2656 jpg files are typically compressed. This can be disabled in most image programs, but I'm willing to bet that these are compressed. Tarsnap's deduplication will be useful when you make additional archives, but Tarsnap's compression stage will not save you a lot of space on those jpgs. Same goes for mp4 and pdf -- they're already compressed. I'm not certain about "tcx" and "dcm" files. Googling suggests that the latter could be DICOM images (which include compression) or "DCM audio module" (I'm not certain about the compression status there). "tcx" could be TurboCAD in text form or "Training Center XML", both of which *will* benefit from Tarsnap's compression. Assuming that the above guesses are plausible, and only looking at the small table of file extensions, there's 823 Mb total data, of which we could expect to see significant reduction in 171 Mb of that (due to compression). This value depends a huge amount on the actual content, so people are quite reluctant to say anything like "assuming typical user data, DEFLATE will give you a reduction of xy%". That said, let's assume that we reduce 75% of that 171 MB. That saves us 128 MB. You saw a reduction of 1.45 GB to to 1.07 GB, or a saving of 380 MB. I was looking at roughly half of your data, so let's double the estimated saving and get 250 MB. So you're seeing more compression than we expected. There's a huge amount of quibble room here. Deduplication will probably have /some/ kind of benefit to your initial data. Your jpg files were probably compressed by a further 0.2% or 0.5% by Tarsnap. Your data in file extensions less than 50 MB might be easier to compress. And the figure of "75%" was a complete guess on my part. Still, in terms of a "back-of-the-napkin analysis", the answer is "yes, your reduction of 1.45 GB to 1.07 GB for the initial archive, on the data you provided, looks quite reasonable". Cheers, - Graham
Re: Tarsnap GUI shows 0 data archived or backed up
Hi Graham, Sorry for the delay. Here's the file type distribution (some are unknown; I just shared as powershell printed): Extension Size (MB) Count - - - txt 7.88 630 236.45 846 plist 0.2813 doentry 0.01 5 jpg 391.85 4653 xls 2.38 6 tcx108.9 146 html 22.92 805 vcf 4.4428 csv 3.1816 aclcddb 0.03 1 abcddb 1.78 3 abcdmr 0.04 1 tbz 0.11 1 jpeg2.38 208 abcdp 0.86 711 abcdi 0 3 lockfile 0 6 abcdg 0.01 3 backup 0.13 1 ics 0.01 1 vcs0 1 ldif0.07 1 xml 5.2 214 smsbackup 0.03 1 png 27.4 641 pdf 180.03 456 doc 5.3419 properties 0.0128 gif 0.647 json 18.61 20479 expense... 0.04 1 htm12.1131 css 2.76 156 zip 7.9831 gpx 0.74 3 enex0.23 2 js 10.93 306 webp0.01 5 asc 0.0720 txt~ 0 3 sig0 1 kdb0 2 kdbx0.43 4 tar 8.05 1 mp3 0.09 3 unknown 0.01 8 mp4 80.2 7 dat0 3 im 0.01 1 lm 0 1 torrent 0.1211 pptx1.91 2 accdb 3.38 1 odt 4.8215 php 0.01 5 odp 2.85 4 ppt 3.94 5 dll 6.4639 jar 0.02 2 bat0 3 exe 4.827 stackdump 0 1 class 0.2225 C 0.0112 java0.2323 gz 0.14 3 mdb 0.3624 userprefs 0.01 3 usertasks 0 1 cs 0.695 csproj 0.0520 pidb0.2517 xsd 0.01 2 ico 0.0210 resources 0.04 5 docx2.1732 sln 0.0319 xsc0 1 xss0 1 pfx0 1 user 0 4 resx0.11 7 settings 0 2 rtf 0.03 5 as 0.01 5 swf15.5734 air 1.43 4 mxml 0.129 out 0.07 6 cpp 0.0753 obj 0.1113 csm 0.23 1 lock 0 1 cfs22.03 1 stamp 0 1 swc 0.17 3 md 0 1 vim 0.01 3 sh 0.07 8 makefile 0 2 vmb0 1 markdown0.01 2 sdf 1.98 2 suo 0.2417 rc 0 2 vcxproj 0.01 2 filters0 2 h 0 2 lastbui... 0 2 pch 2.44 1 unsucce... 0 2 tlog0.0110 pdb 0.38 9 res0 1 manifest 0 1 idb 0.02 1 mov 0.16 2 xhtml 0.02 1 aspx0.2333 vb 0.1766 config 0.03 9 xslt0.0110 vbproj 0.02 4 Cache 0.01 4 myapp 0 2 vsmacros0.71 1 asmx 0 1 master 0 1 xsl0 2 refresh0 3 bmp 1.95 5 disco 0 1 discomap 0 1 wsdl 0 1 cnf0 6 btr 0.04 3 lck0 1 py 0 4 url0 1 flv 3.04 1 M4A 0.03 1 deb 0.61 4 rpm 0.16 1 ods 0.03 2 7z 0.94 9 pc 0 1 stetic 0.02 2 test-cache 0.02 8 confi 0 1 c~ 0 1 o 0.04 1 8 0 1 go 0 1 go~0 1 pl 0.02 2 tcl0 1 aux0 1 tex 0.14 2 xps 0.94 7 inf0 1 dcm63.2656 RAW34.11 3 mobi1.84 1 3gp 0.89 1 archive0 2 tif 3.84 2 nameman... 0 1 timesin... 0 1 usherli... 0 1 dmg 9.67 1 bckup 0 1
Re: Tarsnap GUI shows 0 data archived or backed up
On Thu, Sep 07, 2017 at 03:55:07PM -0400, Garance AE Drosehn wrote: > > Depending on just how paranoid you are, and how disciplined you are, > > you can have multiple machines deduplicate to a common store, as long > > as you only have one running at a time. > > Hmm. Yes, that's an interesting point. I can see how that could be setup and > work okay. I hadn't thought about doing things that way, but I see how it > could work as long as it was setup with care. We have a few notes about the process: http://www.tarsnap.com/multiple-machines.html but it's almost certainly not worth it for for general-purpose desktop computers. Cheers, - Graham Percival
Re: Tarsnap GUI shows 0 data archived or backed up
> On Sep 7, 2017, at 2:54 PM, Scott Robisonwrote: > > On Thu, Sep 7, 2017 at 11:55 AM, Garance AE Drosehn wrote: >> >> De-duplication works great in some other situations because the other >> service is de-duplicating files across (say) 20 different Windows machines. >> So the first machine may use up 10-gig of space, but all of the rest of the >> machines might have less than a gig of additional unique files per machine. >> Tarsnap cannot do that kind of optimization across multiple machines. > > Depending on just how paranoid you are, and how disciplined you are, > you can have multiple machines deduplicate to a common store, as long > as you only have one running at a time. > > What you wrote does not preclude that from happening, I just wanted to > point it out in case others might be interested in something like > that. Hmm. Yes, that's an interesting point. I can see how that could be setup and work okay. I hadn't thought about doing things that way, but I see how it could work as long as it was setup with care. But I'm not going to do it for my own machines! :) -- Garance Alistair Drosehn= gadc...@earthlink.net Senior Systems Programmer or g...@freebsd.org
Re: Tarsnap GUI shows 0 data archived or backed up
On Thu, Sep 7, 2017 at 11:55 AM, Garance AE Drosehnwrote: > >> On Sep 7, 2017, at 4:21 AM, a...@sdf.org wrote: >> >>> So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can >>> explain what kind data are part of those 1.45 GB (can give a breakup too >>> if needed). >>> >>> Can I compress it more, make de-duplication more aggressive? I have used >>> the default config that came with the app. >> >> Or should I ask it in a separate thread? Or rather shoot a mail to Tarsnap >> support? (But I see Graham is CCed in this mail thread). >> >> Thanks. > > There is one thing to note about de-duplication in tarsnap, compared to some > other services. Tarsnap can only de-duplicate across the files *you* are > backing up, on a single specific machine you are backing up. This is a > side-effect of all the privacy and encryption that tarsnap guarantees. > > De-duplication works great in some other situations because the other service > is de-duplicating files across (say) 20 different Windows machines. So the > first machine may use up 10-gig of space, but all of the rest of the machines > might have less than a gig of additional unique files per machine. Tarsnap > cannot do that kind of optimization across multiple machines. Depending on just how paranoid you are, and how disciplined you are, you can have multiple machines deduplicate to a common store, as long as you only have one running at a time. What you wrote does not preclude that from happening, I just wanted to point it out in case others might be interested in something like that. -- Scott Robison
Re: Tarsnap GUI shows 0 data archived or backed up
On Thu, Sep 07, 2017 at 08:21:42AM -, a...@sdf.org wrote: > I believe too much feedback in my last mail might have been off-putting > and judgemental (and kinda unsolicited advice) :-) Or not. Hi Amar, No, your previous mail wasn't too judgemental at all! Sorry for the delay. > > I am backing up 1.45 GB and I see the usage are 1.07 GB. Yes, there are > > other archives but they are in KBs. Even after the first archive I believe > > it was either 1.07 GB or something very close to it - it was > 1 GB. > > > > So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can > > explain what kind data are part of those 1.45 GB (can give a breakup too > > if needed). > > > > Can I compress it more, make de-duplication more aggressive? I have used > > the default config that came with the app. The short answer is no, you cannot alter any parameters of Tarsnap's deduplication and compression. At first glance, going from 1.45 GB to 1.07 GB looks reasonable. It very much depends on what data you're backing up, of course! If you'd like to give me more details about what you're backing up (either publicly or privately), I could try to make a better guess. Tarsnap's deduplication is primarily useful when you archive the same directories later on -- it will only upload data that's changed (plus a little bit of metadata). For example, one of the tarsnap.com servers generates an archive every hour; the sum of all those archives is 96000 GB, but thanks to deduplication that's reduced to 56 GB, and after compression (that particular data) is 16 GB: http://www.tarsnap.com/deduplication.html Note that deduplication works best if your data is not already compressed: http://www.tarsnap.com/tips.html#compression I'm guessing that your data includes some photos? Most image formats are already compressed, so Tarsnap's deduplication and compression won't do much to their file size. If you'd like a technical discussion about the details of deduplication, see https://www.tarsnap.com/download/EuroBSDCon13.pdf One detail that isn't covered there is that after deduplication, each block is compressed with zlib. There's technical info about zlib and the DEFLATE method here: https://zlib.net/zlib_tech.html Cheers, - Graham
Re: Tarsnap GUI shows 0 data archived or backed up
Hi, Thanks again. > 2. You started a backup for said Job at 00-57-08 and then another which > failed with: > [18/08/17 1:04 AM] Backup Job_weekly tarsnap backup_2017-08-18_01-04-30 > failed: tarsnap: Transaction already in progress I was not aware of this. I didn't get any notifications/prompt either. I thought it could run more than one backup in parallel. My bad. Maybe it's in the docs which I didn't go through. Also, I started using Tarsnap only because of this GUI. I have known about since 2010 (iirc) but I waited till the GUI arrived. So I kind of thought I will see what happens loud and clear. (off-topic - I am not CLI averse, but I, for some reason, I prefer GUI for backup etc) > Which is correct since you can't run multiple create/delete operations > with tarsnap, these two operations are mutually-exclussive. > > 3. You exited the app and ran it again, while the first backup ran in the > background still. You tried a manual backup again which failed for the > same reason. Got it now. > 4. You started a new session when you woke up, I see 8:40 AM. You queued a > dropbox on-demand backup which started running immediately. I see no > backup finished for the initial backup at 2017-08-18_00-57-08 which I > assume you killed the process either by system shutdown or otherwise. I had closed the lid of my Mac. That must be the reason. I didn't know that the previous one was killed. I couldn't see it in the GUI. Actually I couldn't see much (and I believe a lot about this may have to do with the fact that I know little about this app - downloaded and started using with similar expectations as some popular backup apps and since this one is quite new I shouldn't have) > 5. Then you followed with more actions which failed with: > [18/08/17 1:35 PM] Backup Job_weekly tarsnap backup_2017-08-18_10-03-42 > failed: tarsnap: Error looking up v1-0-0-server.tarsnap.com: nodename nor > servname provided, or not known > These are network issues, check your firewall if you have one or could > have been just a temporary network problem on your side. No. Not something I can remember. Maybe a few seconds or at max few minutes but definitely not more than that. Firewall is off. > I see you had better success with creating archives and jobs later on > between 21-22. Can you confirm that you have those backups showing up in > the GUI? Yes, I have a job and it has been backing up daily. I think I will leave it at that. Yes, I still have that Job and couple of archives related to that. Naming the columns in GUI will help. Also, and this is just a feedback, instead of showing size of every archive as full 1.55GB (in my case) as the prominent/visible size it will be better to show the actual size/diff of that archive and show full size elsewhere in info or in another column (if there are to be more columns). Also when I see Settings > Account I see "total backup size" as 18.4 GB ~ 1.55 x 12 (12 is my number of archives). While in a way this correct this doesn't give the clear picture. That is definitely not the total backup or data size either on my machine or Tarsnap servers. Yes, if done individually it would have been that but I think it's not. I have doubt - that I might be missing something here. As in Tarsnap, maybe creates separate snapshot points in time of all the data and non-unique (previously backed up) data/chunks/pieces are kind of referenced from where they are. Or something like that. Even in that case, the total data being backed up shown as actual data on client x no. of archives kinda seems weird. Log could either be ON by defaults or shown in at least the GUI setup steps as an option. I believe this is very crucial in seeing what is happening especially since GUI is kinda new (not really new new though :P) In fact I would like to see how much of my data is currently selected for being backed up and I am not able to see that. It is 1.45 GB. I have a questions though and instead of asking that in a separate thread I thought I will ask that here itself: I am backing up 1.45 GB and I see the usage are 1.07 GB. Yes, there are other archives but they are in KBs. Even after the first archive I believe it was either 1.07 GB or something very close to it - it was > 1 GB. So, can I make it smaller than 1.07 GB? Am I doing something wrong? I can explain what kind data are part of those 1.45 GB (can give a breakup too if needed). Can I compress it more, make de-duplication more aggressive? I have used the default config that came with the app. Regards. -- Amar
Re: Tarsnap GUI shows 0 data archived or backed up
On Fri, Aug 18, 2017 at 06:55:54AM -, a...@sdf.org wrote: > Also, I see 0 archives. I am wondering why no archives were created. > Because when I created a backup it goes to "Backup" tab and when I create > a job it goes to "Jobs" tab. Nothing ever shows up in "Archives". Neither > did anything show up in the task's Archive tab in the morning. > > I had followed a combination of > http://shinnok.com/rants/2016/02/19/using-tarsnap-gui-on-os-x/ and Getting > Started on tarsnap.com. I'm sorry that this isn't working well. Let's find out where your data is. I suggest using the command-line interface for this, since it's easier to give command via email. The first question is "how many key files do you have?": - in "Getting Started", did you create a key by running tarnsap-keygen ? (that's step 5. Register the machine(s) on which you will be using Tarsnap) - in the GUI setup wizard, did you use the "Create new keyfile" or "Use existing keyfile" option? (note that unfortunately the docs on shinnok.com is 1.5 years old, so the screens don't match the current GUI) If you have a single keyfile, we can see if there's any archives with: $ tarsnap --list-archives (if you put it in /root/tarsnap.key, or else $ tarsnap --list-archives --keyfile ~/location/of/tarsnap.key --cachedir /tmp/temp-tarsnap-cacheidr If you have two (or more) keyfiles, you can run the above command multiple times, once for every keyfile. Once we know where your data is, we can work on getting the GUI to recognize it. (I suspect that you might have multiple keyfiles, which would explain the GUI's confusion.) Cheers, - Graham Percival