Re: [fossil-users] Backups of deconstructed fossil repositories
On Sun, Jun 17, 2018, at 20:05, Warren Young wrote: > However, I’ll also give a counterargument to the whole idea: you > probably aren’t saving anything in the end. An intelligent deconstruct > + backup probably saves no net I/O over just re-copying the Fossil repo > DB to the destination unless the destination is *much* slower than the > machine being backed up. > > (rsync was created for the common case where networks are much slower > than the computers they connect. rsync within a single computer is > generally no faster than cp -r, and sometimes slower, unless you take > the mtime optimization mentioned above.) > > The VM/ZFS + snapshots case has a similar argument against it: if you’re > using snapshots to back up a Fossil repo, deconstruction isn’t helpful. > The snapshot/CoW mechanism will only clone the changed disk blocks in > the repo. > > So, what problem are you solving? If it isn’t the slow-networks > problem, I suspect you’ve got an instance of the premature optimization > problem here. If you go ahead and implement it, measure before > committing the change, and if you measure a meaningful difference, > document the conditions to help guide expectations. I want my approximately daily backups to be small. I currently version the fossil SQLite files in borg, and I am considering versioning instead the artefact dumps. I figure these will change less than the SQLite files do and that they also will be smaller because they lack caches. But the backups are already very small. I suppose I could test this. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Backups of deconstructed fossil repositories
On 6/17/18, Thomas Levine <_...@thomaslevine.com> wrote: > As content is added to a fossil repository, files in the corresponding > deconstructed repository never change; they are only added. Most backup > software will track changes to the deconstructed repository with great > efficiency. > > I should thus take my backups of the deconstructed repositories, yes? Fossil itself tracks changes with great efficiency. The best backup of a fossil repository is a clone. The self-hosting Fossil repo at https://fossil-scm.org/ is backed up by two clones, one at https://www2.fossil-scm.org/ and the other at https://www3.fossil-scm.org/site.cgi. Each of these clones is in a separate data center in a different part of the world. The second clone uses a different ISP (DigitalOcean instead of Linode). Both clones sync to the master hourly via a cron job. -- D. Richard Hipp d...@sqlite.org ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Backups of deconstructed fossil repositories
On Sun, Jun 17, 2018 at 10:08 PM Warren Young wrote: > On Jun 17, 2018, at 2:05 PM, Warren Young wrote: > > > > If you’re willing to gamble that if the first test returns true that the > second will also returns true, it buys you a big increase in speed. The > gamble is worth taking as long as the files’ modification timestamps are > trustworthy. > > I just remembered something: “fossil up” purposely does not modify the > mtimes of the files it writes to match the mtime of the file in the > repository because it can cause difficult-to-diagnose build system errors. > Writing changed files out with the current wall time as the mtime is more > likely to cause correct builds. > To that i'm going to add that fossil doesn't actually store any file timestamps! It only records the time of a commit. When fossil is asked "what's the timestamp for file X?", the answer is really the timestamp of the last commit in which that file was modified. -- - stephan beal http://wanderinghorse.net/home/stephan/ "Freedom is sloppy. But since tyranny's the only guaranteed byproduct of those who insist on a perfect world, freedom will have to do." -- Bigby Wolf ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Backups of deconstructed fossil repositories
On Jun 17, 2018, at 2:05 PM, Warren Young wrote: > > If you’re willing to gamble that if the first test returns true that the > second will also returns true, it buys you a big increase in speed. The > gamble is worth taking as long as the files’ modification timestamps are > trustworthy. I just remembered something: “fossil up” purposely does not modify the mtimes of the files it writes to match the mtime of the file in the repository because it can cause difficult-to-diagnose build system errors. Writing changed files out with the current wall time as the mtime is more likely to cause correct builds. I wonder if the fossil deconstruct mechanism also does the same thing? If so, then you can’t take the rsync mtime optimization without changing that behavior. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Backups of deconstructed fossil repositories
On Jun 17, 2018, at 12:16 PM, Thomas Levine <_...@thomaslevine.com> wrote: > > One inconvenience I noted is that the deconstruct command always writes > artefacts to the filesystem, even if a file of the appropriate name and > size and contents already exists. You might want to split that observation into two, as rsync does: - name, size, and modification date match - contents also match If you’re willing to gamble that if the first test returns true that the second will also returns true, it buys you a big increase in speed. The gamble is worth taking as long as the files’ modification timestamps are trustworthy. When the timestamps aren’t trustworthy, you do the first test, then if that returns true, also do the second as extra assurance. > Would the developers welcome a flag > to blob_write_to_file in src/blob.c to skip the writing of a new > artefact file if the file already exists? In addition to your backup case, it might also benefit snapshotting mechanisms found in many virtual machine systems and in some of the more advanced filesystems. (ZFS, btrfs, APFS…) However, I’ll also give a counterargument to the whole idea: you probably aren’t saving anything in the end. An intelligent deconstruct + backup probably saves no net I/O over just re-copying the Fossil repo DB to the destination unless the destination is *much* slower than the machine being backed up. (rsync was created for the common case where networks are much slower than the computers they connect. rsync within a single computer is generally no faster than cp -r, and sometimes slower, unless you take the mtime optimization mentioned above.) The VM/ZFS + snapshots case has a similar argument against it: if you’re using snapshots to back up a Fossil repo, deconstruction isn’t helpful. The snapshot/CoW mechanism will only clone the changed disk blocks in the repo. So, what problem are you solving? If it isn’t the slow-networks problem, I suspect you’ve got an instance of the premature optimization problem here. If you go ahead and implement it, measure before committing the change, and if you measure a meaningful difference, document the conditions to help guide expectations. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
[fossil-users] Backups of deconstructed fossil repositories
As content is added to a fossil repository, files in the corresponding deconstructed repository never change; they are only added. Most backup software will track changes to the deconstructed repository with great efficiency. I should thus take my backups of the deconstructed repositories, yes? That is, should I back up the SQLite database format of the fossil repository or the deconstructed directory format of the repository? One inconvenience I noted is that the deconstruct command always writes artefacts to the filesystem, even if a file of the appropriate name and size and contents already exists. Would the developers welcome a flag to blob_write_to_file in src/blob.c to skip the writing of a new artefact file if the file already exists? That is, rebuild_step in src/rebuild.c would check for the existance of the file corresponding the artefact's hash, and if such a file exists already (even if its content is wrong), rebuild_step would skip writing this artefact. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users