On 2020-07-09 at 04:25 +0200, Daniel Jagszent wrote: > > [...] This specific filesystem is used to store a borg repository. > > All files are ~500 MiB in size. Consequently, I expect ~20-40k of > > these > > files.[...] > > this should not be a problem regarding the S3QL database size. (I > suspect the uncompressed size of the DB would be < 100 MiB) > > > I have not (yet) profiled the file access patterns exactly, but I > > know > > that all new writes are strictly sequential and files are never > > rewritten, but accessing a borg repository causes many small random > > reads with no discernible pattern. > I do not know the specifics of the content-defined chunking borg uses > but when it is is similar to restic's implementation ( > https://godoc.org/github.com/restic/chunker ) then chunks will be > between 512KiB and 8MiB. Let's say that compression can reduce that 2 > times. So the chunks borg needs to access are between ~256KiB and > ~4MiB. > Then maybe a max S3QL block size of 5 MiB instead of the default of > 10 > MiB would be better. Since your file system has relatively few > inodes > (~40k inodes, ~40k names, 4M blocks) this should be OK for the S3QL > database.
That's true for data chunks, but metadata chunks are smaller. I still have no hard profile data (bpf doesn't like me), but plain stracing shows that pruning/deleting old archives with borg (metadata heavy operation) performs a scatter of 128 KiB reads every 1-10 MiB, which naturally results in huge read amplification. Pruning two archives from a test 600 GiB repository has just crossed 100 GiB total I/O. Not sure if this can be solved with reducing S3QL block size... > > Besides max block size what really really would improve the random > read/write performance is a dedicated SSD for the S3QL cache. Yes, I do have an SSD for the cache — not dedicated, though. -- Ivan Shapovalov / intelfx / -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/s3ql/a73957fcadebbac634d7de3d1dcd9a21b8a329e2.camel%40intelfx.name.
