> Is it correct that s3ql will always download the whole block into the > local cache upon a read that intersects with this block? yes > If true, then how scalable is s3ql with respect to number of blocks in > the filesystem? That is, how far can I realistically reduce the block > size if my dataset is, say, 10-20 TB? > > Basically, I'm trying to optimize for random reads. How many files/inodes does your dataset have? What is the average file size?
You should take your connection speed into account, also. If you are on a 1GB/s internet connection downloading 10MiB blocks (the default max block size) is probably fine. If you only have a 1MB/s connection your max block size should be smaller. But keep in mind: not only the raw download speed is relevant but also the setup cost for each download request. The object storage backend needs to authenticate your request, look up your object and you maybe have a DNS/TLS/TCP slow start overhead. And you might also need to pay for each request. S3QL uses a SQLite database to keep track of all things. A smaller block size means more blocks for S3QL to keep track of => a bigger database. That SQLite database can get big. A compressed version of the database gets stored on the object storage as a backup. Currently there is a limitation, that this backup can only be 5GiB (the maximum object size of almost all object storage providers). If you have many blocks and many inodes you can reach this limit (search this list, one or two folks have had this problem) and can get in real trouble. I probably would not choose a max block size below 10MiB. I have some S3QL file systems with few big files (Bareos "tape" backups) for these file systems I have increased the max block size to 300MiB but these files only get accessed sequentially and the VMs running these file systems are on 1GB/s+ internet connections. -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/s3ql/61a91cc3-1d56-c2ed-0307-bc67340dab75%40jagszent.de.
