> Is it correct that s3ql will always download the whole block into the
> local cache upon a read that intersects with this block?
yes
> If true, then how scalable is s3ql with respect to number of blocks in
> the filesystem? That is, how far can I realistically reduce the block
> size if my dataset is, say, 10-20 TB?
>
> Basically, I'm trying to optimize for random reads.
How many files/inodes does your dataset have? What is the average file size?

You should take your connection speed into account, also. If you are on
a 1GB/s internet connection downloading 10MiB blocks (the default max
block size) is probably fine. If you only have a 1MB/s connection your
max block size should be smaller.
But keep in mind: not only the raw download speed is relevant but also
the setup cost for each download request. The object storage backend
needs to authenticate your request, look up your object and you maybe
have a DNS/TLS/TCP slow start overhead. And you might also need to pay
for each request.

S3QL uses a SQLite database to keep track of all things. A smaller block
size means more blocks for S3QL to keep track of => a bigger database.
That SQLite database can get big. A compressed version of the database
gets stored on the object storage as a backup. Currently there is a
limitation, that this backup can only be 5GiB (the maximum object size
of almost all object storage providers). If you have many blocks and
many inodes you can reach this limit (search this list, one or two folks
have had this problem) and can get in real trouble.

I probably would not choose a max block size below 10MiB.  I have some
S3QL file systems with few big files (Bareos "tape" backups) for these
file systems I have increased the max block size to 300MiB but these
files only get accessed sequentially and the VMs running these file
systems are on 1GB/s+ internet connections.

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/s3ql/61a91cc3-1d56-c2ed-0307-bc67340dab75%40jagszent.de.

Reply via email to