On 10/06/24 at 13:31 +0200, Lucas Nussbaum wrote: > It looks like you see the work on object-storage backend as a > procurement/infrastructure question. I don't think that this is the main > issue. Based on what I've done so far (and I still plan to continue > working on this, but I have limited time for Debian nowadays), the code > also needs deep changes because, if you want an object-storage-based > backend to perform adequately, you need to more parallelism for > backend-related operations. > > This is true whether the storage service is AWS S3, OpenStack Swift, > Azure Blob Storage, or Ceph Object Storage, or whatever. If you increase > the latency between the importer/indexer and the storage service, > you need parallelism to hide it and stay with a bandwidth-bound problem. > > To work on this, you need an object storage backend, but I suspect that > once it works with one of them, porting it to another one will be > trivial, as the S3-specific bits are really minimal. (and Swift is > S3-compatible anyway) > > Help is welcomed -- my code is at > https://salsa.debian.org/lucas/snapshot/-/commits/s3snap/?ref_type=heads > > Typically a good way to test this is to try to import a small archive > (e.g. debian-security with one architecture only) and see if you can get > an import time on object storage that is similar to the one on > file-based storage.
FYI, I stopped working on that, since the FS-backed service is back in a good state. My work is pushed to the above git repo and I cleaned up the infrastructure bits I set up on AWS. Lucas
