On 05/07/2019 10:27 wea...@debian.org wrote,
[snip]
There are two parts to the snapshot thing, each with its own resource
constraints.
(a) On is everything that goes to the database. Which is pretty much
every request except for see (b). Things have gotten somewhat
better since we moved the DB for the secondary snapshot instance
to a new host, but it's probably still not happy to be hammered.
[snip]
These requests are bound by database latency, and also number of
concurrent requests to the DBMS. Further, since the pooling class
in use is not exactly great, once a certain number of requests are
in flight, things just fall over and everybody starts gettings 503s.
Don't overload the DB :)
So I guess pretty much every request to the machine readable interface
hits the database. How about if I did something like, make a request,
time how long it takes, wait 4 times that before making the next
request? Does that seem a reasonable place to start to avoid
breaking/abusing the system? No parallel requests, just a single thread,
with appropriate retries and backoff in the event of failure.
(b) The only requests that do not hit the DB are requests to
https://urldefense.proofpoint.com/v2/url?u=https-3A__snapshot.debian.org_file_&d=DwIBAg&c=yzoHOc_ZK-sxl-kfGNSEvlJYanssXN3q-lhj0sp26wE&r=8D-NmPUqjigQa5eRXEy3duNXP_ANo_zcQJg1uvF7OSQ&m=6FUjk9CjdXM2MkpASWIWmNXXe1Pz9rMeQHemjWV4EQM&s=2r9Vp9Dq0T00mL5UY6Sk-8aQ5OWr6WJrSm0m9k-5r-Y&e=<sha1sum
of file>
Those are cheap(ish). They are static files and apache fetches them
directly from disk (NFS, but still). I wouldn't worry too much
about making a lot of them. Maybe not concurrently, but fetching
them fast and sustained shouldn't cause too many issues. If things
fail, retry slowly?
This is very useful to know. I see you have some iptables limits on the
number of connections from each IP but if I'm only downloading one file
at a time with no concurrency hopefully I won't trigger them.
When I last tried accessing snapshot from AWS I found it was blocked.
I'd really like to do this in AWS, maybe I could get around this by
using IPv6 but that probably counts as cheating right?
Thanks for your reply and for forwarding this email to the list. You've
been very helpful already.
Cheers,
Paul