Hello Peter, nice to virtually meet you! I'm Luigi a sysadmin that work for sysdig <http://www.sysdig.org/>. I saw that you are the developer and maintainer of snapshot.debian.org, i'm writing a krawler to get all the old debian linux-image and linux-kernel deb packages to be able to pre-compile a kernel probe for the sysdig project.
I noticed that the krawler is really slow and I did some profiling with cprofile (i'm using python). The most amount of time is spent in the open function to grub the HTML from the website. I was wondering if there are actions on you side that you can take to improve the performances of the website like add a CDN or a varnish cache o spot some bottleneck that you may have on your side? Here an example of the time spent from an AWS instance on us-east-1 region to grub a page from snapshot.debian.org (as you can see it took 20s): [root@ip-10-10-1-128 ~]# curl -o /dev/null http://snapshot.debian.org/package/linux/4.6~rc3-1~exp1/ % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1255k 0 1255k 0 0 61728 0 --:--:-- 0:00:20 --:--:-- 337k Looking forward to your reply. Regards L. -- Luigi --- “The only way to get smarter is by playing a smarter opponent.”
