Hello Peter, nice to virtually meet you!
I'm Luigi a sysadmin that work for sysdig <http://www.sysdig.org/>. I saw
that you are the developer and maintainer of snapshot.debian.org, i'm
writing a krawler to get all the old debian linux-image and linux-kernel
deb packages to be able to pre-compile a kernel probe for the sysdig
project.

I noticed that the krawler is really slow and I did some profiling with
cprofile (i'm using python).

The most amount of time is spent in the open function to grub the HTML from
the website.
I was wondering if there are actions on you side that you can take to
improve the performances of the website like add a CDN or a varnish cache o
spot some bottleneck that you may have on your side?

Here an example of the time spent from an AWS instance on us-east-1 region
to grub a page from snapshot.debian.org (as you can see it took 20s):
[root@ip-10-10-1-128 ~]# curl -o /dev/null
http://snapshot.debian.org/package/linux/4.6~rc3-1~exp1/

  % Total    % Received % Xferd  Average Speed   Time    Time     Time
Current

                                 Dload  Upload   Total   Spent    Left
Speed

100 1255k    0 1255k    0     0  61728      0 --:--:--  0:00:20 --:--:--
337k
Looking forward to your reply.
Regards
L.
-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”

Reply via email to