2013/7/25 Andre Klapper <[email protected]> Mozilla have recommendations for researchers at > https://bugzilla.mozilla.org/page.cgi?id=researchers.html > and offer a sanitized MySQL dump (without attachments and secret > tickets) at http://people.mozilla.com/~mhoye/bugzilla/ . > Would it be worth if I asked Mozilla for steps how to create such a > dump?
Sure, we can ask them for a way to build a dump without secret tickets or private data. Will you go ahead and do that? > For the time being that researchers crawl GNOME Bugzilla and that we > don't have a dump: > What would be acceptable latency values to *not* get IP addresses > blocked, and UTC times of the day where there's less traffic anyway? > (Actually I'm asking this on behalf of a university professor.) > Currently, all requests that exceed the amount of 1500 hits per hour get banned (an hit means an entry on the relevant apache log in the format "IP date GET PATH"). We had a few cases of people not keeping a cache of the static html / css files that resulted in a ban after a few minutes cause their browser requesting the same static files at each request. What we can do now is adding a few exceptions to the htaccess file that gets populated by our banning script. That said most of the GNOME developers are either from EU (mainly GMT+1) or from the eastern coast of the US (GMT-5), so I would say any time between 1-2 o'clock AM to 7-8 AM. We should probably ask these researchers to don't crawl the website at the same time if they plan to do so in the future, maybe limiting them to one per night. Someone else might have another better idea though ;) -- Cheers, Andrea Debian Developer, Fedora / EPEL packager, GNOME Sysadmin, GNOME Foundation Membership & Elections Committee Chairman Homepage: http://www.gnome.org/~av
_______________________________________________ gnome-infrastructure mailing list [email protected] https://mail.gnome.org/mailman/listinfo/gnome-infrastructure
