On Mon, Dec 05, 2005 at 07:25:49AM -0700, Elijah Newren wrote: > On 12/5/05, Olav Vitters <[EMAIL PROTECTED]> wrote: > > Searching for a bug can produce lots of results. Some queries can > > return all the bugs in the database. The script has a very idiotic way > > to protect against such queries (nothing after the ? in the URL or just > > buglist.cgi). > > > > A java program was requesting: > > http://bugzilla.gnome.org/buglist.cgi?bug_id= > > This caused buglist.cgi to retrieve all bugs. I've blocked his IP & > > changed buglist.cgi to reject above query, but the java program already > > had 3 buglist.cgi processes running on window, each consuming lots of > > processor time (20min) & memory (180MB+). > > Was their java program running core-bugs-today.cgi or another > braindead script that we have? I've requested > http://bugzilla.gnome.org/buglist.cgi?bug_id= > several times myself as well. I didn't do it intentionally, but > rather because reports/core-bugs-today.cgi does it's own little query > to get a comma separated list of bugs, and then just appends them to > that above URL and automatically redirects--if no bugs were found (as > is the case just after 24:00 UTC), the appended list is merely an > empty string. Taking a look at the code, this bug is still there > although it appears you fixed it when you ported that script to 2.20. > We may have other scripts that are similarly braindead. I think > boogle had this problem at one point (causing me to request that same > url...) before I added code to special case that and print an error > that no bugs were found instead.
Ahh.. that may explain it. Still strange that the USER_AGENT only contained Java/some_ver. I assumed it was a badly written bot as also nobody was logged in from that IP (and /24). I couldn't say if it accessed other urls as I do not have access to the server logs (perhaps ask the sysadmins, but they are currently busy). For the processes I just /proc/$PID/environ to find out the information. Using that file is also way better cause it is way more precise. > > Ideally buglist.cgi should contain a better detection of such queries. > > Well, I think special casing an empty query since it has happened so > many times makes sense. Something more advanced would be welcome too, > but this would probably be about as simple as you get and an empty > query really ought to return an empty list. Currently it avoids buglist.cgi?bug_id= (added today) and buglist.cgi. I know I sometimes use buglist.cgi ?id=somebug (this will still hang.. ). I do not see an easy way to detect such queries, that is why I'm suggesting the limit. > > Another way would be to limit the number of bugs in the SQL. This isn't > > perfect as the java process would still return lots of results, but it > > is easy to implement. This is what I want to do now. > > Sounds like it'd also be a good idea, but I really do think we should > also special case an empty query and make it return > nothing--especially since we have caused it so many times ourselves. Yeah.. I'll look again.. seemed hard to do :-( -- Regards, Olav _______________________________________________ Gnome-bugsquad mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gnome-bugsquad
