On Tue, 10 Mar 2026 at 14:43, Nick Holland <[email protected]> wrote:
> On 3/10/26 11:37 AM, Constantine A. Murenin wrote: > [... blah ... blah ... blah ...] > > you made arguments for why it is "easy". > You did not make any consideration for the CONSEQUENCES of your "easy" > solutions. > > The old cvsweb app made it trivial to scrape not only every single file, > but > every single version of every single file, every single incremental diff of > every file and /every single diff between every two versions of every > file/. > So... for a file in CVS with N commits, there are N*(N-1) diffs possible. > And the old version of cvsweb exposed all of those...for every single file > in > CVS. They already have a list of all these possible diffs, and they are > attempting to use them. > > Every diff requested requires firing up external applications. That's a > lot > of load, even for a significantly more efficient application. > > Those requests are still coming in. Yesterday, well over 90% of the > queries > we got were URLs from the old application. So by returning a 404 instead > of > firing up cvs, co, and/or rcsdiff every time a bot query comes in, we save > a > LOT of load on the system. That's a HUGE win. > > This win has enabled me to remove the IP filters, I've removed much of the > malicious request handling which was all justifiably highly unpopular and > (unfortunately) hurting some of the legitimate users. I hope to soon > return > the systems this application runs on to a fully redundant CARP pair (due to > the load, I had to "split" the cvsweb off to its own machine, so lost the > redundancy). Lots of "win" here. > > So...unless the OpenBSD developers request otherwise, I do think we will > not > be worrying about -- and in fact, actively discouraging -- the old URLs. > I get > it, less than optimal. But this whole problem has been a gigantic "less > than > optimal" that shouldn't be, but it is, and we deal with it as best we can > with > the resources that we have available. As far as Ken and I are concerned > at this > point, the discussion of supporting old URL is over. > > (And special thanks to my employer for laying me off at an opportune time > where > I could devote a fair chunk of time to work with Ken at getting his new > solution > up and running!) > > Nick. > You're literally just providing more inapplicable excuses for doing the wrong thing. If those broken bots continue requesting all the version-to-version diffs, just block those N*(N-1) diffs; noone cares about those. But why do you break http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/ and the like, too? You pretend that the new cvsweb is not actually "cvsweb", but yet it's stil ***cvsweb***.openbsd.org. Do you seriously think it's reasonable to break everyone's links simply because some backend somewhere has changed slightly? Because it is simply not reasonable in my book. A user visiting the site couldn't care less if it's written in Go or in Perl. They care about the CONTENT they came for. This whole issue should have taken less time to fix than this entire discussion we're having. Instead, somehow it's okay for everyone everywhere to deal with all these broken links everywhere simply because of some invalid justifications that do not even make any sense in the grand scheme of things. So far, you have not provided a single valid reason for breaking EVERY SINGLE LINK on the website. C.

