On Tue, 30 Oct 2012 06:17:05 -0400, Richard Hipp <d...@sqlite.org> wrote:

[...]

> Both sessions started out innocently.  The logs suggest that there really
> was a human operator initially.  But then after about 3 minutes of "normal"
> browsing, each session starts downloading every hyperlink in sight at a
> rate of about 5 to 10 pages per second. It is as if the user had pressed a
> "Download Entire Website" button on their browser.  Question:  Is there
> such a button in IE?

No, just "save page as ...". It will not follow hyperlinks, only save
html and embedded resources, like images.

> Another question:  Are significant numbers of people still using IE6 and
> IE7?  Could we simply change Fossil to consider IE prior to version 8 to be
> a bot, and hence not display any hyperlinks until the user has logged in?

I don't think it would help much. Newer versions will potentially run
the same add-ons.

By the way, over 5% of the population still use these older versions.
http://stats.wikimedia.org/archive/squid_reports/2012-09/SquidReportClients.htm

> Yet another question:  Is there any other software on Windows that I am not
> aware of that might be causing the above behaviors?  Are there plug-ins or
> other tools for IE that will walk a website and download all its content?

There are several browser add-ons that will try to walk complete
websites, e.g.:
http://www.winappslist.com/download_managers.htm
http://www.unixdaemon.net/ie-plugins.html

One can also think of validator tools.

Standalone programs usually will not run javascript.


> Finally: Do you have any further ideas on how to defend a Fossil website
> against runs such as the two we observed on SQLite last night?

Perhaps the href javascript should run "onfocus", rather than "onload"?
(untested)

Other defenses could use DoS defense techniques, like not honouring (or
agressively delay responses to) more than a certain number of requests
within a certain time, which is not nice, because the server would have
to maintain (more) session state.

Sidenote:
As far as I can tell several modern browsers have a "read ahead" option,
that will try to load more pages of the site before a link is clicked.
https://developers.google.com/chrome/whitepapers/prerender
Those will not walk a whole site though.

-- 
Groet, Cordialement, Pozdrawiam, Regards,

Kees Nuyt

_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to