I've been meaning for months to summarize the patterns of use and abuse
of the Rails wiki. Today I decided to do this by reviewing all the
updates in a typical day - and yesterday (Friday 26th May 2006, GMT)
looked a good enough candidate.
Getting these figures has involved some hand counting, so they might be
a little bit off.
There were 373 changes, affecting 57 pages.
34 pages had valid updates (86 changes), and 25 pages were spammed (248
changes) or despammed (39 changes). Two pages had valid as well as
spam-related changes.
Usful editing activity centred on the RealWorldUsage page. This page
started overflowing the wiki's 64K limit on 14th March 2006, and has
suffered from truncation on most updates since then. Yesterday Larry
Gilbert split the content between two new pages, RealWorldUsagePage1 and
RealWorldUsagePage2, and began bringing lost content back from past
versions. This accounted for 23 changes on 3 pages. The other useful
activity averaged about 2 changes each on 31 pages.
The most heavily spammed page, addentry.php, had 174 updates (and passed
version 10,000) yesterday. Nobody bothers to remove spam on that page -
the page has never had valid content. The other 74 spammings were spread
over 24 pages, and were countered by 39 rollbacks. Some pages are more
popular than others for spamming; the popular ones are spammed many
times more often than they are rolled back, so only show valid content
for a small proportion of the time. Rolling these pages back is tedious,
as it involves looking back through many versions for a valid one to
roll back to (and the wiki is quite slow).
Useful pages suffering from heavy spamming include MySQL,
ActiveRecordAssociations, DeadPages (which would be useful if anyone
took any notice of it), Contributors, and OpenSourceProjects. Various
individuals' pages are badly hit too, e.g. DimitrySabanin, PabloFlores
and DanielVonFange.
IP addresses used by spammers are mostly faked, and the user names they
use are often randomly generated. I don't believe the present mechanisms
for controlling updates to the wiki are adequate. It's possible that
using a CAPTCHA would improve matters, but I'd rather see registration
required for wiki editing. Cookies could be used to remember registered
users.
As well as stopping bad content getting in, it is high time someone was
able to strip the existing junk pages out. I sent Dan a list of about
300 junk pages (pages which had never held any useful content) back in
February, and later added those to the DeadPages wiki page, but the
pages are still there. (I've noticed occasionally since then that
someone has, for the first time, put valid content into a page which has
been around for months, so trawling for junk pages would have to be done
again when someone is ready to start deleting.) Deletion could be
logical (like the Attic in CVS) - so long as it removes pages from
normal view, including the All Pages and Recently Revised lists.
Once these things are sorted out it will be easier to focus on improving
the wiki content.
Remember the bit about the Broken Window Theory in the Pragmatic
Programmer?
"In the original experiment leading to the 'Broken Window Theory,' an
abandoned car sat for a week untouched. But once a single window was
broken, the car was stripped and turned upside down within hours."
The spam and junk pages on the wiki are a broken window in the Rails
neighbourhood. Please fix it - I'd rather be contributing value than
rolling back spam.
regards
Justin
_______________________________________________
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core