I've been meaning for months to summarize the patterns of use and abuse of the Rails wiki. Today I decided to do this by reviewing all the updates in a typical day - and yesterday (Friday 26th May 2006, GMT) looked a good enough candidate.

Getting these figures has involved some hand counting, so they might be a little bit off.

There were 373 changes, affecting 57 pages.

34 pages had valid updates (86 changes), and 25 pages were spammed (248 changes) or despammed (39 changes). Two pages had valid as well as spam-related changes.

Usful editing activity centred on the RealWorldUsage page. This page started overflowing the wiki's 64K limit on 14th March 2006, and has suffered from truncation on most updates since then. Yesterday Larry Gilbert split the content between two new pages, RealWorldUsagePage1 and RealWorldUsagePage2, and began bringing lost content back from past versions. This accounted for 23 changes on 3 pages. The other useful activity averaged about 2 changes each on 31 pages.

The most heavily spammed page, addentry.php, had 174 updates (and passed version 10,000) yesterday. Nobody bothers to remove spam on that page - the page has never had valid content. The other 74 spammings were spread over 24 pages, and were countered by 39 rollbacks. Some pages are more popular than others for spamming; the popular ones are spammed many times more often than they are rolled back, so only show valid content for a small proportion of the time. Rolling these pages back is tedious, as it involves looking back through many versions for a valid one to roll back to (and the wiki is quite slow).

Useful pages suffering from heavy spamming include MySQL, ActiveRecordAssociations, DeadPages (which would be useful if anyone took any notice of it), Contributors, and OpenSourceProjects. Various individuals' pages are badly hit too, e.g. DimitrySabanin, PabloFlores and DanielVonFange.

IP addresses used by spammers are mostly faked, and the user names they use are often randomly generated. I don't believe the present mechanisms for controlling updates to the wiki are adequate. It's possible that using a CAPTCHA would improve matters, but I'd rather see registration required for wiki editing. Cookies could be used to remember registered users.

As well as stopping bad content getting in, it is high time someone was able to strip the existing junk pages out. I sent Dan a list of about 300 junk pages (pages which had never held any useful content) back in February, and later added those to the DeadPages wiki page, but the pages are still there. (I've noticed occasionally since then that someone has, for the first time, put valid content into a page which has been around for months, so trawling for junk pages would have to be done again when someone is ready to start deleting.) Deletion could be logical (like the Attic in CVS) - so long as it removes pages from normal view, including the All Pages and Recently Revised lists.

Once these things are sorted out it will be easier to focus on improving the wiki content.

Remember the bit about the Broken Window Theory in the Pragmatic Programmer?

"In the original experiment leading to the 'Broken Window Theory,' an abandoned car sat for a week untouched. But once a single window was broken, the car was stripped and turned upside down within hours."

The spam and junk pages on the wiki are a broken window in the Rails neighbourhood. Please fix it - I'd rather be contributing value than rolling back spam.

regards

  Justin
_______________________________________________
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core

Reply via email to