Hi all! Today I noticed that the spam handling of our MediaWiki's was broken in more than one way. So I had to fix it. Now I'll document how it is handled:
1. The path: /iglu/Hosts/common/MediaWiki/mediawiki/extensions/SpamBlacklist Contains the MediaWiki SpamBlacklist extension: http://meta.wikimedia.org/wiki/SpamBlacklist_extension This is downloaded from the SourceForge CVS using the SourceForge web interface, and then uploaded to Eskimo. 2. The same directory also contains the file "wikimedia_blacklist". In order to fetch it I have the following script under /home/shlomif/bin/load_wikimedia_lists.sh : <<<<<<< #!/bin/bash ( wget -O /iglu/Hosts/common/MediaWiki/mediawiki/extensions/SpamBlacklist/wikimedia_blacklist \ 'http://vipe.technion.ac.il/~shlomif/wikimedia_blacklist' ) > /dev/null 2>&1 >>>>>>> And I have this cronjob: <<<<<< 30 3,9,15,21 * * * /home/shlomif/bin/load_wikimedia_lists.sh >>>>>> These in turn fetch it from vipe. 3. On vipe I'm fetching it using the following script: <<<<<<< #!/bin/bash (wget -O "$HOME"/public_html/wikimedia_blacklist 'http://meta.wikimedia.org/w/index.php?title=Spam_blacklist&action=raw&sb_ver=1' ) > /dev/null 2>&1 >>>>>>> Also running in a cronjob. The reason I'm fetching it first to vipe and then to eskimo, is because the old firewall rules did not allow outgoing connections to arbitrary IPs, and the wikimedia.org's hostnames are served by a large number of IPs. 4. The central MediaWiki configuration file: /iglu/Hosts/common/MediaWiki/mediawiki/LocalSettings.php Contains the following lines to load the SpamBlacklist extension and configure the blacklist file from wikimedia: <<<<<< require_once( "$IP/extensions/SpamBlacklist/SpamBlacklist.php" ); $wgSpamBlacklistFiles = array( "$IP/extensions/SpamBlacklist/wikimedia_blacklist", // Wikimedia's list ); >>>>>> --------------- That's it! Regards, Shlomi Fish --------------------------------------------------------------------- Shlomi Fish [EMAIL PROTECTED] Homepage: http://www.shlomifish.org/ 95% of the programmers consider 95% of the code they did not write, in the bottom 5%.
