https://bugs.contribs.org/show_bug.cgi?id=10857
--- Comment #1 from Stefan Schulz <[email protected]> ---
Unfortunately this is not working out of the box. Checking the logs brought up
no bad bots have been banned.
Digging here: https://github.com/fail2ban/fail2ban/pull/2259
brought up, that the bad bot list is *very* old. As o.m. I pulled the
recommended updated list (big thank to mitchellkrogza & thezoggy) and updated
the apache-badbots.conf.
I also changed the regex as from thezoggy recommended to
failregex = (?i)<HOST>
-.*"(GET|POST|HEAD|OPTIONS).*HTTP.*(?:%(badbotscustom)s|%(badbots)s).*"$
Running a test shows now:
[root@saturn ~]# fail2ban-regex --print-all-missed /root/test_log
/root/apache-badbots.conf
Running tests
=============
Use failregex file : /root/apache-badbots.conf
Use log file : /root/test_log
Use encoding : UTF-8
Results
=======
Failregex: 1441 total
|- #) [# of hits] regular expression
| 1) [1441] (?i)<HOST>
-.*"(CONNECT|GET|POST|HEAD|OPTIONS).*HTTP.*(?:|\bGo-http-client\b|\b360Spider\b|\b404checker\b|\b404enemy\b|\b80legs\b|\bAbonti\b|\bAboundex\b|\bAboundexbot\b|\bAcunetix\b|\bADmantX\b|\bAfD-Verbotsverfahren\b|\bAhrefsBot\b|\bAIBOT\b|\bAiHitBot\b|\bAipbot\b|\bAlexibot\b|\bAlligator\b|\bAllSubmitter\b|\bAlphaBot\b|\bAnarchie\b|\bApexoo\b|\barchive\.org_bot\b|\barquivo\.pt\b|\barquivo-web-crawler\b|\bASPSeek\b|\bAsterias\b|\bAttach\b|\bautoemailspider\b|\bAwarioRssBot\b|\bAwarioSmartBot\b|\bBackDoorBot\b|\bBacklink-Ceck\b|\bbacklink-check\b|\bBacklinkCrawler\b|\bBackStreet\b|\bBackWeb\b|\bBadass\b|\bBandit\b|\bBarkrowler\b|\bBatchFTP\b|\bBattleztar
Bazinga\b|\bBBBike\b|\bBDCbot\b|\bBDFetch\b|\bBetaBot\b|\bBigfoot\b|\bBitacle\b|\bBlackboard\b|\bBlack
Hole\b|\bBlackWidow\b|\bBLEXBot\b|\bBlow\b|\bBlowFish\b|\bBoardreader\b|\bBolt\b|\bBotALot\b|\bBrandprotect\b|\bBrandwatch\b|\bBuddy\b|\bBuiltBotTough\b|\bBuiltWith\b|\bBullseye\b|\bBunnySlippers\b|\bBuzzSumo\b|\bCalculon\b|\bCATExplorador\b|\bCazoodleBot\b|\bCCBot\b|\bCegbfeieh\b|\bCheeseBot\b|\bCherryPicker\b|\bCheTeam\b|\bChinaClaw\b|\bChlooe\b|\bClaritybot\b|\bCliqzbot\b|\bCloud
mapping\b|\bcoccocbot-web\b|\bCogentbot\b|\bcognitiveseo\b|\bCollector\b|\bcom\.plumanalytics\b|\bCopier\b|\bCopyRightCheck\b|\bCopyscape\b|\bCosmos\b|\bCraftbot\b|\bcrawler4j\b|\bcrawler\.feedback\b|\bcrawl\.sogou\.com\b|\bCrazyWebCrawler\b|\bCrescent\b|\bCrunchBot\b|\bCSHttp\b|\bCurious\b|\bCusto\b|\bDatabaseDriverMysqli\b|\bDataCha0s\b|\bDBLBot\b|\bdemandbase-bot\b|\bDemon\b|\bDeusu\b|\bDevil\b|\bDigincore\b|\bDigitalPebble\b|\bDIIbot\b|\bDirbuster\b|\bDisco\b|\bDiscobot\b|\bDiscoverybot\b|\bDispatch\b|\bDittoSpyder\b|\bDnyzBot\b|\bDomainAppender\b|\bDomainCrawler\b|\bDomainSigmaCrawler\b|\bDomainStatsBot\b|\bDotbot\b|\bDownload
Wonder\b|\bDragonfly\b|\bDrip\b|\bDSearch\b|\bDTS
Agent\b|\bEasyDL\b|\bEbingbong\b|\beCatch\b|\bECCP/1\.0\b|\bEcxi\b|\bEirGrabber\b|\bEMail
Siphon\b|\bEMail
Wolf\b|\bEroCrawler\b|\bevc-batch\b|\bEvil\b|\bExabot\b|\bExpress
WebPictures\b|\bExtLinksBot\b|\bExtractor\b|\bExtractorPro\b|\bExtreme Picture
Finder\b|\bEyeNetIE\b|\bEzooms\b|\bfacebookscraper\b|\bFDM\b|\bFemtosearchBot\b|\bFHscan\b|\bFimap\b|\bFirefox/7\.0\b|\bFlashGet\b|\bFlunky\b|\bFoobot\b|\bFreeuploader\b|\bFrontPage\b|\bFyberSpider\b|\bFyrebot\b|\bGalaxyBot\b|\bGenieo\b|\bGermCrawler\b|\bGetintent\b|\bGetRight\b|\bGetWeb\b|\bGigablast\b|\bGigabot\b|\bG-i-g-a-b-o-t\b|\bGo-Ahead-Got-It\b|\bGotit\b|\bGoZilla\b|\bGo\!Zilla\b|\bGrabber\b|\bGrabNet\b|\bGrafula\b|\bGrapeFX\b|\bGrapeshotCrawler\b|\bGridBot\b|\bGT\:\:WWW\b|\bHaansoft\b|\bHaosouSpider\b|\bHarvest\b|\bHavij\b|\bHEADMasterSEO\b|\bheritrix\b|\bHeritrix\b|\bHloader\b|\bHMView\b|\bHTMLparser\b|\bHTTP\:\:Lite\b|\bHTTrack\b|\bHumanlinks\b|\bHybridBot\b|\bIblog\b|\bIDBot\b|\bId-search\b|\bIlseBot\b|\bImage
Fetch\b|\bImage Sucker\b|\bIndeedBot\b|\bIndy
Library\b|\bInfoNaviRobot\b|\bInfoTekies\b|\binstabid\b|\bIntelliseek\b|\bInterGET\b|\bInternet
Ninja\b|\bInternetSeer\b|\binternetVista
monitor\b|\bips-agent\b|\bIria\b|\bIRLbot\b|\bIskanie\b|\bIstellaBot\b|\bJamesBOT\b|\bJbrofuzz\b|\bJennyBot\b|\bJetCar\b|\bJetty\b|\bJikeSpider\b|\bJOC
Web Spider\b|\bJoomla\b|\bJorgee\b|\bJustView\b|\bJyxobot\b|\bKenjin
Spider\b|\bKeyword
Density\b|\bKozmosbot\b|\bLanshanbot\b|\bLarbin\b|\bLeechFTP\b|\bLeechGet\b|\bLexiBot\b|\bLftp\b|\bLibWeb\b|\bLibwhisker\b|\bLightspeedsystems\b|\bLikse\b|\bLinkdexbot\b|\bLinkextractorPro\b|\bLinkpadBot\b|\bLinkScan\b|\bLinksManager\b|\bLinkWalker\b|\bLinqiaMetadataDownloaderBot\b|\bLinqiaRSSBot\b|\bLinqiaScrapeBot\b|\bLipperhey\b|\bLipperhey
Spider\b|\bLitemage_walker\b|\bLmspider\b|\bLNSpiderguy\b|\bLtx71\b|\blwp-request\b|\bLWP\:\:Simple\b|\blwp-trivial\b|\bMagnet\b|\bMag-Net\b|\bmagpie-crawler\b|\bMail\.RU_Bot\b|\bMajestic12\b|\bMajestic-SEO\b|\bMajestic
SEO\b|\bMarkMonitor\b|\bMarkWatch\b|\bMasscan\b|\bMass Downloader\b|\bMata
Hari\b|\bMauiBot\b|\bmeanpathbot\b|\bMeanpathbot\b|\bMeanPath
Bot\b|\bMediatoolkitbot\b|\bmediawords\b|\bMegaIndex\.ru\b|\bMetauri\b|\bMFC_Tear_Sample\b|\bMicrosoft
Data Access\b|\bMicrosoft URL Control\b|\bMIDown tool\b|\bMIIxpc\b|\bMister
PiX\b|\bMJ12bot\b|\bMojeek\b|\bMojolicious\b|\bMorfeus Fucking
Scanner\b|\bMr\.4x3\b|\bMSFrontPage\b|\bMSIECrawler\b|\bMsrabot\b|\bmuhstik-scan\b|\bMusobot\b|\bName
Intelligence\b|\bNameprotect\b|\bNavroad\b|\bNearSite\b|\bNeedle\b|\bNessus\b|\bNetAnts\b|\bNetcraft\b|\bnetEstate
NE Crawler\b|\bNetLyzer\b|\bNetMechanic\b|\bNetSpider\b|\bNettrack\b|\bNet
Vampire\b|\bNetvibes\b|\bNetZIP\b|\bNextGenSearchBot\b|\bNibbler\b|\bNICErsPRO\b|\bNiki-bot\b|\bNikto\b|\bNimbleCrawler\b|\bNimbostratus\b|\bNinja\b|\bNmap\b|\bNPbot\b|\bNutch\b|\boBot\b|\bOctopus\b|\bOffline
Explorer\b|\bOffline
Navigator\b|\bOnCrawl\b|\bOpenfind\b|\bOpenLinkProfiler\b|\bOpenvas\b|\bOpenVAS\b|\bOrangeBot\b|\bOrangeSpider\b|\bOutclicksBot\b|\bOutfoxBot\b|\bPageAnalyzer\b|\bPage
Analyzer\b|\bPageGrabber\b|\bpage
scorer\b|\bPageScorer\b|\bPandalytics\b|\bPanscient\b|\bPapa
Foto\b|\bPavuk\b|\bpcBrowser\b|\bPECL\:\:HTTP\b|\bPeoplePal\b|\bPHPCrawl\b|\bPicscout\b|\bPicsearch\b|\bPictureFinder\b|\bPimonster\b|\bPi-Monster\b|\bPixray\b|\bPleaseCrawl\b|\bplumanalytics\b|\bPockey\b|\bPOE-Component-Client-HTTP\b|\bProbethenet\b|\bProPowerBot\b|\bProWebWalker\b|\bPsbot\b|\bPump\b|\bPxBroker\b|\bPyCurl\b|\bQueryN
Metasearch\b|\bQuick-Crawler\b|\bRankActive\b|\bRankActiveLinkBot\b|\bRankFlex\b|\bRankingBot\b|\bRankingBot2\b|\bRankivabot\b|\bRankurBot\b|\bRealDownload\b|\bReaper\b|\bRebelMouse\b|\bRecorder\b|\bRedesScrapy\b|\bReGet\b|\bRepoMonkey\b|\bRipper\b|\bRocketCrawler\b|\bRogerbot\b|\bRSSingBot\b|\bs1z\.ru\b|\bSalesIntelligent\b|\bSBIder\b|\bScanAlert\b|\bScanbot\b|\bscan\.lol\b|\bScoutJet\b|\bScrapy\b|\bScreaming\b|\bScreenerBot\b|\bSearchestate\b|\bSearchmetricsBot\b|\bSemrush\b|\bSemrushBot\b|\bSEOkicks\b|\bSEOkicks-Robot\b|\bSEOlyticsCrawler\b|\bSeomoz\b|\bSEOprofiler\b|\bseoscanners\b|\bSeoSiteCheckup\b|\bSEOstats\b|\bserpstatbot\b|\bsexsearcher\b|\bShodan\b|\bSiphon\b|\bSISTRIX\b|\bSitebeam\b|\bSiteExplorer\b|\bSiteimprove\b|\bSiteLockSpider\b|\bSiteSnagger\b|\bSiteSucker\b|\bSite
Sucker\b|\bSitevigil\b|\bSlySearch\b|\bSmartDownload\b|\bSMTBot\b|\bSnake\b|\bSnapbot\b|\bSnoopy\b|\bSocialRankIOBot\b|\bSociscraper\b|\bsogouspider\b|\bSogou
web
spider\b|\bSosospider\b|\bSottopop\b|\bSpaceBison\b|\bSpammen\b|\bSpankBot\b|\bSpanner\b|\bsp_auditbot\b|\bSpbot\b|\bSpinn3r\b|\bSputnikBot\b|\bspyfu\b|\bSqlmap\b|\bSqlworm\b|\bSqworm\b|\bSteeler\b|\bStripper\b|\bSucker\b|\bSucuri\b|\bSuperBot\b|\bSuperHTTP\b|\bSurfbot\b|\bSurveyBot\b|\bSuzuran\b|\bSwiftbot\b|\bsysscan\b|\bSzukacz\b|\bT0PHackTeam\b|\bT8Abot\b|\btAkeOut\b|\bTeleport\b|\bTeleportPro\b|\bTelesoft\b|\bTelesphoreo\b|\bTelesphorep\b|\bThe
Intraformant\b|\bTheNomad\b|\bThumbor\b|\bTightTwatBot\b|\bTitan\b|\bToata\b|\bToweyabot\b|\bTracemyfile\b|\bTrendiction\b|\bTrendictionbot\b|\btrendiction\.com\b|\btrendiction\.de\b|\bTrue_Robot\b|\bTuringos\b|\bTurnitin\b|\bTurnitinBot\b|\bTwengaBot\b|\bTwice\b|\bTyphoeus\b|\bUnisterBot\b|\bUpflow\b|\bURLy\.Warning\b|\bURLy
Warning\b|\bVacuum\b|\bVagabondo\b|\bVB
Project\b|\bVCI\b|\bVeriCiteCrawler\b|\bVidibleScraper\b|\bVirusdie\b|\bVoidEYE\b|\bVoil\b|\bVoltron\b|\bWallpapers/3\.0\b|\bWallpapersHD\b|\bWASALive-Bot\b|\bWBSearchBot\b|\bWebalta\b|\bWebAuto\b|\bWeb
Auto\b|\bWebBandit\b|\bWebCollage\b|\bWeb
Collage\b|\bWebCopier\b|\bWEBDAV\b|\bWebEnhancer\b|\bWeb
Enhancer\b|\bWebFetch\b|\bWeb Fetch\b|\bWebFuck\b|\bWeb Fuck\b|\bWebGo
IS\b|\bWebImageCollector\b|\bWebLeacher\b|\bWebmasterWorldForumBot\b|\bwebmeup-crawler\b|\bWebPix\b|\bWeb
Pix\b|\bWebReaper\b|\bWebSauger\b|\bWeb
Sauger\b|\bWebshag\b|\bWebsiteExtractor\b|\bWebsiteQuester\b|\bWebsite
Quester\b|\bWebster\b|\bWebStripper\b|\bWebSucker\b|\bWeb
Sucker\b|\bWebWhacker\b|\bWebZIP\b|\bWeSEE\b|\bWhack\b|\bWhacker\b|\bWhatweb\b|\bWho\.is
Bot\b|\bWidow\b|\bWinHTTrack\b|\bWiseGuys
Robot\b|\bWISENutbot\b|\bWonderbot\b|\bWoobot\b|\bWotbox\b|\bWprecon\b|\bWPScan\b|\bWWW-Collector-E\b|\bWWW-Mechanize\b|\bWWW\:\:Mechanize\b|\bWWWOFFLE\b|\bx09Mozilla\b|\bx22Mozilla\b|\bXaldon_WebSpider\b|\bXaldon
WebSpider\b|\bXenu\b|\bxpymep1\.exe\b|\bYoudaoBot\b|\bZade\b|\bZauba\b|\bzauba\.io\b|\bZermelo\b|\bZeus\b|\bzgrab\b|\bZitebot\b|\bZmEu\b|\bZumBot\b|\bZyBorg\b|\bapplebot\b|\barchive\.org_bot\b|\bBaidu\b|\bBaiduspider\b|\bbingbot\b|\bCFNetwork\b|\bFirefox/21\.0\b|\bia_archiver\b|\bMozilla/4\.76\b|\bMSIE
5\.\b|\bMSIE 6\.0\b|\bQwantify\b|\bSafeDNSBot\b|\bUptimebot\b|\bYahoo\!
Slurp\b|\bYandex\b).*"$
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [1458] Day(?P<_sep>[-/])MON(?P=_sep)Year[
:]?24hour:Minute:Second(?:\.Microseconds)?(?: Zone offset)?
`-
Lines: 1458 lines, 0 ignored, 1441 matched, 17 missed
[processed in 0.20 sec]
|- Missed line(s):
| 138.68.219.40 - - [01/Jan/2020:01:29:19 +0100] "GET /" 400 456 "-" "-"
| 185.156.177.234 - - [01/Jan/2020:17:23:59 +0100] "\x03" 400 226 "-" "-"
| 5.188.206.50 - - [02/Jan/2020:05:39:20 +0100] "\x03" 400 226 "-" "-"
| 164.52.24.162 - - [03/Jan/2020:08:47:40 +0100] "GET /" 400 456 "-" "-"
| 185.156.177.234 - - [03/Jan/2020:10:50:20 +0100] "\x03" 400 226 "-" "-"
| 173.212.251.146 - - [03/Jan/2020:15:42:21 +0100] "\x16\x03\x01" 400 226 "-"
"-"
| 193.188.22.187 - - [03/Jan/2020:20:55:56 +0100] "\x03" 400 226 "-" "-"
| 61.219.11.153 - - [04/Jan/2020:10:06:07 +0100] "-" 408 - "-" "-"
| 45.136.108.64 - - [04/Jan/2020:18:04:31 +0100] "\x03" 400 226 "-" "-"
| 211.157.175.118 - - [04/Jan/2020:23:34:21 +0100] "\x16\x03\x01" 400 226 "-"
"-"
| 194.61.24.55 - - [05/Jan/2020:00:43:02 +0100] "\x03" 400 226 "-" "-"
| 54.72.39.203 - - [05/Jan/2020:14:14:25 +0100] "\x16\x03\x01" 400 226 "-" "-"
| 112.66.102.101 - - [05/Jan/2020:14:16:38 +0100] "GET /" 400 456 "-" "-"
| 220.191.249.136 - - [05/Jan/2020:23:36:24 +0100] "\x16\x03\x01" 400 226 "-"
"-"
| 172.104.242.173 - - [05/Jan/2020:23:53:54 +0100] "-" 408 - "-" "-"
| 45.136.108.65 - - [06/Jan/2020:08:00:21 +0100] "\x03" 400 226 "-" "-"
| 61.219.11.153 - - [06/Jan/2020:14:48:54 +0100] "GET /" 400 456 "-" "-"
`-
This looks much better.
I added in the regex the CONNECT because of bot with the bot Go-http-client.
I'm not quite sure if this is making sense at this place. Maybe someone knows
better. Would be nice to get rid of the x03 tries. They don't use
CONNECT|GET|POST|HEAD|OPTIONS.
I also couldn't get the section badbotscustom to work. No idea why. So I added
a few more unwanted bots just to the badbots.
Seems to work right now, I'll report in a few days back.
--
You are receiving this mail because:
You are the QA Contact for the bug.
_______________________________________________
Mail for each SME Contribs bug report
To unsubscribe, e-mail [email protected]
Searchable archive at https://lists.contribs.org/mailman/public/contribteam/