On 11/11/19, Gene Heskett <ghesk...@shentel.net> wrote: > On Monday 11 November 2019 08:33:13 Greg Wooledge wrote: ... snip ... >> I *know* I told you to look at your log files, and to turn on >> user-agent logging if necessary. >> >> I don't remember seeing you ever *post* your log files here, not even >> a single line from a single instance of this bot. Maybe I missed it. > > Only one log file seems to have useful data, the "other..." file, and I > have posted several single lines here, but here's a few more: ... snip ... > [11/Nov/2019:12:11:39 -0500] "GET > /gene/nitros9/level1/coco1_6309/bootfiles/bootfile_covga_cocosdc > HTTP/1.1" 200 16133 "-" "Mozilla/5.0 (compatible; Daum/4.1; > +http://cs.daum.net/faq/15/4118.html?faqId=28966)" > > I did ask earlier if daum was a bot but no one answered. They are > becoming a mite pesky.
Google translate can be your friend: https://translate.google.com/translate?hl=&sl=ko&tl=en&u=https%3A%2F%2Fcs.daum.net%2Ffaq%2F15%2F4118.html Note they even tell you how to turn off collection: I want to automatically exclude documents from my site from web document search results. [robots.txt Exclusion using file] Please write the following in Notepad, and save it as robots.txt file to the root directory. User-agent: DAUM Disallow: / Using * instead of DAUM can prevent web collection robots from collecting documents on all search services, not just Daum. So let's take a look at what you've got: $ curl http://geneslinuxbox.net:6309/robots.txt # $Id: robots.txt 410967 2009-08-06 19:44:54Z oden $ # $HeadURL: svn+ssh://svn.mandriva.com/svn/packages/cooker/apache-conf/current/SOURCES/robots.txt $ # exclude help system from robots User-agent: googlebot-Image Disallow: / User-agent: googlebot Disallow: / User-agent: * Disallow: /manual/ User-agent: * Disallow: /manual-2.2/ User-agent: * Disallow: /addon-modules/ User-0agent: * Disallow: /doc/ User-agent: * Disallow: /images/ # the next line is a spam bot trap, for grepping the logs. you should _really_ change this to something else... #Disallow: /all_our_e-mail_addresses # same idea here... User-agent: * Disallow: /admin/ # but allow htdig to index our doc-tree # User-agent: htdig # Disallow: User-agent: * Disallow: stress test User-agent: stress-agent Disallow: / User-agent * Disallow: / $ You're missing a ':' - it should be User-agent: * Disallow: / and I don't think "User-0agent: *" is going to do what you want.. Regards, Lee