[ https://jira.duraspace.org/browse/DS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=24505#comment-24505 ]
Bram Luyten (@mire) commented on DS-919: ---------------------------------------- 2 new observations: - not all bots follow all the links on the page they hit. So they might not follow the bot trap link. - almost no human user hide their user agent. Many bots do. A simple way to catch more bots would be to flag all hits that don't have a user agent. > SOLR Statistics: Better detection & avoidance of abusive traffic (including a > bot trap) > ---------------------------------------------------------------------------------------- > > Key: DS-919 > URL: https://jira.duraspace.org/browse/DS-919 > Project: DSpace > Issue Type: New Feature > Components: Solr > Reporter: Bram Luyten (@mire) > > The current implementation of bot traffic filtering relies on IP lists. Even > though using hostnames (as suggested here: > https://jira.duraspace.org/browse/DS-790 ) could improve the situation, there > are still forms of abusive traffic we might want to detect and exclude from > stats. > The most obvious example here would be repeated hits or downloads coming from > the same unique source. Another example could be traffic from spiders that > aren't included in the lists. A way to do this would be to create a bot trap: > a link hidden behind one pixel, that a human user would never click, but that > bots might follow. The agents getting to the resource at this link, could be > listed and dynamically removed from the hit/download counts. > Some related links: > http://www.affiliatebeginnersguide.com/sitelogs/bots_hunt.html > http://www.elxsy.com/2009/06/how-to-identify-and-ban-bots-spiders-crawlers/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://jira.duraspace.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel