[ 
https://jira.duraspace.org/browse/DS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=24505#comment-24505
 ] 

Bram Luyten (@mire) commented on DS-919:
----------------------------------------

2 new observations:
- not all bots follow all the links on the page they hit. So they might not 
follow the bot trap link.
- almost no human user hide their user agent. Many bots do. A simple way to 
catch more bots would be to flag all hits that don't have a user agent.
                
> SOLR Statistics: Better detection & avoidance of abusive traffic (including a 
> bot trap) 
> ----------------------------------------------------------------------------------------
>
>                 Key: DS-919
>                 URL: https://jira.duraspace.org/browse/DS-919
>             Project: DSpace
>          Issue Type: New Feature
>          Components: Solr
>            Reporter: Bram Luyten (@mire)
>
> The current implementation of bot traffic filtering relies on IP lists. Even 
> though using hostnames (as suggested here: 
> https://jira.duraspace.org/browse/DS-790 ) could improve the situation, there 
> are still forms of abusive traffic we might want to detect and exclude from 
> stats.
> The most obvious example here would be repeated hits or downloads coming from 
> the same unique source. Another example could be traffic from spiders that 
> aren't included in the lists. A way to do this would be to create a bot trap: 
> a link hidden behind one pixel, that a human user would never click, but that 
> bots might follow. The agents getting to the resource at this link, could be 
> listed and dynamically removed from the hit/download counts.
> Some related links:
> http://www.affiliatebeginnersguide.com/sitelogs/bots_hunt.html
> http://www.elxsy.com/2009/06/how-to-identify-and-ban-bots-spiders-crawlers/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel
  • [Dspace-devel] [DuraSpace... Bram Luyten (@mire) (Commented) (DuraSpace JIRA)

Reply via email to