Hi Tint, > Our repository is running on 1.6.2 and we have been using solr for a few > months now. There seems to be some problem with solr statistics. Bitstream > for some items were downloaded more than a few thousand times within a month > from the same place. How can I filter out such systematic access (by > bots/spiders etc)?
Take a look at the following tool: /dspace/bin/dspace stats-util -h usage: StatisticsClient -b,--reindex-bitstreams Reindex the bitstreams to ensure we have the bundle name -r,--remove-deleted-bitstreams While indexing the bundle names remove the statistics about deleted bitstreams -u,--update-spider-files Update Spider IP Files from internet into /dspace/config/spiders -f,--delete-spiders-by-flag Delete Spiders in Solr By isBot Flag -i,--delete-spiders-by-ip Delete Spiders in Solr By IP Address -m,--mark-spiders Update isBot Flag in Solr -h,--help help -o,--optimize Run maintenance on the SOLR index You might need to first register the IP address of the bots in /dspace/config/spiders/ I hope that helps, Stuart Lewis Digital Development Manager Te Tumu Herenga The University of Auckland Library Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand Ph: +64 (0)9 373 7599 x81928 ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech