Hi Steve, On 18/12/12 19:15, Steve Swinsburg wrote: > We have identified a number of new spider IP addresses from Google and > other indexers being responsible for vastly inflating our stats. I've > created a local spider filter list with the IP addresses and I am > running the stats updater: > dspace stats-util -m > > to reprocess the stats and mark them appropriately, then will remove > them via: > dspace stats-util -f > > However the mark is taking hours. Likewise if I go ahead and just > delete them based on the new rules, via: > dspace stats-util -i > > Is that normal? We only have about 200,000 views to process.
I use dspace stats-util -i every now and again (-i because the code for -m looked a bit strange to me -- the solr stats don't use IDs for the documents, which means updating essentially has to be done as delete the old document + post a new document). It takes quite a while (2.5 million downloads+views) but not hours. This is on 1.8.2. I'm not sure whether this is still of interest for you, but I figured I'd share my experience, just in case. cheers, Andrea -- Dr Andrea Schweer IRR Technical Specialist, ITS Information Systems The University of Waikato, Hamilton, New Zealand ------------------------------------------------------------------------------ Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and much more. Get web development skills now with LearnDevNow - 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122812 _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

