Dear list,

The DSpace 5.x (and presumably 6.x) documentation[0] suggests that it is
possible to mark existing Solr statistics records as being bots or spiders
using the following command:

$ dspace stats-util -m

After trying to test this with an updated list of user agents[1] for a
while I realized that the feature is only implemented for IPs. As it stands
right now the code in StatisticsClient.java only marks robots based on
their IPs, but not on their user agents or domains:

else if (line.hasOption('m'))
{
    SolrLogger.markRobotsByIP();
}

Strangely enough, SolrLogger has a markRobotByUserAgent() function that is
never called anywhere in the Java code base (also it seems to only be
partially implemented, as it does not iterate over agents).

Should I file a bug? This issue affects DSpace 5.x and 6.x for sure.

Regards,

[0]
https://wiki.duraspace.org/display/DSDOC5x/SOLR+Statistics+Maintenance#SOLRStatisticsMaintenance-FilteringandPruningSpiders
[1] https://github.com/atmire/COUNTER-Robots
-- 
Alan Orth
[email protected]
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
"In heaven all the interesting people are missing." ―Friedrich Nietzsche

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/CAKKdN4Un91gqRVwrVSzL9EjZgu35NeGwzxrsgyyy3oRQHCvytQ%40mail.gmail.com.

Reply via email to