Thank you, Mark. For now I'll just settle for an updated list of spider
agents from COUNTER-Robots¹ (dropping the text file into
dspace/config/spiders/agents seems to work).

Regards,

¹ https://github.com/atmire/COUNTER-Robots

On Tue, Nov 5, 2019 at 4:02 PM Mark H. Wood <[email protected]> wrote:

> On Mon, Nov 04, 2019 at 11:10:25PM +0200, Alan Orth wrote:
> > The DSpace 5.x (and presumably 6.x) documentation[0] suggests that it is
> > possible to mark existing Solr statistics records as being bots or
> spiders
> > using the following command:
> >
> > $ dspace stats-util -m
> >
> > After trying to test this with an updated list of user agents[1] for a
> > while I realized that the feature is only implemented for IPs. As it
> stands
> > right now the code in StatisticsClient.java only marks robots based on
> > their IPs, but not on their user agents or domains:
> >
> > else if (line.hasOption('m'))
> > {
> >     SolrLogger.markRobotsByIP();
> > }
> >
> > Strangely enough, SolrLogger has a markRobotByUserAgent() function that
> is
> > never called anywhere in the Java code base (also it seems to only be
> > partially implemented, as it does not iterate over agents).
> >
> > Should I file a bug? This issue affects DSpace 5.x and 6.x for sure.
>
> https://jira.duraspace.org/browse/DS-2431
>
> There are several Issues related to completing the work on extended
> spider marking and filtering.
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>
> --
> All messages to this mailing list should adhere to the DuraSpace Code of
> Conduct: https://duraspace.org/about/policies/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/dspace-tech/20191105140039.GA30402%40IUPUI.Edu
> .
>


-- 
Alan Orth
[email protected]
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
"In heaven all the interesting people are missing." ―Friedrich Nietzsche

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/CAKKdN4Uf43qw8WeX_6yrK25-qo%2BJ3QRF80w05f%3DggtWvCdoiKw%40mail.gmail.com.

Reply via email to