https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30614

--- Comment #8 from David Cook <[email protected]> ---
(In reply to Aleisha Amohia from comment #7)
> This enhancement makes a GET request as a fallback if HEAD returns an error
> status.
> 
> This doesn't fix all of the errors produced by the sample error MARC
> attached - any further support is welcomed.

It's a challenging one. On one hand, we're trying to stop people from using
bots against Koha. On the other hand, we'd like to use bots to do things like
link checking in Koha.

We've had a custom link checker for many years that sits adjacent to Koha, and
what I did create a hash of configurable domains and then check each link
checker URL against that list first. If it matches, then we don't bother
checking it, because we know the site is just going to block our link checker
attempt anyway.

Obviously, it means your link checking is never going to be perfect. But it
makes for fewer false positives. 

To get the more accurate result, we'd have to use a headless browser set up to
pretend to be a real human, but then we'd also have become the thing that we
were trying to protect ourselves against. Of course, I think we could argue our
purposes are more positive, but I don't know that the link checker targets
would necessarily agree.

Anyway, that's a lot of text for a small idea. That's what I ended up doing
locally (outside of Koha) to deal with this sort of situation. Not perfect but
it's been fairly practical.

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to