#853: oaiharvest: better handling of remote OAI sources timing out
-------------------------+-----------------
Reporter: jcaffaro | Owner:
Type: enhancement | Status: new
Priority: major | Milestone:
Component: BibHarvest | Version:
Keywords: oaiharvest |
-------------------------+-----------------
Task #511 improved the handling of exceptions thrown when remote sources
are not available. It even added ''retries'' to still achieve the
harvesting when remote sources reply with error messages. However it can
still happen that sources time out without being retried:
{{{
2011-11-23 18:01:26 --> source arXiv is going to be updated
2011-11-23 18:02:06 --> an error occurred while harvesting from source
arXiv:
An error occured when trying to read response from export.arxiv.org: timed
out
}}}
The timeout is probably reported by {{{socket.error}}} (for example at
source:modules/bibharvest/lib/oai_harvest_getter.py@55a26d516ec820a5905d9a506419e4215e05573a#L291
as well as on lines 276/285?).
It would be nice if timeouts were handled similarly to HTTP errors (i.e.
trigger other attempts to harvest).
Some comments:
* It might (?) be as simple as adding some statement like '{{{if i <
attempt: time.sleep(10); continue}}}' before the '{{{raise}}}' statement
on line 292.
* Setting a higher timeout with '{{{socket.settimeout(..)}}}' might help
(one should be careful with side-effects, such as the one described at
https://twiki.cern.ch/twiki/bin/view/CDS/PythonGotchas#3_2_Incompatibility_between_SSL),
so that timeout should be reset after it has been changed. Note that a
'{{{timeout}}}' parameter was added to HTTPConnection/HTTPSConnection
classes in Python 2.6 (probably calling '{{{socket.settimeout(..)}}}'
behind the scene).
* A more important refactoring could lead to the oaiharvest task to be
re-submitted once several attempts have failed. The behaviour could be the
following one: after max attempts is reached, if task is run periodically
(with '-s' option) then gently terminate the task (and don't update
'lastrun' field) and change its scheduled running time to be in +5
minutes. This would lead to a slow drift in time of the daily execution of
the task if not handled properly. One could think of alternative options
to get the harvesting postponed/retried. Some might also like to keep the
current behaviour (failing task) or to simply wait for the next regular
scheduled execution of the task (for example the next day).
--
Ticket URL: <http://invenio-software.org/ticket/853>
Invenio <http://invenio-software.org>