#511: oaiharvest: better handling of non available remote OAI sources
------------------------+-----------------
Reporter: jcaffaro | Owner:
Type: defect | Status: new
Priority: major | Milestone:
Component: BibHarvest | Version:
Keywords: |
------------------------+-----------------
Currently oaiharvest task will fail if the remove site is not available:
{{{
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/invenio/bibtask.py", line 756,
in _task_run
if callable(task_run_fnc) and task_run_fnc():
File "/usr/lib64/python2.4/site-packages/invenio/oai_harvest_daemon.py",
line 190, in task_run_core
setspecs=setspecs)
File "/usr/lib64/python2.4/site-packages/invenio/oai_harvest_daemon.py",
line 475, in oai_harvest_get
sets, secure, user, password, cert_file, key_file)
File "/usr/lib64/python2.4/site-packages/invenio/oai_harvest_getter.py",
line 217, in harvest
key_file=key_file)
File "/usr/lib64/python2.4/site-packages/invenio/oai_harvest_getter.py",
line 95, in OAI_Session
secure, user, password, cert_file, key_file)
File "/usr/lib64/python2.4/site-packages/invenio/oai_harvest_getter.py",
line 310, in OAI_Request
response = conn.getresponse()
File "/usr/lib64/python2.4/httplib.py", line 872, in getresponse
response.begin()
File "/usr/lib64/python2.4/httplib.py", line 336, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.4/httplib.py", line 294, in _read_status
line = self.fp.readline()
File "/usr/lib64/python2.4/socket.py", line 325, in readline
data = recv(1)
timeout: timed out
}}}
The code is currently catching httplib.HTTPException, but should probably
also catch socket.error exceptions. In addition the task could probably
re-submit itself upon failure, though one should be careful not to harvest
the same source multiple times, even partially.
--
Ticket URL: <https://invenio-software.org/ticket/511>
Invenio <http://invenio-software.org>