For record 12, I get the following in the invenio.err:
[...]
Frame _task_run_core in
/usr/lib64/python2.6/site-packages/invenio/refextract_daemon.py at line 408
-------------------------------------------------------------------------------
405 except Exception, err:
406 write_message("Error: Unable to extract references.\n%s\n" % \
407 (err.args[0]), stream=sys.stdout, verbose=0)
----> 408 raise StandardError
409
410 try:
411 ## Always move contents of file holding xml into a file
-------------------------------------------------------------------------------
_append_recid_collection_list = '<function _append_recid_collection_list
at 0x2389050>'
task_info = 'None'
fulltexts_for_collection =
"['12:/opt/invenio/var/data/files/g0/24/.text;1']"
err = "KeyError('authors',)"
daemon_cli_opts = "{'kb-journal': 0, 'fulltext':
['12:/opt/invenio/var/data/files/g0/24/.text;1'],
'treat_as_reference_section': 0, 'inspire': 0, 'dictfile': 0, 'xmlfile':
'/opt/invenio/var/tmp/refextract/refextract_task_109.xml',
'extraction-mode': 'ref', 'output_raw': 0, 'kb-report-number': 0,
'verbosity': 0}"
/opt/invenio/var/data/files/g0/24/.text\;1 indeed has the text extracted
from the pdf (and contains the keyword "authors", twice).
Any ideas?
On 14/5/2012 3:46 μμ, Samuele Kaplun wrote:
Hi Theodoros,
In data lunedì, 14 maggio 2012 15.29:37, Theodoros Theodoropoulos ha scritto:
2012-05-14 14:09:31 --> Task #104 started.
2012-05-14 14:09:31 --> Updating task status to RUNNING.
2012-05-14 14:09:31 --> Error: Unable to extract references.
authors
2012-05-14 14:09:31 --> Updating task status to CERROR.
2012-05-14 14:09:31 --> Task #104 finished. [CERROR]
anything potentially interesting in /opt/invenio/var/log/invenio.err?
Cheers!
Sam