#1063: Error when converting from .docx to text upon
-------------------------+-------------------------------------------------
Reporter: skaplun | Owner: skaplun
Type: defect | Status: new
Priority: critical | Component: WebSubmit
Version: v1.0.0 | Keywords: libreoffice openoffice docx
| textification fulltext
-------------------------+-------------------------------------------------
How to reproduce. Assume a .docx file existing in /tmp:
{{{
sudo -u www-data python
/opt/invenio/lib/python/invenio/websubmit_file_converter.py -c CDS-
list.docx -f txt -o prova.txt -d
DEBUG:root:Executing: ('sudo', '-u', 'nobody', '/usr/bin/python', '-c',
'import os; open(os.path.join(\'/opt/invenio/var/tmp/ooffice-tmp-files\',
"test"), "w").write(\'1337334979.47\')')
DEBUG:root:Preparing IO for input=CDS-list.docx, output=prova.txt,
output_ext=.text
DEBUG:root:IO prepared: input_file=/tmp/CDS-list.docx,
output_file=/tmp/prova.txt, working_dir=None
DEBUG:root:Executing: ('sudo', '-u', 'nobody', '/usr/bin/python',
'/opt/invenio/lib/python/invenio/unoconv.py', '-v', '-s', 'localhost',
'-p', 2002, '--outputfile', '/opt/invenio/var/tmp/ooffice-tmp-
files/tmpLLYteh.text', '-f', 'text', '/tmp/CDS-list.docx')
Exception AttributeError: "'Process' object has no attribute
'_Process__process'" in <bound method Process.__del__ of
<invenio.asyncproc.Process object at 0x32f43d0>> ignored
ERROR: Unexpected error when converting from CDS-list.docx to .txt (<type
'exceptions.TypeError'>): execv() arg 2 must contain only strings
}}}
In {{{invenio.err}}} one can find:
{{{
>> 2012-05-18 11:56:20 -> TypeError: execv() arg 2 must contain only
strings
>>> User details
No client information available
>>> Traceback details
Traceback (most recent call last):
File "/opt/invenio/lib/python/invenio/websubmit_file_converter.py", line
378, in convert_file
return converter(current_input, current_output, **final_params)
File "/opt/invenio/lib/python/invenio/websubmit_file_converter.py", line
422, in unoconv
execute_command('sudo', '-u', CFG_OPENOFFICE_USER,
CFG_PATH_OPENOFFICE_PYTHON, os.path.join(CFG_PYLIBDIR, 'invenio',
'unoconv.py'), '-v', '-s', CFG_OPENOFFICE_SERVER_HOST, '-p', CFG_OPENOF
FICE_SERVER_PORT, '--outputfile', tmpfile, '-f', unoconv_format,
input_file)
File "/opt/invenio/lib/python/invenio/websubmit_file_converter.py", line
941, in execute_command
res, stdout, stderr = run_process_with_timeout(args,
cwd=argd.get('cwd'), filename_out=argd.get('filename_out'),
filename_err=argd.get('filename_err'))
File "/usr/local/lib/python2.7/dist-packages/invenio/shellutils.py",
line 226, in run_process_with_timeout
the_process = Process(args, shell=shell, stdin=stdin, cwd=cwd)
File "/usr/local/lib/python2.7/dist-packages/invenio/asyncproc.py", line
138, in __init__
self.__process = subprocess.Popen(*params, **kwparams)
File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
raise child_exception
TypeError: execv() arg 2 must contain only strings
}}}
The error is fixed in the master branch, so a suitable bugfix need to be
backported to 1.0.
--
Ticket URL: <http://invenio-software.org/ticket/1063>
Invenio <http://invenio-software.org>