tags 574731 + moreinfo thanks * Michael Below <[email protected]>, 2010-03-20, 16:41:
I am trying to scan and recognize some files that contain text and graphics. I am scanning them using xsane and save them as PDF files (120-200 pages, 300dpi greyscale). Then I turn them into DJVU files using pdf2djvu --monochrome and try to ocr them with ocrodjvu--language=deuThe result is usually an error like this: ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray: index out of range Exception in thread Thread-5: Traceback (most recent call last): File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner self.run() File "/usr/lib/python2.5/threading.py", line 446, in run self.__target(*self.__args, **self.__kwargs) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 443, in page_thread result = self.process_page(page) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 428, in process_page html_file.close() File "/usr/lib/python2.5/contextlib.py", line 33, in __exit__ self.gen.throw(type, value, traceback) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 189, in recognize ocropus.wait() File "/usr/share/ocrodjvu/lib/ipc.py", line 58, in wait raise CalledProcessError(return_code, self.__command) CalledProcessError: Command 'ocroscript' returned non-zero exit status 1 ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray: index out of range ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray: index out of range ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray: index out of range
Thanks for your bug report.The fact that the document has many pages is most likely irrelevant. Could you try to determine which page is causing this crash, save the offending page as a separate document, and send it to me in a private mail so I can do further investigation? Thanks in advance.
-- Jakub Wilk
signature.asc
Description: Digital signature

