clone 572081 -1 retitle 572081 ocrodjvu: assumes that foreground/background separation is correct reassign -1 cuneiform 0.7.0+dfsg-5 retitle -1 cuneiform: fails with an unhelpful error message on empty pages thanks
* jsb...@mimuw.edu.pl <jsb...@mimuw.edu.pl>, 2010-03-01, 14:46:
ocrodjvu --engine cuneiform --language pol --clear-text -o out.djvu in.djvu Processing 'in.djvu': - Page #1 PUMA_XFinalrecognition failed. Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner self.run() File "/usr/lib/python2.5/threading.py", line 446, in run self.__target(*self.__args, **self.__kwargs) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 424, in page_thread result = self.process_page(page) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 391, in process_page with self._engine.recognize(pfile, language=self._options.language, details=self._options.details) as result: File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 210, in recognize return cuneiform.recognize(pbm_file, language) File "/usr/share/ocrodjvu/lib/cuneiform.py", line 72, in recognize worker.wait() File "/usr/share/ocrodjvu/lib/ipc.py", line 58, in wait raise CalledProcessError(return_code, self.__command) CalledProcessError: Command 'cuneiform' returned non-zero exit status 1
There are two problems here:1. ocrodjvu assumes that foreground/background separation has been done correctly and feeds an OCR engine with page masks only. However, in case of this document all page masks are solid white.
2. Cuneiform fails (with an unhelpful error message) on a completely blank page.
-- Jakub Wilk
signature.asc
Description: Digital signature