Bug#574731: ocrodjvu: crashes on > 100 page file

Michael Below Sat, 20 Mar 2010 09:09:31 -0700

Package: ocrodjvu
Version: 0.4.2-1
Severity: normal


I am trying to scan and recognize some files that contain text and graphics. I
am scanning them using xsane and save them as PDF files (120-200
pages, 300dpi greyscale). Then I turn them into DJVU files using
pdf2djvu --monochrome and try to ocr them with ocrodjvu
--language=deu

The result is usually an error like this:

ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray:
index out of range
Exception in thread Thread-5:
Traceback (most recent call last):
  File "/usr/lib/python2.5/threading.py", line 486, in
__bootstrap_inner
    self.run()
  File "/usr/lib/python2.5/threading.py", line 446, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 443, in
page_thread
    result = self.process_page(page)
  File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 428, in
process_page
    html_file.close()
  File "/usr/lib/python2.5/contextlib.py", line 33, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 189, in
recognize
    ocropus.wait()
  File "/usr/share/ocrodjvu/lib/ipc.py", line 58, in wait
    raise CalledProcessError(return_code, self.__command)
CalledProcessError: Command 'ocroscript' returned non-zero exit
status 1

ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray:
index out of range
ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray:
index out of range
ocroscript: /usr/share/ocropus/scripts//recognize.lua:183: narray:
index out of range



-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages ocrodjvu depends on:
ii  djvulibre-bin                 3.5.22-8   Utilities for the DjVu image forma
ii  python                        2.5.4-9    An interactive high-level object-o
ii  python-argparse               1.1-1      optparse-inspired command-line par
ii  python-djvu                   0.1.17-1   Python support for the DjVu image 
ii  python-lxml                   2.2.4-1+b1 pythonic binding for the libxml2 a
ii  python-support                1.0.6.1    automated rebuilding support for P

Versions of packages ocrodjvu recommends:
ii  ocropus                       0.3.1-2    document analysis and OCR system
ii  python-pyicu                  0.9-2      Python extension wrapping the ICU 
ii  tesseract-ocr                 2.04-2     Command line OCR tool

Versions of packages ocrodjvu suggests:
ii  cuneiform                   0.7.0+dfsg-5 multi-language OCR system

-- no debconf information



-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Bug#574731: ocrodjvu: crashes on > 100 page file

Reply via email to