Package: gscan2pdf
Version: 1.0.3-1
Severity: normal

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

running gscan2pdf --debug and trying to do OCR I got:

INFO - echo tessedit_create_hocr 1 > hocr.config;tesseract /tmp/TZHdnZXQb3.tif 
/tmp/7DDKyZd_Rl -l deu +hocr.config 2> /dev/null;rm hocr.config
Tesseract Open Source OCR Engine v3.02 with Leptonica
utf8 "\xC0" does not map to Unicode at /usr/share/perl5/Gscan2pdf.pm line 921, 
<> chunk 1.
*** unhandled exception in callback:
***   Malformed UTF-8 character (fatal) at /usr/share/perl5/Gscan2pdf/Page.pm 
line 114.
***  ignoring at /usr/bin/gscan2pdf line 10729.

I could repeat this with several documents and resolutions. When I ran tesseract
manually on the .tif file, I indeed saw non UTF-8 characters in the produced
html.

Regards,

Thomas Koch

- -- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages gscan2pdf depends on:
ii  graphicsmagick-imagemagick-compat [imagemagick]  1.3.12-1.1
ii  libconfig-general-perl                           2.50-1
ii  libgoo-canvas-perl                               0.06-1+b2
ii  libgtk2-ex-simple-list-perl                      0.50-2
ii  libgtk2-imageview-perl                           0.05-1+b2
ii  libhtml-parser-perl                              3.69-2
ii  liblocale-gettext-perl                           1.05-7+b1
ii  liblog-log4perl-perl                             1.29-1
ii  libpdf-api2-perl                                 2.019-1
ii  libproc-processtable-perl                        0.45-3+b1
ii  libreadonly-perl                                 1.03-3
ii  librsvg2-common                                  2.36.1-1
ii  libsane-perl                                     0.05-1
ii  libset-intspan-perl                              1.16-1
ii  libtiff-tools                                    4.0.1-5
ii  perlmagick                                       8:6.7.4.0-5
ii  sane-utils                                       1.0.22-7.1

Versions of packages gscan2pdf recommends:
ii  cuneiform                  <none>
ii  djvulibre-bin              3.5.25.2-4
ii  gocr                       <none>
ii  libgtk2-ex-podviewer-perl  0.18-1
ii  sane                       1.0.14-9
ii  tesseract-ocr              3.02.01-4
ii  unpaper                    0.3-1
ii  xdg-utils                  1.1.0~rc1+git20111210-6

gscan2pdf suggests no packages.

- -- no debconf information

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQIcBAEBCAAGBQJPnTTyAAoJEAf8SJEEK6ZaGlAP/28TRSC7BbGOPQ0OGa2Gmu49
iqHgd2QfABV4cZQJj8tvk663vsPiEO5Sc6go5AWOobSWGMyERUQo5ZkOLSvEWjTv
ZNumcwSCa/84H7x4K/t9aQljdX9p/hued9vPkGwRU0eH1AHc0bkzpfwo3shJ6F2g
AMyJNuLxWPm0D8Mh4/Dil/usJennCaxOAN5BFVUmn2Vuhd79xRDJSWW9eF6IFxdo
4knxmDC20Y7YZ2rBVegFiA++BGjN0dgYsgZINtMWvOeHtnx6SLlxf4yN1/CMnEQw
5TfImCZrI/+alUGu1KTSQmfgeVK+mteDiZFND1+aLjbTgIiLZq+ghFGpeebVAIPj
5kiZn2oCCVLEbQCKuYL00RH2MoRcEeWBS9rv250xM/fxPukM8ahMX6NFwSf4ZSA/
43zXDMA3oyNbX1PVgDp0MgoU7l7mKcZyefJvkUyaSNo4BzK02XCtvcp/5atXGehv
XW434tBb/WEr9y0UESLo54BnAiCCri4FGv/6KIa+Cuw26WpNh8vFRuMCZQyMMGrV
bSpuiu96B1VLvcNj+3gK4aNXgetsFO10V6u7XJ+t2W8XUTdU+lKFE55QxfKsQQvv
ttPr3P1kMrYC/sEXyzxq4Hk4tvvLQFe9I/qC5liUgymzpe9qwsrJFq5AQ2vbgKgq
2MQ3FAoqhN6szCs9yYdY
=6uA9
-----END PGP SIGNATURE-----



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to