** Attachment added: "test.tif" https://bugs.launchpad.net/bugs/912648/+attachment/2659608/+files/test.tif
** Description changed: + Summary: + wget -O test.tif https://bugs.launchpad.net/ubuntu/+source/tesseract/+bug/912648/+attachment/2659608/+files/test.tif && tesseract test.tif testout + + Expected results: Run to completion. Actual results: Aborts with an + assertion error. + + -------------------------------- + tesseract consistently crashes with the following assertion error: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed. Aborted ...when passed certain files generated by ocrfeeder. Attached is a sample file captured from an ocrfeeder run. To reproduce, run tesseract <attached sample tif file> outputfilename ProblemType: Bug DistroRelease: Ubuntu 11.10 Package: tesseract-ocr 2.04-2.1ubuntu1 ProcVersionSignature: Ubuntu 3.0.0-14.23-generic 3.0.9 Uname: Linux 3.0.0-14-generic x86_64 NonfreeKernelModules: fglrx ApportVersion: 1.23-0ubuntu4 Architecture: amd64 Date: Thu Jan 5 22:32:11 2012 InstallationMedia: Xubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012) ProcEnviron: - PATH=(custom, user) - LANG=en_US.UTF-8 - SHELL=/bin/bash + PATH=(custom, user) + LANG=en_US.UTF-8 + SHELL=/bin/bash SourcePackage: tesseract UpgradeStatus: No upgrade log present (probably fresh install) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/912648 Title: crash with certain tif inputs: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/tesseract/+bug/912648/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
