What about osd.traineddata and config files? Are they in your tessdata
directory?

- excuse the brevity, sent from mobile

On 01-Jan-2017 9:22 PM, <ruediger.k...@deutschebahn.com> wrote:

> Hi all,
>
> I'm in a time critical situation. I want to deliver a new software for our
> customer on 5th January 2017.
> While things worked well on the test-environment; after deploying the
> software on the productive environment problems came up.
> Before describing the situation/failure in detail, some info about the
> setup and the environment.
>
>
> Environment & Installation
>
> *Operating System: Suse Enterprise Linux Server 12 SP 1*
> $ uname –a
> Linux 3.12.62-60.64.8-default #1 SMP Tue Oct 18 12:21:38 UTC 2016
> (42e0a66) x86_64 x86_64 x86_64 GNU/Linux
> Since this environment is managed, I can not update any system libraries
> like glibc etc.
> *So the newest and only official supported version for "Suse 12 SP1
> x86_64" of teaaseract I found is 3.02*
>
> *Installed Packages:*
> libgif4-4.1.6-34.1.1.x86_64.rpm
> liblept3-1.69-16.1.x86_64.rpm
> libtesseract3-3.02.02-3.2.1.x86_64.rpm
> libwebp4-0.3.1-34.1.x86_64.rpm
> tesseract-3.02.02-59.1.x86_64.rpm
>
> *tesseract version*
> $ tesseract –v
> tesseract 3.02.02
>     leptonica-1.69
>         libgif 4.1.6 : libjpeg 8d : libpng 1.5.22 : libtiff 4.0.6 : zlib
> 1.2.8
>
> *Release details*
> $ zypper info tesseract
> Information for package tesseract:
> ----------------------------------
> Repository: @System
>
>
> *Name: tesseractVersion: 3.02.02-59.1Arch: x86_64*
> Vendor: obs://build.opensuse.org/home:koprok
> Support Level: unknown
> Installed: Yes
> Status: up-to-date
> Installed Size: 3.8 MiB
> Summary: Open Source OCR Engine
> Description: […]
>
>
> Traindata & Languages
>
> *Traindata*
> The traindata has been manually downloaded from github
> <https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302>
> .
>
>    - https://sourceforge.net/projects/tesseract-ocr-alt/
>    files/tesseract-ocr-3.02.eng.tar.gz/download
>    
> <https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.eng.tar.gz/download>
>    - https://sourceforge.net/projects/tesseract-ocr-alt/
>    files/tesseract-ocr-3.02.deu.tar.gz/download
>    
> <https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.deu.tar.gz/download>
>
> *And files have been to /usr/share/tessdata/*
> $ ls -la /usr/share/tessdata/
> drwxr-xr-x 1 root root      230 Dec 31 16:37 configs/
> -rw-r--r-- 1 root root  2438081 Dec 30 15:31 deu.traineddata
> -rw-r--r-- 1 root root   171918 Dec 30 20:16 eng.cube.bigrams
> -rw-r--r-- 1 root root       38 Dec 30 20:16 eng.cube.fold
> -rw-r--r-- 1 root root      181 Dec 30 20:16 eng.cube.lm
> -rw-r--r-- 1 root root   857304 Dec 30 20:16 eng.cube.nn
> -rw-r--r-- 1 root root      254 Dec 30 20:16 eng.cube.params
> -rw-r--r-- 1 root root 13020078 Dec 30 20:16 eng.cube.size
> -rw-r--r-- 1 root root  2444187 Dec 30 20:16 eng.cube.word-freq
> -rw-r--r-- 1 root root      996 Dec 30 20:16 eng.tesseract_cube.nn
> -rw-r--r-- 1 root root 21876572 Dec 30 20:16 eng.traineddata
> drwxr-xr-x 1 root root       88 Dec 31 16:37 tessconfigs/
>
> *tesseract detects 'deu' and 'eng' as available languages*
> $ tesseract --list-langs
> List of available languages (2):
> deu
> eng
>
>
> Application & Problem
>
> *The software application is build upon Spring Boot framework*
> Runtime.getRuntime().exec(new String[] {
>  "tesseract",
>  "--tessdata-dir", "/usr/share/tessdata",
>  "-l", lang.getISO3Language(),
>  inputTiff.toAbsolutePath().toString(), extractedcntPath });
>
> *The appication logfile says*
> 2016-12-30 20:30:02,320 [https-jsse-nio-8443-exec-7] WARN
> PDFContentExtractor - read_params_file: parameter not found: II*
>
> *Executing tesseract with tessdata dir fails*
> $ tesseract --tessdata-dir /usr/share/tessdata -l deu
> inputPdf6632237754781472255.tiff out4
> read_params_file: parameter not found: II*
>
> *When executing tesseract with no tessdata dir works well*
> $ tesseract -l deu inputPdf6632237754781472255.tiff out5
> Tesseract Open Source OCR Engine v3.02.02 with Leptonica
>
>
> Questions & Ideas
> Why does tesseract work well and detect the available languages without
> the --tessdata-dir parameter set?
> Why does teasseract crash during initialization when using the
> --tessdata-dir parameter set?
> Is there any difference between running tesseract with/without the 
> --tessdata-dir
> parameter set?
>
> What can I do to fix this problem?
> Install a newer version of tesseract?
> Compile a version from sources?
> Use other traindata/tessdata?
> Run tesseract without the --tessdata-dir param?
>
> If anybody can help me getting this issue solved in the upcomming week, it
> would not only make me happy, but rather our whole team.
>
> Thank you very much in advance!
> Rüdiger Kurz
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/f046ae79-d687-45f8-af41-289cd84da2b9%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/f046ae79-d687-45f8-af41-289cd84da2b9%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXFc1jZEb0L%2B0xV7FvzYsedP%2Bs1k5i7Ca8UPJDyiG9atA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to