Hi,

I tried it in 3.03 version (on openSUSE 13.1) and there was no segfault
(3.02 segfault also for me).

Zdenko


On Fri, May 30, 2014 at 6:22 PM, Christopher Smeenk <[email protected]>
wrote:

> I would like to use tesseract to read data from a scanned high school
> transcript. The form contains a bunch of fields (student name, gender,
> address) and corresponding values (characters, words or numbers).
>
> I understand the way to do this is using config files augmented with user
> data [see the man page
> <http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html>,
>  patterns are explained in more detail in the file
> /path/to/tesseract-ocr/dict/trie.h].
>
> However, when I try to set my own eng.user-words or eng.user-patterns,
> tesseract returns a *Segmentation Fault*.
>
> First, here is a test image I am using to check the pattern matching:
> (attached file test-002.png)
>
> Here is some info about my install:
> cs@pleco:/data/OCR/tesseract/tests$ lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 12.04.4 LTS
> Release: 12.04
> Codename: precise
>
>
> cs@pleco:/data/OCR/tesseract/tests$ tesseract -v
> tesseract 3.02.02
>  leptonica-1.69
>   libjpeg 6b : libpng 1.2.46 : libtiff 3.9.5 : zlib 1.2.3.4
>
>
> Here's is a good run, showing the output:
> cs@pleco:/data/OCR/tesseract/tests$ tesseract testImages/test-002.png
> thetext -psm 3
> Tesseract Open Source OCR Engine v3.02.02 with Leptonica
> cs@pleco:/data/OCR/tesseract/tests$ cat thetext.txt
> Na me: Roosevelt, Fra nklin
>
>
> Age: 102
>
>
> Name: Harper, Stephen
> Age: 58
>
>
> Name: Hawk, Tony
> Age: 34
>
>
> Nane: Shakespeare, Bill
> Age: 432
>
>
> Here are the config file and user pattern files:
> cs@pleco:/usr/share/tesseract-ocr/tessdata$ cat configs/bazaar_test
> load_system_dawg F
> load_freq_dawg F
> user_words_suffix test-words
> user_patterns_suffix test-patterns
>
>
> cs@pleco:/usr/share/tesseract-ocr/tessdata$ cat eng.test-patterns
> Name: \A\c*, \A\c*
> Age: \d*
>
>
> cs@pleco:/usr/share/tesseract-ocr/tessdata$ cat eng.test-words
> Name:
> Age:
> Roosevelt
> Franklin
> Harper
> Stephen
> Hawk
> Tony
> Shakespeare
>
>
> And here is the result when running tesseract with the config file:
> cs@pleco:/data/OCR/tesseract/tests$ tesseract testImages/test-002.png
> thetext -psm 3 bazaar_test
> Tesseract Open Source OCR Engine v3.02.02 with Leptonica
> Segmentation fault
>
>
>
> What am I doing wrong? Thanks for reading!
>
> Chris
>
>  --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/bb5b289c-6453-437e-88e1-3506f8d8bf8f%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/bb5b289c-6453-437e-88e1-3506f8d8bf8f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yw3_EpC8D_PRHr_zavn4oF%3Dj4o_ZR3zUG3AdCnL2OMiQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to