This appears to be an issue with --find_fonts and/or 
--strip_unrenderable_words. The following command succeeds for me:

$ text2image --exposure=0 --font "Helvetica Neue Thin" 
--outputbase=eng.Helvetica_Neue_Thin.exp0 
--text=/Users/ryan/source/tesseract/tesseract-ocr.langdata/eng/eng.training_text
 
--leading=32 --char_spacing=0.0 --box_padding=0                             
                
Initializing fontconfig
Rendered page 0 to file eng.Helvetica_Neue_Thin.exp0.tif
Rendered page 1 to file eng.Helvetica_Neue_Thin.exp0.tif

-Ryan

On Tuesday, March 31, 2015 at 3:43:23 PM UTC-4, Philip Pearl wrote:
>
> Hi All
>
> I'm trying to train tesseract for the first time on my Mac.  I'm running 
> text2image as follows, but it is crashing in Pango as the priv data on the 
> font is NULL.
>
> /usr/local/Cellar/tesseract/HEAD/bin//text2image --leading=32 
> --fonts_dir=/Library/Fonts --box_padding=0 --strip_unrenderable_words 
> --char_spacing=0.0 --exposure=0 --find_fonts=true 
> --outputbase=/tmp/tesstrain/eng/eng.Helvetica_Neue_Thin.exp0 
> --text=./tesslang/eng/eng.training_text
>
> Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
>
> 0   libpangoft2-1.0.0.dylib             0x00000001090fad9e 
> pango_fc_font_get_glyph + 25
>
> 1   text2image                          0x000000010858bf58 
> tesseract::PangoFontInfo::CanRenderString(char const*, int, 
> std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, 
> std::__1::allocator<char> >, 
> std::__1::allocator<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> > > >*) const + 322
>
> 2   text2image                          0x000000010858d0ab 
> tesseract::FontUtils::SelectFont(char const*, int, 
> std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, 
> std::__1::allocator<char> >, 
> std::__1::allocator<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, 
> std::__1::basic_string<char, std::__1::char_traits<char>, 
> std::__1::allocator<char> >*, std::__1::vector<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> >, 
> std::__1::allocator<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> > > >*) + 287
>
> 3   text2image                          0x0000000108592c06 
> tesseract::StringRenderer::RenderAllFontsToImage(double, char const*, int, 
> std::__1::basic_string<char, std::__1::char_traits<char>, 
> std::__1::allocator<char> >*, Pix**) + 108
>
> 4   text2image                          0x0000000108584149 main + 2750
>
> 5   libdyld.dylib                       0x00007fff932315fd start + 1
>
>
> I installed from HEAD using homebrew and the instructions I found here 
> https://ryanfb.github.io/etc/2014/11/19/installing_tesseract_training_tools_on_mac_os_x.html
>
>
>    - Any ideas how to get around this crash?
>    - Am I crazy running this on my Mac?  Would I be better off with a 
>    Linux VM?
>    - Does training from fonts work or am I better off starting with 
>    images (my data is analog HD screen captures of TV menus!)? I know the 
> font 
>    the menus use.
>
> Thanks in advance for any help or advice you are able to give me.
>
> Phil
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f028b0c2-91ee-4686-bb6d-edd81282485e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to