Thanks, Nick. After poking through the source, it seems that one of my
assumptions was incorrect; tesseract will default to the OEM_TESSERACT_ONLY
mode, therefore it will not try to infer the best mode to use for
individual languages (by default).
*tesseractclass.cpp:*
*
*
INT_INIT_MEMBER(tessedit_ocr_engine_mode, tesseract::OEM_TESSERACT_ONLY,
"Which OCR engine(s) to run (Tesseract, Cube, both)."
" Defaults to loading and running only Tesseract"
" (no Cube,no combiner)."
" Values from OcrEngineMode enum in tesseractclass.h)",
this->params()),
On Monday, July 15, 2013 10:38:00 AM UTC-4, Nick White wrote:
>
> Hi,
>
> > I never set the tessedit_ocr_engine_mode
> > configuration for tesseract, so I assume that it is using the default
> mode
> > which, from my reading, will infer the best mode to use from the engine
> for the
> > particular language.
>
> You're right in your assumptions, it will use the default (non-cube)
> mode unless you tell it otherwise. You're also correct that the
> default mode is likely the best for Spanish.
>
> > Finally, where can I set the tessedit_ocr_engine_mode? I cannot find
> this in
> > any documentation online. Do I need to modify the source before
> compiling? Is
> > there a configuration file that I can modify or add?
>
> It's a configuration variable, which you set the same way as any
> other configuration variable. That is documented a little here:
> http://code.google.com/p/tesseract-ocr/wiki/ControlParams
>
> I'm afraid I can't help you with performance, as I have no knowledge
> of android stuff. You might find it useful to look at the code of
> Renard's excellent looking Text Fairy app for android:
> https://github.com/renard314/textfairy
>
> Nick
>
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.