You can unpack the traineddata file and take a look at the .config file in
it.

eg. In case of hin.traineddata the config file uses combined mode - cube as
well as OEM which makes it very slow. I changed the config value to use OEM
only and recombined the file and that improved the speed.

Please see
http://tesseract-ocr.googlecode.com/svn/trunk/doc/combine_tessdata.1.html

Shree

Shree Devi Kumar
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com


On Mon, Jul 15, 2013 at 9:12 PM, bear <[email protected]> wrote:

>  Thanks, Nick.  After poking through the source, it seems that one of my
> assumptions was incorrect; tesseract will default to the OEM_TESSERACT_ONLY
> mode, therefore it will not try to infer the best mode to use for
> individual languages (by default).
>
> *tesseractclass.cpp:*
> *
> *
> INT_INIT_MEMBER(tessedit_ocr_engine_mode, tesseract::OEM_TESSERACT_ONLY,
>                     "Which OCR engine(s) to run (Tesseract, Cube, both)."
>                     " Defaults to loading and running only Tesseract"
>                     " (no Cube,no combiner)."
>                     " Values from OcrEngineMode enum in tesseractclass.h)",
>                this->params()),
>
> On Monday, July 15, 2013 10:38:00 AM UTC-4, Nick White wrote:
>>
>> Hi,
>>
>> > I never set the tessedit_ocr_engine_mode
>> > configuration for tesseract, so I assume that it is using the default
>> mode
>> > which, from my reading, will infer the best mode to use from the engine
>> for the
>> > particular language.
>>
>> You're right in your assumptions, it will use the default (non-cube)
>> mode unless you tell it otherwise. You're also correct that the
>> default mode is likely the best for Spanish.
>>
>> > Finally, where can I set the tessedit_ocr_engine_mode?  I cannot find
>> this in
>> > any documentation online.  Do I need to modify the source before
>> compiling?  Is
>> > there a configuration file that I can modify or add?
>>
>> It's a configuration variable, which you set the same way as any
>> other configuration variable. That is documented a little here:
>> http://code.google.com/p/**tesseract-ocr/wiki/**ControlParams<http://code.google.com/p/tesseract-ocr/wiki/ControlParams>
>>
>> I'm afraid I can't help you with performance, as I have no knowledge
>> of android stuff. You might find it useful to look at the code of
>> Renard's excellent looking Text Fairy app for android:
>> https://github.com/renard314/**textfairy<https://github.com/renard314/textfairy>
>>
>> Nick
>>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to