Yes, if a variable isn't mentioned in the config file it will just
go to the default, which is as tesseract only.

On Mon, Jul 15, 2013 at 10:53:57AM -0700, bear wrote:
> I see, that was very helpful.  Thanks Shree.  I unpacked Arabic, and noticed
> the engine mode:
> 
> tessedit_ocr_engine_mode 1
> 
> I unpacked Spanish, and it did not contain an engine mode variable 
> declaration.
>  Does that mean that it will default to using tesseract only (and not cube) as
> defined in my tesseractclass.cpp?  Or, will the absence of the variable from a
> language specific .config file default to something else?
> 
> Thanks again.
> 
> On Monday, July 15, 2013 1:23:07 PM UTC-4, shree wrote:
> 
>     You can unpack the traineddata file and take a look at the .config file in
>     it.
> 
>     eg. In case of hin.traineddata the config file uses combined mode - cube 
> as
>     well as OEM which makes it very slow. I changed the config value to use 
> OEM
>     only and recombined the file and that improved the speed.
> 
>     Please see http://tesseract-ocr.googlecode.com/svn/trunk/doc/
>     combine_tessdata.1.html
> 
>     Shree
> 
>     Shree Devi Kumar
>     ____________________________________________________________
>     भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
> 
> 
>     On Mon, Jul 15, 2013 at 9:12 PM, bear <[email protected]> wrote:
> 
>          Thanks, Nick.  After poking through the source, it seems that one of
>         my assumptions was incorrect; tesseract will default to the
>         OEM_TESSERACT_ONLY mode, therefore it will not try to infer the best
>         mode to use for individual languages (by default).
> 
>         tesseractclass.cpp:
> 
>         INT_INIT_MEMBER(tessedit_ocr_engine_mode,
>         tesseract::OEM_TESSERACT_ONLY,
>                             "Which OCR engine(s) to run (Tesseract, Cube,
>         both)."
>                             " Defaults to loading and running only Tesseract"
>                             " (no Cube,no combiner)."
>                             " Values from OcrEngineMode enum in
>         tesseractclass.h)",
>                        this->params()),
> 
>         On Monday, July 15, 2013 10:38:00 AM UTC-4, Nick White wrote:
> 
>             Hi,
> 
>             > I never set the tessedit_ocr_engine_mode
>             > configuration for tesseract, so I assume that it is using the
>             default mode
>             > which, from my reading, will infer the best mode to use from the
>             engine for the
>             > particular language.
> 
>             You're right in your assumptions, it will use the default
>             (non-cube)
>             mode unless you tell it otherwise. You're also correct that the
>             default mode is likely the best for Spanish.
> 
>             > Finally, where can I set the tessedit_ocr_engine_mode?  I cannot
>             find this in
>             > any documentation online.  Do I need to modify the source before
>             compiling?  Is
>             > there a configuration file that I can modify or add?
> 
>             It's a configuration variable, which you set the same way as any
>             other configuration variable. That is documented a little here:
>             http://code.google.com/p/tesseract-ocr/wiki/ControlParams
> 
>             I'm afraid I can't help you with performance, as I have no
>             knowledge
>             of android stuff. You might find it useful to look at the code of
>             Renard's excellent looking Text Fairy app for android:
>             https://github.com/renard314/textfairy
> 
>             Nick
> 
>         --
>         --
>         You received this message because you are subscribed to the Google
>         Groups "tesseract-ocr" group.
>         To post to this group, send email to [email protected]
>         To unsubscribe from this group, send email to
>         [email protected]
>         For more options, visit this group at
>         http://groups.google.com/group/tesseract-ocr?hl=en
>          
>         ---
>         You received this message because you are subscribed to the Google
>         Groups "tesseract-ocr" group.
>         To unsubscribe from this group and stop receiving emails from it, send
>         an email to [email protected].
>         For more options, visit https://groups.google.com/groups/opt_out.
>          
>          
> 
> 
> 
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>  
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email
> to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to