On Thu, Mar 8, 2012 at 11:11 AM, Dmitri Silaev <[email protected]> wrote:
> As for existence and effects of specific parameters, currently I don't
> any other way to find it out but digging in Tesseract's code.
If you are on Windows, I wrote this section on TCC/LE [1] that talks
about how you can use it's "ffind" command to display all (most?
some?) configuration parameters defined in the tesseract-ocr source
files (which is not the same thing as those parameters actually being
*used* to do anything). It also mentions how you can do something
similar with Visual Studio 2008, or the bash shell on Linux.
You can also put the following in a config file called, for example,
config-write-params.txt:
tessedit_write_params_to_file currentparams.txt
tessdata_manager_debug_level 1
(NOTE: this file *MUST* use unix style line endings, that is, only a
Linefeed character, *NOT* the window's convention: Carriage Return,
Linefeed).
Then do:
tesseract.exe eurotext.tif eurotext config-write-params.txt
You'll see:
Wrote parameters to currentparams.txt
Loading Tesseract/Cube with tessedit_ocr_engine_mode 0
Loaded unicharset
Loaded ambigs
Loaded language 'eng' as main language
Tesseract Open Source OCR Engine v3.02 with Leptonica
And looking at the newly created currentparams.txt you'll see something like:
textord_debug_tabfind 0
textord_debug_bugs 0
textord_testregion_left -1
...
textord_noise_hfract 0.015625
textord_noise_rowratio 6
textord_blshift_maxshift 0
textord_blshift_xfraction 9.99
(over 660 lines in my case). This file unfortunately is missing the
Description string that is listed in the source files, but otherwise
it gives a pretty good idea of what can be set. Searching the source
for a particular param will then provide insight into what it does.
For example with TCC/LE, try searching the source for
"tessedit_write_params_to_file":
ffind /s/v/t"tessedit_write_params_to_file" *.cpp
which gives:
---- TesseractSVN\ccmain\tessedit.cpp
if (((STRING &)tessedit_write_params_to_file).length() > 0) {
FILE *params_file = fopen(tessedit_write_params_to_file.string(), "wb");
tessedit_write_params_to_file.string());
tessedit_write_params_to_file.string());
---- TesseractSVN\ccmain\tesseractclass.cpp
STRING_MEMBER(tessedit_write_params_to_file, "",
5 lines in 2 files
Opening ccmain\tessedit.cpp, we then see the following:
if (((STRING &)tessedit_write_params_to_file).length() > 0) {
FILE *params_file = fopen(tessedit_write_params_to_file.string(), "wb");
if (params_file != NULL) {
ParamUtils::PrintParams(params_file, this->params());
fclose(params_file);
if (tessdata_manager_debug_level > 0) {
tprintf("Wrote parameters to %s\n",
tessedit_write_params_to_file.string());
}
} else {
tprintf("Failed to open %s for writing params.\n",
tessedit_write_params_to_file.string());
}
}
and ccmain\tesseractclass.cpp shows:
STRING_VAR_H(tessedit_write_params_to_file, "",
"Write all parameters to the given file.");
[1] http://tesseract-ocr.googlecode.com/svn/trunk/vs2008/doc/tools.html#id2
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en