I use tesseract-ocr 3 in .Net application.
Here are the settings I have before INIT
m_tesseract.SetVariable("load_system_dawg", "0");
m_tesseract.SetVariable("load_freq_dawg", "0");
m_tesseract.SetPageSegMode(ePageSegMode.PSM_AUTO);
Here is the fragment of the code:
private TesseractProcessor m_tesseract = null;
private const string m_path = @"data\";
private const string m_lang = "eng";
private void InitOCR()
{
m_tesseract = new TesseractProcessor();
m_tesseract.SetVariable("load_system_dawg", "0");
m_tesseract.SetVariable("load_freq_dawg", "0");
//m_tesseract.SetVariable("tessedit_char_whitelist",
"0123456789");
//m_tesseract.SetVariable("tessedit_pageseg_mode",
((int)TesseractPageSegMode.PSM_AUTO).ToString());
m_tesseract.SetPageSegMode(ePageSegMode.PSM_AUTO);
bool succeed = m_tesseract.Init(m_path, m_lang,
(int)TesseractEngineMode.DEFAULT);
if (!succeed)
{
MessageBox.Show("Tesseract initialization failed. The
application will exit.");
Application.Exit();
}
//System.Environment.CurrentDirectory =
System.IO.Path.GetFullPath(m_path);
}
private string Ocr(Image image)
{
string retVal = string.Empty;
sw.Reset();
sw.Start();
m_tesseract.Clear();
m_tesseract.ClearAdaptiveClassifier();
retVal = m_tesseract.Recognize(image);
sw.Stop();
label1.Text = string.Format("Elapsed time: {0}",
sw.ElapsedMilliseconds);
return retVal;
}
четверг, 5 июня 2014 г., 19:07:03 UTC+4 пользователь Nick White написал:
>
> On Thu, Jun 05, 2014 at 01:51:24PM +0200, zdenko podobny wrote:
> > On Thu, Jun 5, 2014 at 12:10 PM, 'thakobyan' via tesseract-ocr
> > [email protected] <javascript:>> wrote:
> >>
> >> Trying to OCR the portion of the image. For some reason if I
> >> cut only one word (see Fail.png and Fail2.png attached) it
> >> returns empty string.
> >
> > Please provide more detail - e.g. exact version of tesseract, how did
> you run
> > OCR (API, executable, parameters etc..)
>
> Indeed, give us more information on how you get the failure. Using
> the latest SVN version works well for me:
>
> ; tesseract Fail.png stdout
> FEMALE
> ; tesseract 'Fail 2.png' stdout
> SINGLE
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/4487fbcb-d1d0-4081-bc52-332241bc62d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.