Thanks, save_best_choices worked great.

/mikael

On 7 Apr, 06:01, Dmitri Silaev <[email protected]> wrote:
> Use this:
>
>     // This ensures Tesseract's "blob_choices" structures are filled
>     SetVariable("save_best_choices", "T");
>
> Warm regards,
> Dmitri Silaev
>
> On Wed, Apr 6, 2011 at 9:16 PM, micke <[email protected]> wrote:
> > Hi,
>
> > I'm using Tesseract 3.01 on images basically containing two columns of
> > multidigit numbers. The source material is semi-poor computer
> > printouts from the 60's. I've trained Tesseract specifically for that
> > data, using a unicharset containing only the relevant characters, and
> > overall I'm very pleased with the accuracy. On character level, I'm
> > getting about 99.8 percent. What I'm trying to do now is find a way to
> > locate probable errors to make it easier to fix them.
>
> > My first approach is to make use of Tesseract's confidence data.
> > Having researched this a bit, I realize those numbers may not do me a
> > whole lot of good, but I'd like to at least give it a try. What I've
> > tried so far is to patch TessBaseAPI::GetBoxtText to include a new
> > column in the box file containing the confidence values, by calling
> > Confidence(RIL_SYMBOL) on the ResultIterator for each character. The
> > problem is that I get the same confidence value for all characters in
> > a "word", rather than character-specific values. Is this what's meant
> > to happen?
>
> > I've found that for my data, best_choice->blob_choices() always
> > returns NULL in ResultIterator::Confidence. Is this why I get word
> > confidences, or would it be the same thing if I did get choices, and
> > choice_it.data()->certainty() was called instead of best_choice-
> >>certainty()? And should I be worried that there are no choices?
>
> > Of course, if there's a better way of getting at the character-level
> > confidence values, I'd appreciate any pointers you may have.
>
> > Thanks in advance,
> > Mikael
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "tesseract-ocr" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group 
> > athttp://groups.google.com/group/tesseract-ocr?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to