Thanks, save_best_choices worked great. /mikael
On 7 Apr, 06:01, Dmitri Silaev <[email protected]> wrote: > Use this: > > // This ensures Tesseract's "blob_choices" structures are filled > SetVariable("save_best_choices", "T"); > > Warm regards, > Dmitri Silaev > > On Wed, Apr 6, 2011 at 9:16 PM, micke <[email protected]> wrote: > > Hi, > > > I'm using Tesseract 3.01 on images basically containing two columns of > > multidigit numbers. The source material is semi-poor computer > > printouts from the 60's. I've trained Tesseract specifically for that > > data, using a unicharset containing only the relevant characters, and > > overall I'm very pleased with the accuracy. On character level, I'm > > getting about 99.8 percent. What I'm trying to do now is find a way to > > locate probable errors to make it easier to fix them. > > > My first approach is to make use of Tesseract's confidence data. > > Having researched this a bit, I realize those numbers may not do me a > > whole lot of good, but I'd like to at least give it a try. What I've > > tried so far is to patch TessBaseAPI::GetBoxtText to include a new > > column in the box file containing the confidence values, by calling > > Confidence(RIL_SYMBOL) on the ResultIterator for each character. The > > problem is that I get the same confidence value for all characters in > > a "word", rather than character-specific values. Is this what's meant > > to happen? > > > I've found that for my data, best_choice->blob_choices() always > > returns NULL in ResultIterator::Confidence. Is this why I get word > > confidences, or would it be the same thing if I did get choices, and > > choice_it.data()->certainty() was called instead of best_choice- > >>certainty()? And should I be worried that there are no choices? > > > Of course, if there's a better way of getting at the character-level > > confidence values, I'd appreciate any pointers you may have. > > > Thanks in advance, > > Mikael > > > -- > > You received this message because you are subscribed to the Google Groups > > "tesseract-ocr" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group > > athttp://groups.google.com/group/tesseract-ocr?hl=en. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

