Hi Efi,

On Mon, Nov 25, 2013 at 07:05:46AM -0800, Efi Jorn wrote:
> I need to perform OCR on tables that contain a product's name in the left
> column and the quantity of this product that was ordered in the right column.
> 
> Will Tesseract's accuracy will be greater if I have all possible product names
> in the freq-dawg file and then train Tesseract?

Yes, it should help to improve things. Note that you don't have to
completely re-train Tesseract to do this, you can just use
'combine-tessdata -u' to take apart the appropriate training file,
use dawg2wordlist to extract the original freq-dawg wordlist, add
your words, then use wordlist2dawg and combine-tessdata to create a
new training file with the extra product names.

Nick

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
  • Freq-dawg Efi Jorn
    • Re: Freq-dawg Nick White

Reply via email to