1. Which --oem are you using with tesseract 4, legacy engine or lstm?

--oem 0 or --oem 1

2. Is Brazilian Portuguese very different from Portuguese? Please see the
trainingtext and wordlists on
https://github.com/tesseract-ocr/langdata/tree/master/por

3. Provide a sample image with it's ground truth and point out the errors
in it. Is the image at 300 dpi?

4. Please share the box/tiff pair to test for training.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Wed, May 17, 2017 at 2:49 AM, Maicon Azevedo <pnpinformat...@gmail.com>
wrote:

> Hello!
>
> Guys I have tesseract 4 on Ubuntu 16.04.
>
> Running the tesseract with  -l por (portuguese from Brazil) I don't have
> the good results. The image use other font than the trained data (I think).
>
> My question is. It's necessary to train tesseract again? I created the tif
> and box file with jtesseditor but I don't what I need to do with these
> files and how to write a good training data.  I sow the
> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
> but I didn't found any case similar with mine.
>
> Thanks in advance!
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/a34d2a11-54d6-416f-87cd-164a8157aed6%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/a34d2a11-54d6-416f-87cd-164a8157aed6%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVn%3D6eSR-F3qtOt2XvJ%2BaC-%2BWUPtrKWm4CmHVu9ZQDCbA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to