[tesseract-ocr] Re: Difference trained data for Chinese

shree Fri, 11 Aug 2017 05:43:33 -0700

Please see https://github.com/tesseract-ocr/tessdata/issues/72




On Friday, August 11, 2017 at 2:26:55 PM UTC+5:30, Yang Yu wrote:
>
> Good day!
>
> Recently I was using tesseract (4.0 alpha) to do Chinese OCR and it works 
> really great. Now I want to pick up a best model to use but I find several 
> versions. What is the difference between them?
>
> 1. chi_sim from https://github.com/tesseract-ocr/tesseract/wiki/Data-Files 
> (around 50M)
> 2. chi_sim from https://github.com/tesseract-ocr/tessdata/tree/master/best 
> (around 13M)
> 3. chi_sim_vert from 
> https://github.com/tesseract-ocr/tessdata/tree/master/best (around 13M)
> 4. HanS from https://github.com/tesseract-ocr/tessdata/tree/master/best 
> (around 16M)
>
> All of them can work but the results are slightly different. From my own 
> evaluation #4 is the best, but I don't have any insight.
>
> Appreciate for any help.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2b698539-1bd3-4ad6-b753-84b90d13f79b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Re: Difference trained data for Chinese

Reply via email to