After several testing, I think "line removal" is the reason instead of the 
binarization.

xian於 2020年7月2日星期四 UTC+8下午5時54分42秒寫道:
>
> For the Chinese words, I found that binarization in tesseract makes really 
> bad results.
> I use -c tessedit_write_image=1 to get the result image from tesseract's 
> binarization.
>
> As attachments,
> original
> tess_bin -> tesseract binarize the original.png
> my_bin -> my preprocessing to the original.png
> tess_my_bin ->  tesseract binarize the my_bin.png
>
> You can find that some characters disappear.
> Before I pass all the images to the tesseract, I want to use my own 
> function (pre-processing) first.
> But tesseract's binarization make result worse.
>
>
> I want to handle the image preprocessing part by mysl
> How can I disable tesseract's image preprocessing? ....Or the only chance 
> to do this is to modify the source code?
> Thanks!!
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/cdeebed4-ad10-44aa-8d22-cfa5911d03c3o%40googlegroups.com.

Reply via email to