tesseract uses leptonica. You can try that for preprocessing See an example
at
http://tpgit.github.io/UnOfficialLeptDocs/leptonica/line-removal.html
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Wed, May 23, 2018 at
I am trying to read some tiff files that are receipts, invoices and so on.
The text is in lithuanian, I've used imagemagick to remove the background
and make the text pop more but I still get a pretty bad output.
Here are some examples of my files.
2 matches
Mail list logo