subject:"\[tesseract\-ocr\] does it make sense to train existing languages\? how to fix repeatedly wrong letters\?"

Re: [tesseract-ocr] does it make sense to train existing languages? how to fix repeatedly wrong letters?

2018-04-02 Thread ShreeDevi Kumar

My suggestion would be to do post processing of the OCR output. On Mon 2 Apr, 2018, 6:09 PM JP T, wrote: > Hi > > I don't really got an understanding of the consequences of training. > > My problem: > I've got tons of pages with a special format. ("one place study"

[tesseract-ocr] does it make sense to train existing languages? how to fix repeatedly wrong letters?

2018-04-02 Thread JP T

Hi I don't really got an understanding of the consequences of training. My problem: I've got tons of pages with a special format. ("one place study" about the historic inhabitants of a town) tesseract repeatedly fails on a few special words: oo (oh-oh) at start of line for "wedding" is often