Images appearing readable to human eyes may not be so to computers.
Therefore, image processing is most likely required prior to OCR step.
Sure, you can use jTessBoxEditor to train for your language. The generated
.traineddata will be placed in a tessdata folder and you can use the *Validate
Hi Sach,
I am having a very similar problem, did you have any luck getting a full
screen shot to OCR close to 100%?
On Thursday, June 9, 2016 at 10:38:26 AM UTC-4, Sach wrote:
>
> Expected the OCR of a screenshot to be 100%. Please see the attached PNG
> image. Most of the labels are not
On Monday, June 13, 2016 at 10:22:32 AM UTC-4, Matthias Schneider wrote:
>
> I'm using latest dev version 3.05.00dev and I used peirick/leptonica (
> https://github.com/peirick/leptonica) to build libtesseract.dll and
> liblept.dll with Visual Studio 2015.
> However, the resulting DLLs I'm using
If you look at the readme files in the diff subdirectories starting with
OCR under
https://github.com/Shreeshrii/imagessan/tree/master you will see results of
character and word level accuracy. Depending on the font, character level
accuracy is around 80% and word level accuracy around 60%
I have
I haven't train this font and I've not encounter the same problem as you. This
might mean that you haven't drop your trained data file to the good directory.
If you have installed tesseract for Windows, you will have to drop the file in
that directory. Tesseract-Ocr uses some Environment
Thanks again for replying. I will surely check them out.
My experience is that OCR on sanskrit data with hin.traineddata gives
better results than san.traineddata. I do know know, it is due to cube mode
or devanagari preprocessing(segmentation i guess) in devanagari?
I wonder why such
6 matches
Mail list logo