Re: [tesseract-ocr] Tesseract makes different predictions on seemingly equal images. How to make it more robust?

2020-07-15 Thread Lorenzo Bolzani
I think the reason is that your input is bad so the model is confused and a few pixels are enough to see an extra letter. Your input is "bad" because it is different from the one used to train the neural network. The difference between the two images is small but the difference from the training

Re: [tesseract-ocr] Tesseract makes different predictions on seemingly equal images. How to make it more robust?

2020-07-15 Thread MysteriousGuy
This seems like an ad-hoc approach. I am already converting images to grayscale. If I apply blurring, binarisation, etc. then I will solve this case but I will prompt another case to fail as a result. There is something with tesseract that fails to generalize on clearly near-identical images,

Re: [tesseract-ocr] Large app size of a tesseract app on Android

2020-07-15 Thread JB Data31
To see details, you can *unzip -v* the *apk* file. @*JB*Δ Le mer. 15 juil. 2020 à 10:01, Kunal Singh a écrit : > Hi, > > I am using Tesseract 4 in an android/ios app. The OCR part is working > fine. But the app has a large installed size on the device

[tesseract-ocr] Tesseract makebox config with known lines of text

2020-07-15 Thread a.f...@sheffield.ac.uk
I'm using a loop around "tesseract $X $X batch.nochop makebox" to produce box files to be corrected and re-used for training, and have two questions. Is there a way to make it produce the line-by-line format (rather than character-by-character) that newer versions of tesseract support as

[tesseract-ocr] Large app size of a tesseract app on Android

2020-07-15 Thread Kunal Singh
Hi, I am using Tesseract 4 in an android/ios app. The OCR part is working fine. But the app has a large installed size on the device ( about 90 Mb). I am only using Tesseract at this point, and dont have any other assets like images, etc. What can be causing such large app sizes, and is there

Re: [tesseract-ocr] Tesseract makes different predictions on seemingly equal images. How to make it more robust?

2020-07-15 Thread Tuan Ardouin
You need to apply some pre-processing to your image. On Wednesday, July 15, 2020 at 9:01:14 AM UTC+2, MysteriousGuy wrote: > > Hi. Latest stable version (4.1.1) produces the same error > > 2020 m. liepa 14 d., antradienis 17:13:40 UTC+3, zdenop rašė: >> >> Try to use the latest version of

Re: [tesseract-ocr] Tesseract makes different predictions on seemingly equal images. How to make it more robust?

2020-07-15 Thread MysteriousGuy
Hi. Latest stable version (4.1.1) produces the same error 2020 m. liepa 14 d., antradienis 17:13:40 UTC+3, zdenop rašė: > > Try to use the latest version of tesseract. > > Zdenko > > > ut 14. 7. 2020 o 16:04 MysteriousGuy > > napísal(a): > >> I am using Tesseract to extract text from images

Re: [tesseract-ocr] tessaract ocr on capcha images--how to perform well?

2020-07-15 Thread Omar Hasan
it is okay. I was just doing experiment how to recognize characters from this wrost case scenarios. Thank you for your response. On Wednesday, July 15, 2020 at 12:08:56 PM UTC+6, zdenop wrote: > > there is albostultelly no intention to help you (or others) to use OCR for > breaking captcha. > >

Re: [tesseract-ocr] tessaract ocr on capcha images--how to perform well?

2020-07-15 Thread Zdenko Podobny
there is albostultelly no intention to help you (or others) to use OCR for breaking captcha. Zdenko ut 14. 7. 2020 o 19:53 Omar Hasan napísal(a): > Hello! I am trying to run ocr on capcha images. well, for normal images > tessaract performs well, but for images below attachments, it performs