Re: [tesseract-ocr] Tesseract not recognizing ancient language's code

2020-03-09 Thread aby tesh
Hey, I followed the steps in the readme file, and i started the lstmtraining, but it seems my current computer's processor can't handle the training for a longer period of time. What can i do about it? When should i abort the training to get a good trainedata file? or is there one which is

[tesseract-ocr] Tesseract 4 + OpenCL?

2020-03-09 Thread Matt Chapman
Does Tesseract 4.0+ work with OpenCL? I'm going through this guide to install, but some strange dependency issues make me think it doesn't. The documentation doesn't mention the tesseract version, which makes me think the

Re: [tesseract-ocr] Tesseract not recognizing ancient language's code

2020-03-09 Thread Shree Devi Kumar
https://github.com/tesseract-ocr/tessdoc/blob/master/TrainingTesseract-4.00.md#hardware-software-requirements On Tue, Mar 10, 2020, 03:41 aby tesh wrote: > Hey, > > I followed the steps in the readme file, and i started the lstmtraining, > but it seems my current computer's processor can't

Re: [tesseract-ocr] Tesseract not recognizing ancient language's code

2020-03-09 Thread Shree Devi Kumar
If you can share a large enough training text and fonts, I can rerun the training. On Tue, Mar 10, 2020, 03:41 aby tesh wrote: > Hey, > > I followed the steps in the readme file, and i started the lstmtraining, > but it seems my current computer's processor can't handle the training for > a

[tesseract-ocr] Tesseract unable to read simple image correctly

2020-03-09 Thread Velectico Consulting
*Environment* Tesseract Version: tesseract v5.0.0-alpha.20200223 Platform: Windows 64-bit *Problem: * The attached image below is not read correctly with language as English. Not being a pro, I tried some prepossessing of the image as suggested in the following link but it did not help.

[tesseract-ocr] Re: Ban some characters on tessseract ( '/' , '|' , ',' , ...)

2020-03-09 Thread Shubhranshu Panda
you can also take help of regex. On Friday, 6 March 2020 18:19:41 UTC+5:30, Guillaume de Rybel wrote: > > Hi, my work is to recognize license plates, and sometimes, tesseract > recognize some special characters. I need to 'ban' those characters : '/' , > '|' , ',' . > May I have some help ? > >

Re: [tesseract-ocr] Supplying a different DPI param per page

2020-03-09 Thread Zdenko Podobny
Just quick replay (I did not test it :-) ): - tiff is"container of images" and AFAIK each image can have its own resolution (DPI is just information for correct printing/displaying of image) - tesseract should read multi-page tiff image-by-image and process it individually

Re: [tesseract-ocr] Tesseract unable to read simple image correctly

2020-03-09 Thread Zdenko Podobny
Please write us what did you already tried from tesseract documentation. Zdenko po 9. 3. 2020 o 10:02 Velectico Consulting napĂ­sal(a): > *Environment* > Tesseract Version: tesseract v5.0.0-alpha.20200223 > Platform: Windows 64-bit > > *Problem: * > The attached image below is not read

[tesseract-ocr] [SOLVED] Using different resolutions for OCR-ing and background image (Was: New user's questions)

2020-03-09 Thread Dr Rainer Woitok
Greetings, On Friday, 2020-02-21 18:18:21 +0100, I myseld wrote: > ... > after playing a while with "tesseract" and after having read plenty of > manual pages and documentation on the web I still have some questions. > I want to create a PDF file with an OCR layer, but: > > 1. Some of my

Re: [tesseract-ocr] Tesseract not recognizing ancient language's code

2020-03-09 Thread aby tesh
Ohh a very nice repo, i will check it out and get back to you. Thanks! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to