Re: [tesseract-ocr] Could anyone help me about pytessract?

2019-09-18 Thread luffy monky
Sorry because I can understand why the out put is nothing...But an other code use the same way it will out the string but show 03 not 09 I just want to debug about those question -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe

Re: [tesseract-ocr] Could anyone help me about pytessract?

2019-09-18 Thread Zdenko Podobny
What did you try from already publish here in forum and wiki? Zdenko št 19. 9. 2019 o 5:52 luffy monky napísal(a): > Hi ALL > I try to use any sample code from google. > But it's show no thing in my code > Could I trouble you for any advice?? > Here is my sample code >

[tesseract-ocr] Could anyone help me about pytessract?

2019-09-18 Thread luffy monky
Hi ALL I try to use any sample code from google. But it's show no thing in my code Could I trouble you for any advice?? Here is my sample code import pytesseract from PIL import Image image = Image.open("test3.jpg") code = pytesseract.image_to_string(image)

Re: [tesseract-ocr] Tutorial for fine-tuning Tesseract 4 for a new font?

2019-09-18 Thread shree
Please see https://github.com/ameera3/OCR_Expiration_Date On Wednesday, September 18, 2019 at 5:47:55 PM UTC+5:30, Jochen Naumann wrote: > > thanks, I read the recent post, but it only pointed to a tutorial using > old tesseract 3. > However, I have found a wonderful youtube tutorial, so if an

Re: [tesseract-ocr] problems with upper-case character

2019-09-18 Thread Zdenko Podobny
IMO only solution is to send longer text for ocr. (e.g. paragraph) Zdenko st 18. 9. 2019 o 17:19 'Sandra M.' via tesseract-ocr < tesseract-ocr@googlegroups.com> napísal(a): > I'm using Tesseract with Python. I have an image with 1-6 words in it and > need to read the text. Sometimes the charact

Re: [tesseract-ocr] problems with upper-case character

2019-09-18 Thread Timothy Snyder
No configs I know of but I have similar functionality implemented in a text post-processing step in my OCR pipeline. On Wed, Sep 18, 2019 at 11:19 AM 'Sandra M.' via tesseract-ocr < tesseract-ocr@googlegroups.com> wrote: > I'm using Tesseract with Python. I have an image with 1-6 words in it and

[tesseract-ocr] problems with upper-case character

2019-09-18 Thread 'Sandra M.' via tesseract-ocr
I'm using Tesseract with Python. I have an image with 1-6 words in it and need to read the text. Sometimes the character "C", which look the same in upper and lower case, is detected as lower case c instead of upper case C. I see the problem, but in context to the following letters it should b

Re: [tesseract-ocr] Next problem with training (tesseract 4.0)

2019-09-18 Thread J Adam Funk
Those look very useful --- thanks! On Tuesday, 17 September 2019 16:38:19 UTC+1, shree wrote: > > config files are there some languages. They will be in langdata or > langdata_lstm repos. radical_stroke.txt is also there. > > You can also look at training instructions in wiki or in > shreeshrii

[tesseract-ocr] Difference between --psm 1 and --psm 3 (default)?

2019-09-18 Thread Pranas Žiaukas
Page segmentation modes: 0Orientation and script detection (OSD) only. 1Automatic page segmentation with OSD. 2Automatic page segmentation, but no OSD, or OCR. (not implemented) 3Fully automatic page segmentation, but no OSD. (Default) 4Assume a single column of t

Re: [tesseract-ocr] Tutorial for fine-tuning Tesseract 4 for a new font?

2019-09-18 Thread Jochen Naumann
thanks, I read the recent post, but it only pointed to a tutorial using old tesseract 3. However, I have found a wonderful youtube tutorial, so if anyone reads this and is also interested: https://www.youtube.com/watch?v=TpD76k2HYms&t=314s Am Mi., 18. Sept. 2019 um 12:29 Uhr schrieb Shree Devi Ku

Re: [tesseract-ocr] Small script to generate all boxes for ocrd-train

2019-09-18 Thread Shree Devi Kumar
Please submit as a PR to https://github.com/tesseract-ocr/tesstrain On Wed, Sep 18, 2019 at 4:08 PM Lorenzo Bolzani wrote: > > Hi, > I wrote this small script to speed up OCRD-train > training startup. > > It generates the boxes for all the images provided

[tesseract-ocr] Small script to generate all boxes for ocrd-train

2019-09-18 Thread Lorenzo Bolzani
Hi, I wrote this small script to speed up OCRD-train training startup. It generates the boxes for all the images provided on the command line (it works only for single line images). It is a simple conversion of the generate_line_box.py from ocrd-train. I used

Re: [tesseract-ocr] Tutorial for fine-tuning Tesseract 4 for a new font?

2019-09-18 Thread Shree Devi Kumar
Please search forum archive There was a recent mention regarding dot-matrix font training. On Wed, Sep 18, 2019, 15:06 Jochen Naumann wrote: > Does somebody know a good tutorial of how to fine-tune tesseract for > 1. a new ttf font > 2. images of characters > > I am especially interested in tra

Re: [tesseract-ocr] Regarding box file creation using tesseract 4.0.0 and tesseract 5.0.0 (alpha version)

2019-09-18 Thread Shree Devi Kumar
Your installation of tesseract must be old. You need the config file lstmbox. On Wed, Sep 18, 2019, 15:11 isuri anuradha wrote: > Hi, > > when I try to create box files by following the > https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#making-box-files > why error prompt

[tesseract-ocr] Regarding box file creation using tesseract 4.0.0 and tesseract 5.0.0 (alpha version)

2019-09-18 Thread isuri anuradha
Hi, when I try to create box files by following the https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#making-box-files why error prompting as read_params_file: Can't open lstmbox and generating .txt file instead of box file? -- You received this message because you are s

[tesseract-ocr] Tutorial for fine-tuning Tesseract 4 for a new font?

2019-09-18 Thread Jochen Naumann
Does somebody know a good tutorial of how to fine-tune tesseract for 1. a new ttf font 2. images of characters I am especially interested in training files for dot-matrix fonts. Maybe somebody already has training data for dot-matrix based fonts? Thanks in advance -- You received this message b

Re: [tesseract-ocr] Is there any way to load model(tesseract custom model) at ones instead of loading every time?

2019-09-18 Thread Zdenko Podobny
You have to use tesseract API and not executable for this. Zdenko st 18. 9. 2019 o 9:45 jitendra dubey napísal(a): > We have pass every time model name while predicating the text from images. > So I want to know any way to load tesseract custom generated model in > "in-Memory" and predict fast

Re: [tesseract-ocr] Is there any way to load model(tesseract custom model) at ones instead of loading every time?

2019-09-18 Thread Lorenzo Bolzani
Yes, you can create an instance and reuse it. You call Init once and just reuse it. Performance does improve. If you have multiple threads see this: https://stackoverflow.com/questions/4827924/is-tesseractan-ocr-engine-reentrant For multi threading I created a pool of instances. On each instance

[tesseract-ocr] Is there any way to load model(tesseract custom model) at ones instead of loading every time?

2019-09-18 Thread jitendra dubey
We have pass every time model name while predicating the text from images. So I want to know any way to load tesseract custom generated model in "in-Memory" and predict faster. Thanks in advance. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" grou