[tesseract-ocr] Re: Unable to read the circled text from an image

2018-02-15 Thread Mateusz Dudek
Hello, In brief: 1) I want extract from this picture only "Text2" Now I use tesseract 4.0, to extract every text from image, so I have: "text1 @" (I don't know why it returns me @, maybe because "text2" is in circle) -> Then I'm using imagemagick to clear colors from image -> then read image

[tesseract-ocr] bank card OCR

2018-02-15 Thread Olivier Demin
Hi all. I'm completely new to tesseract, so please apologise for potential "dummy" questions. You're free to make "dummy" answers as well :-) I would like to OCRize pictures of bank cards in order to extract bank account numbers. I can post-process easily the recognized text with regular

Re: [tesseract-ocr] Error in training Tesseract 4.0. Training gets completed somehow but then the output it gives after reading the pdf is incorrect.

2018-02-15 Thread ShreeDevi Kumar
> I have fixed the Langdata folder now. And also the previous files are different from the file now. Look at the error messages. Search for 'Failed' You now have more langdata related errors. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.

Re: [tesseract-ocr] Error in training Tesseract 4.0. Training gets completed somehow but then the output it gives after reading the pdf is incorrect.

2018-02-15 Thread Adarsh Shukla
Thanks alot for replying shree. I will be asking more doubtsin future because of people like you. Ill revert back if the problem still exists. Thanks a lot. Regards Adarsh REGARDS ADARSH SHUKLA Junior Developer Trainee *TURNING CLOUD SOLUTIONS+91 9717783099* On Thu, Feb 15, 2018 at 1:34 PM,

Re: [tesseract-ocr] When using text2image for training, I get the error: Could not find font named... how can I know the correct name of a font?

2018-02-15 Thread ShreeDevi Kumar
You can check available fonts on your system by using --find_fonts with text2image, to find font names used by tesseract example command with output - please modify path to match your setup *text2image --find_fonts --text ./langdata/eng/eng.training_text --outputbase ./langdata/eng/

Re: [tesseract-ocr] Been able to create tessdata from a text and a font, but can I do it from an image?

2018-02-15 Thread ShreeDevi Kumar
Depends on what version of tesseract you are using. tesseract can be used to make box files which work well with 3.0x. Training with images is not supported for 4.0alpha. ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

Re: [tesseract-ocr] Re: Tesseract recognition accuracy is low

2018-02-15 Thread ShreeDevi Kumar
Read wiki pages about improving quality of your input images. Also try with the latest tesseract code and traineddata files from github. ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Thu, Feb 15, 2018 at 10:35 AM,

Re: [tesseract-ocr] Error in training Tesseract 4.0. Training gets completed somehow but then the output it gives after reading the pdf is incorrect.

2018-02-15 Thread ShreeDevi Kumar
You are missing langdata files Failed to load script unicharset from:/home/adarsh/tesseract/ langdata/Latin.unicharset Failed to read data from: /home/adarsh/tesseract/langdata/radical-stroke.txt Error reading radical code table /home/adarsh/tesseract/ langdata/radical-stroke.txt Even after you

[tesseract-ocr] When using text2image for training, I get the error: Could not find font named... how can I know the correct name of a font?

2018-02-15 Thread Ernesto Borio
When using text2image for training, I get the error: $ text2image --text=charset.txt --outputbase=[eng].[HeroicCondensedBoldRegular].exp0 --font='Heroic Condensed Bold Regular' --fonts_dir=. (process:29818): Pango-WARNING **: couldn't load font "Heroic Bold Condensed", modified

[tesseract-ocr] Re: Tesseract recognition accuracy is low

2018-02-15 Thread tofailshaheen
same problem if u find any solution then plz share with me thanx in advance On Wednesday, February 7, 2018 at 5:52:42 PM UTC+5, Niti Rohilla wrote: > > Hi All, > > I am using tesseract for OCR but I am not getting higher accuracy. For > some characters results are completely wrong for

[tesseract-ocr] Been able to create tessdata from a text and a font, but can I do it from an image?

2018-02-15 Thread Ernesto Borio
I used jTessBoxEditor to create tessdata for a text and a font file. It works pretty well. But as I need to OCR photos of printed labels, the font is very similar to the one used in the labels, but I guess it would be better to directly train tesseract with the label letters themselves. It's

[tesseract-ocr] OCR in only application

2018-02-15 Thread Stanley Denman
I am wanting to create an online application that takes a large pdf file and extracts information that is valuable for the user. The key to the application is going to be speed - I am basically wanting to provide a minimal service for free that builds up an e-mail address. I know when I OCRed