[tesseract-ocr] Anyone willing to answer the question I ask. Tesseract is not giving Decimals for numbers having 9 or more digits.

2018-03-26 Thread adarsh shukla
Will anyone care to answer my question. I am in urgent need of answers but nobody seems to bother. @shree hope you can put some insight onto this. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop

[tesseract-ocr] Re: Recognize random characters with tesseract

2018-03-16 Thread adarsh shukla
Tried to run tesseract with your image seite1_1.PNG and got same results as shown by you. But it seems it will resolve the issue if you use proper PSM mode. for example: tesseract -psm 6 /home/adarsh/Downloads/seite1_1.PNG out This will give you this output: AAEAYj

[tesseract-ocr] Re: tesseract 4.00 beta is released ? I saw the who use the tesseract 4.00 beta

2018-03-12 Thread adarsh shukla
There is no official release of tesseract 4.0 Beta. There might be some unofficial release, not found anything as such in Google. On Monday, March 12, 2018 at 10:17:35 AM UTC+5:30, 이경준 wrote: > > tesseract 4.00 beta is released ? I saw the who use the tesseract 4.00 > beta (in the github issue)

[tesseract-ocr] Tesseract reading "-" as "~" and reading "|" as "/".

2018-03-07 Thread adarsh
Tesseract is reading the complete file properly but at some places it is reading "-" as "~". I have Ubuntu 16.0.4 LTS and Tesseract 4.0 Alpha. Can anyone help ASAP. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this

[tesseract-ocr] Re: Differentiate "I" and "|" in Tesseract.

2018-02-28 Thread adarsh
to train my tesseract to differentiate similar characters like > "I", "l" and "|". > > There are errors at some places in the pdf. I hope that someone helps. > > Thanks in advance. > > Adarsh > > -- You received this message because you are s

[tesseract-ocr] Differentiate "I" and "|" in Tesseract.

2018-02-28 Thread adarsh
I want help to train my tesseract to differentiate similar characters like "I", "l" and "|". There are errors at some places in the pdf. I hope that someone helps. Thanks in advance. Adarsh -- You received this message because you are subscribed to the Google

[tesseract-ocr] Re: only one symbol recognized incorrectly, rest all are read perfectly. Kindly help

2018-02-27 Thread adarsh
rror. Correct output should be "/ > 725775" and the incorrrect output given is "| 725775 ". > And in the rest of the page, all other text is recognized correctly. > > @shree help needed. > > Regards and thanks in advance > Adarsh > > -- You rec

[tesseract-ocr] only one symbol recognized incorrectly, rest all are read perfectly. Kindly help

2018-02-27 Thread adarsh
hted text contains the error. Correct output should be "/ 725775" and the incorrrect output given is "| 725775 ". And in the rest of the page, all other text is recognized correctly. @shree help needed. Regards and thanks in advance Adarsh -- You received this message because y

Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-26 Thread adarsh
Thanks alot shree. On Monday, February 26, 2018 at 2:04:04 PM UTC+5:30, shree wrote: > > try > > -c page_separator= "\n" > > or the code for CRLF > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop

Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-25 Thread adarsh
Can you please suggest a way to print a newline instead of FF. I am able to print any character other than formfeed by using the " -c page_separator="Hello" " option, but i don't know how to print a newline. Thanks in advance. Regards Adarsh On Friday, February 23, 2018

Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-22 Thread adarsh
Is there any way to remove the End of page symbol that appears in the image? It looks like a box with some 000c written at the end. Regards Adarsh On Thursday, February 22, 2018 at 4:22:21 PM UTC+5:30, shree wrote: > > What --psm are you using? > > Tesseract might be treat

[tesseract-ocr] Re: tesseract on Google Summer of Code

2018-02-22 Thread adarsh
Hi Carlos I am a Final Year Student doing my Bachelors in CSE and currently working in a company for Tesseract Project. I have done quite extensive work in tesseract and also worked with Tesseract 4.0 alpha version. Hope to hear from you soon. Regards Adarsh SHUKLA On Thursday, February 22

Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-22 Thread adarsh
@shree You are awesome. Your solution straightaway gave me the solution. You are awesome man. Really appreciate your help. You have responded whenever I needed it. :) Keep up the good work. Regards Adarsh SHUKLA On Thursday, February 22, 2018 at 4:22:21 PM UTC+5:30, shree wrote: > >

[tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-22 Thread adarsh

[tesseract-ocr] Re: bank card OCR

2018-02-21 Thread adarsh
. The text output then can be expected to be okay. Keep up the Good work, Regards Adarsh On Thursday, February 15, 2018 at 5:33:44 PM UTC+5:30, Olivier Demin wrote: > > Hi all. I'm completely new to tesseract, so please apologise for > potential "dummy" questions. You're free to m

[tesseract-ocr] Re: Read Local Charter (Hindi , Tamil, Sinhala)

2018-02-21 Thread adarsh
Hi Aruna I need to know what all languages have you installed libraries for and also the Version of tesseract you are using. Regards Adarsh On Wednesday, February 21, 2018 at 12:53:59 PM UTC+5:30, Aruna Gamage wrote: > > Dear Sir, > > I need to read local language. (Hindi , Ta

[tesseract-ocr] Re: How to limit the number of characters in Tesseract 3.0.2 C#?

2018-02-21 Thread adarsh
hi Abraham I need you to explain the problem more precisely so that I can think of some possible solution. Regards Adarsh On Thursday, February 22, 2018 at 1:03:41 AM UTC+5:30, Abraham Rivera wrote: > > I am doing OCR with card id like the one I upload in this post, the only >

[tesseract-ocr] Re: bank card OCR

2018-02-21 Thread adarsh
Oliver You would need to change your image processing before sending the image to tesseract. This image can't be read by Tesseract any better by training as the pixels around the text are noise that needs to be removed before sending them to Tesseract. On Thursday, February 15, 2018 at

Re: [tesseract-ocr] Error in training Tesseract 4.0. Training gets completed somehow but then the output it gives after reading the pdf is incorrect.

2018-02-15 Thread Adarsh Shukla
Thanks alot for replying shree. I will be asking more doubtsin future because of people like you. Ill revert back if the problem still exists. Thanks a lot. Regards Adarsh REGARDS ADARSH SHUKLA Junior Developer Trainee *TURNING CLOUD SOLUTIONS+91 9717783099* On Thu, Feb 15, 2018 at 1:34 PM

[tesseract-ocr] Re: Hide names in a scan

2018-02-14 Thread adarsh
someone read out the image that you will provide and then differentiate the names from other text. Hope you get what I mean. You can look a way in NLP to solve your purpose. Revert back for any queries. Keep up the good work. Regards Adarsh On Tuesday, February 13, 2018 at 7:33:43 PM UTC+5:30

[tesseract-ocr] Re: tesseract to recognize the cropped digits

2018-02-14 Thread adarsh
Hi Abhishek, hope you are doing well. What you need to do is to pre-process the image. Conver it to binary file or invert the colors to black and white. then what you can do is use opencv to draw a contour on the number. then that contour can be cut out and sent to tesseract. I can provide you

[tesseract-ocr] Error in training Tesseract 4.0. Training gets completed somehow but then the output it gives after reading the pdf is incorrect.

2018-02-14 Thread adarsh
adarsh@adarsh-X555LJ:~/tesseract$ training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --noextract_font_properties --langdata_dir /home/adarsh/tesseract/langdata --training_text /home/adarsh/tesseract/langdata/eng/eng.training_text --linedata_only --tessdata_dir /home/tessdata

Re: [tesseract-ocr] Training a new font with tesstrain.sh failed at phase M

2018-02-12 Thread adarsh
Hello shree, The Issue is coming with me too andthe error is shown as below: /home/adarsh/tesseract/font_properties does not exist or is not readable On Tuesday, September 27, 2016 at 11:20:14 AM UTC+5:30, shree wrote: > > Are you trying to train for English language with an Arabi

Re: [tesseract-ocr] ERROR: Could not find training text file

2018-01-29 Thread adarsh
eng --linedata_only --noextract_font_properties --langdata_dir ../langdata --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain --training_text ../home/adarsh/tes1/tesseract/langdata/eng/eng.training_text \ On Monday, July 31, 2017 at 5:10:14 PM UTC+5:30, shree wrote: > &g