[tesseract-ocr] Re: Checking if Searchable or Image Only PDF

2016-05-11 Thread Gunasekaran Velu
Hi Rob using iTextsharp you can check the pdf is already searchable pdf or not. Regards Guna On Tuesday, May 10, 2016 at 5:04:58 PM UTC+5:30, Robert Williams wrote: > > Hi > > Within code - is it possible to check if a PDF is already "searchable"? > > We get documents from a third party and

[tesseract-ocr] Re: Failure on certain types of images

2016-05-11 Thread Gunasekaran Velu
HI What is your input image format png or tif? If your input image format is png then just change to tif and try the tesseract. I got same issue and fixed. Regards Guna On Monday, March 12, 2012 at 9:52:35 PM UTC+5:30, TestTest wrote: > > > somebody have any idea how solved this issue. >

[tesseract-ocr] Re: OCR Recognition for Underlined text

2016-04-16 Thread Gunasekaran Velu
Hi Tom Does it possible to use config variable for underline text image? Looking forward it. Regards Guna On Monday, March 7, 2016 at 6:08:03 AM UTC+5:30, Gunasekaran Velu wrote: > > HI > > I just sent own creation f image in paint and sent you. > > Now i have attached

[tesseract-ocr] Detect table region and dump table images from images using tesseract3.02 or later

2016-04-02 Thread Gunasekaran Velu
Hi I am using tesseract3.05.exe for windows. My input image has tables and images. How can i detect table region and dump table images using following config variables textord_dump_table_images 0 Paint table detection output textord_show_tables 0 Show table regions Should i add these variable

[tesseract-ocr] Set value for config variable in tesseract 3.05

2016-03-31 Thread Gunasekaran Velu
Hi I am using tesseract 3.05 for windows. I am using underline text images for OCR I need to remove underline text and do the OCR for better accuracy. I am using the following command line arguments in tesseract 3.05 to remove the underline text >tesseract.exe OriginalTest-1.tif under -l eng

Re: [tesseract-ocr] Re: Improve accuracy of underlined text

2016-03-28 Thread Gunasekaran Velu
[1]. B > > > > art > > --- > > 1. http://www.leptonica.com/line-removal.html > > > > *From:* tesser...@googlegroups.com [mailto: > tesser...@googlegroups.com ] *On Behalf Of *Gunasekaran Velu > *Sent:* Thursday, March 24, 2016 2:47 AM > *To:* tesseract

[tesseract-ocr] Re: Image to PDF(Searchabel text) in teseract 3.03

2016-03-24 Thread Gunasekaran Velu
Solved. Regards Guna On Monday, March 21, 2016 at 1:26:24 PM UTC+5:30, Gunasekaran Velu wrote: > > Hi > > I am using teserct 3.03 windows. > > Does it possible to do convert PNG image to PDF(Searchable text) usin > tesseract 3.03? > > Looking forward your re

[tesseract-ocr] Re: Improve accuracy of underlined text

2016-03-24 Thread Gunasekaran Velu
Hi Art Rhyno I have same problem(underline issue). If you can, can you share line removal code? Looking forward your reply. Regards Guna On Monday, March 21, 2016 at 1:48:38 AM UTC+5:30, akhil katpally wrote: > > Some underlines in my image are are very close to the text. For that >

[tesseract-ocr] Image to PDF(Searchabel text) in teseract 3.03

2016-03-21 Thread Gunasekaran Velu
Hi I am using teserct 3.03 windows. Does it possible to do convert PNG image to PDF(Searchable text) usin tesseract 3.03? Looking forward your re[ly. Regards Guna -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this

[tesseract-ocr] Re: OCR Recognition for Underlined text

2016-03-09 Thread Gunasekaran Velu
Hi Tom Any update regarding underline text problem? Regards Guna On Monday, March 7, 2016 at 6:08:03 AM UTC+5:30, Gunasekaran Velu wrote: > > HI > > I just sent own creation f image in paint and sent you. > > Now i have attached the real document(Cropping from full image due

[tesseract-ocr] Re: How to get check page orientaion

2016-03-06 Thread Gunasekaran Velu
it gets looked at (and > include your example files): > > https://github.com/tesseract-ocr/tesseract/issues > > Tom > > On Thursday, March 3, 2016 at 6:30:17 AM UTC-5, Gunasekaran Velu wrote: >> >> >> Hi Tom >> >> Thanks for your informat

[tesseract-ocr] Re: OCR Recognition for Underlined text

2016-03-06 Thread Gunasekaran Velu
On Saturday, March 5, 2016 at 10:42:18 PM UTC+5:30, Tom Morris wrote: > > On Saturday, March 5, 2016 at 5:11:55 AM UTC-5, Gunasekaran Velu wrote: >> >> >> >tesseract.exe Underline.png Underline -l eng -psm 1 >> >> Result: This is underline word @ >&

[tesseract-ocr] OCR Recognition for Underlined text

2016-03-05 Thread Gunasekaran Velu
Hi >tesseract.exe Underline.png Underline -l eng -psm 1 Result: This is underline word @ Does it possible to do OCR recognition for underlined text/word on the image? or some image processing need to apply on the image? Attached sample image. Looking forward your reply. Regards Guna --

[tesseract-ocr] How to get check page orientaion

2016-02-29 Thread Gunasekaran Velu
Hi I have multiple page document some pages are normal page some pages or 90 degree rotated. How i check that(90 degree) or how can i get the orientation value for particular page then only i can rotate the page for OCR process. Looking forward your reply. Regards Guna -- You received

[tesseract-ocr] Re: OCR text problem using -psm 6 hocr

2016-01-29 Thread Gunasekaran Velu
rd your reply. Regards Guna On Thursday, January 28, 2016 at 11:05:28 PM UTC+5:30, Tom Morris wrote: > > > On Thursday, January 28, 2016 at 4:24:26 AM UTC-5, Gunasekaran Velu wrote: > > I am using following tesseract command to do the HOCR file for bmp image >> >> >

[tesseract-ocr] How can i get actual font name?

2015-04-17 Thread Gunasekaran Velu
Hi I am working windows Tesseract I am using following code to get the font name const char* m_WordType = ri-WordFontAttributes(is_bold,is_italic,is_underlined,is_monospace,is_serif,is_smallcaps,pointsize,font_id); the output i got from above line is Verdana_Bold But actual font name is

Re: [tesseract-ocr] Re: Cast word confidence success rate ?

2015-04-11 Thread Gunasekaran Velu
, Dmitri Silaev www.CustomOCR.com On Wed, Apr 8, 2015 at 1:21 PM, Gunasekaran Velu mail2...@gmail.com javascript: wrote: Really sorry for the mistake. I am getting certainty value from tesseract for Text Name 215(Positive value). Does your formula applicable for this certainty value

Re: [tesseract-ocr] Re: Cast word confidence success rate ?

2015-04-08 Thread Gunasekaran Velu
confusing certainty and confidence here. Please pay close attention to what you're writing or rephrase your question. The formula itself allows no values out of the [0, 100] range. Best regards, Dmitri Silaev www.CustomOCR.com On Wed, Apr 8, 2015 at 8:37 AM, Gunasekaran Velu mail2

[tesseract-ocr] Re: Cast word confidence success rate ?

2015-04-07 Thread Gunasekaran Velu
Hi Dmitri Does your formula only for negative confidence score or for all? Because i am getting confidence score for Name - 215 Is it correct or not? or Does i do any calculation for that? Looking forward your reply. Regards Guna On Saturday, August 6, 2011 at 12:19:47 PM UTC+5:30, Dmitri

[tesseract-ocr] Re: Cast word confidence success rate ?

2015-04-07 Thread Gunasekaran Velu
Hi Dmitri Does your formula only for negative confidence score or for all? Because i am getting confidence score for Name - 215(positive value) Is it correct or not? or Does i do any calculation for that? Looking forward your reply. Regards Guna On Saturday, August 6, 2011 at 12:19:47 PM

[tesseract-ocr] Re: OCR.Init - Failed

2015-03-25 Thread Gunasekaran Velu
Hi Are you asking me? I am not getting the error wile using eng. Regards Guna On Wednesday, March 25, 2015 at 5:31:23 PM UTC+5:30, Dewald Human wrote: Getting the same issue with eng On Monday, 16 March 2015 13:58:46 UTC+2, Gunasekaran Velu wrote: Hi I am using following line

[tesseract-ocr] OCR.Init - Failed

2015-03-16 Thread Gunasekaran Velu
Hi I am using following line for OCR ocr.Init(C:\\temp, deu, false); The temp folder contain tessdata folder for german language downloaded from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.02.deu.tar.gzcan=2q= But i am getting the application crash error.

[tesseract-ocr] Hot to check new line character

2015-03-11 Thread Gunasekaran Velu
Hi I am using tesseract OCR in .NET programming. How can i get the new line character while doing the OCR in paragraph in the image? I did not get \n character in new line. Kindly do the needful. Looking forward your reply. Regards Guna -- You received this message because you are

[tesseract-ocr] How can i grouping based on font size or Next line

2015-03-09 Thread Gunasekaran Velu
Hi I have attached image. I need to do this image in to 4 single field. Paragraph - 1 as group 1 bottom left line as group 2 bottom center as group 3 Bottom right as group 4. How can i do that. i am using Emgu.Cv.OCR(Tesseract). Kindly do the needful. Looking forward your reply. Regards Guna

[tesseract-ocr] OCR not accuracy

2015-02-13 Thread Gunasekaran Velu
Hi I did not get proper OCR using Tesseract English for attached image. The Attached image is 96 DPI. Kindly do the needful. Looking forward your reply. Regards Guna -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from

[tesseract-ocr] Re: How can i improve OCR for attached image

2015-02-13 Thread Gunasekaran Velu
Hi Thanks for the reply. Yes. same type of image only but some information will be change like patient ID and patient name like that. Its basically prescription. How can i improve that. Help me to do that. Regards Guna On Wednesday, February 11, 2015 at 12:13:21 PM UTC+5:30, Gunasekaran Velu

[tesseract-ocr] Re: OCR not accuracy

2015-02-13 Thread Gunasekaran Velu
Hi Thanks for the information. Will come to you if i need any help. Thanks and Regards Guna On Saturday, February 14, 2015 at 10:59:20 AM UTC+5:30, Gunasekaran Velu wrote: Hi I did not get proper OCR using Tesseract English for attached image. The Attached image is 96 DPI. Kindly

[tesseract-ocr] OCR Region for Word instead of each letter

2015-02-13 Thread Gunasekaran Velu
Hi When i am doing the OCR for image it put each box for each letter. Does it possible to draw the box(rectangle) for each word instead of each character? Looking forward your reply. Regards Guna -- You received this message because you are subscribed to the Google Groups tesseract-ocr