Try using "tessedit_dump_page_segment T" , config parameter, it dumps the image after removing horizontal and vertical lines. You can use this image further for OCR.
On Wednesday, April 9, 2014 9:47:41 AM UTC+5:30, ANBU J wrote: > > Thanks for the reply Nick. I'm doing it. It is very hard ti figure out the >> functionality of methods without understanding the whole project. Since I >> have to find out what are those header files do and the relation, it is >> going to take a lot of time. I'd appreciate if anyone can point me out >> where the outputs (the extracted text from table) being passed. So that I >> can add html table tags to the output to reproduce the table in html >> format. > > Anbu > > On Tuesday, April 8, 2014 9:08:30 PM UTC+5:30, Nick White wrote: >> >> Documentation for the internals of Tesseract is unfortunately rather >> minimal, indeed. I'd recommend you take a look at the TableFinder >> class in the code to figure it out. And please do share anything you >> learn here! >> >> Nick >> >> On Mon, Apr 07, 2014 at 02:45:51AM -0700, ANBU J wrote: >> > It's sad that we couldn't find a documentation for the methods for >> table >> > manipulation in tesseract. Looks like I have to manually implement an >> algorithm >> > to handle tables. >> > if you have done it already, please share the knowledge. >> > >> > On Tuesday, 25 June 2013 14:42:46 UTC+5:30, [email protected] wrote: >> > >> > Hi ! >> > >> > I'm going to work for a program which can recognize the table >> structure and >> > text in this table. >> > I tried to OCR the table image using command line on Windows 7, but >> the >> > output text was so bad. >> > >> > (just like this: tesseract table.jpg out -l eng, or with "hocr") >> > I tried to using TessBaseAPI in VC too.(just a simple application) >> > >> > The table lines(especially column) interfere in the whole image. >> > >> > And now, I find the Class "TableFinder" in Tesseract source code, >> but I >> > can't get anything else from Internet. (Tesseract-OCR-3.02) >> > No demos, teachings here? >> > >> > I am new, sincerely hope to get some help. :) >> > >> > Thanks! >> > >> > -- >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "tesseract-ocr" group. >> > To post to this group, send email to [email protected] >> > To unsubscribe from this group, send email to >> > [email protected] >> > For more options, visit this group at >> > http://groups.google.com/group/tesseract-ocr?hl=en >> > >> > --- >> > You received this message because you are subscribed to the Google >> Groups >> > "tesseract-ocr" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email >> > to [email protected]. >> > For more options, visit https://groups.google.com/d/optout. >> > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

