Try using "tessedit_dump_page_segment  T" , config parameter, it dumps the 
image after removing horizontal and vertical lines. You can use this image 
further for OCR.

On Wednesday, April 9, 2014 9:47:41 AM UTC+5:30, ANBU J wrote:
>
> Thanks for the reply Nick. I'm doing it. It is very hard ti figure out the 
>> functionality of methods without understanding the whole project. Since I 
>> have to find out what are those header files do and the relation, it is 
>> going to take a lot of time. I'd appreciate if anyone can point me out 
>> where the outputs (the extracted text from table) being passed. So that I 
>> can add html table tags to the output to reproduce the table in html 
>> format.
>
> Anbu   
>
> On Tuesday, April 8, 2014 9:08:30 PM UTC+5:30, Nick White wrote:
>>
>> Documentation for the internals of Tesseract is unfortunately rather 
>> minimal, indeed. I'd recommend you take a look at the TableFinder 
>> class in the code to figure it out. And please do share anything you 
>> learn here! 
>>
>> Nick 
>>
>> On Mon, Apr 07, 2014 at 02:45:51AM -0700, ANBU J wrote: 
>> > It's sad that we couldn't find a documentation for the methods for 
>> table 
>> > manipulation in tesseract. Looks like I have to manually implement an 
>> algorithm 
>> > to handle tables. 
>> > if you have done it already, please share the knowledge.   
>> > 
>> > On Tuesday, 25 June 2013 14:42:46 UTC+5:30, [email protected] wrote: 
>> > 
>> >     Hi ! 
>> > 
>> >     I'm going to work for a program which can recognize the table 
>> structure and 
>> >     text in this table. 
>> >     I tried to OCR the table image using command line on Windows 7, but 
>> the 
>> >     output text was so bad. 
>> > 
>> >     (just like this: tesseract table.jpg out -l eng, or with "hocr") 
>> >     I tried to using TessBaseAPI in VC too.(just a simple application) 
>> > 
>> >     The table lines(especially column) interfere in the whole image. 
>> > 
>> >     And now, I find the Class "TableFinder" in Tesseract source code, 
>> but I 
>> >     can't get anything else from Internet. (Tesseract-OCR-3.02) 
>> >     No demos, teachings here? 
>> > 
>> >     I am new, sincerely hope to get some help.  :) 
>> > 
>> >     Thanks! 
>> > 
>> > -- 
>> > -- 
>> > You received this message because you are subscribed to the Google 
>> > Groups "tesseract-ocr" group. 
>> > To post to this group, send email to [email protected] 
>> > To unsubscribe from this group, send email to 
>> > [email protected] 
>> > For more options, visit this group at 
>> > http://groups.google.com/group/tesseract-ocr?hl=en 
>> > 
>> > --- 
>> > You received this message because you are subscribed to the Google 
>> Groups 
>> > "tesseract-ocr" group. 
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an email 
>> > to [email protected]. 
>> > For more options, visit https://groups.google.com/d/optout. 
>>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to