[tesseract-ocr] Re: Column based detection

Eugene Shkel' Thu, 24 Apr 2014 13:19:29 -0700

You can perform segmentation by columns. After that you will have two 
regions (each region correspond to column). Next - recognize each region 
separately. Text recognized from first region will be first column, text 
recognized from second region will be second column.


For segmentation by columns you can use constant widths if you know or you 
can find it finding specific number of "empty" (without colored pixels) 
vertical lines on your picture. You can configure number of such lines 
depending of gap between columns on picture, small gap can lead to 
incorrect segmentation (it will separate letters instead of columns). 

понедельник, 7 апреля 2014 г., 15:36:56 UTC+3 пользователь temp name 
написал:
>
> Hello,
>
> I have a image which has text in tabular format, but borders of the table 
> has been removed. The table has two rows an two columns. 
>
> The first row first column and second column has some text too. 
> But the second row first column is empty and second row second column has 
> some text.
>
> When I tried this image with Tesseract it recognizes the text as " DE Abc 
> FG" .
>
> Do anyone know how to fore tesseract to recognize text from first column 
> first then from second column.
>
> Thanks in Advance!
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/759885fd-ac31-451b-aa5d-484e7951ce09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Re: Column based detection

Reply via email to