Preserving spaces and tabs in tesseract ocr.

raju . sharma . blr . min Sat, 10 Aug 2013 04:32:39 -0700

Hi.
I am using tesseract ocr java library with my web application. I have a 
sample tabular data (table cells are not separated by grid lines) in TIFF 
format.
Tesseract works fine in text extraction but it do not maintain the TABS (4 
spaces). It replaces tabs and multiple spaces with a single space only.


For example in TIFF image, "abcde          edcba" is rendered by tesseract 
as "abcde edcba".

Can someone provide me a solution for this if this is a known issue?

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Preserving spaces and tabs in tesseract ocr.

Reply via email to