Hi. I am using tesseract ocr java library with my web application. I have a sample tabular data (table cells are not separated by grid lines) in TIFF format. Tesseract works fine in text extraction but it do not maintain the TABS (4 spaces). It replaces tabs and multiple spaces with a single space only.
For example in TIFF image, "abcde edcba" is rendered by tesseract as "abcde edcba". Can someone provide me a solution for this if this is a known issue? -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

