Hi,

1. Deskew the image to get straight text lines.
2. Use tesseract's PSM 6 mode, this mode helps you scan the pdf horizontally 
which can be very useful in table extraction.

Tesseract engine can provide great results depending on the quality of image 
provided to it. It cannot give you 100% results all the time. Although if the 
image quality is great, it's possible to get 100% results. :)

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b6c13790-057a-4ee2-9537-99b2d2b12ff6%40googlegroups.com.

Reply via email to