I tried with all psm modes,but the result is the same. Its a 4page document.In 3 of the pages,it works fine.But in one of the pages,it fails. Actually its a confidential document hence not being able to upload it. I will try to find an alternative.
On Tuesday, February 11, 2020 at 9:04:42 PM UTC+5:30, shree wrote: > > What psm are you using? > > On Tue, Feb 11, 2020, 20:46 KOLLOL CHOWDHURY <[email protected] > <javascript:>> wrote: > >> Hi, >> >> There are certain pages with multi column and when I try to OCR it, it >> doesn't recognise the multi column and takes all the words in a particular >> line . >> >> I am using Tesseract 4.01 and trying to output an hocr/pdf file. >> >> Any help will be appreciated. >> >> TIA >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/fce508c8-d2aa-4d90-b4c5-b7546dea6aee%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/fce508c8-d2aa-4d90-b4c5-b7546dea6aee%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/97c91a34-5abf-45e9-89a8-d6303450a187%40googlegroups.com.

