I am running tesseract 3.03 and I don't see this option available. Is there another way to do it via an option?
$ tesseract -v tesseract 3.03 leptonica-1.71 libgif 4.1.6(?) : libjpeg 6b : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8 : libwebp 0.4.1 : libopenjp2 2.1.0 $ tesseract --print-parameters | grep -i 'page' textord_show_page_cuts 0 tessedit_pageseg_mode 6 pageseg_devanagari_split_strategy 0 applybox_page 0 tessedit_page_number -1 tessedit_dump_pageseg_images 0 Thanks! On Sunday, March 13, 2016 at 1:17:36 AM UTC-5, zdenop wrote: > > you have very old version of tesseract. > page_separator was implemented after 3.02 release > > Zdenko > > On Sat, Mar 12, 2016 at 10:22 PM, <[email protected] <javascript:>> wrote: > >> Thanks Zdenko. I'm still stuck. I OCR'd an 81 page tiff file and I've >> searched my output txt file for the form feed character (asc 12) and didn't >> find one. I have windows version of tesseract 3.02. Also I don't see a >> parameter for page_separator in the command-line options. Do you know what >> I'm doing wrong? >> >> On Saturday, March 12, 2016 at 1:44:12 PM UTC-5, [email protected] wrote: >>> >>> If I OCR a multipage tiff file using Tesseract it comes out as one >>> single page .txt file. Is there a way to maintain the page breaks? >>> Thanks. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/da280a42-a10d-40e9-a587-46bf31af51a8%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/da280a42-a10d-40e9-a587-46bf31af51a8%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/72ceef7e-7c17-463d-8374-1a93ef9e242e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

