We have released version 1.3 of Tesseract Studio with the following 
enhancements:

   - Improved memory management to support large multi-page files.
      - Streaming interface to Leptonica.
      - Eliminate unnecessary cache of images.
      - Unload processed pages early.
      - Tested with a scanned file of 2,100 pages.
   - New OCR options:
      - OCR only image objects in vector PDF files, or
      - Fully rasterize and OCR each page.
   - New Save options:
      - Save as vector PDF with existing objects (including visible text) 
      preserved and merged with OCR.
      - Save as searchable PDF where each page has a single image which 
      overlays hidden OCR data.
         - Maintain original color if applicable, or
         - Convert to grayscale before saving, or
         - Convert to monochrome using a dithering algorithm, or
         - Convert to monochrome using dynamic or specified thresholding.
         - Specify or automatically assign resolution to control PDF size.
      - Save as text-only PDF.
         - Use a visible font for OCR and other text objects.
         - Pick standard type 1 fonts to reduce PDF size.
         - Embed any available font into the PDF file (with some overhead).
         - Format OCR and other text to approximate the original layout 
         (without graphics).
      - Some bug fixes. 
   

Download: https://github.com/OpaitSoftware/TesseractStudio.Net
>
> Thank you,
>
> Farhad Khalafi
> Opait Software
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5d1bf52c-04a1-42e9-be7d-25dd667c27c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to