The PDF list is a service provided by PDFzone.com | http://www.pdfzone.com __________________________________________________________________
If Acrobat Paper Capture uses FineReader 6, why is it that given the SAME document Abbyy will get 25 hits on a given word search and Paper Capture will only get 18 hits? We've looked at thousands of documents and, with statistical consistently, Acrobat's OCR underperforms both ScanSoft OmniPage Pro 12 and Abbyy FineReader by very wide margins. Acrobat's Paper Capture results are not at all up to par with either Scansoft or FineReader, so it seems highly problematic that FineReader is actually bundled into Paper Capture. PdfCompressor 2.1 uses ScanSoft's OCR engine, so let's conduct a simple test using these two systems. Here's a comparison of some keyword hits for the same document (the file is posted on Adobe's site): http://www.adobe.com/products/acrcapture/agentpack/pdfs/pdfimage/AnnualRepor t.pdf . Running Acrobat 6's Paper Capture vs. PdfCompressor 2.1, we have the following hit results: # of keyword hits keyword Acrobat 6 CVISION Paper Capture PdfCompressor 2.1 commission 50 169 section 4 19 recall 31 58 requirements 18 31 corrective 13 23 regulations 11 18 Of course, for all of Acrobat 6's OCR inaccuracy, it also runs much slower. For this 68 page document, time to convert to searchable PDF using Acrobat 6 is 3 mins, 40 secs (220 secs) using a 3 GHz, intel P4 machine; the time to covert to searchable (JBIG2-compressed) PDF using PdfCompressor 2.1 is only 1 min, 28 secs (88 secs). So Acrobat's Paper Capture is roughly 2.5x slower than PdfCompressor 2.1. In addition, the Acrobat Paper Capture-generated hidden text-layer is about 5x-7x larger than the hidden-text layer generated by CVISION's PdfCompressor 2.1. Ari -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Leonard Rosenthol Sent: Friday, September 26, 2003 7:54 AM To: [EMAIL PROTECTED] Subject: RE: [PDF] Searchable pdf The PDF list is a service provided by PDFzone.com | http://www.pdfzone.com __________________________________________________________________ At 12:21 AM -0400 9/26/03, Ari Gross wrote: >Acrobat 6 Paper Capture is not reliable, nor accurate, and runs very >slow. Its accuracy is in now way comparable to either Scansoft's >OmniPage Pro 12 or Abbyy FineReader 6.0. First, I only said it was better than 5.0 ;). However, under certain circumstances, it is EXACTLY the same as FineReader 6 since that's the engine being used! Paper Capture uses multiple engines based on internally determined criteria (language, quality, platform, etc.) - one of those engines is the Abbyy FineReader one. >I've seen it fail to process in Paper Capture mode some very >standard TIFF files. It also runs very slow. > It runs slowly on color, works much better on B&W. Leonard -- --------------------------------------------------------------------------- Leonard Rosenthol <mailto:[EMAIL PROTECTED]> Chief Technical Officer <http://www.pdfsages.com> PDF Sages, Inc. 215-629-3700 (voice) 215-629-0789 (fax) To change your subscription: http://www.pdfzone.com/discussions/lists-pdf.html To change your subscription: http://www.pdfzone.com/discussions/lists-pdf.html
