[I sent this 15 minutes ago. I do not think it worked, presumably because I attached the Teller paper in HTML format]

These EPRI papers I have uploaded, and the upcoming ICCF-3 book, are in image-over-text Acrobat format. I have discussed this previously. With this format, the page you see on the screen is a facsimile of the original scan, but there is underlying text which was OCR'ed. Unfortunately, the EPRI documents were of poor quality in the first place, and scanning them did not improve them. You end up with a very blurry, noisy image on the screen.

I am tempted to convert parts of it into a pure text Acrobat format, such as the talk by Teller. I output his talk in HTML format and . . . [will try sending it another message.] As you [DON'T] see, the OCR worked remarkably well. This is not edited. He did not use any Greek letters or equations, and only a couple of superscripts, so it is close to 100% accurate. I did not check.

I am tempted, but I am not going to proofread 710 pages. No thanks!

Anyway, if you would like to see the underlying text in one of these files, a couple of methods are available:

1. Put a block around the image, copy, and paste into a word processor or text editor.

2. Get an Acrobat reader such as PDF Converter Professional ($100) or FoxIt Reader (free) that show underlying text, or save a file as text.

Or, I guess --

3. Ask me for a copy which I can output in several different formats.

eBook (*.opf)
HTML 4.0 (*htm)
HTML 3.2 (*htm)
InfoPath (*.xsn)
Microsoft Excel 97, 2000 (*.xls)
Microsoft Excel XP, 2003 (*.xls)
Microsoft Excel 2007 (*.xlsx)
Microsoft PowerPoint 97 (*.rtf)
Microsoft Publisher 98 (*.rtf)
Microsoft Reader See note 1(*.lit)
Microsoft Word 2007 (*.docx)
Microsoft Word 2003 (WordML) (*.xml)
Microsoft Word 2000, XP (*.doc)
Microsoft Word 97 (*.doc)
PDF, normal (*.pdf)
PDF Edited (*.pdf)
PDF Searchable Image (*.pdf)
PDF with image substitutes (*.pdf)
RTF Word 2000, 97, 6.0/95 (*.rtf)
RTF 2000 ExactWord (*.rtf)
WordPad (*.rtf)
WordPerfect 12, X3 (*.wpd)
XML (*.xml) See note 1
XPS (XML Paper Specification) (*.xps)
XPS Searchable Image (*.xps)
Text (*.txt)
Text and Text with line breaks (*.txt)
Text - Comma Separated (*.csv)
Text - Formatted (*.txt)
Wave Audio Converter (*.wav)
Unicode Text (*.txt)
Unicode Text - Comma Separated (*.csv)
Unicode Text - Formatted (*.txt)
Unicode Text with line breaks (*txt)
OmniPage Document (*.opd)

See:

<http://www.nuance.com/imaging/resources/userGuides/OPUserguide/chapter6/ch6_3.asp>http://www.nuance.com/imaging/resources/userGuides/OPUserguide/chapter6/ch6_3.asp

- Jed

Reply via email to