On Mon, 26 Jan 2009 14:06:23 -0800, Gary Kline <kl...@thought.org> wrote: > So what kind of moron is going to photograph pages --or maybe just > get-screenshot-of-this-page" and upload it?
The PDF serves as a container for pictural images in this context. Another idea would be to have separate image files, one file per page, that you could view at with your favourite image viewer. The advantage of the PDF container is that you can easily print a bunch of pages (or, a book). > Or a Real question: > I read an online pdf of "The Art of War" from the 1880's [?], and > it was in an old-English or olden-Deutsch type font. In PDF. i > have other p.d. texts in pdf and am wondering in there is some > sort of scanner than can take a book-length script and create a > pdf file. Anybody know? It's very complicated to handle old fonts using OCR techniques. It's even quite complicated with today's standard fonts. Allthough there are (usually expensive) OCR programs with good algorithms, most documents need some work afterwards. It's not only about correcting mis-recognized characters, you have to handle hyphenation and paragraph typesetting as well. I know that there are scanners that can process a bunch op paper (sheets of paper) through an automatic feeder, then scan them and finally have a PDF file ready for FTP download. But there's no OCR involved, of course. > I got a bunch of ^L bytes and nothing > else. The Ctrl-L (^L) is the page break character (FF = form feed). The rest of the file then contains images that are not transformable into characters. > Now I'm looking at the file with od -c and, yup, it's and > image. The parts inbetween pages are in ASCII. Do you know what > "MediaBox" is? An image container maybe? So every page contains of a "MediaBox" container holding one image. > At least the web article was not an image! Don't mind, I know "important" web pages where the text content actually IS an image, and of course theres no alt= or longdesc= parameter because they're for weenies. :-) -- Polytropon >From Magdeburg, Germany Happy FreeBSD user since 4.0 Andra moi ennepe, Mousa, ... _______________________________________________ email@example.com mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"