pdf edit again.

2007-11-03 Thread Gary Kline

A couple weeks ago I skimmed thru the postings on editing PDF
files.  Wasn't entirely clear what the answer it because I never
thought I would need to edit a GUI file.  I just found a book 
from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
there something I can use  to edit/look-at this book and get rid
of whateveriit is that's causing pdftotext to fail.  (sorry for
the grammar )

gary



-- 
  Gary Kline  [EMAIL PROTECTED]   www.thought.org  Public Service Unix
  http://jottings.thought.org   http://transfinite.thought.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread Mike Jeays
On November 3, 2007 08:38:55 pm Gary Kline wrote:
   A couple weeks ago I skimmed thru the postings on editing PDF
   files.  Wasn't entirely clear what the answer it because I never
   thought I would need to edit a GUI file.  I just found a book
   from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
   version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
   there something I can use  to edit/look-at this book and get rid
   of whateveriit is that's causing pdftotext to fail.  (sorry for
   the grammar )

   gary

Try gv and xpdf.  You might get lucky. 

Otherwise - try od :-)




-- 
Mike Jeays
http://www.jeays.ca
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread cpghost
On Sat, 3 Nov 2007 16:38:55 -0800
Gary Kline [EMAIL PROTECTED] wrote:

   A couple weeks ago I skimmed thru the postings on editing PDF
   files.  Wasn't entirely clear what the answer it because I
 never thought I would need to edit a GUI file.  I just found a book 
   from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
   version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
   there something I can use  to edit/look-at this book and get
 rid of whateveriit is that's causing pdftotext to fail.  (sorry for
   the grammar )

Old books in PDF are normally scanned bitmaps. There are no characters
or whatever therein; just pixels (EPS files). If you want to convert
that to ASCII, you'd need to extract the EPS files (use something like
pdfimages from the xpdf port), turn them into some bitmap format, and
run some kind of OCR software on that. It's a slow, unreliable,
error-prone and painful process though.

Good luck!

-cpghost.

-- 
Cordula's Web. http://www.cordula.ws/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread Gary Kline
On Sat, Nov 03, 2007 at 09:03:17PM -0400, Mike Jeays wrote:
 On November 3, 2007 08:38:55 pm Gary Kline wrote:
  A couple weeks ago I skimmed thru the postings on editing PDF
  files.  Wasn't entirely clear what the answer it because I never
  thought I would need to edit a GUI file.  I just found a book
  from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
  version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
  there something I can use  to edit/look-at this book and get rid
  of whateveriit is that's causing pdftotext to fail.  (sorry for
  the grammar )
 
  gary
 
 Try gv and xpdf.  You might get lucky. 
 
 Otherwise - try od :-)
 

Welll, yeah, I can view ths file with xpdf or any other viewer,
but can't figure out what's blocking it from being converted to
ASCII.  I've seen pdfedit for linux, but haven't found it

thanks,

gary

PS: can't figure out whyanybody would take a pub domain book 
125 years old any say copyright..   *mumble*


 
 
 
 -- 
 Mike Jeays
 http://www.jeays.ca
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]

-- 
  Gary Kline  [EMAIL PROTECTED]   www.thought.org  Public Service Unix
  http://jottings.thought.org   http://transfinite.thought.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread Gary Kline
On Sun, Nov 04, 2007 at 02:39:14AM +0100, cpghost wrote:
 On Sat, 3 Nov 2007 16:38:55 -0800
 Gary Kline [EMAIL PROTECTED] wrote:
 
  A couple weeks ago I skimmed thru the postings on editing PDF
  files.  Wasn't entirely clear what the answer it because I
  never thought I would need to edit a GUI file.  I just found a book 
  from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
  version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
  there something I can use  to edit/look-at this book and get
  rid of whateveriit is that's causing pdftotext to fail.  (sorry for
  the grammar )
 
 Old books in PDF are normally scanned bitmaps. There are no characters
 or whatever therein; just pixels (EPS files). If you want to convert
 that to ASCII, you'd need to extract the EPS files (use something like
 pdfimages from the xpdf port), turn them into some bitmap format, and
 run some kind of OCR software on that. It's a slow, unreliable,
 error-prone and painful process though.
 
 Good luck!


Arrrgh (Charlie Brown).  If it's that tortured, I'll forget
it; thanks for the clue.  Pretty sure this *was* just phot'd and
scanned in.

(Much be how amazon.com has thir zillions of boooks online.
OCR'ing is serious work; I know that first hand.)

gary
 
 -cpghost.
 
 -- 
 Cordula's Web. http://www.cordula.ws/
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]

-- 
  Gary Kline  [EMAIL PROTECTED]   www.thought.org  Public Service Unix
  http://jottings.thought.org   http://transfinite.thought.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread Jona Joachim
On Sat, 03 Nov 2007 17:42:03 -0800, Gary Kline wrote:

 On Sat, Nov 03, 2007 at 09:03:17PM -0400, Mike Jeays wrote:
 On November 3, 2007 08:38:55 pm Gary Kline wrote:
 A couple weeks ago I skimmed thru the postings on editing PDF
 files.  Wasn't entirely clear what the answer it because I never
 thought I would need to edit a GUI file.  I just found a book
 from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
 version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
 there something I can use  to edit/look-at this book and get rid
 of whateveriit is that's causing pdftotext to fail.  (sorry for
 the grammar )
 
 gary
 
 Try gv and xpdf.  You might get lucky. 
 
 Otherwise - try od :-)
 
 
   Welll, yeah, I can view ths file with xpdf or any other viewer,
   but can't figure out what's blocking it from being converted to
   ASCII.  I've seen pdfedit for linux, but haven't found it
 
   thanks,
 
   gary
 
   PS: can't figure out whyanybody would take a pub domain book 
   125 years old any say copyright..   *mumble*

The guy who scanned the book did actually invest a considerable amount
of work into scanning it and he can claim copyright for that work.
The text itself may be in the public domain but the pdf file is subject to
copyright.

Best regards,
Jona


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread cpghost
On Sat, 3 Nov 2007 17:54:53 -0800
Gary Kline [EMAIL PROTECTED] wrote:

 On Sun, Nov 04, 2007 at 02:39:14AM +0100, cpghost wrote:
  On Sat, 3 Nov 2007 16:38:55 -0800
  Gary Kline [EMAIL PROTECTED] wrote:
  
 A couple weeks ago I skimmed thru the postings on editing
   PDF files.  Wasn't entirely clear what the answer it because I
   never thought I would need to edit a GUI file.  I just found a
   book from 1883 in pdf format.  I would like a
   text/ASCII/ISO_8859-1 version.  Tried pfdtotext, but it doesn't
   work.   Nutshell: is there something I can use  to edit/look-at
   this book and get rid of whateveriit is that's causing pdftotext
   to fail.  (sorry for the grammar )
  
  Old books in PDF are normally scanned bitmaps. There are no
  characters or whatever therein; just pixels (EPS files). If you
  want to convert that to ASCII, you'd need to extract the EPS files
  (use something like pdfimages from the xpdf port), turn them into
  some bitmap format, and run some kind of OCR software on that. It's
  a slow, unreliable, error-prone and painful process though.
  
  Good luck!
 
 
   Arrrgh (Charlie Brown).  If it's that tortured, I'll forget
   it; thanks for the clue.  Pretty sure this *was* just phot'd
 and scanned in.
 
   (Much be how amazon.com has thir zillions of boooks online.
   OCR'ing is serious work; I know that first hand.)

If you need help on imperfectly OCR'ed texts, esp. on texts that
are no longer copyrighted, there's always Distributed Proofreaders
from the venerable Project Gutenberg: http://www.pgdp.net/

Good luck!
-cpghost.

-- 
Cordula's Web. http://www.cordula.ws/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pdf edit again.

2007-11-03 Thread Gary Kline
On Sun, Nov 04, 2007 at 05:12:48AM +0100, Jona Joachim wrote:
 On Sat, 03 Nov 2007 17:42:03 -0800, Gary Kline wrote:
 
  On Sat, Nov 03, 2007 at 09:03:17PM -0400, Mike Jeays wrote:
  On November 3, 2007 08:38:55 pm Gary Kline wrote:
A couple weeks ago I skimmed thru the postings on editing PDF
files.  Wasn't entirely clear what the answer it because I never
thought I would need to edit a GUI file.  I just found a book
from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
there something I can use  to edit/look-at this book and get rid
of whateveriit is that's causing pdftotext to fail.  (sorry for
the grammar )
  
gary
  
  Try gv and xpdf.  You might get lucky. 
  
  Otherwise - try od :-)
  
  
  Welll, yeah, I can view ths file with xpdf or any other viewer,
  but can't figure out what's blocking it from being converted to
  ASCII.  I've seen pdfedit for linux, but haven't found it
  
  thanks,
  
  gary
  
  PS: can't figure out whyanybody would take a pub domain book 
  125 years old any say copyright..   *mumble*
 
 The guy who scanned the book did actually invest a considerable amount
 of work into scanning it and he can claim copyright for that work.
 The text itself may be in the public domain but the pdf file is subject to
 copyright.
 


So then I could consier my HTML version of the philosophy books
that friends and I OCR'd in.  --Yes, it is a Lot of work...
but as least for myself, i wouldn't be that crass.

gary





 Best regards,
 Jona
 
 
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]

-- 
  Gary Kline  [EMAIL PROTECTED]   www.thought.org  Public Service Unix
  http://jottings.thought.org   http://transfinite.thought.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]