Hello Emad,

I have seriously looked at the documentation associated with pyPDF. This seems to have the page as its smallest element of work, and what i need is a line by line process to go from .PDF format to Text. I don't think pyPDF will meet my needs but thank you for bringing it to my attention.

Thanks,


Robert Berman

Emad Nawfal (عماد نوفل) wrote:


On Tue, Apr 21, 2009 at 12:54 PM, bob gailer <bgai...@gmail.com <mailto:bgai...@gmail.com>> wrote:

    Robert Berman wrote:

        Hi,

        I must convert a history file in PDF format that goes from May
        of 1988 to current date.  Readings are taken twice weekly and
        consist of the date taken mm/dd/yy and the results appearing
        as a 10 character numeric + special characters sequence. This
        is obviously an easy setup for a very small database
         application with the date as the key, the result string as
        the data.

        My problem is converting the PDF file into a text file which I
        can then read and process. I do not see any free python
        libraries having this capacity. I did see a PDFPILOT program
        for Windows but this application is being developed on Linux
        and should also run on Windows; so I do not want to
        incorporate a Windows only application.

        I do not think i am breaking any new frontiers with this
        application. Have any of you worked with such a library, or do
        you know of one or two I can download and work with?
        Hopefully, they have reasonable documentation.


    If this is a one-time conversion just use the save as text feature
    of adobe reader.



        My development environment is:

        Python
        Linux
        Ubuntu version 8.10


        Thanks for any help  you might be able to offer.


        Robert Berman
        _______________________________________________
        Tutor maillist  -  Tutor@python.org <mailto:Tutor@python.org>
        http://mail.python.org/mailman/listinfo/tutor



-- Bob Gailer
    Chapel Hill NC
    919-636-4239

    _______________________________________________
    Tutor maillist  -  Tutor@python.org <mailto:Tutor@python.org>
    http://mail.python.org/mailman/listinfo/tutor



I tried pyPdf once, just for fun, and it was nice:
http://pybrary.net/pyPdf/
--
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي
"No victim has ever been more repressed and alienated than the truth"

Emad Soliman Nawfal
Indiana University, Bloomington
--------------------------------------------------------
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to