Thanks guys. :) I'm actually waiting for the design team to send me a sample PO so I can ask more pertinent questions. I am a bit concerned since there are embedded images of the garment in the PDF. At the minimum, we need the "text" part since it describes the bill of materials and some financial info ... getting the images would be nice too.
Oh, I did ask if the client can send the PO in EDI (way much easier all around) but it seems that somebody from the client is saying that "we are your client and you should be happy with what we give you". Cue lightning and organ music!!! I'll look into what openoffice is doing. :) Thanks. r/Alex -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Chris Burke Sent: Wednesday, January 28, 2009 4:32 PM To: General forum Subject: Re: [Jgeneral] Parsing PDF documents in J bill lam wrote: > On Wed, 28 Jan 2009, Alex Rufon wrote: >> PDF format, they have to go through these steps: >> >> 1. Export the PDF file into HTML >> >> 2. Parse the HTML file >> >> 3. Insert/Update the databases > > Theoretically you can parse pdf file yourself because it is a plain > test format with possible embedded graphic or compressed text in zlib. > However I think that you can just export pdf into txt directly using > some utilities (please google yourself). Openoffice can be used as a > command line converter between various formats including pdf and txt > iirc. Just to add to Bill's comment: PDF files contain drawing instructions that are arbitrarily complex. You might be able to parse simple examples, but in general it would be very difficult write a J program to turn these instructions into data. Why not ask your clients to send edi in plain text? ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
