Is there specific metadata you're looking for? Creation date, title, etc? Or
stuff like number of pages, page size, and other document layout specifics?

If you're just looking for what it's title is, who created it and when, you
can do a some-what brute force method of loading the entire document into a
variable (most likely a byte array as PDFs can be binary), searching for a
specific string, then using that data to find the who/what/when data.

You'll want to search for the "/Info " string (it will look like: "/Info X 0
R" without quotes)

The "X" following "/Info" will be a number. For this example let's say that
number is 106.

Then search for "106 0 obj" (note the 106 is the value of X you find in your
PDF)

The data between "106 0 obj" and the next instance of "endobj" is all of the
document creation data: Creation Date, Author, Producer (the software that
created the PDF), Title, ModDate, Company, and others.

Here's an example of the document info from an old PDF generated with
blazePDF:

106 0 obj
<<
/Title (blazePDF Document)
/Author (g.wygonik)
/Creator (blazePDF in Macromedia Flash)
/CreationDate (D:20061009203227-05'00')
/Producer (blazePDF v2.0)
>>
endobj


I'd suggest looking at the PDF specs on Adobe's site for all the fields that
could be there, as well as how to parse their date format (if you can't
figure it out -- it's not hard).

NOW - the big "but..." to all of this -- a PDF data may be encoded and
compressed (or in some cases encrypted) and the desired data may not be
plain text. In fact, there are several different encoding/compression
methods that could be used and you'll need to handle them all if you want to
be able to work with any PDF thrown at your app.

I would start easy with the string searches and if you find you need more,
look into existing libraries for all of the additional
encodings/formats/etc.

Good luck! :-)

Cheers
g.



On Fri, Aug 1, 2008 at 2:47 AM, Ian Thomas <[EMAIL PROTECTED]> wrote:

> On Fri, Aug 1, 2008 at 8:41 AM, Zárate <[EMAIL PROTECTED]> wrote:
> > If you are happy with AS3, use Alive PDF:
> >
> > http://www.bytearray.org/?p=101
>
> Nice, I'd forgotten about that. Does it read PDFs as well as write?
>
> Ian
>
> >
> > On Thu, Jul 31, 2008 at 10:03 PM, Paul Venton <[EMAIL PROTECTED]>
> wrote:
> >> Just a thought ... couldn't you use the URLStream class to parse the
> file as
> >> it's being downloaded and once you have the metadata, close the
> connection?
> >>
>
> _______________________________________________
> Flashcoders mailing list
> [email protected]
> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>



-- 
weblog: broadcast.artificialcolors.com
_______________________________________________
Flashcoders mailing list
[email protected]
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Reply via email to