Re: [PDFdev] stats on PDF Name Objects

Robert A Jones Fri, 27 Jun 2003 07:13:16 -0700

PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com
_____________________________________________________________


Carolyn:

Your question about tags was a perfectly reasonable one.  You don't need
to be apologetitic.  Leonard has been 'cranky' with many people the last
few weeks,  and I hope you can overlook it and continue to contribute to
this forum.

Bob Jones
On Thu, 26 Jun 2003 14:25:46 -0600 "Carolyn Briles" <[EMAIL PROTECTED]>
writes:
> 
> PDFdev is a service provided by PDFzone.com | 
> http://www.pdfzone.com
> _____________________________________________________________
> 
> Hello All,
> 
> Ooops!  My many and humble apologies for not being specific
> and using the wrong semantics for PDF in my previous question.
> I work on another project in which the protocol uses the word "tag"
> for what is called (at least I think is called....) a name object 
> that 
> is a key in PDF protocol.
> 
> So, let me rephrase.  Do any statistics exist on the frequency of 
> common name objects that are keys in PDF files?  
> 
> When I say "name object that is a key," I am talking about things 
> like these:
> /Filter   /MediaBox  /CropBox  /F1  /BitsPerComponent  /ColorSpace 
> /Type  /Page  /Parent  /Kids  /Contents  /D  /Title  /Dest  
> ...............
> 
> Here is another way to ask this question:  
> Does a signature (meaning a unique identifier, not the approval 
> kind
> of signature) exist for the "typical" PDF file.  In other words, if 
> I 
> take a million PDF files and histogram the frequency of the keys,
> will "most" PDF files have the same basic kind of histogram or
> will they be all over the map?
> 
> There are lots of details that I am leaving out here, like exactly 
> how
> I would index the  dependent axis of the histogram.  But in general, 
> 
> I am trying to see if a "typical" PDF file can be described (if it 
> exists) 
> by an examination of the contents of the raw file (not the printed 
> or 
> viewed result of the reader).
> 
> Here is yet another try at what I'm trying to get at.  Suppose you 
> get
> a raw PDF file and run it through a parser and simply histogram 
> the occurrence of keys.  Could I look at the frequency of the keys 
> and
> say to myself, "Hmmmm.  This is a catalog.  This one is a form. 
> This
> one is a ......."
> 
> Please no flames for what I am asking here.  It is not a 
> "development"
> kind of question, so I apologize if it wastes bitspace and does not
> belong here.  I just thought that people who develop apps for PDF 
> and deal with the files frequently might have insight.  Believe it 
> or not, 
> there are people out there who would like to know this.
> 
> Thanks in advance, and again, apologies for the vagueness of the
> first post.
> 
> Carolyn
> 
> We are what we repeatedly do.  Excellence, then, is not an act, 
>   but a habit.   -- Aristotle
> 
> /****************************************************/
>   * M. Carolyn Briles
>   * Software Engineer    P-21/NIS-9 
>   * MS J570
>   * Los Alamos National Laboratory               
>   * Los Alamos, NM   87545                          
>   * office: 505-665-0980  cell: 505-690-6660 
> /****************************************************/
> 
> 
> 
> To change your subscription:
> http://www.pdfzone.com/discussions/lists-pdfdev.html
> 
> 
> 

To change your subscription:
http://www.pdfzone.com/discussions/lists-pdfdev.html

Re: [PDFdev] stats on PDF Name Objects

Reply via email to