PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com _____________________________________________________________
Still not enough info to help you (top-secret pdf classification project ?:), but I would guess the PDFs would cluster by the producing software (of course you can just look that up in the Producer tag most of the time) and by the source document (html, latex, msword, scanned file) etc. > Here is another way to ask this question: > Does a signature (meaning a unique identifier, not the approval kind > of signature) exist for the "typical" PDF file. In other words, if I > take a million PDF files and histogram the frequency of the keys, > will "most" PDF files have the same basic kind of histogram or > will they be all over the map? this is confusing. Are you looking for a unique signature of a PDF file from a set of many PDF files or sig. of a PDF file vs other document types? max. To change your subscription: http://www.pdfzone.com/discussions/lists-pdfdev.html
