>> There is no such thing as "canonical" PDF - anything that complies with the 
>> PDF specification is valid. That allows for various uses of >>compression, 
>> ASCII encoding, etc.
>
>Well, not really. If there are rules for the PDF standard then you could in 
>fact create some alternative representation- it could
>be super big, verbose, complicated, etc but it may be a useful intermediate 
>form for various types of work
>such as debug or adhoc editing where you don't want to waste time writing 
>custom code to do something simple. 
>
No argument!   

BUT an "intermediate format" (or an "alternative format") and a "canonical 
format" are VERY VERY different things...

There are many folks who have developed alternative representations of PDF, 
whether in XML or other formats, including Adobe ourselves.  For example, Adobe 
has a project codenamed "Mars" on our Labs site 
(<http://labs.adobe.com/wiki/index.php/Mars>) which describes an XML+ZIP-based 
representation of PDF.  It supports all of the features of PDF from PDF 1.7.  
We provide some tooling for Acrobat & Reader, and you are welcome to develop 
your own. 

But again, that's NOT canonical - just alternative.


>> That's why library such as iText exist - to provide you with higher level 
>> APIs (where possible). They are what one would use to create 
>> automated test tools, validators, etc. And many such tools already do exist 
>> - so it's definitely doable (and has been done).
>>
>If you took that attitude you couldn't even hide behind "but pdf is a 
>standard" since then the argument is " well I have API
>xyz and we can do anything with it. if you use my ABC format"  I guess having 
>a list would help, is there a pdf
>developer download somewhere with tools like this? 
>
Adobe Acrobat Professional includes a PDF validator feature as part of its 
Preflight module, and has since version 7.  It is the only publicly available 
validator that I am aware of, though I have spoken to at least a half-dozen 
commercial PDF vendors that have told me that they have developed their own 
validators for their own use.

There used to be two limited open source validators - JHOVE 
(<http://hul.harvard.edu/jhove/pdf-hul.html>) and Multivalent 
(<http://multivalent.sourceforge.net/Tools/pdf/Validate.html>).   But to my 
knowledge, neither is currently supported/updated.   Since both were Java-based 
OSS, I would think you could pick them up and run with them if you wished.


Leonard


------------------------------------------------------------------------------

_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to