> >> OK - wasn’t precise enough - token types didn’t change but there are newer >> tokens introduced. > > Yes. > >> As the syntax has changed do we need version and standards support in the >> parsing phase then? > > I don’t think so, no. I don’t know what the use-case would be. You’d have to > go back and read all seven versions of the PDF Reference and make sure that > the parser implements the correct handling for each version, that’s an awful > lot of work.
OK - so the parser should concentrate on getting the parsing done according to the spec (which is mostly the case with NonSequentialParser today) and we also have a way that there is some standards/relaxed way of parsing for files where the base syntax is not correct as we need to catch such circumstances for standards compliant parsing (which we don’t have in core but in the PDF/A project) but would ignore such errors if they can be corrected for relaxed parsing. > >> Other way would be to parse what’s in there and do validation etc. purely on >> the parsing result (COS model, PD model). Need to do that anyway. > > Yes, I prefer this approach, you can always write a tool which inspects a > PDDocument and determines whether or not it uses features available in a > given PDF version. It seems better to do this as a separate feature than to > try and build it into the parser or the PD model directly. Fine for me - would be something like a ‚profile' per standard which could be used for validation as well as writing. To get that completed we need to revisit the PD model as not all features of PDF are reflected in the matching PD model. That could be done when implementing the profiles. > >> What about writing? > > Yes, we want versions for writing, because a user may want to generate e.g a > PDF 1.6 file. This is going to be even more important in the near future > because the PDF 2.0 standard is supposed to be introduced in 2014. There are some base features missing in writing a PDF today but I think Andreas has something in the works. The ‚profile‘ mentioned above could be used for writing too e.g. to check if PD model keys are permitted for a certain standard/version or not. > > -- John
