PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com _____________________________________________________________

At 3:25 AM -0400 10/14/03, Lori DeFurio wrote:
In the current product and specification, there are two different places to store a list of keywords associated with a document.

However, as of Acrobat 6, the Info dictionary is ignored in the presence of the XMP metadata. If XMP isn't present, then (and only then) does Acrobat read from the Info dict.



We are considering a change to Acrobat (and the Adobe PDF Library) so that the two lists of keywords will be automatically kept in sync (much as Author and Title have been since Acrobat 5).

Given that the PDF/A committee has already deprecated /Info in favor of XMP and the PDF/X committee is considering a similar change for a future revision of their spec AND that Acrobat 6 already prefers XMP - wouldn't it make more sense to just kill the /Info once and for all?!?!?!



1. When the Info dictionary Keywords entry has to be transcribed into the dc:subject entry in XMP, a heuristic will be applied to separate the string into individual keywords based on separator characters such as commas, semicolons, and spaces.

What do you do about languages where those aren't wordbreak characters, such as CJK? Will you do dictionary lookup for wordbreaks in those cases?


What will you do about the fact that /Info can be in UCS2 unicode, but that XMP is usually going to be in UTF-8? Will Acrobat perform the necessary re-encoding?


2. When the dc:subject entry in XMP is to be transcribed into the Info dictionary Keywords entry, the individual keyword entries will be concatenated together with semicolons as a separator character.

Will you also do the correct thing with re-encoding from (most likely) UTF-8 to UCS2? What about handling of semi-colon in non-Roman scripts?



So, here's my question - would an automatic process along these lines cause problems for your development or solutions? We look forward to your feedback.


I personally think it's a waste of engineering time, and I'd rather just see /Info go away in favor of XMP which is MUCH more modern, flexible and extensible.



Leonard -- --------------------------------------------------------------------------- Leonard Rosenthol <mailto:[EMAIL PROTECTED]> Chief Technical Officer <http://www.pdfsages.com> PDF Sages, Inc. 215-629-3700 (voice) 215-629-0789 (fax)

To change your subscription:
http://www.pdfzone.com/discussions/lists-pdfdev.html



Reply via email to