Hi, From: Grant Ingersoll [mailto:[email protected]] > Hmm, it does look like I'm still getting the Keywords, but this > AAPL:Keywords is an additional one. Looks like it is coming from > PDFBox. I will update my tests.
0.8 exposes quite a bit more document metadata, and in some cases these additional fields duplicate previously exposed information. For backwards compatibility we didn't remove the old metadata fields even in cases where the new field is more accurately named or formatted. In Tika 1.0 we probably should review all such cases and drop the old metadata fields to avoid confusion later on, so you may want to prepare for some extra upgrade work with 1.0. BR, Jukka Zitting
