I imagine the problem may not be just the act of extracting metadata from the real title page of the PDF, but also the actual ability to create the document group (as Anurag mentioned in the talk) matching the same document in multiple sources, which might possibly employ techniques as content hash of the PDF. That would obviously change with any modification of the PDF file. The message was clear - do not tamper with the original file.
Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
------------------------------------------------------------------------------
_______________________________________________ Dspace-general mailing list Dspace-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-general