[ https://issues.apache.org/jira/browse/TIKA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17484923#comment-17484923 ]
Tim Allison commented on TIKA-3666: ----------------------------------- > I'm working to get a non-sensitive one (or several in the different office > formats) available for testing. That's critical. Thank you! > Detect and indicate file encrypted with Rights Management Service RMS/IRM > ------------------------------------------------------------------------- > > Key: TIKA-3666 > URL: https://issues.apache.org/jira/browse/TIKA-3666 > Project: Tika > Issue Type: Improvement > Components: metadata > Reporter: August Valera > Priority: Major > > Rights Management Service (RMS), implemented in MS Office as Information > Rights Management (IRM), allows organizations to set file permissions that > are stored within the file. In most cases, this will result in the file > getting a new extension (with a prefix p, such as {{.txt}} becoming > {{{}.ptxt{}}}), but in the case of MS Office and PDF files, which support > this natively, the implementation results in the file contents being > encrypted without any extension change. > h4. Current behavior > Running such files through Tika produces results as if it was an empty file > ran through {{DefaultParser}} and {{{}OfficeParser{}}}. > h4. Expected behavior > Extract more metadata about necessary permissions to view (if possible), and > throwing {{EncryptedDocumentException}} as is the case with Office files > encrypted in the more traditional manner. > Reference: > [https://docs.microsoft.com/en-us/azure/information-protection/rms-client/clientv2-admin-guide-file-types#supported-file-types-for-classification-and-protection] -- This message was sent by Atlassian Jira (v8.20.1#820001)