[ 
https://issues.apache.org/jira/browse/TIKA-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233064#comment-14233064
 ] 

Nick Burch commented on TIKA-1489:
----------------------------------

Can someone find an existing, externally well defined specification for 
permissions related metadata that we could use, covering most or all of these? 
Perhaps something in XMP or one of the XMP extensions?

(Wherever possible, we don't like to just make up Tika specific keys for new 
items of metadata, but instead re-use existing well known definitions)

> PDF Text extraction without permission
> --------------------------------------
>
>                 Key: TIKA-1489
>                 URL: https://issues.apache.org/jira/browse/TIKA-1489
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.7
>            Reporter: Tilman Hausherr
>
> In TIKA-1442 text extraction from files like 717226.pdf that don't have text 
> extraction permission works. The permissions in PDF files are only enforced 
> by the application (i.e. PDFBox), i.e. the text information isn't stored 
> separately in encrypted form. 
> PDFBox ExtractText command line does throw an exception.
> So I wonder why TIKA is able to extract text. Either TIKA or the PDFBox call 
> used bypasses the permission checking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to