[
https://issues.apache.org/jira/browse/SOLR-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jan Høydahl updated SOLR-1929:
------------------------------
Attachment: SOLR-1929.patch
Updated patch for trunk which utilizes the new Tika feature in TIKA-850.
Contains a RegexRulesPasswordProvider backed by regex rules file and/or
explicit password.
New solr cell request params:
* resource.password - explicit password for this file
* passwordsFile - name of property file with list of known passwords based on
filename regex. Loaded using ResourceLoader
Note that Tika currently support passwords for PDF and DOCX files, not legacy
DOC files or any other type. I tried to decrypt the existing test file
password-is-solrcell.docx but it fails due to unsupported enctyption method in
Apache POI.
In order to apply this patch and have tests pass, you also need to add two
binary files by unzipping SOLR-1929-extra-docs.zip in project root.
> Index encrypted files
> ---------------------
>
> Key: SOLR-1929
> URL: https://issues.apache.org/jira/browse/SOLR-1929
> Project: Solr
> Issue Type: Improvement
> Components: contrib - Solr Cell (Tika extraction)
> Reporter: Yiannis Pericleous
> Assignee: Jan Høydahl
> Priority: Minor
> Fix For: 4.0, 5.0
>
> Attachments: SOLR-1929.patch, SOLR-1929.patch
>
>
> SolrCell should be able to index encrypted files (pdfs, word docs).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]