RE: Indexing PDF and MS Office files

2015-04-16 Thread Allison, Timothy B.
This sounds like a Tika issue, let's move discussion to that list. If you are still having problems after you upgrade to Tika 1.8, please at least submit the stack traces (if you can) to the Tika jira. We may be able to find a document that triggers that stack trace in govdocs1 or the slice of

RE: Indexing PDF and MS Office files

2015-04-16 Thread Allison, Timothy B.
Let's move this to the Tika users' list. I'm aware that [1] is quite common in govdocs1, and it might (?) be the source of your problem with MSWord files. If you can share a stack trace, we'll be better able to diagnose. Best, Tim [1]