On 12/7/2011 11:15 AM, Michael McCandless wrote:
This looks just like:

     https://issues.apache.org/jira/browse/TIKA-801

Likely Tika's parser is (incorrectly) producing invalid XHTML tags for
your document... when you open the Jira issue can you attach the
problematic document?  Thanks.


Maybe, because ironically enough the document is an actual e-mail exchange between the CTO of our company and an alpha test customer about a non-discloser agreement. :-(

If I could binary edit it to drop the names referenced and drop the actual NDA attached document I might be able to generate an example that fails. I will experiment with things like forwarding it without the attachment and then hacking some bytes. If it still fails, I'll send it your way.

-Paul

Reply via email to