Amit Humnabadkar created TIKA-2441: -------------------------------------- Summary: Unable to extract text present in a table inside a textbox in MS Word Key: TIKA-2441 URL: https://issues.apache.org/jira/browse/TIKA-2441 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.15 Environment: Windows, Linux, Apache tika 1.15 used with Apache Solr-6.6.0 Reporter: Amit Humnabadkar
Hello, I am using Tika-1.15 with Solr-6.6.0 to indexing and searching. This setup fails to index text present in a table inside a textbox in a word document. A MS Word document contains two words - 1. Germany - present in a table inside a textbox 2. Africa - present in a textbox Germany is not getting indexed while Africa gets indexed successfully. Looks like Tika fails to extract the content present in table inside a textbox. Please have a look. Thanks, Amit Humnabadkar [^doc001.zip] -- This message was sent by Atlassian JIRA (v6.4.14#64029)